..

Navigating Open Source Tanzanian Health Registry Data

For a very long time, I have been trying to find opportunities to contribute to Tanzania’s health sector from abroad. When I sat down to think about it, given that this is not my full-time job, I think the spurts of engagement I have put in are worth chronicling. This past week, I picked up where I left off and started brainstorming opportunities again. Ultimately I got stuck once again just when I thought I might have a breakthrough. I dealt with a lot of emotions once I met my latest roadblock: disappointed, impressed, confused, unsure. This is when I do my best writing – let’s jump into it.

This next section is a mix of some personal history since I’m not a robot here to print out technical information. Feel free to skim/skip it. It does have some valuable info buried in there too.

Some Personal and Relevant Background

A list might help me keep this concise:

  1. I am Tanzanian.
  2. Both my parents are trained doctors.
  3. For a long time, I thought I would be a doctor.
  4. I pivoted to nursing after the first semester of college.
  5. I gave up and thought maybe public health at some point before graduating.
  6. I ended up studying sociology with more focus on education. By the end, I think there was general (personal?) consensus that my two fields of interest converged to education and health.
  7. I graduated, had an itch to scratch, and I really thought data science was going to be my next jam. So I used my graduating summer to try to find datasets to use.
  8. I found a really promising dataset in the Health Facility Registry (HFR).
  9. At the same time, I was also seeing the Data Innovation Challenges from Tanzania Data Lab and lots of folks were also aiming to address social sector issues. Very energizing.
  10. As I was learning programming for my potential data science shift, I decided scraping the data from HFR might be a good way to learn.
  11. The scraping and data cleaning resulted in me uploading a Kaggle dataset of all the facilities at the time.
  12. Somewhere along the way, I got into using the Chrome Browser to search the Network tab and realized the data being populated in the HFR website was from an INSTEDD’s ResourceMap collection.
  13. Ever since then, I’ve been ideating about whether there were novel approaches to using the underlying data and add to the pool of innovations out there
  14. This is documenting digging into the API and its documentation.
  15. This is documenting the result of exploring that API and its limitations/opportunities.

Instedd ResourceMap REST API Documentation

For the most part, most of the documentation can be found on the Github’s wiki page. I have found it to be as complete as about anything I have seen. So for the rest of the document, I’ll try to use what the Wiki provides and write down requests one can do for Tanzania specifically.

I will provide URL links since you can use the browser to play around with most of these, but a REST client or curl would work just as well.

What is the Collection Id for Tanzania?

The collection ID for Tanzania is 409. I have not found there to be another secret one out there in the world for Tanzania. There could be though.

https://resourcemap.instedd.org/api/collections/409.json?page=1&Admin_div[under]=TZ

Some noteworthy query params

Technically, you could have set page=all in that last request, but then it becomes a very high latency response. Using a page number provides a nice paginated response with a count of roughly how many responses to expect. It seems that the Admin_div query parameter is also omittable. For more on the options for filtering via the API, see this GitHub wiki page. It explains a lot of cool goodies like the [under] which is often not something that I see with the APIs that I work with. The other thing to note is that Admin_div is just one of the fields in the collection. So once you learn how to filter depending on the field type, you can filter by pretty much anything!

Finding Fields Metadata

One challenging thing about a dataset like this, and frankly a lot of datasets that I make up on my own, is capturing the set of possible values in each field. That is why one of my favorite things about this dataset is you can query for information about fields metadata:

https://resourcemap.instedd.org/api/collections/409/fields.json

There is a dedicated section for Metadata in the wiki.

Finding a Specific Site

This can be done using:

https://resourcemap.instedd.org/api/sites/1254962.json

Surprisingly, this resource is not under the specific Tanzanian collection. Do note that the ID being referenced is the globally unique one and not the custom field known as Fac_IDNumber. For this and other more comprehensive REST API goodies, this page was most helpful.

Activity Feed

The last good news I wanted to share was how excited I was to find an RSS feed for changes! How incredible was this!

https://resourcemap.instedd.org/api/activity.rss?collection_ids[]=409

We can filter for our specific collection of interest like the above does.

Disappointment Is Around the Corner

I found the information about the feed and was ready to start building on top of this lucrative information! Because again, up until now, I had been scraping this information the hard way. Then I noticed something strange – there have been no data updates since 2021! And I thought to myself, maybe that means Tanzania stopped building health facilities post-COVID. But that doesn’t make a lot of sense now does it? When I re-visited the HFR website, there are pages that are dedicated to showing changes in the past month, 3 months, and this year. Lo and behold, changes have been happening but it seems that Tanzania moved away from using this wonderful tool. I suppose the next step might be to ask around and see if there are other ways to integrate with this data using APIs or other means of integration.

Conclusion

That is pretty much all that I wanted to touch on for this post. I had a lot of fun playing around with the APIs and dreaming up ways to augment this information in a way that adds value to folks. Fortunately or unfortunately, the challenge seems to once again be building a platform on top of which others can build. While it won’t be realtime, it should be possible to build something that can get us closer to the ergonomics of that API using the lemons we have today. It might also be easier to just ask if they plan to release it as a consummable API at a later time. I am currently planning to write a follow-up and to release something of a beta of such a platform while I figure out who to ask about this in the Tanzanian community.