Searching for answers in data
Today I saw something at DataConnect that got me very excited. Mainly because it’s work I’d like to re-use in a future work project, but also because it’s work that combined some areas I’ve got skills in.
The session was called ‘Searching for data, not just for datasets’ and is looking at making it easier to search for data. For example, if you want to answer the question ‘How many cars did the UK export last year?’, wouldn’t it be nice to search for that? You know, instead of having to
- find out which organisation publishes data about cars and where
- wonder whether their datasets about cars include data about exports too
- find the statistical release for the year you’re interested in
- get the data, work out that someone else entirely publishes data about exports
- bang the keyboard and start looking again
Robin and Ross showed how they’d been tinkering with Elasticsearch to make it return results from datasets in response to natural language queries (see the prototype). For a few queries I tried, the results it returned were iffy, but they’re collecting feedback on the results. (We systemised collecting feedback on the GOV.UK Search team and built a search relevance tool to help, so I shared that with them.)
They’re also working on making those results more findable by surfacing ontologies and classifications (in RDF) that describe the data to Google (in JSON-LD) – part of the Data on the Web Best Practices and how Google understands structured data.
All of this got me excited because people use data (and other information) to answer questions. And Robin and Ross are working on making those answers more findable. Making answers to common research questions more findable is an area I expect to have to look at soon-ish.
Making those answers more findable inside a product will help our existing users, but making the answers findable on the web would be a growth loop. It would help us find new prospective users and help support our users’ community of practice.
I know that sounds like VC-backed, rocketship startup bullshit, so I should point out that it wouldn’t only benefit us. It would benefit the public good because these are common research questions, and people are already looking for our data to answer those. It shouldn’t be an arcane process to find this stuff.
To nab from Maggie Appleton’s talk on data blocks: better user experiences leads to better data science which leads to better decision-making tools.
P.S. This note is squarely aimed at my future self, so that I come back to it if/when that project starts. But if this interests you too, drop me a line. Let’s chat!
Some history, for context
While hanging out some washing this evening, it occurred to me why I was so interested in this. TL;DR: I’ve got intellectual skin in the game. These problems have been part of my work for the last 8 years.
This stuff first interested me around 2014 or 2015 when I was working at Porism. I was tasked with increasing engagement with a reporting tool they’d built for councils, which helps council officers get statistical data about the people living in the council’s boundaries. The data helps support policy decisions and service design, so making it easier for people to use that data was a thing on my plate.
One issue was helping people get to grips with the tool. It’s pretty powerful, you can answer a lot of questions with it, but people found it hard to get started or to know how to ask their question. Engagement suffered as a result. There were lots of drop-downs and query builders and the like. The interface didn’t match users’ initial mental models.
But what makes the tool powerful are the data standards behind it all, how it links together (for example)
- the concept of homelessness
- services related to homelessness
- legal duties related to homelessness
- data about homelessness
and links all that to other useful data like
- the age of homeless people in your area
- the number of people in different age ranges in your area so that you can do some analysis.
The problem was all about making answers to questions findable, but by the time I got close to having the skills to answer it (through product management and user-centred design), I’d moved on to work at GDS.
While at GDS, I looked after GOV.UK’s site search for a year. I wrote about how we were going to make site search work smarter for users in order to make results more relevant to a user’s query. It meant I got pretty deep into indexing strategies, taxonomies, and the similarity of words. But we looked at interaction patterns and common components in search experiences too. It’s all about making things more findable.
Our work on personalisation was about findability too, but also about better, more targeted service provision.
2020: a year in review
A quick review of the year 2020.
My blog posts from National Blog Posting Month 2020.
A mantra for bad mental health days
Don’t read on if you’re having an existential crisis (just come back after it’s over).