Researchers love talking about their data and methods as a “toolbox,” and with the rise of big data, they’ve got a fancy new tool. It’s important to remember, however, that the reason for carrying a toolbox is that there are very few projects where just one tool, no matter how powerful, is sufficient to get the job done. It’s necessary to recognize the weaknesses of big data as well as its strengths, and to think about what other types of data are needed to complement it.
The strengths of big data stem from its scope and depth. If we imagine the data gathered by a major payments provider, it covers not only a very large number of individuals but also has a very large number of data points for each individual (since each transaction is a data point). As financial transactions around the world become increasingly digitized, big data can shed light on patterns such as how frequently and for what purposes customers use their accounts. When combined with the data that is gathered from our mobile phones, social media and Internet searches, big data is beginning to allow service providers to simplify identity verification and to better assess the creditworthiness of individuals looking to enter the financial system, just to name two examples.
But perhaps counterintuitively, big data’s weaknesses also relate to its scope and depth. For questions of financial inclusion, it is particularly important to remember that there is still a significant population with a light data footprint. Although mobile phones have spread at lightning pace, the most recent estimate suggests that a quarter of the world’s adult population still does not have a mobile subscription. Even with a conservative definition of what counts as “active,” the number of adults with an active account is just barely over half (53 percent). The director of the Oxford Internet Institute recently reminded us that the majority of the world’s population remains disconnected. And while big data from transactions, mobile phones and social media can tell us what is going on, it does not lend itself to answering the question of why we are observing certain patterns.
To complement big data, we need data that can be gathered from those whose lives remain mostly undigitized, and that provides a different kind of depth by combining quantitative and qualitative methods.
A well-known example of this in financial inclusion is Portfolios of the Poor. The data in the book comes from the financial diaries methodology pioneered by Stuart Rutherford, Orlanda Ruthven, Daryl Collins and Jonathan Murdoch. The diaries gather detailed financial data by interviewing households at least twice a month for a full year, “covering such territory as assets and debts, cash flows in and out of the households, financial instruments, employment, financial goals, and attitudes about money.”
The methodology has now been deployed in parts of South Asia, Africa, Central and South America, as well as the United States. Although the in-depth nature of the data gathered necessitates a relatively small sample, it is focused on the underserved. The data reveals both the complex ways that households actively manage their money and how they feel about their finances. For big-picture patterns identified by big data, such as account dormancy, data from studies like the financial diaries can help us generate theories as to whether those accounts are dormant due to distrust of banks, competition from savings groups that help individuals maintain their self-discipline – or because account holders simply forgot their PIN.
These two different types of data could be seen as opposite ends of a spectrum. But one thing they have in common is that both can seem intimidating to collect and analyze. Given that one of big data’s defining characteristics is an inability to be processed with standard applications, it is understandable that gathering and synthesizing this data will be beyond the capacity of most organizations. Likewise, the personnel costs of conducting yearlong in-depth mixed methods studies make it unlikely that many organizations will commit to conducting this type of research.
But in between the extremes of this spectrum are types of data that nearly every organization can gather to help them learn, iterate and improve. As Daniella Ballou-Aares and Tony Pipa recently argued, we need an “all of the above strategy” for development data. For instance, a small microfinance organization that may not have massive amounts of transactions data or the resources to conduct financial diaries can still learn a lot from deep-dive interviews with their clients. And a government regulator may be able to gather new insights by opening up administrative data sets that were previously inaccessible or unusable for most internal employees. Once we have the humility to recognize that our policies, products and programs won’t be perfect at first, it becomes obvious that we should think about how we can gather data to inform refinements.
We all know the saying that “If all you have is a hammer, everything looks like a nail” – and this principle applies even more if your primary tool is something as legitimately exciting as big data. But though big data has enormous potential and diverse uses that we still haven’t fully appreciated, it will not be ideally suited to all problems, and will work best in combination with the tools we already have.
The potential and limitations of big data was one of the topics under discussion at Rethinking Financial Inclusion: Smart Design for Policy and Practice, a program offered by Harvard Kennedy School Executive Education and Evidence for Policy Design. The program is designed to help practitioners and policymakers working in financial inclusion think about the tools they can use to drive innovation in their organizations. It featured panel discussions with practitioners working on big data and financial inclusion, including Stefan Hunt and Christoph Riedl, and practitioners gathering in-depth data through mixed methods approaches, including David Porteous. In addition, afternoon sessions led by Harvard Professor and co-Director of Evidence for Policy Design, Asim Khawaja helped small groups think about how cycles of data gathering, learning and iteration can be applied to solve puzzles that participants bring to the course. This is the third year of a program that has attracted scores of participants from banks, microfinance institutions, NGOs and government ministries from around the world.
Michael Fryar is a research fellow with Evidence for Policy Design at Harvard Kennedy School where he is focused on training initiatives that aim to equip policy decision-makers with practical skills and frameworks for effectively applying data and evidence in their work.
This article originally appeared in NextBillion Financial Innovation. It is reposted here with permission.