Personal data privacy – the sacrificial lamb in this era of big data?

By Michele Tucci, head of product development and data partnerships at CredoLab

Data is at the heart of what most technology companies do. Netflix uses data to recommend movies and TV series that consumers will likely watch, and like. Facebook uses data to recommend products that users will likely like and buy. CredoLab uses data to create tailor-made scorecards that help financial institutions extend loans to consumers who are excluded from traditional financial systems and in need of credit.  

Among the financially excluded could lie a young student in need of a university loan, an immigrant seeking to grow his small business, or an industrious, trustworthy individual in need of cash to tide him over a difficult patch. These underbanked and underserved, if left unintegrated in the traditional financial system, are left at the bottom of the credit pile and remain socio-economically disadvantaged.

Extracting and processing data is not a problem per se. The problem arises when data is misused; when customers are unaware of the data that has been extracted without their consent, and when the insights produced do not deliver real value to them as consumers, but are only of value to the company.

The problem is further exacerbated when personal data is extracted and not properly protected. In September 2017, Equifax (one of the three major consumer credit reporting agencies in the U.S.) revealed that hackers had gained access to company data, potentially compromising sensitive, personal information of some 143 million American consumers. The breach exposed sensitive data, including social security numbers and drivers’ license numbers. In May 2018, a prison technology company used by law enforcement agencies across the U.S., allegedly had its data breached by a hacker who was able to access the live locations of the entire American population.

Anonymous data is meaningless until used to predict consumer behavioural trends

A data compromise in any form could mean colossal damage to a data-driven business. But what if you used ‘metadata’ instead of non-anonymous data? Information about other data does not include personal identification details, such as the borrower’s name, ID number, date of birth, race, nationality etc., and is extracted from the smartphones of borrowers completely anonymously.

Metadata only becomes useful when technology is used to turn this data into meaningful, highly predictive, behavioural insights. For example, by using proprietary technology to extract and analyse this metadata, we can help lenders better predict a customer’s risk profile and repayment behavior.

For example, we may access information about a digital image but we won’t see the actual picture. The metadata processed may include the description of how large the picture is, the color depth, image resolution, when the image was created, or even the shutter speed. A text document's metadata may contain information about how long the document is and when it was written. Metadata within web pages may also contain descriptions of page content, as well as key words linked to the content. However, without any context or point of reference, it is almost impossible to identify metadata just by looking at it.

A further illustration can be see through this example. Imagine that you have a database containing 13-digit long number strings. These strings could be the result of calculations or a list of numbers to plug into an equation. In other words, without  any context, the numbers themselves can be perceived as the data. However, if it is known that this database is a log of a book collection, those 13-digit numbers may now be identified as International Standard Book Numbers (ISBNs) – information that refers to the book, not information within the book.

Anonymous but data protection still critical

The lenders and card issuers we work with have the option to extract either anonymous or non-anonymous data. However, given the increasing incidence of data exploitation and violation we are seeing today, many often and intentionally opt for the former. The solution to the misuse of data is not to completely phase out the use of data as this would be unthinkable. The solution to this lies in ensuring we use this data to do what we set out to do from the outset - to bank the unbanked and serve the underserved.

Our main goal is to improve the standards of living for people and for businesses that have poor or even no access to mainstream financial services. The beauty of living in an era of advanced technology and big data means that almost every other person around you owns a smartphone. Companies like CredoLab leverage smartphone technology to turn anonymous data into highly-predictive insights of consumer repayment behavior that go beyond the realms of traditional credit scoring.

The result? New customer segments and more cost-effective credit decisions for financial institutions and lenders, and higher likelihoods of consumers receiving much-needed credit. All of this can be achieved without any infringements on data privacy or security. If done right, it can be a win-win for all.