Zero Knowledge Proof
Personal data is of crucial importance to large digital companies. The GAFAMs, and more particularly social networks exploit their users’ personal data. This allows them to sell it to third parties especially for targeted advertising purposes, and thus generate tens of billions in profits (1). This is achieved through our data, through your data, along with the development of ever more precise tools to collect and analyse what constitutes our identity. The Cambridge Analytica scandal of 2018 showed that the interest in our data goes beyond its simple monetisation, going as far as seeking to potentially manipulate elections, and calling into question the very idea of democracy. These data gathering mechanisms are widespread, particularly in the field of advertising. Beyond the apparent goal of “improving the user experience”, we do not know what is being done with our data, who is using it and to whom it is being transferred. This is the starting point of the General Data Protection Regulation (GDPR), effective since 25 May 2018, which aims to subject the collection of personal data to user consent, and which in fact came into force in March 2021 in France.
In addition, and beyond this collection and what is commonly known as “Big Data”, centralised data is of growing interest to hackers. Every week, new cases of database hacking are discovered, and the information represents an equally important financial windfall on the dark web (2).
XSL Labs’ aim is to provide a solution that will allow us to regain control of our personal data. The use of Zero Knowledge Proof (ZKP) protocols is one way of achieving this.
ZKP is an interactive protocol between two parties, a prover and a verifier. The first convinces the second one that a proposition is true without transmitting any other information. Designed in the 1980s by Goldwasser, Micali and Rackoff (3) , the purpose of the ZKP is therefore to show an information is true without revealing it.
Their exact definition is as follows: “Zero-knowledge proofs are defined as those proofs that convey no additional knowledge other than the correctness of the proposition in question”.
ZKP has three essential properties:
– Completeness: This property states that once the protocol is followed by both parties, then the proof must always be accepted by the verifier.
– Soundness: If the proposition is false, then the prover will not be able to convince the verifier that the proposition is true (or with an extremely low probability).
– Zero knowledge: The verifier can only learn the fact that the statement is true. He doesn’t learn any other information.
The Interactive ZKP protocol is commonly referred to as the Σ “sigma” protocol, in that it is a 3- move structure protocol:
– commitment: In this phase, the prover indicates that it can prove the veracity of the information to the verifier
– the challenge: This is the response sent by the verifier
– the response: literally the response to this challenge from the prover.
Ali Baba’s cave
In order to explain the zero-knowledge proof, a common example is given in the article “How to explain Zero-Knowledge Protocols to your Children ” (4) , based on the tale of Ali Baba, which we will summarise here.
In this example, Ali Baba is robbed of his purse by a thief, who escapes into a cave with two passages, one to the left and one to the right. Ali Baba, who followed the thief, does not know which way the thief went.
He chooses the left passage, but reaches a dead end. So he thinks that the thief must have gone the other way. He comes back to the entrance and then goes to the right, but realises that it is also a dead end and does not find the thief.
The following days, this phenomenon occurs again, Ali Baba is robbed again, arrives in front of this cave and does not find the thief, no matter which direction he takes. After 40 days, Ali Baba finally realises that this cannot be a coincidence, as the probability that he will always take the wrong direction is too small. So he decides to hide at the bottom of the cave, and discovers that the thieves use a magic word that allows them to pass from one side of the cave to the other. There was indeed a hidden door connecting the passages.
From this situation, let’s imagine two characters that we will call Peggy and Viktor, and a cave splitting into two passages A and B which are, on the other side, connected by a closed door that opens with a magic word.
Viktor wants to prove to Peggy that he knows the magic word to open the door, without giving her the word (i.e. without revealing any information).
To proceed, a scenario is created:
Step 1: Peggy waits outside the cave, without looking. Viktor enters the cave through one of the passages, at random, and goes to the door.
Step 2: Once this is done, Peggy comes to the junction and flips a coin. Depending on the result, she yells A or B.
Step 3: Viktor must appear on the side of the exit Peggy asked for in step 2.
Thus, if Viktor really knows the magic word, it is easy for him to respond to Peggy’s request and always appear on the correct side.
On the other hand, if he does not know the magic word, he has a 50/50 chance of making a mistake at each attempt, since he is already on one side of the cave, and in order to succeed, this side must match the one requested by Alice after the coin toss.
If the scenario is repeated, the chance of success decreases, and so in, say, twenty successive attempts, the probability of success is 1/2^20, which is less than one in a million.
In this example, the repetition of the scenario in the form of an interactive game shows that it becomes increasingly unlikely that Viktor does not know the magic word. One qualification is that this system can never provide a one hundred percent guarantee that the proposition is true.
However, we can imagine a different protocol where Peggy stands at the junction, then Viktor enters one side of the cave and exits the other side, which proves in one try that Viktor knows the magic word, without giving any other information to Peggy than the proof that he knows it. Note that in such a scenario, the whole process could be observed or recorded by Alice or a third party. In this case, anyone could eventually learn of Viktor’s knowledge of the magic word.
The initial example is a case of ZKP in interactive form: as mentioned earlier, the protocol takes the form of an iterative “game” between the two parties.
ZKP and data protection
As pointed out by the CNIL (the French National Commission on Informatics and Liberty) in a September 2018 document related to the EU GDPR (5) , the requirements of the GDPR are difficult to apply to blockchain.
Indeed, one of the properties of the blockchain consists in the irreversibility of the recorded data: once data is stored on the blockchain, it cannot be modified or deleted.
Therefore, while the rights of access and portability of personal data can be effectively exercised, this cannot apply to the rights of rectification and deletion of data stored on the blockchain. The CNIL notes that it is “technically impossible to comply with the data subject’s request for deletion when data are recorded in the blockchain” and presents in this document initial recommendations, pending an in-depth evaluation.
The main idea is that no unencrypted personal data should be written on the blockchain.
In practice, all traditional transactions (e.g. in bitcoin) are public. The idea is that anyone can see an actor’s balance and transactions, and pseudonymity is actually an illusion of privacy as pseudonyms can ultimately be traced back to individuals (6).
Notably, the Zcash project uses non-interactive ZKP in order to be able to hide all of the data (7). This project relies on “zk-SNARKs” (for Zero-Knowledge Succinct Non-Interactive Argument of Knowledge), a specific system of efficient (succinct) proof that consists of a single message transmitted from the prover to the verifier (non-interactive). Other players are interested in this type of ZKP, such as JP Morgan or ING (8)
At this point, let us recall the main difference between interactive and non-interactive protocols in ZKP: The interactive protocol requires repeated interactions between the evidence provider and the verifier, whereas the non-interactive protocol requires minimal interactions, typically only sending an evidence to the system, so that any verifier can access this evidence at any time.
When it comes to XSL Labs, the SDI will behave like a ZKP protocol where appropriate. A common example is an age verification which will not require the transmission of information from an identity document, but solely a proof stating “age is over X”, not providing the information “date of birth”, or even “age” in any precise way, let alone all the information that may be contained in an identity document. Other common cases may consist for example in a proof of possession of a certain amount of money without telling the other party the total amount of our funds or revealing particularly sensitive banking information (a typical example is Yao’s Millionnaire Problem (9) , which can be answered with ZKP protocols).
All in all, the ZKP is a promising technology for solving data security issues, as well as an answer to issues related to the EU General Data Protection Regulation. The zero knowledge proof is starting to be used in various industries, typically some banks. This technology may in the future find applications in many areas of information protection, starting with authentication, research and even electronic voting. The use of ZKP is set to spread massively, and it could become a standard in the future. At XSL Labs, we are part of this revolution in usage, and thus participate in the evolution of the Internet of trust.