This website uses cookies. By continuing to use this website you are agreeing to our use of cookies as described in our privacy policy.

Executive Summaries May 23, 2018

Anonymization? Think Again

Caution is advised when considering anonymization to circumvent the rigid data collection, use and communication requirements set out in Europe's General Data Protection Regulation (GDPR) or in Canada's Personal Information Protection and Electronic Documents Act (PIPEDA).

Danielle Miller Olofsson has authored this article.

Any organization considering anonymization to circumvent the rigid data collection, use, and communication requirements set out in Europe’s General Data Protection Regulation (GDPR) or for that matter in Canada’s Personal Information Protection and Electronic Documents Act (PIPEDA) - which, all indications suggest, will align itself with the GDPR - should think again.


Anonymization includes a variety of technics used to disassociate individuals from their personal information so as to sell or re-use the information, often for research or marketing purposes, without triggering GDPR or PIPEDA compliance. Both laws impose strict restrictions on the way organizations treat personal information which they defined broadly to include any information about an identifiable person. If information is rendered anonymous, so the argument goes, it is no longer about an identifiable individual and therefore, its collection, use, and disclosure are no longer subject to the GDPR or PIPEDA.


Setting aside the arguments relating to the usefulness of anonymized data, given the distortion it frequently creates, a number of other factors militate in favour of caution when choosing the anonymization route.

To begin with, from a technological point of view there is no evidence that foolproof anonymization is possible. That is to say, there is always a small risk that the data can be re-identified. One only has to think of the New York Taxi case, in 2014, in which bloggers were able to discover the algorithms used to alter the medallion numbers by the New York City Taxi and Limousine Commission that released the information to reverse the pseudomization process the Commission had put in place. In case one is tempted to argue that the New York Taxi case was a freak occurrence, it should be noted that 63% of the population can be identified by a small amount of data such as their gender, date of birth, and postal code.

Moreover, organizations that base their decision to release anonymized information on a risk analysis, should remember that, if the process fails, even a minimal risk in terms of percentage still represents a substantial number of individuals: 1% of the Québec population is roughly 80,000 people – enough for a class action!

A second technicality of which organizations should be aware is that de-identification, for example the list of identifiers that Health Insurance Portability and Accountability Act (HIPAA) of the United States requires removed from health data before it can be shared, is not anonymization but rather a preliminary step in the anonymization process. Further analysis and measures are required such as removing or altering other information that could identify an individual and putting controls and safeguards in place to manage the risk of re-identification.

Finally, just because information cannot be traced back to a name does not mean that it is not considered personal information. The Canadian Office of the Privacy Commissioner, in a decision involving a telecommunications company, concluded that account information, demographics, and network usage, even if un-identified, constituted information related to specific individuals and therefore personal information. This decision puts an end to speculation that behavioural patterns are not personal information and therefore not subject to GDPR or PIPEDA regulation.


Organizations that choose to anonymize information either to avoid regulation or to add an extra level of security to the data they collect might wish to consider the following:

  • Anonymization is not simply a question of modifying the data itself but protecting it against the environment into which it is released. It is therefore important to assess the properties of the data, the type of user, the type of application, the type of access, the modus of release and the attacker model to establish the appropriate level and form of anonymization for each type of data;
  • While a de-identification process may seem technologically sound today, technology changes fast and ongoing re-assessment is required;
  • Establishing a clear process for anonymization. The Office of the Australian Information Commissioner has issued a guide that is well worth consulting;
  • Restricting rather than extending access to data that has been anonymized; and
  • Putting in place a de-identification governance committee.

While there is a good place for anonymization in an organization’s handling of personal information, it should not be relied on to replace compliance with existing legislation. Our Privacy, Data Protection and Cyber-Crypto Security team would be pleased to answer any questions regarding anonymization or any other data protection matters.