[ https://issues.apache.org/jira/browse/MADLIB-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan reassigned MADLIB-911: -------------------------------------- Assignee: Himanshu Pandey > Anonymization > ------------- > > Key: MADLIB-911 > URL: https://issues.apache.org/jira/browse/MADLIB-911 > Project: Apache MADlib > Issue Type: New Feature > Components: Module: Utilities > Reporter: Frank McQuillan > Assignee: Himanshu Pandey > Priority: Major > Labels: starter > > Story > As a data scientist, I want to perform anonymization operations on my data, > so that I can prepare it for input to predictive analytics algorithms. I > also want to be able to de-anonymize my data. > Proposed functionality: > * Create conversion table for anonymization. > * Create an anonymized version of a table. > * Create a deanonymized version of a table > Must be able to: > * anonymize multiple columns in a table > * datasets will still join correctly even on masked columns > * the aggregates on masked columns will match to the original > References > [1] PDL tools > http://pivotalsoftware.github.io/PDLTools/group__grp__anonymization.html > [2] General information on anonymization > https://en.wikipedia.org/wiki/Data_anonymization -- This message was sent by Atlassian JIRA (v7.6.3#76005)