Andy,

You raise a great point about considering the provenance. Unless there's a
way to exclude attributes from provenance tracking, I think we'd need to
force the issue by not allowing attributes to be an input source for
expression language. That's the only way to kinda force people to think
"hey, I shouldn't put this here." In my opinion, that's not really
something we should allow given the ramifications of people using the
feature without reading up on the relevant documentation.

On Wed, Jun 20, 2018 at 1:35 PM Andy LoPresto <alopresto.apa...@gmail.com>
wrote:

> Sivaprasanna,
>
> Thanks for joining this effort. I don’t recall what’s on the existing
> Jira, but please be very aware of the challenges in data anonymization and
> the various threat models — de-anonymizing data can lead to the leak of
> PII, EPHI, PCI data, etc. In some cases, it can even lead to physical
> danger against persons.
>
> There are a number of high impact examples of avoidable scenarios like
> this.
>
>
> https://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-in-databases-of-ruin/
>
>
> https://arstechnica.com/tech-policy/2014/06/poorly-anonymized-logs-reveal-nyc-cab-drivers-detailed-whereabouts/
>
> We should use publicly reviewed algorithms, document the risks and known
> challenges well, take into consideration provenance and other NiFi-specific
> features, and write a good summary of these features if/when they are
> introduced.
>
> Andy LoPresto
> alopre...@apache.org
> alopresto.apa...@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jun 20, 2018, at 10:06, Sivaprasanna <sivaprasanna...@gmail.com>
> wrote:
> >
> > Wow.. I dint realize there was a JIRA already. I'm interested and would
> be
> > happy to contribute my time & efforts on this.
> >
> >> On Wed, Jun 20, 2018 at 10:34 PM, Matt Burgess <mattyb...@apache.org>
> wrote:
> >>
> >> I think is a great idea, I filed a Jira [1] a while ago in case
> >> someone wanted to start working on it (or in case I got a chance). It
> >> mentions ARX but any Apache-friendly implementation is of course
> >> welcome. I think it should be in its own bundle as it is functionality
> >> separate from all our other bundles (and not ubiquitous enough to put
> >> in the standard NAR).
> >>
> >> Glad to hear you're interested in this, please feel free to reach out
> >> with any questions and I too would be happy to review any
> >> contributions.
> >>
> >> Thanks,
> >> Matt
> >>
> >> [1] https://issues.apache.org/jira/browse/NIFI-4492
> >>
> >> On Wed, Jun 20, 2018 at 12:57 PM Mike Thomsen <mikerthom...@gmail.com>
> >> wrote:
> >>>
> >>> There's a framework called ARX that could very useful for this. The
> only
> >>> question you have is how compliant it would be with different sets of
> >>> distinct legal requirements for privacy handling. In the absence of
> >> strong
> >>> legal guidance, I'd say err on the side of complying with health care
> >>> regulations because that's where you're likely to find the clearest
> >>> guidance and established tools.
> >>>
> >>> Ping me on any PR you send.
> >>>
> >>> On Wed, Jun 20, 2018 at 12:49 PM Sivaprasanna <
> sivaprasanna...@gmail.com
> >>>
> >>> wrote:
> >>>
> >>>> With data becoming more critical and substantial to business
> >> development,
> >>>> new stringent regulations & law are getting introduced (GDPR being a
> >> recent
> >>>> example), I've been spending some time lately doing research on data
> >>>> anonymization and after some hefty thinking, I finally decided to go
> >> ahead
> >>>> with the creation of new processor bundle that has processors like
> >>>> 'AnonymizeRecord', 'DeanonymizeRecord' (not quite sure about the name
> >>>> though). Following are my questions:
> >>>>
> >>>>   - What do you guys think about these proposed processors?
> >>>>   - If the processors are okay to be introduced, are they "standard"
> >>>>   enough to get them added to our 'nifi-standard-bundles' module or
> >> is it
> >>>>   better to keep it separated much like others like AWS, Azure
> >> bundles,
> >>>> etc.
> >>>>
> >>>> Having said this, I'm very much in the beginning phase with my
> >> research and
> >>>> development efforts so all your inputs & feedback on this one are
> >> greatly
> >>>> appreciated.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> -
> >>>> Sivaprasanna
> >>>>
> >>
>

Reply via email to