@Mike
However, this is also partly very frustrating, what we have to consider here. 
But also pretty fascinating.

Mit freundlichen Grüßen / best regards
Kay-Uwe Moosheimer

> Am 30.01.2020 um 16:23 schrieb Mike Thomsen <mikerthom...@gmail.com>:
> 
> That's actually a pretty fascinating use case. Our experience on this side
> of the Atlantic is that few people really care about lineage.
> 
>> On Thu, Jan 30, 2020 at 9:48 AM u...@moosheimer.com <u...@moosheimer.com>
>> wrote:
>> 
>> I think you have the wrong picture.
>> 
>> Data lineage systems like Atlas and similar are pushed because GDPR
>> prescribes it!
>> Data Lineage is by no means a pure "internal diagnostic" but has a legal
>> background.
>> 
>> Thus GDPR defines a recording requirement.
>> It states among other things that
>> - a description of the categories of personal data
>> - a description of the categories of recipients of personal data,
>> including recipients in third countries or international organisations
>> Transfer of personal data to a third country or an international
>> organisation
>> - be recorded in an audit-proof manner.
>> 
>> And if you do all this correctly, then you have to make sure that the
>> data is erasable again (right to be forgotten).
>> 
>> By the way, this does not only apply to special Data Lineage systems but
>> also to all log files, backups etc. At least as long as no other legal
>> regulation prohibits this.
>> Data Lineage is therefore not a nice feature for internal diagnostics
>> but a must.
>> 
>> So far, too few companies have thought of this. But more and more are
>> recognizing the necessity.
>> This is also the reason why formerly Hortonworks and now Cloudera work
>> hard on Atlas.
>> 
>>> Am 30.01.2020 um 15:25 schrieb Mike Thomsen:
>>> IANAL, but I would be surprised if NiFi provenance data even legally
>> falls
>>> under the Right to Be Forgotten because it's internal diagnostic data
>> that
>>> is highly ephemeral.
>>> 
>>> On Thu, Jan 30, 2020 at 9:07 AM Emanuel Oliveira <emanu...@gmail.com>
>> wrote:
>>> 
>>>> Hi, dont think makes sense an api for atomic records:
>>>> 
>>>>   1. one configure retention od data provenance (default 24h is "good
>>>>   enough" GDPR doesnt need milisecond realtime deletion right ?)
>>>> 
>>>> 
>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#persistent-provenance-repository-properties
>>>>   2. even if there would be one api to delete FF's with an attribute =
>>>>   <some id>, that would normally be useless as well, since inbound FFs
>>>> have
>>>>   normally hundreds, thousands of records that will need to split,
>>>> aggregate,
>>>>   in complex flow file, implementing a clean up an nano atomic level
>>>> would be
>>>>   to hard and extra effort not needed, since your target single record
>>>> would
>>>>   surely be part of multiple FF UUIDs, some only holding your record,
>> but
>>>> mot
>>>>   surefly will have 100s, 100s of other records including your record
>>>>   somewhere on the middle.
>>>> 
>>>> 
>>>> In my opinion your answer to business/management gate keepers is that
>> data
>>>> will be stored on data provenance for 24h (default) which can be
>>>> configured, and that
>>>> 
>>>> 
>>>> Best Regards,
>>>> *Emanuel Oliveira*
>>>> 
>>>> 
>>>> 
>>>> On Thu, Jan 30, 2020 at 1:54 PM u...@moosheimer.com <u...@moosheimer.com>
>>>> wrote:
>>>> 
>>>>> Dear NiFi developer team,
>>>>> 
>>>>> NiFi's Data Provenance and Data Lineage is perfectly adequate in the
>>>>> environment of NiFi, so there is often no need to use Atlas.
>>>>> 
>>>>> When using NiFi with customer data a problem arises.
>>>>> The problem is the GDPR requirement that a user has the right to be
>>>>> forgotten. Unfortunately, I can't find any API call or information on
>>>>> how to delete individual user data from the NiFi Provenance Repository
>>>>> based on a user-defined attribute and its defined characteristics.
>>>>> 
>>>>> A delete request like "delete all data and dependencies where the
>>>>> attribute XYZ has the value 123" is currently not possible to my
>>>> knowledge.
>>>>> My questions are:
>>>>> Is this actually possible and how? And if not, is it planned?
>>>>> 
>>>>> Thanks
>>>>> Uwe
>>>>> 
>> 
>> 

Reply via email to