Yes, looks we need a schema abstraction which can represent any
sensitivity information.
sensitivityType and numOfOccurrences are just two common fields of the
whole sensitivity information.

For hdfs, the sensitivity information also includes filedir, while for
hive, the sensitivity information includes hiveResource, which could be
database, table, column etc.

Thanks
Edward

On 1/13/16, 0:35, "Prasad Mujumdar" <pras...@apache.org> wrote:

> The number of occurrences is certainly a good idea.
> For the HDFS and any future data sources which don't have native schema,
>how do we handle these fields which are defined in an external system ?
>Are
>you proposing to add a schema abstraction as well ?
>
>thanks
>Prasad
>
>
>On Tue, Jan 12, 2016 at 11:49 PM, Edward Zhang <yonzhang2...@apache.org>
>wrote:
>
>> Hi Daniel,
>>
>> That is great idea to add more meaningful fields into sensitivity
>>metadata,
>> you can go ahead to design/add that.
>>
>> Only one concern is : how do we name this field generally? and what
>>else is
>> possible for future. numOfOccurrences could be a good name, for hdfs or
>> hive, the occurrence is defined differently.
>>
>> Thanks
>> Edward
>>
>> On Mon, Jan 11, 2016 at 7:38 PM, Daniel Zhou <daniel.z...@dataguise.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > Recently I am working on a project to automatically fetch the
>>metadata of
>> > sensitive info stored in DB and then create eagle policy. I am
>>wondering
>> if
>> > we can add a field called "threshold" to current "fileSensitivity
>> > structure" in eagle so that we can create a policy with more details.
>> >
>> > Our company's product "DgSecure" can discover all the sensitive
>>elements
>> > within every file in hadoop  automatically,  so we have many details
>>of
>> > these sensitive information. With these information, we can make the
>> policy
>> > more precisely.  For example, I want to create a policy based on two
>> > parameters, one is  "sensitivity type", the other is called
>>"threshold".
>> > Only when the total number of that particular sensitive type element
>> > reaches or exceeds "threshold" can the alerts be triggered.
>> >
>> > So the trigger condition could be something like this:
>> > ........ if (sensitiveType == "MailAddress" && NumberOfSensData
>> > >=threshodl) .....
>> >
>> > I think this condition makes more sense than just tagging a file with
>>a
>> > sensitive type.
>> >
>> > Please let me know if you have any opinions or suggestions. :)
>> >
>> > Thanks!
>> > Daniel
>> >
>>

Reply via email to