Hi Daniel, That is great idea to add more meaningful fields into sensitivity metadata, you can go ahead to design/add that.
Only one concern is : how do we name this field generally? and what else is possible for future. numOfOccurrences could be a good name, for hdfs or hive, the occurrence is defined differently. Thanks Edward On Mon, Jan 11, 2016 at 7:38 PM, Daniel Zhou <daniel.z...@dataguise.com> wrote: > Hi all, > > Recently I am working on a project to automatically fetch the metadata of > sensitive info stored in DB and then create eagle policy. I am wondering if > we can add a field called "threshold" to current "fileSensitivity > structure" in eagle so that we can create a policy with more details. > > Our company's product "DgSecure" can discover all the sensitive elements > within every file in hadoop automatically, so we have many details of > these sensitive information. With these information, we can make the policy > more precisely. For example, I want to create a policy based on two > parameters, one is "sensitivity type", the other is called "threshold". > Only when the total number of that particular sensitive type element > reaches or exceeds "threshold" can the alerts be triggered. > > So the trigger condition could be something like this: > ........ if (sensitiveType == "MailAddress" && NumberOfSensData > >=threshodl) ..... > > I think this condition makes more sense than just tagging a file with a > sensitive type. > > Please let me know if you have any opinions or suggestions. :) > > Thanks! > Daniel >