Here is an example: Query: from hdfsAuditLogEventStream[(str:regexp(sensitivityType,'.*Social_Security.*')==true)] select * insert into outputStream;
-----Original Message----- From: Zhang, Edward (GDI Hadoop) [mailto:yonzh...@ebay.com] Sent: Friday, February 05, 2016 4:11 PM To: dev@eagle.incubator.apache.org Subject: Re: Policy based on sensitive types stops working if there are too many sensitive items can you please show one policy? I thought it would be because policy itself is too complicated for engine to parse and evaluate Thanks Edward On 2/5/16, 15:46, "Daniel Zhou" <daniel.z...@dataguise.com> wrote: >Hi all, > >Anyone test eagle's performance in this situation? > >1. large amount (Eg: 2000) sensitive items are stored in Hbase >table "fileSensitivity" > >2. create policies based on sensitive Types. > > >I am asking this question because it seems that policy based on >sensitive types stops working if there are too many sensitive items >(1700+). > >Here is what I did: >At first I created about 20 HDFS policies based on 20 sensitive types, >such as "creditCard", "PhoneNumber", etc, and in table >"fileSensitivity" there were 10 sensitive entries, alerts were >triggered when I did hdfs operation on these sensitive items. >Then I inject 1700 sensitive items into table "fileSensitivity" by >calling Eagle's API, after that I operated on sensitive items through >Hadoop terminal, alerts CAN NOT be triggered. >Notice that at that time, policies based on attributes such as "src", >"dest" still work. > >To fix that, >I delete table "fileSensitivity" and create a new one with same name, >this time I only inject 5 items into the table. Then 20 HDFS policies >start to work again. > >So I'm wondering is it a performance issue? > >My cluster contains two machines, both are: >Centos6, 4 cores, 15.58 GB RAM, 50G Disk (20% used), >HDP-2.2.9.0-3393<http://192.168.6.131:8080/> > >Regards, >Daniel > >