Hi Sethukumar

You requests are reasonable. Let¹s start with creating the JIRA. Also if you
are planning to do some specific contribution, then let us know.

Thanks

Bosco


From:  Sethukumar Ramachandran <[email protected]>
Reply-To:  "[email protected]"
<[email protected]>
Date:  Tuesday, April 14, 2015 at 9:01 PM
To:  "[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Subject:  RE: Some Apache Ranger queries/thoughts

> Thanks Durai for the responses. I¹m happy to contribute to Ranger in whatever
> way I can. I shall create JIRA with detailed descriptions/requirements for
> these items (1) eliminating multiple entries for a single event (2) auditable
> actions in hdfs and hive (would be really nice if this is based on some
> configurable patterns) (3) Ranger to capture the exact nature of event
> (update, create, delete, permission modified, ACL created etc..) .
>  
> On the fourth item it is not exactly the policy changes (policy changes in
> Ranger keep track of old value and new value for any kind of changes) but any
> changes happening in HDFS and HIVE which can be defined in some fashion. For
> example, in HDFS we need to audit file/folder creation, modification to the
> same, deletion, user creation, user permission changes, ACL changes, HIVE
> grants and revokes etc. just to list some of them (can go in detail in JIRA
> with exact requirements). For these kind of changes it is required to keep
> track of what changes from what value to what value and by whom and when. If
> such a change attempt resulted in failure that also need to be audited.
>  
>  
> Hope this outlines the requirements. I shall start creating JIRAs for these
> and let me know in whatever way I can contribute to this.
>  
>  
> Thanks
> Sethukumar
>  
> 
> From: Don Bosco Durai [mailto:[email protected]] On Behalf Of Don Bosco
> Durai
> Sent: Wednesday, April 15, 2015 6:44 AM
> To: [email protected]; [email protected]
> Subject: Re: Some Apache Ranger queries/thoughts
>  
> 
> Hi Sethukumar
> 
>  
> 
> Thanks for your input. My responses are inline.
> 
>  
> 
> Regards
> 
>  
> 
> Bosco
> 
>  
> 
>  
> 
> From: Sethukumar Ramachandran <[email protected]>
> Reply-To: "[email protected]"
> <[email protected]>
> Date: Tuesday, April 14, 2015 at 2:48 AM
> To: "[email protected]" <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Some Apache Ranger queries/thoughts
> 
>  
>> 
>> Hello all,
>>  
>> We are using HDP 2.2 and setup Apache Ranger along with it in Ubuntu 12.04.
>> We are not able to fulfill our audit related requirement through Ranger. At
>> present we have the following items which we were not able to get through
>> Ranger. Please let us know whether we are missing something or ways to
>> improve.
>>  
>>  
>> 1.      As part of our audit requirements we are required to capture
>> PermissionDenied type of exceptions  (or any exceptions for that matter) in
>> HDFS and GRANT related issues in Hive. At present we are not able to capture
>> these in Ranger. But HDFS audit logs and hiverserver logs have some relevant
>> information on this. As a single point of information on audit related stuff
>> we would like to have these in Ranger than looking around in those logs.  How
>> Can we do this with Ranger?
> 
> Bosco: This is our ultimate goal. With Hive we might be auditing all user
> level activities. With HDFS, we are auditing all file access related actions.
> Would you be able to list out the actions you want to audit. This will help us
> to scope the work. Please create a JIRA to track this.
> 
>  
>> 
>> 2.      Both HDFS and Hive plugins for Ranger actually captures multiple
>> audit entries for the same event and this is bit an overhead from auditing
>> perspective. Is it possible to have a single and clear audit entry in Ranger
>> for a particular auditable event? Is there some configuration available for
>> this to work?
> 
> Bosco: In the release under development (Apache Ranger 0.5), the HDFS audit
> has been optimized to only one call per request. For Hive, we are just
> capturing one action per request. I am now sure whether you are referring to
> ³USE² action. Anyway, for Hive, it would be good if you can let us know which
> ones are duplicate. We can look into it.
> 
>  
>> 
>> 3.      If we have an HDFS read, write or delete operation we get multiple
>> entries in Ranger audit. But we are not able to figure about the exact nature
>> of change happened in HDFS by looking  through the Ranger Audit trail
>> records. Similar is the case for Hive related operations. The resource name
>> that Ranger captures is sometimes vague and point to /tmp folder and all
> 
> Bosco: Hopefully, eliminating the multiple entries will ease some of your
> pain. Regarding Hive access to HDFS, since Hive creates a lot of temporary
> intermediate files, there is a lot of noise. Your concerns are valid. I feel,
> we should extend our UI search to be more smart and help the admin users to
> suppress (filter out) accesses to /tmp folders and similar transient
> resources. Can you help us documenting and track the requirement by creating a
> JIRA? FYI, we are moving our audits to Solr. This gives a lot more search and
> filter capabilities and you can also use Banana (or other BI tools) to write
> your own custom Audit dashboard. Something that might be interesting to you.
> 
>  
>> 
>> 4.      If there is a change in HDFS or Hive (grants, data delete/update), as
>> a requirement we need to store the old value and new value along with who
>> made the change, when the change was made and whether it was successful or
>> not. But this is not happening now. How can we achieve this with Ranger?
> 
> Bosco: Assuming you are referring to policy changes, all Hive related policy
> changes (Ranger UI, Ranger REST or Hive GRANT/REVOKE) are logged into Ranger.
> You can check them from Ranger -> Audit -> Admin tab. For HDFS, all policy
> changes done via Ranger UI and Ranger REST are logged in Ranger.
> 
>  
> 
>  
>> 
>>  
>>  
>> Thanks & Regards,
>> Sethukumar Ramachandran


Reply via email to