Sethukumar

Thanks for creating the JIRAs. Will get back to you after reviewing them.

Bosco


From:  Sethukumar Ramachandran <[email protected]>
Reply-To:  "[email protected]"
<[email protected]>
Date:  Tuesday, April 21, 2015 at 1:17 AM
To:  "[email protected]" <[email protected]>,
"[email protected]" <[email protected]>
Subject:  RE: Some Apache Ranger queries/thoughts

> I have added JIRA items for these (RANGER-413, 414,415,416) and let me know
> the description is detailed enough to be taken as requirement statement.
>  
> Thanks
> Sethukumar
>  
> 
> From: Sethukumar Ramachandran
> Sent: Thursday, April 16, 2015 9:07 AM
> To: [email protected]; [email protected]
> Subject: RE: Some Apache Ranger queries/thoughts
>  
> Sure and happy to contribute (taking up requirements to coding, anything for
> that matter). Please let me know. Give me few days to start creating JIRA.
> Then we can refine the requirements and start on..
>  
>  
> Thanks
> Sethukumar
>  
> 
> From: Don Bosco Durai [mailto:[email protected]] On Behalf Of Don Bosco
> Durai
> Sent: Thursday, April 16, 2015 4:59 AM
> To: [email protected]; [email protected]
> Subject: Re: Some Apache Ranger queries/thoughts
>  
> 
> Hi Sethukumar
> 
>  
> 
> You requests are reasonable. Let¹s start with creating the JIRA. Also if you
> are planning to do some specific contribution, then let us know.
> 
>  
> 
> Thanks
> 
>  
> 
> Bosco
> 
>  
> 
>  
> 
> From: Sethukumar Ramachandran <[email protected]>
> Reply-To: "[email protected]"
> <[email protected]>
> Date: Tuesday, April 14, 2015 at 9:01 PM
> To: "[email protected]" <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: RE: Some Apache Ranger queries/thoughts
> 
>  
>> 
>> Thanks Durai for the responses. I¹m happy to contribute to Ranger in whatever
>> way I can. I shall create JIRA with detailed descriptions/requirements for
>> these items (1) eliminating multiple entries for a single event (2) auditable
>> actions in hdfs and hive (would be really nice if this is based on some
>> configurable patterns) (3) Ranger to capture the exact nature of event
>> (update, create, delete, permission modified, ACL created etc..) .
>>  
>> On the fourth item it is not exactly the policy changes (policy changes in
>> Ranger keep track of old value and new value for any kind of changes) but any
>> changes happening in HDFS and HIVE which can be defined in some fashion. For
>> example, in HDFS we need to audit file/folder creation, modification to the
>> same, deletion, user creation, user permission changes, ACL changes, HIVE
>> grants and revokes etc. just to list some of them (can go in detail in JIRA
>> with exact requirements). For these kind of changes it is required to keep
>> track of what changes from what value to what value and by whom and when. If
>> such a change attempt resulted in failure that also need to be audited.
>>  
>>  
>> Hope this outlines the requirements. I shall start creating JIRAs for these
>> and let me know in whatever way I can contribute to this.
>>  
>>  
>> Thanks
>> Sethukumar
>>  
>> 
>> From: Don Bosco Durai [mailto:[email protected]] On Behalf Of Don Bosco
>> Durai
>> Sent: Wednesday, April 15, 2015 6:44 AM
>> To: [email protected]; [email protected]
>> Subject: Re: Some Apache Ranger queries/thoughts
>>  
>> 
>> Hi Sethukumar
>> 
>>  
>> 
>> Thanks for your input. My responses are inline.
>> 
>>  
>> 
>> Regards
>> 
>>  
>> 
>> Bosco
>> 
>>  
>> 
>>  
>> 
>> From: Sethukumar Ramachandran <[email protected]>
>> Reply-To: "[email protected]"
>> <[email protected]>
>> Date: Tuesday, April 14, 2015 at 2:48 AM
>> To: "[email protected]" <[email protected]>
>> Cc: "[email protected]" <[email protected]>
>> Subject: Some Apache Ranger queries/thoughts
>> 
>>  
>>> 
>>> Hello all,
>>>  
>>> We are using HDP 2.2 and setup Apache Ranger along with it in Ubuntu 12.04.
>>> We are not able to fulfill our audit related requirement through Ranger. At
>>> present we have the following items which we were not able to get through
>>> Ranger. Please let us know whether we are missing something or ways to
>>> improve.
>>>  
>>>  
>>> 1.      As part of our audit requirements we are required to capture
>>> PermissionDenied type of exceptions  (or any exceptions for that matter) in
>>> HDFS and GRANT related issues in Hive. At present we are not able to capture
>>> these in Ranger. But HDFS audit logs and hiverserver logs have some relevant
>>> information on this. As a single point of information on audit related stuff
>>> we would like to have these in Ranger than looking around in those logs.
>>> How Can we do this with Ranger?
>> 
>> Bosco: This is our ultimate goal. With Hive we might be auditing all user
>> level activities. With HDFS, we are auditing all file access related actions.
>> Would you be able to list out the actions you want to audit. This will help
>> us to scope the work. Please create a JIRA to track this.
>> 
>>  
>>> 
>>> 2.      Both HDFS and Hive plugins for Ranger actually captures multiple
>>> audit entries for the same event and this is bit an overhead from auditing
>>> perspective. Is it possible to have a single and clear audit entry in Ranger
>>> for a particular auditable event? Is there some configuration available for
>>> this to work?
>> 
>> Bosco: In the release under development (Apache Ranger 0.5), the HDFS audit
>> has been optimized to only one call per request. For Hive, we are just
>> capturing one action per request. I am now sure whether you are referring to
>> ³USE² action. Anyway, for Hive, it would be good if you can let us know which
>> ones are duplicate. We can look into it.
>> 
>>  
>>> 
>>> 3.      If we have an HDFS read, write or delete operation we get multiple
>>> entries in Ranger audit. But we are not able to figure about the exact
>>> nature of change happened in HDFS by looking  through the Ranger Audit trail
>>> records. Similar is the case for Hive related operations. The resource name
>>> that Ranger captures is sometimes vague and point to /tmp folder and all
>> 
>> Bosco: Hopefully, eliminating the multiple entries will ease some of your
>> pain. Regarding Hive access to HDFS, since Hive creates a lot of temporary
>> intermediate files, there is a lot of noise. Your concerns are valid. I feel,
>> we should extend our UI search to be more smart and help the admin users to
>> suppress (filter out) accesses to /tmp folders and similar transient
>> resources. Can you help us documenting and track the requirement by creating
>> a JIRA? FYI, we are moving our audits to Solr. This gives a lot more search
>> and filter capabilities and you can also use Banana (or other BI tools) to
>> write your own custom Audit dashboard. Something that might be interesting to
>> you.
>> 
>>  
>>> 
>>> 4.      If there is a change in HDFS or Hive (grants, data delete/update),
>>> as a requirement we need to store the old value and new value along with who
>>> made the change, when the change was made and whether it was successful or
>>> not. But this is not happening now. How can we achieve this with Ranger?
>> 
>> Bosco: Assuming you are referring to policy changes, all Hive related policy
>> changes (Ranger UI, Ranger REST or Hive GRANT/REVOKE) are logged into Ranger.
>> You can check them from Ranger -> Audit -> Admin tab. For HDFS, all policy
>> changes done via Ranger UI and Ranger REST are logged in Ranger.
>> 
>>  
>> 
>>  
>>> 
>>>  
>>>  
>>> Thanks & Regards,
>>> Sethukumar Ramachandran


Reply via email to