Sethukumar Thanks for creating the JIRAs. Will get back to you after reviewing them.
Bosco From: Sethukumar Ramachandran <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, April 21, 2015 at 1:17 AM To: "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Subject: RE: Some Apache Ranger queries/thoughts > I have added JIRA items for these (RANGER-413, 414,415,416) and let me know > the description is detailed enough to be taken as requirement statement. > > Thanks > Sethukumar > > > From: Sethukumar Ramachandran > Sent: Thursday, April 16, 2015 9:07 AM > To: [email protected]; [email protected] > Subject: RE: Some Apache Ranger queries/thoughts > > Sure and happy to contribute (taking up requirements to coding, anything for > that matter). Please let me know. Give me few days to start creating JIRA. > Then we can refine the requirements and start on.. > > > Thanks > Sethukumar > > > From: Don Bosco Durai [mailto:[email protected]] On Behalf Of Don Bosco > Durai > Sent: Thursday, April 16, 2015 4:59 AM > To: [email protected]; [email protected] > Subject: Re: Some Apache Ranger queries/thoughts > > > Hi Sethukumar > > > > You requests are reasonable. Let¹s start with creating the JIRA. Also if you > are planning to do some specific contribution, then let us know. > > > > Thanks > > > > Bosco > > > > > > From: Sethukumar Ramachandran <[email protected]> > Reply-To: "[email protected]" > <[email protected]> > Date: Tuesday, April 14, 2015 at 9:01 PM > To: "[email protected]" <[email protected]>, > "[email protected]" <[email protected]> > Subject: RE: Some Apache Ranger queries/thoughts > > >> >> Thanks Durai for the responses. I¹m happy to contribute to Ranger in whatever >> way I can. I shall create JIRA with detailed descriptions/requirements for >> these items (1) eliminating multiple entries for a single event (2) auditable >> actions in hdfs and hive (would be really nice if this is based on some >> configurable patterns) (3) Ranger to capture the exact nature of event >> (update, create, delete, permission modified, ACL created etc..) . >> >> On the fourth item it is not exactly the policy changes (policy changes in >> Ranger keep track of old value and new value for any kind of changes) but any >> changes happening in HDFS and HIVE which can be defined in some fashion. For >> example, in HDFS we need to audit file/folder creation, modification to the >> same, deletion, user creation, user permission changes, ACL changes, HIVE >> grants and revokes etc. just to list some of them (can go in detail in JIRA >> with exact requirements). For these kind of changes it is required to keep >> track of what changes from what value to what value and by whom and when. If >> such a change attempt resulted in failure that also need to be audited. >> >> >> Hope this outlines the requirements. I shall start creating JIRAs for these >> and let me know in whatever way I can contribute to this. >> >> >> Thanks >> Sethukumar >> >> >> From: Don Bosco Durai [mailto:[email protected]] On Behalf Of Don Bosco >> Durai >> Sent: Wednesday, April 15, 2015 6:44 AM >> To: [email protected]; [email protected] >> Subject: Re: Some Apache Ranger queries/thoughts >> >> >> Hi Sethukumar >> >> >> >> Thanks for your input. My responses are inline. >> >> >> >> Regards >> >> >> >> Bosco >> >> >> >> >> >> From: Sethukumar Ramachandran <[email protected]> >> Reply-To: "[email protected]" >> <[email protected]> >> Date: Tuesday, April 14, 2015 at 2:48 AM >> To: "[email protected]" <[email protected]> >> Cc: "[email protected]" <[email protected]> >> Subject: Some Apache Ranger queries/thoughts >> >> >>> >>> Hello all, >>> >>> We are using HDP 2.2 and setup Apache Ranger along with it in Ubuntu 12.04. >>> We are not able to fulfill our audit related requirement through Ranger. At >>> present we have the following items which we were not able to get through >>> Ranger. Please let us know whether we are missing something or ways to >>> improve. >>> >>> >>> 1. As part of our audit requirements we are required to capture >>> PermissionDenied type of exceptions (or any exceptions for that matter) in >>> HDFS and GRANT related issues in Hive. At present we are not able to capture >>> these in Ranger. But HDFS audit logs and hiverserver logs have some relevant >>> information on this. As a single point of information on audit related stuff >>> we would like to have these in Ranger than looking around in those logs. >>> How Can we do this with Ranger? >> >> Bosco: This is our ultimate goal. With Hive we might be auditing all user >> level activities. With HDFS, we are auditing all file access related actions. >> Would you be able to list out the actions you want to audit. This will help >> us to scope the work. Please create a JIRA to track this. >> >> >>> >>> 2. Both HDFS and Hive plugins for Ranger actually captures multiple >>> audit entries for the same event and this is bit an overhead from auditing >>> perspective. Is it possible to have a single and clear audit entry in Ranger >>> for a particular auditable event? Is there some configuration available for >>> this to work? >> >> Bosco: In the release under development (Apache Ranger 0.5), the HDFS audit >> has been optimized to only one call per request. For Hive, we are just >> capturing one action per request. I am now sure whether you are referring to >> ³USE² action. Anyway, for Hive, it would be good if you can let us know which >> ones are duplicate. We can look into it. >> >> >>> >>> 3. If we have an HDFS read, write or delete operation we get multiple >>> entries in Ranger audit. But we are not able to figure about the exact >>> nature of change happened in HDFS by looking through the Ranger Audit trail >>> records. Similar is the case for Hive related operations. The resource name >>> that Ranger captures is sometimes vague and point to /tmp folder and all >> >> Bosco: Hopefully, eliminating the multiple entries will ease some of your >> pain. Regarding Hive access to HDFS, since Hive creates a lot of temporary >> intermediate files, there is a lot of noise. Your concerns are valid. I feel, >> we should extend our UI search to be more smart and help the admin users to >> suppress (filter out) accesses to /tmp folders and similar transient >> resources. Can you help us documenting and track the requirement by creating >> a JIRA? FYI, we are moving our audits to Solr. This gives a lot more search >> and filter capabilities and you can also use Banana (or other BI tools) to >> write your own custom Audit dashboard. Something that might be interesting to >> you. >> >> >>> >>> 4. If there is a change in HDFS or Hive (grants, data delete/update), >>> as a requirement we need to store the old value and new value along with who >>> made the change, when the change was made and whether it was successful or >>> not. But this is not happening now. How can we achieve this with Ranger? >> >> Bosco: Assuming you are referring to policy changes, all Hive related policy >> changes (Ranger UI, Ranger REST or Hive GRANT/REVOKE) are logged into Ranger. >> You can check them from Ranger -> Audit -> Admin tab. For HDFS, all policy >> changes done via Ranger UI and Ranger REST are logged in Ranger. >> >> >> >> >>> >>> >>> >>> Thanks & Regards, >>> Sethukumar Ramachandran
