[jira] [Commented] (RANGER-1837) HDFS Audit Compression

2017-10-28 Thread Don Bosco Durai (JIRA)

[ 
https://issues.apache.org/jira/browse/RANGER-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223820#comment-16223820
 ] 

Don Bosco Durai commented on RANGER-1837:
-

I like the approach. A couple of questions:
1. Should HDFSAuditDestinationORC extend from HDFSAuditDestination? Or should 
we create a common base class for batch File Output?
2. If we can create a common base class, then we can have the FileWriters 
(HDFS/Text, HDFS/ORC, HDFS/Parquet, HDFS/Avro, etc.) extend from it. So all the 
configuration, life cycle (batch time, rotation, recovery, etc) can be handled 
by the base class and call the derived class on for writing. And the method can 
take the input file and base folder for the output file.



> HDFS Audit Compression
> --
>
> Key: RANGER-1837
> URL: https://issues.apache.org/jira/browse/RANGER-1837
> Project: Ranger
>  Issue Type: Improvement
>  Components: audit
>Reporter: Kevin Risden
>
> My team has done some research and found that Ranger HDFS audits are:
> * Stored as JSON objects (one per line)
> * Not compressed
> This is currently very verbose and would benefit from compression since this 
> data is not frequently accessed. 
> From Bosco on the mailing list:
> {quote}You are right, currently one of the options is saving the audits in 
> HDFS itself as JSON files in one folder per day. I have loaded these JSON 
> files from the folder into Hive as compressed ORC format. The compressed 
> files in ORC were less than 10% of the original size. So, it was significant 
> decrease in size. Also, it is easier to run analytics on the Hive tables.
>  
> So, there are couple of ways of doing it.
>  
> Write an Oozie job which runs every night and loads the previous day worth 
> audit logs into ORC or other format
> Write a AuditDestination which can write into the format you want to.
>  
> Regardless which approach you take, this would be a good feature for 
> Ranger.{quote}
> http://mail-archives.apache.org/mod_mbox/ranger-user/201710.mbox/%3CCAJU9nmiYzzUUX1uDEysLAcMti4iLmX7RE%3DmN2%3DdoLaaQf87njQ%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (RANGER-1861) There are errors when the "ranger-kms start" command was executed.

2017-10-28 Thread peng.jianhua (JIRA)
peng.jianhua created RANGER-1861:


 Summary: There are errors when the "ranger-kms start" command was 
executed.
 Key: RANGER-1861
 URL: https://issues.apache.org/jira/browse/RANGER-1861
 Project: Ranger
  Issue Type: Bug
  Components: kms
Affects Versions: 1.0.0, master
Reporter: peng.jianhua
Assignee: peng.jianhua


There are errors when the "ranger-kms start" command was executed.
{code}
2017-10-28 23:23:33,339 ERROR XMLUtils - Error loading : 
java.lang.IllegalArgumentException: InputStream cannot be null
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:117)
at org.apache.ranger.plugin.util.XMLUtils.loadConfig(XMLUtils.java:64)
at org.apache.ranger.plugin.util.XMLUtils.loadConfig(XMLUtils.java:48)
at 
org.apache.ranger.server.tomcat.EmbeddedServer.(EmbeddedServer.java:77)
at 
org.apache.ranger.server.tomcat.EmbeddedServer.main(EmbeddedServer.java:69)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)