Hi Lune

> My question is the following : will I encounter the famous "small files 
> problem" with my namenodes because of the number of small audit files stored 
> in HDFS ?

Based on your environment, you will have 134 files per day, which will be 
around 4000 files per month. Which I feel, shouldn’t be an issues with NameNode.

> My question is the following : will I encounter the famous "small files 
> problem" with my namenodes because of the number of small audit files stored 
> in HDFS ?

Yes, there is a property to set the duration. You can use “file.rollover.sec” 
with the destination prefix.

> Is there a way to purge them natively (I developed a shell script and 
> scheduled it on crontab but if there is a native mechanism) ?

There is no native feature available. But it would be good to have. Would you 
want to contribute the shell script for others to use? Eventually, we could 
have an Oozie job to which could do a few things, e.g. compress it, coalesce 
multiple files, purge it or even create Hive tables out of the files.

 

Thanks

 

Bosco

 

 

From: Lune Silver <lunescar.ran...@gmail.com>
Reply-To: <user@ranger.incubator.apache.org>
Date: Tuesday, July 12, 2016 at 11:40 AM
To: <user@ranger.incubator.apache.org>
Subject: About the audit stored in HDFS

 

Hello everyone !

I send you this mail about a question related to the storage HDFS of the audit.

I use Ranger for three plugins first :
- HDFS
- Kafka
- HBase

I have two namenodes, two Hbase-masters, 100 region servers and 30 kafka 
brokers.

I notices that I have ony audit file per server per day.

My question is the following : will I encounter the famous "small files 
problem" with my namenodes because of the number of small audit files stored in 
HDFS ?

Is there a way to configure the frequence when the audit are put into HDFS ? Or 
a way to configure ranger to store files corresponding to multiple days ?

Is there a way to purge them natively (I developed a shell script and scheduled 
it on crontab but if there is a native mechanism) ?

BR.

Lune

Reply via email to