Re: Creating HDFS sink directories based on LogFile Pattern - POSSIBLE with Flume?

Roshan Naik Fri, 21 Aug 2015 15:15:31 -0700

You can setup a key/value in the header to indicate where the data is coming 
from. Eg. sourceInfo=ArubaRadio

In the HDFS sink's path you can specify the sourceInfo header...  E.g. 
/path/%{sourceInfo}/more .  Take a look at the escape sequences in the HDFS 
Sink doc.
-roshan

From: Sutanu Das <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Friday, August 21, 2015 1:44 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Creating HDFS sink directories based on LogFile Pattern - POSSIBLE 
with Flume?

Hi Team,

We are asked to create HDFS directory in HDFS Sink based on Logfile 
Pattern/Topic. Is it possible with Flume Interceptors / Extractors / Serializes 
out-of-box ?

Example: Single Logfile has following lines:

t=1440187845 ArubaPresence op="add" sta_mac="" associated="False" 
ap_name="a036000000kqVoW-02i6000000T5jrU"
t=1440187845 ArubaPresence op="add" sta_mac="" associated="False" 
ap_name="a036000000kqVoW-02i6000000T5jrU"
t=1440189388 ArubaRadio op="update" mac="04:bd:88:80:38:d0" 
ap_mac="04:bd:88:c0:03:8c" type="RADIO_PHY_TYPE_A_HT" mode="RADIO_MODE_AP"
t=1440189388 ArubaRadio op="update" mac="04:bd:88:80:38:c0" 
ap_mac="04:bd:88:c0:03:8c" type="RADIO_PHY_TYPE_A_HT_40" mode="RADIO_MODE_AP"

So Is it possible to write each lines from the single sample Log above to 
separate HDFS Sink Directory based on the Keywork/patter-topic ( eg Aruba 
Presence and ArubaRadio) ?  so it will looks like this during Flume HDFS sink 
write:

Creating /prod/hadoop/Aruba 
Presence/2015/08/21/20/Airwave_amp_2.1440189722272.tmp

Creating /prod/hadoop/ArubaRadio/2015/08/21/20/Airwave_amp_2.1440189722272.tmp

Re: Creating HDFS sink directories based on LogFile Pattern - POSSIBLE with Flume?

Reply via email to