Thanks. Yeah we're actually capturing JSON POST data in the Apache logs (not 
GET data), so at this point there is no hostname in the payload so, we'd have 
to figure out a way to derive that by virtual host. 

> On Jul 28, 2016, at 5:33 PM, Ahmed Vila <[email protected]> wrote:
> 
> I didn't quite got it - are you ingesting apache access log or what ?
> 
> Either way, there is regex_extractor interceptor that you can configure to 
> extract hostname into the variable of your choice (f.e. 
> %ApacheVirtualHostname). Of course, your event payload has to contain vhost 
> fqdn.
> https://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor
> 
> Then, you can use that variable in the HdfsSink like you described
> 
> 
>> On Fri, Jul 29, 2016 at 12:57 AM, Guyle M. Taber <[email protected]> wrote:
>> I’m trying to determine if I can use a substitution variable in the hdfs 
>> file path that is derived from the apache virtual host name that is called 
>> on a web server listening as multiple vhost names. Where is the substitution 
>> variable %host deriving that value and is there another var I can use? Or 
>> can I use an interceptor to somehow extract the apache virtual hostname 
>> called?
>> 
>> For instance, a single web server is hosting 3 virtual hosts.
>> 
>> vhost1.example.com
>> vhost2.example.com
>> vhost3.example.com
>> 
>> Can a single sink hdfs path be customized based on the vhost (not the 
>> server’s system hostname) called?
>> 
>> Something like   "/hdfs/logdata/%ApacheVirtualHostname"
> 
> 
> 
> -- 
> 
> Best regards,
> 
> Ahmed Vila | Senior software engineer
> 
> 
> Mobile | +387 62 139 348
> Web | www.symphony.is
> Skype | wylla_av
> 
> San Francisco | Sarajevo | Belgrade
> No one can whistle a symphony

Reply via email to