I didn't quite got it - are you ingesting apache access log or what ? Either way, there is regex_extractor interceptor that you can configure to extract hostname into the variable of your choice (f.e. % ApacheVirtualHostname). Of course, your event payload has to contain vhost fqdn. https://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor
Then, you can use that variable in the HdfsSink like you described On Fri, Jul 29, 2016 at 12:57 AM, Guyle M. Taber <[email protected]> wrote: > I’m trying to determine if I can use a substitution variable in the hdfs > file path that is derived from the apache virtual host name that is called > on a web server listening as multiple vhost names. Where is the > substitution variable %host deriving that value and is there another var I > can use? Or can I use an interceptor to somehow extract the apache virtual > hostname called? > > For instance, a single web server is hosting 3 virtual hosts. > > vhost1.example.com > vhost2.example.com > vhost3.example.com > > Can a single sink hdfs path be customized based on the vhost (not the > server’s system hostname) called? > > Something like "/hdfs/logdata/%ApacheVirtualHostname" -- Best regards, Ahmed Vila | Senior software engineer Mobile | +387 62 139 348 Web | www.symphony.is Skype | wylla_av San Francisco | Sarajevo | Belgrade No one can whistle a symphony
