Hi, I'll try to be as clear as I can:
I'm using * 2 source = thrift, *spooldir* * 1 sink = kafka (no topic names configured in the agent.conf) How events are sent to the proper topic? Thrift writes a topic name (x) into the event's headers, then events are directly sent to kafka/x When a Thrift event for some reasons is not sent, it is collected in a directory, /backup. Periodically, we move those events from /backup to the /spooldir Instead of sending events to kafka/x, the spooldir source tries to send events to a "default-flume-topic" Question, why event's headers are not considered from the Spooldir and what is a good way to fix this? Thanks in advance Simone Roselli ITE Sysadmin [email protected] http://www.plista.com ----- Original Message ----- From: "Keane, Mike" <[email protected]> To: "user" <[email protected]> Sent: Thursday, January 7, 2016 3:03:54 PM Subject: RE: Spooldir needs a Kafka topic defined in the agent.conf I assume you want to the topic header based on the contents of the line of data read in from the Spooling Directory Source? If so I think you want to configure a Regex Extract Interceptor, or implement your own interceptor to do this. http://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor ________________________________________ From: Simone Roselli [[email protected]] Sent: Thursday, January 07, 2016 6:52 AM To: user Subject: Re: Spooldir needs a Kafka topic defined in the agent.conf Hi, in your configuration you define the topic name in the agent.conf (spoolingAgent.sinks.kafka-sink-1.topic = data_in); This is what I do not want. I would like the spoolDir to retrieve the topic name from the event headers Simone Roselli ITE Sysadmin [email protected] http://www.plista.com ----- Original Message ----- From: "Keane, Mike" <[email protected]> To: "user" <[email protected]> Sent: Wednesday, January 6, 2016 6:30:41 PM Subject: RE: Spooldir needs a Kafka topic defined in the agent.conf I attempted to put together a little Flume+Kafka tutorial including using Camus to run map-reduce jobs pulling from Kafka and writing to HDFS. My example uses a spoolDirSource, KafkaChannel & KafkaSink. This may be of some help to you. https://github.com/mbkeane/BigDataTechCon/blob/master/README.md ________________________________________ From: Simone Roselli [[email protected]] Sent: Wednesday, January 06, 2016 10:33 AM To: [email protected] Subject: Spooldir needs a Kafka topic defined in the agent.conf Hi, I'm having trouble configuring a spooldir source using the Kafka sink In Flume-NG I can use the Kafka sink without specify a topic name in the agent.conf, since the event contains this topic name in the headers. Things look different using the spooldir source. If you don't provide a topic name in agent.conf, it will only try a default one (default-flume-topic). Is there a way to force spooldir source using the topic name in the headers? ps: I'm using Spooldir with the AVRO deserialization; no other particular configuration. "fileHeader" is set as "true" Many thanks Simone Roselli ITE Sysadmin [email protected] http://www.plista.com This email and any files included with it may contain privileged, proprietary and/or confidential information that is for the sole use of the intended recipient(s). Any disclosure, copying, distribution, posting, or use of the information contained in or attached to this email is prohibited unless permitted by the sender. If you have received this email in error, please immediately notify the sender via return email, telephone, or fax and destroy this original transmission and its included files without reading or saving it in any manner. Thank you. This email and any files included with it may contain privileged, proprietary and/or confidential information that is for the sole use of the intended recipient(s). Any disclosure, copying, distribution, posting, or use of the information contained in or attached to this email is prohibited unless permitted by the sender. If you have received this email in error, please immediately notify the sender via return email, telephone, or fax and destroy this original transmission and its included files without reading or saving it in any manner. Thank you.
