Hi Yanzhi, Tail doesn’t delete the data from the source file. You can do this by creating your own customized source, as suggested by some earlier posts. Flume 1.x is redesigned significantly, so once the event get into flume, it’s guaranteed to reach to its destination or final sink. So, no need to do any configuration.
The host interceptor can be configured as follows: agent.sources.tail1.interceptors = i1 //provide the name of the interceptor as you do for source and sink. agent.sources.tail1.interceptors.i1.type = org.apache.flume.interceptor.HostInterceptor$Builder //type of interceptor, Fully qualified class name agent.sources.tail1.interceptors.i1.preserveExisting = false // set it true if you want to preserve the existing value of host header. agent.sources.tail1.interceptors.i1.useIP = false // set it true, incase you want to store the IP address instead of Hostname. agent.sources.tail1.interceptors.i1.hostHeader = host // parameter name in flume event’s header For more information, please refer the flume user guide which documented very well. http://flume.apache.org/FlumeUserGuide.html ---------------------------------------- ---------------------------------------- Thanks & Regards, Ashutosh Sharma Email: [email protected] ---------------------------------------- From: Yanzhi.liu [mailto:[email protected]] Sent: Monday, August 06, 2012 6:07 PM To: user Subject: About flume Host Interceptor and end-to-end source hello people: I am using the exec source 、avro source、avro sink and hdfs sink to configuration two computer. This is mine configuration: A computer is: agent_foo.sources = tailsource-1 agent_foo.channels = memoryChannel-1 agent_foo.sinks = Sink-1 agent_foo.sources.tailsource-1.type = exec agent_foo.sources.tailsource-1.command = tail -F /home/hadoop/hello.txt agent_foo.sources.tailsource-1.channels = memoryChannel-1 agent_foo.sources.tailsource-1.batchSize = 1 agent_foo.channels.memoryChannel-1.type = memory agent_foo.channels.memoryChannel-1.capacity = 1000 agent_foo.sinks.Sink-1.type=avro agent_foo.sinks.Sink-1.channel=memoryChannel-1 agent_foo.sinks.Sink-1.hostname=221.130.18.211 agent_foo.sinks.Sink-1.port=4545 agent_foo.sinks.Sink-1.batch-size=1 And the other computer is configuration: agent_fo.sources = tailsource-1 agent_fo.channels = memoryChannel-1 agent_fo.sinks = hdfsSink-1 agent_fo.sources.tailsource-1.type = avro agent_fo.sources.tailsource-1.channels=memoryChannel-1 agent_fo.sources.tailsource-1.bind=221.130.18.211 agent_fo.sources.tailsource-1.port=4545 agent_fo.channels.memoryChannel-1.type = memory agent_fo.channels.memoryChannel-1.capacity = 1000 agent_fo.sinks.hdfsSink-1.type = hdfs agent_fo.sinks.hdfsSink-1.channel = memoryChannel-1 agent_fo.sinks.hdfsSink-1.hdfs.path = hdfs://CMN-NJ-2-579:9000/user/hadoop/flume agent_fo.sinks.hdfsSink-1.hdfs.rollInterval=600 agent_fo.sinks.hdfsSink-1.hdfs.rollCount=0 agent_fo.sinks.hdfsSink-1.hdfs.rollSize = 1048576 agent_fo.sinks.hdfsSink-1.hdfs.fileType=CompressedStream agent_fo.sinks.hdfsSink-1.hdfs.codeC=gzip agent_fo.sinks.hdfsSink-1.hdfs.writeFormat=Text agent_fo.sinks.hdfsSink-1.hdfs.batchSize=10 agent_fo.sinks.hdfsSink-1.serializer=avro_event My problem is that hello.txt don't delete old data .About the problem I should do what could solve. I want to know that Flume 1.2.0 User Guide's end-to-end is how to configuration. The second problem is about Host Interceptor.What configuration could get the host-name and IP-address. I hope to get your helping. Thank you very much! My Name: Yanzhi Liu 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다. This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.
