Hello All, I'm trying to load the app servers request logs in Hadoop hdfs.
I get all the consolidate logs in one file for a day. I'm running the flume agent with following config: ## agent.sources = apache agent.sources.apache.type = exec agent.sources.apache.command = cat /appserverlogs/requestfile/request.log.2013_06_07 agent.sources.apache.batchSize = 1 agent.sources.apache.channels = memoryChannel agent.sources.apache.interceptors = itime ihost itype # http://flume.apache.org/FlumeUserGuide.html#timestamp-interceptor agent.sources.apache.interceptors.itime.type = timestamp # http://flume.apache.org/FlumeUserGuide.html#host-interceptor agent.sources.apache.interceptors.ihost.type = host agent.sources.apache.interceptors.ihost.useIP = false agent.sources.apache.interceptors.ihost.hostHeader = host # http://flume.apache.org/FlumeUserGuide.html#static-interceptor agent.sources.apache.interceptors.itype.type = static agent.sources.apache.interceptors.itype.key = log_type agent.sources.apache.interceptors.itype.value = request_logs # http://flume.apache.org/FlumeUserGuide.html#memory-channel agent.channels = memoryChannel agent.channels.memoryChannel.type = memory agent.channels.memoryChannel.capacity = 1000 agent.channels.memoryChannel.transactionCapacity = 100 agent.channels.memoryChannel.keep-alive = 3 agent.channels.memoryChannel.byteCapacityBufferPercentage = 20 ## Send to Flume Collector on 1.2.3.4 (Hadoop Slave Node) # http://flume.apache.org/FlumeUserGuide.html#avro-sink agent.sinks = AvroSink agent.sinks.AvroSink.type = avro agent.sinks.AvroSink.channel = memoryChannel agent.sinks.AvroSink.hostname = h1.vgs.mypoints.com agent.sinks.AvroSink.port = 4545 here you can see that I'm using the cat command with the specific file. As I said that i get one file a day with the date in it. Q: How could I mention in the config file to keep rotating the cat file name in the above config for everyday new file? Currently once the file is loaded then I've to stop the agent and change the config and run the agent again. On the Hadoop slave I've the collector running with the following config: collector.sources = AvroIn collector.sources.AvroIn.type = avro collector.sources.AvroIn.bind = 0.0.0.0 collector.sources.AvroIn.port = 4545 collector.sources.AvroIn.channels = mc1 mc2 ## Channels ######################################################## ## Source writes to 2 channels, one for each sink (Fan Out) collector.channels = mc1 mc2 collector.channels.mc1.type = memory collector.channels.mc1.capacity = 1000 collector.channels.mc1.transactionCapacity = 100 collector.channels.mc1.keep-alive = 3 collector.channels.mc1.byteCapacityBufferPercentage = 20 collector.channels.mc2.type = memory collector.channels.mc2.capacity = 1000 collector.channels.mc2.transactionCapacity = 100 collector.channels.mc2.keep-alive = 3 collector.channels.mc2.byteCapacityBufferPercentage = 20 ## Sinks ########################################################### collector.sinks = LocalOut HadoopOut ## Write copy to Local Filesystem (Debugging) # http://flume.apache.org/FlumeUserGuide.html#file-roll-sink collector.sinks.LocalOut.type = file_roll collector.sinks.LocalOut.sink.directory = /var/log/flume collector.sinks.LocalOut.sink.rollInterval = 0 collector.sinks.LocalOut.channel = mc1 ## Write to HDFS # http://flume.apache.org/FlumeUserGuide.html#hdfs-sink collector.sinks.HadoopOut.type = hdfs collector.sinks.HadoopOut.channel = mc2 collector.sinks.HadoopOut.hdfs.path = /user/flume/events/%{log_type}/%{host}/%y-%m-%d collector.sinks.HadoopOut.hdfs.fileType = DataStream collector.sinks.HadoopOut.hdfs.writeFormat = Text collector.sinks.HadoopOut.hdfs.rollSize = 0 collector.sinks.HadoopOut.hdfs.rollCount = 0 collector.sinks.HadoopOut.hdfs.rollInterval = 0 Q: Collector is loading the file into hdfs as .tmp extention. Untill I kill the collector it dont' rotate the file to normal name. I've played with collector.sinks.HadoopOut.hdfs.rollSize = 0 collector.sinks.HadoopOut.hdfs.rollCount = 0 collector.sinks.HadoopOut.hdfs.rollInterval = 0 but then it create many files.I'm looking for creating one file for one day requestlogs. I really appreciate any help on this issue. -Sanjeev -- Sanjeev Sagar *"**Separate yourself from everything that separates you from others !" - Nirankari Baba Hardev Singh ji * **