It looks like you have not configured any properties for "rolling" files on hdfs. The default rollCount is 10 (events).
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink The flume hdfs sink can be configured to roll based on size, # of events, or time. hdfs.rollInterval30Number of seconds to wait before rolling current file (0 = never roll based on time interval) hdfs.rollSize1024File size to trigger roll, in bytes (0: never roll based on file size) hdfs.rollCount10Number of events written to file before it rolled (0 = never roll based on number of events) On Thu, Feb 27, 2014 at 7:49 AM, orahad bigdata <[email protected]> wrote: > Hi All, > > I'm new in flume , I have a small Hadoop setup and flume agent on that,I'm > using tail -f logfilename file as a source. > > When I started the agent it is ingesting data into hdfs, but each file only > contains 10 lines can we configure the number of line per file on hdfs? > > below is my agent conf file. > > > agent.sources = pstream > agent.channels = memoryChannel > agent.channels.memoryChannel.type = memory > agent.channels.memoryChannel.capacity = 100000 > agent.channels.memoryChannel.transactionCapacity = 10000 > agent.sources.pstream.channels = memoryChannel > agent.sources.pstream.type = exec > agent.sources.pstream.command = tail -f /root/dummylog > agent.sources.pstream.batchSize=1000 > agent.sinks = hdfsSink > agent.sinks.hdfsSink.type = hdfs > agent.sinks.hdfsSink.channel = memoryChannel > agent.sinks.hdfsSink.hdfs.path = hdfs://xxxxx:xxx/somepath > agent.sinks.hdfsSink.hdfs.fileType = DataStream > agent.sinks.hdfsSink.hdfs.writeFormat = Text > > > Thanks
