RE: Configuring flume with Ganglia

2013-10-22 Thread Himanshu Patidar
Hi, This my flume-env.sh file - # Licensed to the Apache Software Foundation (ASF) under one# or more contributor license agreements. See the NOTICE file# distributed with this work for additional information# regarding copyright ownership. The ASF licenses this file# to you under the Apache

Re: Roll based on date

2013-10-22 Thread Martinus m
Hi David, The requirement is only roll per day actually. Hi Devin, Thanks for sharing your experienced. I also tried to set the config as following : agent.sinks.sink.hdfs.fileSuffix = FlumeData.%Y-%m-%d agent.sinks.sink.hdfs.fileType = DataStream agent.sinks.sink.hdfs.rollInterval = 0 agent.si

Re: Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread Roshan Naik
The source splits the data into individual events and inserts them into the channel. In a few cases the sources do additional parsing of data. On Tue, Oct 22, 2013 at 2:33 PM, wrote: > So, from what I am gathering from the discussion below is that the > Scribe source doesn’t do the parsing or

RE: Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread dwight.marzolf
So, from what I am gathering from the discussion below is that the Scribe source doesn't do the parsing or splitting of data. It just takes in the data flow as is and passes it onto the sink. The right sink splits the Scribe data up based on the category.That is a good clarification for me

Re: Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread Roshan Naik
i forgot to note that syslog source also does some parsing. On Tue, Oct 22, 2013 at 1:51 PM, Roshan Naik wrote: > At a minimum it needs to know how to split incoming data into individual > events. Typically a newline is used as the separator. > > Avro & thrift are special purpose sources/sinks

Re: Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread Roshan Naik
At a minimum it needs to know how to split incoming data into individual events. Typically a newline is used as the separator. Avro & thrift are special purpose sources/sinks which handle headers and body. Avro, Thrift & HTTP sources will parse the incoming data into header + body. AFAIKT most ot

help with config file for scribe source

2013-10-22 Thread dwight.marzolf
I am new to both Flume and Scribe. I am trying to configure a flume server with a Scribe source and rolling file sink (eventually it will be an s3 sink which I have gotten to work already) For the scribe source I am taking in data with multiple scribe categories. What I need to be able to do

Re: Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread Jarek Jarcec Cecho
Hi Praveen, I think that there is a confusion between message and payload. Whereas Flume do not need to understand the payload structure, it do need to understand the message to understand what events (what payloads) are there with what headers. To put it differently, Flume do not need to unders

Why does a Flume source need to recognize the format of the message?

2013-10-22 Thread Praveen Sripati
According to the Flume documentation >>A Flume source consumes events delivered to it by an external source like a web server. The external source sends events to Flume in a format that is recognized by the target Flume source. For example, an Avro Flume source can be used to receive Avro even

Re: Roll based on date

2013-10-22 Thread DSuiter RDX
Martinus, you have to set all the other roll options to 0 explicitly in the configuration if you want them only to roll on one parameter, it will take the shortest working parameter it can meet for the roll. If you want it to roll once a day, you will have to specifically disable all the other opti

Re: Roll based on date

2013-10-22 Thread David Sinclair
Do you need to roll based on size as well? Can you tell me the requirements? On Tue, Oct 22, 2013 at 2:15 AM, Martinus m wrote: > Hi David, > > Thanks for your answer. I already did that, but using %Y-%m-%d. But, since > there are still roll based on Size, so it will keep generating two or more