Re: write beyond end of stream

2015-09-08 Thread Shady Xu
Seems the problem only exists when using load balance sink processor with groups of sinks that use lzo compression. Looking into the details. 2015-09-08 10:19 GMT+08:00 Shady Xu : > Using different prefixes does not fix the problem. Any other idea? > > 2015-09-07 10:19 GMT+08:00 Shady Xu : > >>

Kafka source with Avro throwing "Could not find schema for event" exception

2015-09-08 Thread Buntu Dev
I got Flume configured to read Avro events from Kafka source and I'm also attaching the schema like this: ~~~ f1.sources.kafka-source.interceptors.attach-f1-schema.type = static f1.sources.kafka-source.interceptors.attach-f1-schema.key = flume.avro.schema.url f1.sources.kafka-source.interceptors.a

Re: Question Failure Behavior of HDFS Sink

2015-09-08 Thread Aljoscha Krettek
Thanks for your answer. Aljoscha On Tue, 8 Sep 2015 at 20:04 Johny Rufus wrote: > Your assumption is correct, as duplicates in a failure scenario will occur. > > Thanks, > Rufus > > On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek > wrote: > >> Hi, >> as I understand it the HDFS sink uses the

Re: Question Failure Behavior of HDFS Sink

2015-09-08 Thread Johny Rufus
Your assumption is correct, as duplicates in a failure scenario will occur. Thanks, Rufus On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek wrote: > Hi, > as I understand it the HDFS sink uses the transaction system to verify > that all the elements in a transaction are written. This is what I w

Re: HDFS hdfs.path custom config ? possible ?

2015-09-08 Thread Hari Shreedharan
No, you need the %M - else you'd still round down to the hour (since the timestamp is rounded down, but they are still within the same hour) Thanks, Hari On Tue, Sep 8, 2015 at 10:19 AM, IT CTO wrote: > If the path stays with %h what would be the folder name with round 15 unit > minute? > > בת

Re: HDFS hdfs.path custom config ? possible ?

2015-09-08 Thread IT CTO
If the path stays with %h what would be the folder name with round 15 unit minute? בתאריך יום ג׳, 8 בספט׳ 2015, 19:52 מאת Hari Shreedharan < hshreedha...@cloudera.com>: > You can use roundValue as 15 and roundUnit as Minute - so a new dir would > get created every 15 mins. > > > Thanks, > Hari >

Re: HDFS hdfs.path custom config ? possible ?

2015-09-08 Thread Hari Shreedharan
You can use roundValue as 15 and roundUnit as Minute - so a new dir would get created every 15 mins. Thanks, Hari On Tue, Sep 8, 2015 at 9:42 AM, Sutanu Das wrote: > Hi Community, > > > > Is it possible to create /write HDFS dirs./folders every 15 mins? > > > > Right now, we use /hdfs_sink_pat

HDFS hdfs.path custom config ? possible ?

2015-09-08 Thread Sutanu Das
Hi Community, Is it possible to create /write HDFS dirs./folders every 15 mins? Right now, we use /hdfs_sink_path/%Y/%m/%d/%H --- ie using every Hour, But, Is it possible to write every 15 mins? - not every minute... If this is not possible out of the box config, can DEV team please advise whic

Re: Newbe question about Flume-ng - ElasticSearchSink

2015-09-08 Thread IT CTO
Thanks, FLUME-2787 and FLUME-2788 reported. I will also try getting into the WIKI and learn about your contribution process to see if I can help with it. Do you have "HOWTO" for documentation contribution? :-) Eran On Tue, Sep 8, 2015 at 6:44 PM Hari Shreedharan wrote: > Can you file separate ji

Re: Newbe question about Flume-ng - ElasticSearchSink

2015-09-08 Thread Hari Shreedharan
Can you file separate jiras for each one here: https://issues.apache.org/jira/browse/FLUME Thanks, Hari On Tue, Sep 8, 2015 at 8:41 AM, IT CTO wrote: > Hi, > I just started learning flume and using it to send events to our HDFS > cluster and log them in elasticsearch. > I am using the latest b

Newbe question about Flume-ng - ElasticSearchSink

2015-09-08 Thread IT CTO
Hi, I just started learning flume and using it to send events to our HDFS cluster and log them in elasticsearch. I am using the latest build Flume 1.6.0 Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080 Compiled by hshreedh

Re: Re: How to customize the key in a HDFS SequenceFile sink

2015-09-08 Thread Gonzalo Herreros
Looking at the code, I guess this sink is a bit different and the "serializer" property doesn't seem to be used. I see two options: Either configure hdfs.writeFormat with an implementation of SequenceFileSerializerType so it uses your own implementation of SequenceFileSerializer. Or extend HDFSEv

Re: Re: How to customize the key in a HDFS SequenceFile sink

2015-09-08 Thread Thomas . Beer
Von:Gonzalo Herreros An: user@flume.apache.org, Datum: 08.09.2015 09:29 Betreff:Re: How to customize the key in a HDFS SequenceFile sink Thanks for your prompt reply. May I ask you to give me some more details. I'm a little confused as I've read that the "hdfs.serializer" param

Question Failure Behavior of HDFS Sink

2015-09-08 Thread Aljoscha Krettek
Hi, as I understand it the HDFS sink uses the transaction system to verify that all the elements in a transaction are written. This is what I would call at-least-once semantics. My question is now what happens if the writing fails in the middle of writing the elements in the transaction. When the

Re: How to customize the key in a HDFS SequenceFile sink

2015-09-08 Thread Shady Xu
The way Gonzalo said is how we always do it too. 2015-09-08 15:29 GMT+08:00 Gonzalo Herreros : > I would implement a custom serializer and configure it in the standard > Hdfs sink. > That way you control how you build the key for each event. > > Regards, > Gonzalo > > On 8 September 2015 at 06:42

Re: How to customize the key in a HDFS SequenceFile sink

2015-09-08 Thread Gonzalo Herreros
I would implement a custom serializer and configure it in the standard Hdfs sink. That way you control how you build the key for each event. Regards, Gonzalo On 8 September 2015 at 06:42, wrote: > > Hello, > > I'm using Flume's HDFS SequenceFile sink for writing data to HDFS. I'm > looking for