Re: Configuring FLUME to use FailOverSinkProcessor...

2012-07-13 Thread Jarek Jarcec Cecho
Hi Inder,
actually in my testing environment, events weren't drain from memory channel 
and therefore they were not saved in file channel. I'm not sure why, but it 
appeared that the failover did not happened as expected. I unfortunately did 
not yet have enough time to fully explore what has happen (I might be doing 
something wrong).

Jarcec

On Jul 13, 2012, at 3:51 PM, Inder Pall wrote:

> So if this thing works what are people's thoughts about using it for PROD
> envs...
> 
> Jarek, the reason i had hdfs data sink locations same for both HDFS sinks
> was to ensure that the spooled data also makes to the final location...so
> the test to try out would be to check all numbers generated by the seqeunce
> generator source are available once the tests are finished...
> 
> btw, the avro-sink & avro source combination for file channel is a little
> heavy as it's local..perhaps something lighter could be better.
> 
> * I turned off name node, event's were not stored in file channel as
> expected
>>> here you meant they were indeed saved into file channel and drained from
> memory channel right??
> 
> Thanks,
> - Inder
> 
> On Fri, Jul 13, 2012 at 7:18 PM, Inder Pall  wrote:
> 
>> Jarek,
>> 
>> thanks for taking out time to try this..yeah i meant mem channel to be
>> used first and then file channel for some reason i thought lower number
>> means higher priority pardon the ignorance of not looking at the
>> documentation.
>> 
>> i have a dated code-base of trunk around 10-12 days...the agent comes up
>> fine but the sequence generator source wasn't sending any events...or
>> actually i didn't see anything in the logs
>> 
>> 
> 
> 
>> Thanks,
>> - Inder
>> 
>> 
>> On Fri, Jul 13, 2012 at 3:30 PM, Jarek Jarcec Cecho wrote:
>> 
>>> Hi Inder,
>>> could you please advise what exactly was (or wasn't) happening to you?
>>> I've tried your configuration file on current trunk and it was working for
>>> me out of the box.
>>> 
>>> I was playing with it to see if your idea will work and I've ended up
>>> with attached configuration. It contains just few modification to yours:
>>> 
>>> * I've swapped priorities to firstly use the memory channel and fail over
>>> to file channel
>>> * I've renamed target file prefixes to distinguish source of the events
>>> (memory or file channel)
>>> 
>>> Scenario that I've run:
>>> 
>>> * I turned on the flume agents, event's were correctly saved on HDFS
>>> * I turned off name node, event's were not stored in file channel as
>>> expected
>>> 
>>> I'm not sure why the observed behaviour was different than expected and
>>> 'I'll investigate that later. Meantime, could you describe what exactly was
>>> happening to you?
>>> 
>>> Jarcec
>>> 
>>> On Thu, Jul 12, 2012 at 04:11:08PM +0200, Jarek Jarcec Cecho wrote:
 I'm just looking on that sir.
 
 Jarcec
 
 On Thu, Jul 12, 2012 at 07:34:31PM +0530, Inder Pall wrote:
> Folks,
> 
> for some reason updating the JIRA isn't trigerring an email
> I need feedback from FLUME DEVS on
> FLUME-1045<
>>> https://issues.apache.org/jira/browse/FLUME-1045?focusedCommentId=13412737&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13412737
 
> wherein
> i am trying to use failover sinkprocessor and a combination of file
>>> and
> memory channel to achieve scribe like spooling/de-spooling.facing
>>> some
> issues here
> 
> --
> Thanks,
> - Inder
>  Tech Platforms @Inmobi
>  Linkedin - http://goo.gl/eR4Ub
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> Thanks,
>> - Inder
>>  Tech Platforms @Inmobi
>>  Linkedin - http://goo.gl/eR4Ub
>> 
> 
> 
> 
> -- 
> Thanks,
> - Inder
>  Tech Platforms @Inmobi
>  Linkedin - http://goo.gl/eR4Ub



signature.asc
Description: Message signed with OpenPGP using GPGMail


[jira] [Created] (FLUME-1370) HDFSEventSink - file name collision in bucket path

2012-07-13 Thread Mubarak Seyed (JIRA)
Mubarak Seyed created FLUME-1370:


 Summary: HDFSEventSink - file name collision in bucket path
 Key: FLUME-1370
 URL: https://issues.apache.org/jira/browse/FLUME-1370
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.2.0
 Environment: Linux, Java 1.6.0.24, hadoop-0.20.205, flume-1.2.0
Reporter: Mubarak Seyed


It appears from test that two HDFS sinks (from different agent/machines) are 
trying to create same file name

{code}
2012-07-12 22:18:51,820 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.startFile: failed to create file 
/logs/event1/07122012/22/2/Event1.1342130410188.tmp for DFSClient_-1690064085 
on client 0.0.0.1, because this file is already being created by 
DFSClient_1581651201 on 0.0.0.2
2012-07-12 22:18:51,820 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 
on 8020, call create(/logs/event1/07122012/22/2/Event1.1342130410188.tmp, 
rwxr-xr-x, DFSClient_-1690064085, true, 3, 134217728) from 0.0.0.1:54280: 
error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
create file /logs/event1/07122012/22/2/Event1.1342130410188.tmp for 
DFSClient_-1690064085 on client 0.0.0.1, because this file is already being 
created by DFSClient_1581651201 on 0.0.0.2
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /logs/event1/07122012/22/2/Event1.1342130410188.tmp for 
DFSClient_-1690064085 on client 0.0.0.1, because this file is already being 
created by DFSClient_1581651201 on 0.0.0.2
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1338)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1178)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1126)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:585)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)

{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: FLUME-1360: Provide documentation for static interceptor

2012-07-13 Thread Jarek Cecho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5879/
---

(Updated July 13, 2012, 1:57 p.m.)


Review request for Flume, Arvind Prabhakar, Juhani Connolly, Mike Percy, and 
Hari Shreedharan.


Description
---

I've documented static interceptor and I also put some examples to other 
interceptors as well.


This addresses bug FLUME-1360.
https://issues.apache.org/jira/browse/FLUME-1360


Diffs
-

  /trunk/flume-ng-doc/sphinx/FlumeUserGuide.rst 1359503 

Diff: https://reviews.apache.org/r/5879/diff/


Testing
---


Thanks,

Jarek Cecho



Re: Configuring FLUME to use FailOverSinkProcessor...

2012-07-13 Thread Inder Pall
So if this thing works what are people's thoughts about using it for PROD
envs...

Jarek, the reason i had hdfs data sink locations same for both HDFS sinks
was to ensure that the spooled data also makes to the final location...so
the test to try out would be to check all numbers generated by the seqeunce
generator source are available once the tests are finished...

btw, the avro-sink & avro source combination for file channel is a little
heavy as it's local..perhaps something lighter could be better.

* I turned off name node, event's were not stored in file channel as
expected
>> here you meant they were indeed saved into file channel and drained from
memory channel right??

Thanks,
- Inder

On Fri, Jul 13, 2012 at 7:18 PM, Inder Pall  wrote:

> Jarek,
>
> thanks for taking out time to try this..yeah i meant mem channel to be
> used first and then file channel for some reason i thought lower number
> means higher priority pardon the ignorance of not looking at the
> documentation.
>
> i have a dated code-base of trunk around 10-12 days...the agent comes up
> fine but the sequence generator source wasn't sending any events...or
> actually i didn't see anything in the logs
>
>


> Thanks,
>  - Inder
>
>
> On Fri, Jul 13, 2012 at 3:30 PM, Jarek Jarcec Cecho wrote:
>
>> Hi Inder,
>> could you please advise what exactly was (or wasn't) happening to you?
>> I've tried your configuration file on current trunk and it was working for
>> me out of the box.
>>
>> I was playing with it to see if your idea will work and I've ended up
>> with attached configuration. It contains just few modification to yours:
>>
>> * I've swapped priorities to firstly use the memory channel and fail over
>> to file channel
>> * I've renamed target file prefixes to distinguish source of the events
>> (memory or file channel)
>>
>> Scenario that I've run:
>>
>> * I turned on the flume agents, event's were correctly saved on HDFS
>> * I turned off name node, event's were not stored in file channel as
>> expected
>>
>> I'm not sure why the observed behaviour was different than expected and
>> 'I'll investigate that later. Meantime, could you describe what exactly was
>> happening to you?
>>
>> Jarcec
>>
>> On Thu, Jul 12, 2012 at 04:11:08PM +0200, Jarek Jarcec Cecho wrote:
>> > I'm just looking on that sir.
>> >
>> > Jarcec
>> >
>> > On Thu, Jul 12, 2012 at 07:34:31PM +0530, Inder Pall wrote:
>> > > Folks,
>> > >
>> > > for some reason updating the JIRA isn't trigerring an email
>> > > I need feedback from FLUME DEVS on
>> > > FLUME-1045<
>> https://issues.apache.org/jira/browse/FLUME-1045?focusedCommentId=13412737&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13412737
>> >
>> > > wherein
>> > > i am trying to use failover sinkprocessor and a combination of file
>> and
>> > > memory channel to achieve scribe like spooling/de-spooling.facing
>> some
>> > > issues here
>> > >
>> > > --
>> > > Thanks,
>> > > - Inder
>> > >   Tech Platforms @Inmobi
>> > >   Linkedin - http://goo.gl/eR4Ub
>>
>>
>>
>
>
> --
> Thanks,
> - Inder
>   Tech Platforms @Inmobi
>   Linkedin - http://goo.gl/eR4Ub
>



-- 
Thanks,
- Inder
  Tech Platforms @Inmobi
  Linkedin - http://goo.gl/eR4Ub


Re: Configuring FLUME to use FailOverSinkProcessor...

2012-07-13 Thread Inder Pall
Jarek,

thanks for taking out time to try this..yeah i meant mem channel to be used
first and then file channel for some reason i thought lower number means
higher priority pardon the ignorance of not looking at the
documentation.

i have a dated code-base of trunk around 10-12 days...the agent comes up
fine but the sequence generator source wasn't sending any events...or
actually i didn't see anything in the logs

Thanks,
 - Inder

On Fri, Jul 13, 2012 at 3:30 PM, Jarek Jarcec Cecho wrote:

> Hi Inder,
> could you please advise what exactly was (or wasn't) happening to you?
> I've tried your configuration file on current trunk and it was working for
> me out of the box.
>
> I was playing with it to see if your idea will work and I've ended up with
> attached configuration. It contains just few modification to yours:
>
> * I've swapped priorities to firstly use the memory channel and fail over
> to file channel
> * I've renamed target file prefixes to distinguish source of the events
> (memory or file channel)
>
> Scenario that I've run:
>
> * I turned on the flume agents, event's were correctly saved on HDFS
> * I turned off name node, event's were not stored in file channel as
> expected
>
> I'm not sure why the observed behaviour was different than expected and
> 'I'll investigate that later. Meantime, could you describe what exactly was
> happening to you?
>
> Jarcec
>
> On Thu, Jul 12, 2012 at 04:11:08PM +0200, Jarek Jarcec Cecho wrote:
> > I'm just looking on that sir.
> >
> > Jarcec
> >
> > On Thu, Jul 12, 2012 at 07:34:31PM +0530, Inder Pall wrote:
> > > Folks,
> > >
> > > for some reason updating the JIRA isn't trigerring an email
> > > I need feedback from FLUME DEVS on
> > > FLUME-1045<
> https://issues.apache.org/jira/browse/FLUME-1045?focusedCommentId=13412737&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13412737
> >
> > > wherein
> > > i am trying to use failover sinkprocessor and a combination of file and
> > > memory channel to achieve scribe like spooling/de-spooling.facing
> some
> > > issues here
> > >
> > > --
> > > Thanks,
> > > - Inder
> > >   Tech Platforms @Inmobi
> > >   Linkedin - http://goo.gl/eR4Ub
>
>
>


-- 
Thanks,
- Inder
  Tech Platforms @Inmobi
  Linkedin - http://goo.gl/eR4Ub


[jira] [Commented] (FLUME-1368) In user guide, property sink.directory for file roller sink should be bold

2012-07-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413650#comment-13413650
 ] 

Jarek Jarcec Cecho commented on FLUME-1368:
---

Yeah sure,
simply run "mvn site" to build the documentation. You can find generated docs 
in target/site and more specifically, your change shell be seen in file 
target/site/FlumeUserGuide.html

Jarcec

> In user guide, property sink.directory for file roller sink should be bold
> --
>
> Key: FLUME-1368
> URL: https://issues.apache.org/jira/browse/FLUME-1368
> Project: Flume
>  Issue Type: Bug
>Reporter: Mark Stern
>Priority: Trivial
>  Labels: documentation
> Attachments: FLUME-1368.PATCH
>
>
> Required properties should be in bold. 'sink.directory' is a required 
> property (because there is no default and it has to know where to put the 
> files). So it should be bold in the user guide.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (FLUME-1368) In user guide, property sink.directory for file roller sink should be bold

2012-07-13 Thread Mark Stern (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413633#comment-13413633
 ] 

Mark Stern commented on FLUME-1368:
---

I see what you mean. Is there a way I can test my second attempt before I 
submit it?

> In user guide, property sink.directory for file roller sink should be bold
> --
>
> Key: FLUME-1368
> URL: https://issues.apache.org/jira/browse/FLUME-1368
> Project: Flume
>  Issue Type: Bug
>Reporter: Mark Stern
>Priority: Trivial
>  Labels: documentation
> Attachments: FLUME-1368.PATCH
>
>
> Required properties should be in bold. 'sink.directory' is a required 
> property (because there is no default and it has to know where to put the 
> files). So it should be bold in the user guide.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (FLUME-1366) AbstractSource, AbstractChannel & AbstractSink should provide a protected method setLifecycleState()

2012-07-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413621#comment-13413621
 ] 

Jarek Jarcec Cecho commented on FLUME-1366:
---

Just a side note: We're in the process of moving our home brew life cycle 
engine to Guava in FLUME-966.

> AbstractSource, AbstractChannel & AbstractSink should provide a protected 
> method setLifecycleState()
> 
>
> Key: FLUME-1366
> URL: https://issues.apache.org/jira/browse/FLUME-1366
> Project: Flume
>  Issue Type: Improvement
>  Components: Channel, Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Alvaro Polo
>Priority: Minor
>
> Sources, channels and sinks are designed to extend {{AbstractSource}}, 
> {{AbstractChannel}} and {{AbstractSink}}, respectively. These classes 
> implement the basics of lifecycle state tracking for sources, channels or 
> sinks. In all them, that state is modified exclusively by invoking the 
> {{start()}} and {{stop()}} methods.
> There is no possibility of setting the lifecycle state to {{ERROR}}. This 
> would be specially useful when override {{start()}} and {{stop()}} methods 
> find an error that impedes the element to be started or stopped. 
> My suggestion here is to add a new protected method {{void 
> setLifecycleState(LifecycleState)}} that allow concrete sources, channels and 
> sinks to report such a situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Configuring FLUME to use FailOverSinkProcessor...

2012-07-13 Thread Jarek Jarcec Cecho
Hi Inder,
could you please advise what exactly was (or wasn't) happening to you? I've 
tried your configuration file on current trunk and it was working for me out of 
the box.

I was playing with it to see if your idea will work and I've ended up with 
attached configuration. It contains just few modification to yours:

* I've swapped priorities to firstly use the memory channel and fail over to 
file channel
* I've renamed target file prefixes to distinguish source of the events (memory 
or file channel)

Scenario that I've run:

* I turned on the flume agents, event's were correctly saved on HDFS
* I turned off name node, event's were not stored in file channel as expected

I'm not sure why the observed behaviour was different than expected and 'I'll 
investigate that later. Meantime, could you describe what exactly was happening 
to you?

Jarcec

On Thu, Jul 12, 2012 at 04:11:08PM +0200, Jarek Jarcec Cecho wrote:
> I'm just looking on that sir.
> 
> Jarcec
> 
> On Thu, Jul 12, 2012 at 07:34:31PM +0530, Inder Pall wrote:
> > Folks,
> > 
> > for some reason updating the JIRA isn't trigerring an email
> > I need feedback from FLUME DEVS on
> > FLUME-1045
> > wherein
> > i am trying to use failover sinkprocessor and a combination of file and
> > memory channel to achieve scribe like spooling/de-spooling.facing some
> > issues here
> > 
> > -- 
> > Thanks,
> > - Inder
> >   Tech Platforms @Inmobi
> >   Linkedin - http://goo.gl/eR4Ub




signature.asc
Description: Digital signature


[jira] [Commented] (FLUME-1368) In user guide, property sink.directory for file roller sink should be bold

2012-07-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413608#comment-13413608
 ] 

Jarek Jarcec Cecho commented on FLUME-1368:
---

Hi Mark,
thank you very much for your contribution. It seems to me that you've extended 
first column by couple of characters, but you did not change column boundaries 
("=" characters at the begging of the table, end of header and end of the 
table). Because of that UserGuide is failing to build correctly. Would you mind 
fixing it?

Jarcec

> In user guide, property sink.directory for file roller sink should be bold
> --
>
> Key: FLUME-1368
> URL: https://issues.apache.org/jira/browse/FLUME-1368
> Project: Flume
>  Issue Type: Bug
>Reporter: Mark Stern
>Priority: Trivial
>  Labels: documentation
> Attachments: FLUME-1368.PATCH
>
>
> Required properties should be in bold. 'sink.directory' is a required 
> property (because there is no default and it has to know where to put the 
> files). So it should be bold in the user guide.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [DISCUSS] Git as primary source control for Flume

2012-07-13 Thread Arvind Prabhakar
+1 for using Git as primary source control system.

Thanks Hari for following up on this.

Regards,
Arvind Prabhakar

On Wed, Jul 11, 2012 at 7:16 PM, Leslin  wrote:

>  +1 for this proposal.  Git is fine for me.  I never back to SVN after I
> touched git.
>
> 2012/7/12 Mike Percy 
>
> > On Wed, Jul 11, 2012 at 5:45 PM, Ralph Goers  > >wrote:
> >
> > > IMO the person who wrote the code is the one who should get credit.
> > >
> >
> > Of course they should get the credit for the work.
> >
> > Anyone who has ever performed a careful code review knows that it can be
> > time-consuming work. I assume that's one reason why we currently list
> both
> > the author and the committer in the commit message.
> >
> > Regards,
> > Mike
> >
>
>
>
> --
>
>
>
> Best Regards
>
> Leslin
>


Re: [VOTE] Release Apache Flume version 1.2.0 (rc1)

2012-07-13 Thread Arvind Prabhakar
+1

* Binary and Source distributions checksums and signatures match
* LICENSE file accounts for all included Jars in the binary distribution
* Sources build and test fine.
* Top level files all look good
* Jira is clean

One slight concern (not a blocker): the tag contains sources in contrib
that are not included in the source tar-ball. Since these sources are not
used for build, we can do without those for now.

Thanks for your hard work Mike!

Regards,
Arvind Prabhakar

On Wed, Jul 11, 2012 at 4:57 AM, Mike Percy  wrote:

> This is the first release for Apache Flume as a top-level project,
> version 1.2.0. We are voting on release candidate rc1.
>
> *** Please cast your vote within the next 72 hours ***
>
> The list of fixed issues:
> https://svn.apache.org/repos/asf/flume/tags/flume-1.2.0-rc1/CHANGELOG
>
> The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
> for the source and binary artifacts can be found at:
> https://people.apache.org/~mpercy/flume/apache-flume-1.2.0-rc1/
>
> The tag to be voted on:
> https://svn.apache.org/repos/asf/flume/tags/flume-1.2.0-rc1
>
> The KEYS file can be found here:
> https://svn.apache.org/repos/asf/flume/dist/KEYS
>
> Changes since rc0:
>  - Updated LICENSE file
>  - Updated DEVNOTES file
>  - Removed DISCLAIMER file from dist.xml and src.xml manifests
>  - pom.xml file updated with TLP info (FLUME-1359)
>  - A build fix to prevent multiple servlet-api jars in lib dir
>