Re: Review Request: FLUME-1586. File Channel should support verifying integrity of individual events.

2013-05-22 Thread Hari Shreedharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10944/
---

(Updated May 23, 2013, 1:52 a.m.)


Review request for Flume.


Changes
---

Add tests + refactoring.


Description
---

Patch to add a checksum to events and replace them with a noop event using a 
tool, if corrupt.


This addresses bug FLUME-1586.
https://issues.apache.org/jira/browse/FLUME-1586


Diffs (updated)
-

  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CorruptEventException.java
 PRE-CREATION 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
 cc0d38a 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEvent.java
 c447335 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventPointer.java
 5f06ab7 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
 1918baa 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
 d3db896 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV3.java
 d9a2a9b 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/NoopRecordException.java
 PRE-CREATION 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Pair.java
 dfcdd73 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Put.java
 4235a79 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java
 fc47b23 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Serialization.java
 d6897e1 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/TransactionEventRecord.java
 073042f 
  
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/proto/ProtosFactory.java
 4860ac2 
  flume-ng-channels/flume-file-channel/src/main/proto/filechannel.proto 1e668d2 
  
flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
 0f7d14d 
  
flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java
 54978f8 
  
flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java
 bef22ef 
  
flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV3.java
 f403422 
  
flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestUtils.java
 563dbcc 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
 bc3b383 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
 2c2be6a 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java
 b8214be 
  
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java
 0383744 
  flume-tools/pom.xml PRE-CREATION 
  
flume-tools/src/main/java/org/apache/flume/tools/FileChannelIntegrityTool.java 
PRE-CREATION 
  flume-tools/src/main/java/org/apache/flume/tools/FlumeTool.java PRE-CREATION 
  flume-tools/src/main/java/org/apache/flume/tools/FlumeToolType.java 
PRE-CREATION 
  flume-tools/src/main/java/org/apache/flume/tools/FlumeToolsMain.java 
PRE-CREATION 
  
flume-tools/src/test/java/org/apache/flume/tools/TestFileChannelIntegrityTool.java
 PRE-CREATION 
  flume-tools/src/test/java/org/apache/flume/tools/TestFlumeToolsMain.java 
PRE-CREATION 
  pom.xml a6992f6 

Diff: https://reviews.apache.org/r/10944/diff/


Testing
---

Added unit tests when corrupt and noop events are encountered. I will add tests 
for the tool as well soon. I have not yet tested the tool completely. This 
patch aims at gathering feedback on the approach.


Thanks,

Hari Shreedharan



Re: spooldir source reading Flume itself and thinking the file has changed (1.3.1)

2013-05-22 Thread Mike Percy
Hi Phil,
Since this is more of a dev discussion I'll just continue the conversation
here on this list.

FYI the latest Spool Directory Source has support for resuming reading
files. Trunk / Flume 1.4 have some new code around this aspect.

Regarding Linux, doing lsof is a pretty cool idea but not portable to all
systems. Also in Linux, two processes are allowed to have the same file
open for write (it's still a bad idea though). I don't know of a portable
way to check whether some other process have a given file open. We could,
however, check to see if the file changed, and if so just stop processing
that file for a while and try again later. I just don't want people to
think spooldir is good for "tailing" a file, because it's not… :)

Regards,
Mike



On Wed, May 22, 2013 at 5:24 PM, Phil Scala wrote:

> Hey Ed / Mike
>
> Besides a comment in the users mailing list that I commented on the file
> spool starting from the beginning of the file if there was a failure.  The
> code does have that well commented (in the retireCurrentFile) [if you don't
> retire the file then you run the risk of duplicates...fine with my use :)]
>
>
> As Ed mentioned we have been chatting about ensuring there are no
> invariants muddled up during file spool processing.  I see this as 2 or 3
> pieces...I think the code is pretty solid, with one area I want to look
> into.
>
> I would like to give this more thought...
>
> The file the spool source has decided is the "next file"... is it in
> use/has the "upload" to the spool directory completed.
>
> Discussions mentioned some "time" delay -> that could be
> artificial and still never solve the problem.   I need to do some learning
> here, coming from windows the file locking was pretty exclusive.  I want to
> see about FileChannel locks in nio and Linux file management.This could
> maybe be an area to look at.  Right now there are no locks obtained for the
> file being processed.
>
> I will come back with something a little better formulated soon...
>
> Thanks
>
>
> Phil Scala
> Software Developer / Architect
> Global Relay
>
> phil.sc...@globalrelay.net
>
> 866.484.6630  |  i...@globalrelay.net  |  globalrelay.com
>
> -Original Message-
> From: ejsa...@gmail.com [mailto:ejsa...@gmail.com] On Behalf Of Edward
> Sargisson
> Sent: Wednesday, May 22, 2013 12:22 PM
> To: Mike Percy; dev@flume.apache.org
> Subject: Re: spooldir source reading Flume itself and thinking the file
> has changed (1.3.1)
>
> Hi Mike,
> I haven't tried log4j2 in my environments but my review of the log4j2
> change is that it should work.
>
> What would I change?
> Phil Scala may have some thoughts.
>
> It would be nice if we thought through the file locking. I want to be able
> to put a file in the spooldir and know that Flume isn't going to get
> started until I'm ready. This certainly involves thinking about what the
> file-putting process is doing but it's not clear to me how to ensure this
> whole part is safe.
>
> The thing that is currently annoying is handling stack traces. All logging
> systems I've seen (except recent log4j2) output the stack trace with each
> frame on a new line. This means that each frame gets its own log event and
> the timestamp has to be added by Flume (instead of taken from the original
> event). That Flume timestamp might be delayed by up to 1 minute (because of
> log rolling so its pretty crap). Logstash has a multiline filter that
> somewhat solves this.
>
> My current approach is to try and get the Log4j2 FlumeAppender and Flume
> 1.3.1 reliable and trustworthy.
>
> Cheers,
> Edward
>
> "Hi Edward,
> Did the fixes in LOG4J2-254 fix your file rolling issue?
>
> What are your thoughts on how to improve spooling directory source's error
> handling when it detects a change in the file? Just bail and retry later? I
> suppose that's a pretty reasonable approach.
>
> Regards,
> Mike
>
>
> On Tue, May 14, 2013 at 4:50 PM, Edward Sargisson 
> wrote:
>
> > Unless I'm mistaken (and concurrent code is easy to be mistaken about)
> this
> > is a race condition in apache-log4j-extras RollingFileAppender. I live
> > in hope that when log4j2 becomes GA we can move to it and then be able
> > to use it to log Flume itself.
> >
> > Evidence:
> > File: castellan-reader.
> 20130514T2058.log.COMPLETED
> > 2013-05-14 20:57:05,330  INFO ...
> >
> > File: castellan-reader.20130514T2058.log
> > 2013-05-14 21:23:05,709 DEBUG ...
> >
> > Why would an event from 2123 be written into a file from 2058?
> >
> > My understanding of log4j shows that the RollingFileAppenders end up
> > calling this:
> > FileAppender:
> > public  synchronized  void setFile(String fileName, boolean append,
> boolean
> > bufferedIO, int bufferSize)
> >
> > Which shortly calls:
> > this.qw = new QuietWriter(writer, errorHandler);
> >
> > However, the code to actually write to the writer is this:
> > protected
> >   void subAppend(LoggingEvent event) {
> > this.qw.write(this.layout.format(event));
> >
> > Un

RE: spooldir source reading Flume itself and thinking the file has changed (1.3.1)

2013-05-22 Thread Phil Scala
Hey Ed / Mike 

Besides a comment in the users mailing list that I commented on the file spool 
starting from the beginning of the file if there was a failure.  The code does 
have that well commented (in the retireCurrentFile) [if you don't retire the 
file then you run the risk of duplicates...fine with my use :)]


As Ed mentioned we have been chatting about ensuring there are no invariants 
muddled up during file spool processing.  I see this as 2 or 3 pieces...I think 
the code is pretty solid, with one area I want to look into.

I would like to give this more thought...

The file the spool source has decided is the "next file"... is it in use/has 
the "upload" to the spool directory completed. 

Discussions mentioned some "time" delay -> that could be artificial and 
still never solve the problem.   I need to do some learning here, coming from 
windows the file locking was pretty exclusive.  I want to see about FileChannel 
locks in nio and Linux file management.This could maybe be an area to look 
at.  Right now there are no locks obtained for the file being processed.

I will come back with something a little better formulated soon...

Thanks


Phil Scala
Software Developer / Architect
Global Relay

phil.sc...@globalrelay.net

866.484.6630  |  i...@globalrelay.net  |  globalrelay.com 

-Original Message-
From: ejsa...@gmail.com [mailto:ejsa...@gmail.com] On Behalf Of Edward Sargisson
Sent: Wednesday, May 22, 2013 12:22 PM
To: Mike Percy; dev@flume.apache.org
Subject: Re: spooldir source reading Flume itself and thinking the file has 
changed (1.3.1)

Hi Mike,
I haven't tried log4j2 in my environments but my review of the log4j2 change is 
that it should work.

What would I change?
Phil Scala may have some thoughts.

It would be nice if we thought through the file locking. I want to be able to 
put a file in the spooldir and know that Flume isn't going to get started until 
I'm ready. This certainly involves thinking about what the file-putting process 
is doing but it's not clear to me how to ensure this whole part is safe.

The thing that is currently annoying is handling stack traces. All logging 
systems I've seen (except recent log4j2) output the stack trace with each frame 
on a new line. This means that each frame gets its own log event and the 
timestamp has to be added by Flume (instead of taken from the original event). 
That Flume timestamp might be delayed by up to 1 minute (because of log rolling 
so its pretty crap). Logstash has a multiline filter that somewhat solves this.

My current approach is to try and get the Log4j2 FlumeAppender and Flume
1.3.1 reliable and trustworthy.

Cheers,
Edward

"Hi Edward,
Did the fixes in LOG4J2-254 fix your file rolling issue?

What are your thoughts on how to improve spooling directory source's error 
handling when it detects a change in the file? Just bail and retry later? I 
suppose that's a pretty reasonable approach.

Regards,
Mike


On Tue, May 14, 2013 at 4:50 PM, Edward Sargisson  wrote:

> Unless I'm mistaken (and concurrent code is easy to be mistaken about)
this
> is a race condition in apache-log4j-extras RollingFileAppender. I live 
> in hope that when log4j2 becomes GA we can move to it and then be able 
> to use it to log Flume itself.
>
> Evidence:
> File: castellan-reader.
20130514T2058.log.COMPLETED
> 2013-05-14 20:57:05,330  INFO ...
>
> File: castellan-reader.20130514T2058.log
> 2013-05-14 21:23:05,709 DEBUG ...
>
> Why would an event from 2123 be written into a file from 2058?
>
> My understanding of log4j shows that the RollingFileAppenders end up 
> calling this:
> FileAppender:
> public  synchronized  void setFile(String fileName, boolean append,
boolean
> bufferedIO, int bufferSize)
>
> Which shortly calls:
> this.qw = new QuietWriter(writer, errorHandler);
>
> However, the code to actually write to the writer is this:
> protected
>   void subAppend(LoggingEvent event) {
> this.qw.write(this.layout.format(event));
>
> Unless I'm mistaken there's no happens-before edge between setting the 
> qw and calling subappend. The code path to get to subAppend appears 
> not to go through any method synchronized on FileAppender's monitor. 
> this.qw is not volatile.
>
> Oh, and based on my cursory inspection of the log4j2 code this exists 
> in
> log4j2 as well. I've just raised log4j2-254 to cover it. We'll see if 
> I'm actually right...
>
> Cheers,
> Edward
>
>
>
>
> On Mon, May 13, 2013 at 8:45 AM, Edward Sargisson 
> wrote:
>
> > Hi Mike,
> > Based on my reading of the various logging frameworks' source code 
> > and
> the
> > Java documentation I come to the conclusion that relying on an 
> > atomic
> move
> > is not wise. (Next time I see this I might try and prove that the
spooled
> > file is incomplete).
> >
> > So I suggest two things:
> > 1) A breach of that check should not cause the entire Flume instance 
> > to stop passing traffic.
> > 2) A configurable wait time might work. If you're using the sp

Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Brock Noland
+1 for Flume 1.4
+1 for Mike being RM.


On Wed, May 22, 2013 at 1:25 PM, Roshan Naik  wrote:

> +1 for both (non binding)
>
>
> On Wed, May 22, 2013 at 10:41 AM, Mubarak Seyed  wrote:
>
> > +1 for Flume 1.4
> > +1 for Mike being RM
> >
> >
> > -Mubarak
> >
> > On May 22, 2013, at 9:51 AM, Venkatesh S R  wrote:
> >
> > > +1 for both! Thanks Mike!
> > >
> > > Best,
> > > Venkatesh
> > >
> > >
> > > On Wed, May 22, 2013 at 9:41 AM, Will McQueen 
> wrote:
> > >
> > >> +1 for Flume 1.4
> > >> +1 for Mike being RM.
> > >>
> > >> On May 22, 2013, at 9:28 AM, Edward Sargisson 
> wrote:
> > >>
> > >>> Hi All,
> > >>> +1/+1 for 1.4 and Mike.
> > >>>
> > >>>
> > >>> I'm very keen to have a 1.4 for the environments I manage. There's a
> > lot
> > >> of
> > >>> stuff I'm keen on in there.
> > >>>
> > >>> On my pre-1.4 list:
> > >>> 1. compile with elasticsearch 0.90
> > >>> 2. figure out file channel state issue which is stopping Flume
> logging
> > >> via
> > >>> itself.
> > >>>
> > >>> 1. Currently we compile with es 0.19. If somebody wants to run es
> 0.20
> > >> they
> > >>> have to recompile (es made an interface change that is source
> > compatible
> > >>> but requires a recompile). es 0.90 has been out for 2-ish weeks so
> safe
> > >>> enough to change the compile to. I think I'll raise an empty Jira to
> > >> record
> > >>> this.
> > >>>
> > >>> 2. I haven't reported this because I haven't isolated it well enough.
> > I'm
> > >>> having issues with the 1.3.1 file channel which I'd like to resolve.
> > >>>
> > >>> Cheers,
> > >>> Edward
> > >>>
> > >>> "Hi folks,
> > >>> We have had over 100 commits since 1.3.1, and a bunch of new features
> > and
> > >>> improvements including a Thrift source, much improved ElasticSearch
> > sink,
> > >>> support for a new plugins directory and layout, compression support
> in
> > >> the
> > >>> avro sink/source, improved checkpointing in the file channel and
> more,
> > >> plus
> > >>> a lot of bug fixes.
> > >>>
> > >>> It seems to me that it's time to start thinking about cutting a 1.4
> > >>> release. I would be happy to volunteer to RM the release. Worth
> noting
> > >> that
> > >>> I will be unavailable for the next two weeks... but after that I'd be
> > >> happy
> > >>> to pick this up and run with it. That's also a decent amount of time
> > for
> > >>> people  to get moving on patches and reviews for their favorite
> > features,
> > >>> bug fixes, etc.
> > >>>
> > >>> If this all sounds OK, I'd like to suggest targeting the last week of
> > >> June
> > >>> as a release date. If we can release in time for Hadoop Summit then
> > that
> > >>> would be pretty nice. Otherwise, if something comes up and we can't
> get
> > >> the
> > >>> release out that week, let's shoot for the first week of July at the
> > >> latest.
> > >>>
> > >>> Please let me know your thoughts.
> > >>>
> > >>> Regards,
> > >>> Mike
> > >>>
> > >>> +1 for Flume 1.4
> > >>> +1 for Mike being RM.
> > >>>
> > >>>
> > >>> Cheers,
> > >>> Hari"
> > >>
> >
> >
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


[FLUME-1995] Remote Channel for Apache Flume

2013-05-22 Thread Israel Ekpo
I wanted to get some feedback from others before deciding whether or not to
continue working on this.

I initially filed this improvement/new feature because of use cases where
there is a hardware failure on the machine where the agent is currently
running.

In terms of disaster recovery, having the events queue up on a remote
machine (preferably in the same internal network) will allow another agent
with the same configuration to pick it up from another machine and restart
the process of data transport towards the sink.

Sometimes, events may take a while to process and they may end up staying
in the channels (FileChannel) for a long time, during which hardware
failure could occur.

If the data in the events is mission critical, this could cause a lot of
headaches if there is no easy way to recover from the hardware failure
after events have been queued up in the file channel.

What are your thoughts towards the remote channel? I understand there is a
JDBC Channel (http://flume.apache.org/FlumeUserGuide.html#jdbc-channel) but
I have heard it has performance issues.

This is why I am deciding to use a NoSQL store to solve this.

I would like to get some feedback from others so that I can prioritize the
tasks in my JIRA queue especially with the 1.4.0 release deadline drawing
nearer.

Thanks.


[jira] [Commented] (FLUME-1995) CassandraChannel - A Distributed Channel Backed By Apache Cassandra as a Persistent Store for Events

2013-05-22 Thread Israel Ekpo (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664406#comment-13664406
 ] 

Israel Ekpo commented on FLUME-1995:


Hi Edward,

Thank you for the feedback. 

After reading the internal architecture document, I do see your point.

I will explore another option then as this would lead to performance and disk 
space issues down the line.

My reason for queuing outside of the same computer as the Flume agent is that 
there are uses cases where the data is very important and if there happens to 
be a non-recoverable hardware failure on the agent's machine where the file 
channel is located, it would be easier to restart another agent with the same 
configuration from a separate machine since the events would still be available 
elsewhere.

We can discuss more alternatives offline (regarding what the queuing solution 
should be) but I still think there is a need for a high-performing channel that 
queues the events outside of the machine where the agent is running.



> CassandraChannel - A Distributed Channel Backed By Apache Cassandra as a 
> Persistent Store for Events
> 
>
> Key: FLUME-1995
> URL: https://issues.apache.org/jira/browse/FLUME-1995
> Project: Flume
>  Issue Type: New Feature
>  Components: Channel
>Affects Versions: v1.4.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>
> Apache Cassandra Channel
> The events received by this channel are queued up in Cassandra to be picked 
> up later when sinks send pickup requests to the channel.
> This type of channel is suitable for use cases where recoverability in the 
> event of a hardware failure on the agent machine is important.
> The Cassandra cluster can be located on a remote machine.
> Cassandra also supports replication which could back up and replicate the 
> events further to other nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Roshan Naik
+1 for both (non binding)


On Wed, May 22, 2013 at 10:41 AM, Mubarak Seyed  wrote:

> +1 for Flume 1.4
> +1 for Mike being RM
>
>
> -Mubarak
>
> On May 22, 2013, at 9:51 AM, Venkatesh S R  wrote:
>
> > +1 for both! Thanks Mike!
> >
> > Best,
> > Venkatesh
> >
> >
> > On Wed, May 22, 2013 at 9:41 AM, Will McQueen  wrote:
> >
> >> +1 for Flume 1.4
> >> +1 for Mike being RM.
> >>
> >> On May 22, 2013, at 9:28 AM, Edward Sargisson  wrote:
> >>
> >>> Hi All,
> >>> +1/+1 for 1.4 and Mike.
> >>>
> >>>
> >>> I'm very keen to have a 1.4 for the environments I manage. There's a
> lot
> >> of
> >>> stuff I'm keen on in there.
> >>>
> >>> On my pre-1.4 list:
> >>> 1. compile with elasticsearch 0.90
> >>> 2. figure out file channel state issue which is stopping Flume logging
> >> via
> >>> itself.
> >>>
> >>> 1. Currently we compile with es 0.19. If somebody wants to run es 0.20
> >> they
> >>> have to recompile (es made an interface change that is source
> compatible
> >>> but requires a recompile). es 0.90 has been out for 2-ish weeks so safe
> >>> enough to change the compile to. I think I'll raise an empty Jira to
> >> record
> >>> this.
> >>>
> >>> 2. I haven't reported this because I haven't isolated it well enough.
> I'm
> >>> having issues with the 1.3.1 file channel which I'd like to resolve.
> >>>
> >>> Cheers,
> >>> Edward
> >>>
> >>> "Hi folks,
> >>> We have had over 100 commits since 1.3.1, and a bunch of new features
> and
> >>> improvements including a Thrift source, much improved ElasticSearch
> sink,
> >>> support for a new plugins directory and layout, compression support in
> >> the
> >>> avro sink/source, improved checkpointing in the file channel and more,
> >> plus
> >>> a lot of bug fixes.
> >>>
> >>> It seems to me that it's time to start thinking about cutting a 1.4
> >>> release. I would be happy to volunteer to RM the release. Worth noting
> >> that
> >>> I will be unavailable for the next two weeks... but after that I'd be
> >> happy
> >>> to pick this up and run with it. That's also a decent amount of time
> for
> >>> people  to get moving on patches and reviews for their favorite
> features,
> >>> bug fixes, etc.
> >>>
> >>> If this all sounds OK, I'd like to suggest targeting the last week of
> >> June
> >>> as a release date. If we can release in time for Hadoop Summit then
> that
> >>> would be pretty nice. Otherwise, if something comes up and we can't get
> >> the
> >>> release out that week, let's shoot for the first week of July at the
> >> latest.
> >>>
> >>> Please let me know your thoughts.
> >>>
> >>> Regards,
> >>> Mike
> >>>
> >>> +1 for Flume 1.4
> >>> +1 for Mike being RM.
> >>>
> >>>
> >>> Cheers,
> >>> Hari"
> >>
>
>


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Mubarak Seyed
+1 for Flume 1.4 
+1 for Mike being RM


-Mubarak

On May 22, 2013, at 9:51 AM, Venkatesh S R  wrote:

> +1 for both! Thanks Mike!
> 
> Best,
> Venkatesh
> 
> 
> On Wed, May 22, 2013 at 9:41 AM, Will McQueen  wrote:
> 
>> +1 for Flume 1.4
>> +1 for Mike being RM.
>> 
>> On May 22, 2013, at 9:28 AM, Edward Sargisson  wrote:
>> 
>>> Hi All,
>>> +1/+1 for 1.4 and Mike.
>>> 
>>> 
>>> I'm very keen to have a 1.4 for the environments I manage. There's a lot
>> of
>>> stuff I'm keen on in there.
>>> 
>>> On my pre-1.4 list:
>>> 1. compile with elasticsearch 0.90
>>> 2. figure out file channel state issue which is stopping Flume logging
>> via
>>> itself.
>>> 
>>> 1. Currently we compile with es 0.19. If somebody wants to run es 0.20
>> they
>>> have to recompile (es made an interface change that is source compatible
>>> but requires a recompile). es 0.90 has been out for 2-ish weeks so safe
>>> enough to change the compile to. I think I'll raise an empty Jira to
>> record
>>> this.
>>> 
>>> 2. I haven't reported this because I haven't isolated it well enough. I'm
>>> having issues with the 1.3.1 file channel which I'd like to resolve.
>>> 
>>> Cheers,
>>> Edward
>>> 
>>> "Hi folks,
>>> We have had over 100 commits since 1.3.1, and a bunch of new features and
>>> improvements including a Thrift source, much improved ElasticSearch sink,
>>> support for a new plugins directory and layout, compression support in
>> the
>>> avro sink/source, improved checkpointing in the file channel and more,
>> plus
>>> a lot of bug fixes.
>>> 
>>> It seems to me that it's time to start thinking about cutting a 1.4
>>> release. I would be happy to volunteer to RM the release. Worth noting
>> that
>>> I will be unavailable for the next two weeks... but after that I'd be
>> happy
>>> to pick this up and run with it. That's also a decent amount of time for
>>> people  to get moving on patches and reviews for their favorite features,
>>> bug fixes, etc.
>>> 
>>> If this all sounds OK, I'd like to suggest targeting the last week of
>> June
>>> as a release date. If we can release in time for Hadoop Summit then that
>>> would be pretty nice. Otherwise, if something comes up and we can't get
>> the
>>> release out that week, let's shoot for the first week of July at the
>> latest.
>>> 
>>> Please let me know your thoughts.
>>> 
>>> Regards,
>>> Mike
>>> 
>>> +1 for Flume 1.4
>>> +1 for Mike being RM.
>>> 
>>> 
>>> Cheers,
>>> Hari"
>> 



[jira] [Created] (FLUME-2050) Upgrade to log4j2 (when GA)

2013-05-22 Thread Edward Sargisson (JIRA)
Edward Sargisson created FLUME-2050:
---

 Summary: Upgrade to log4j2 (when GA)
 Key: FLUME-2050
 URL: https://issues.apache.org/jira/browse/FLUME-2050
 Project: Flume
  Issue Type: Improvement
Reporter: Edward Sargisson


Log4j1 is being abandoned in favour of log4j2. Log4j2, by all that I've seen, 
has better concurrency handling and the Log4j2 FlumeAppender is nice (easily 
configurable, 3 different styles of agents).

Log4j1 has a concurrency defect which means that rolling over a log file into a 
directory for the Flume spool directory source will not be reliable. Log4j2 has 
fixed this.

Alternatively the log4j2 FlumeAppender may allow Flume to log its own logs via 
itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Venkatesh S R
+1 for both! Thanks Mike!

Best,
Venkatesh


On Wed, May 22, 2013 at 9:41 AM, Will McQueen  wrote:

> +1 for Flume 1.4
> +1 for Mike being RM.
>
> On May 22, 2013, at 9:28 AM, Edward Sargisson  wrote:
>
> > Hi All,
> > +1/+1 for 1.4 and Mike.
> >
> >
> > I'm very keen to have a 1.4 for the environments I manage. There's a lot
> of
> > stuff I'm keen on in there.
> >
> > On my pre-1.4 list:
> > 1. compile with elasticsearch 0.90
> > 2. figure out file channel state issue which is stopping Flume logging
> via
> > itself.
> >
> > 1. Currently we compile with es 0.19. If somebody wants to run es 0.20
> they
> > have to recompile (es made an interface change that is source compatible
> > but requires a recompile). es 0.90 has been out for 2-ish weeks so safe
> > enough to change the compile to. I think I'll raise an empty Jira to
> record
> > this.
> >
> > 2. I haven't reported this because I haven't isolated it well enough. I'm
> > having issues with the 1.3.1 file channel which I'd like to resolve.
> >
> > Cheers,
> > Edward
> >
> > "Hi folks,
> > We have had over 100 commits since 1.3.1, and a bunch of new features and
> > improvements including a Thrift source, much improved ElasticSearch sink,
> > support for a new plugins directory and layout, compression support in
> the
> > avro sink/source, improved checkpointing in the file channel and more,
> plus
> > a lot of bug fixes.
> >
> > It seems to me that it's time to start thinking about cutting a 1.4
> > release. I would be happy to volunteer to RM the release. Worth noting
> that
> > I will be unavailable for the next two weeks... but after that I'd be
> happy
> > to pick this up and run with it. That's also a decent amount of time for
> > people  to get moving on patches and reviews for their favorite features,
> > bug fixes, etc.
> >
> > If this all sounds OK, I'd like to suggest targeting the last week of
> June
> > as a release date. If we can release in time for Hadoop Summit then that
> > would be pretty nice. Otherwise, if something comes up and we can't get
> the
> > release out that week, let's shoot for the first week of July at the
> latest.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Mike
> >
> > +1 for Flume 1.4
> > +1 for Mike being RM.
> >
> >
> > Cheers,
> > Hari"
>


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Will McQueen
+1 for Flume 1.4 
+1 for Mike being RM.

On May 22, 2013, at 9:28 AM, Edward Sargisson  wrote:

> Hi All,
> +1/+1 for 1.4 and Mike.
> 
> 
> I'm very keen to have a 1.4 for the environments I manage. There's a lot of
> stuff I'm keen on in there.
> 
> On my pre-1.4 list:
> 1. compile with elasticsearch 0.90
> 2. figure out file channel state issue which is stopping Flume logging via
> itself.
> 
> 1. Currently we compile with es 0.19. If somebody wants to run es 0.20 they
> have to recompile (es made an interface change that is source compatible
> but requires a recompile). es 0.90 has been out for 2-ish weeks so safe
> enough to change the compile to. I think I'll raise an empty Jira to record
> this.
> 
> 2. I haven't reported this because I haven't isolated it well enough. I'm
> having issues with the 1.3.1 file channel which I'd like to resolve.
> 
> Cheers,
> Edward
> 
> "Hi folks,
> We have had over 100 commits since 1.3.1, and a bunch of new features and
> improvements including a Thrift source, much improved ElasticSearch sink,
> support for a new plugins directory and layout, compression support in the
> avro sink/source, improved checkpointing in the file channel and more, plus
> a lot of bug fixes.
> 
> It seems to me that it's time to start thinking about cutting a 1.4
> release. I would be happy to volunteer to RM the release. Worth noting that
> I will be unavailable for the next two weeks... but after that I'd be happy
> to pick this up and run with it. That's also a decent amount of time for
> people  to get moving on patches and reviews for their favorite features,
> bug fixes, etc.
> 
> If this all sounds OK, I'd like to suggest targeting the last week of June
> as a release date. If we can release in time for Hadoop Summit then that
> would be pretty nice. Otherwise, if something comes up and we can't get the
> release out that week, let's shoot for the first week of July at the latest.
> 
> Please let me know your thoughts.
> 
> Regards,
> Mike
> 
> +1 for Flume 1.4
> +1 for Mike being RM.
> 
> 
> Cheers,
> Hari"


[jira] [Created] (FLUME-2049) Compile ElasticSearchSink with elasticsearch 0.90

2013-05-22 Thread Edward Sargisson (JIRA)
Edward Sargisson created FLUME-2049:
---

 Summary: Compile ElasticSearchSink with elasticsearch 0.90
 Key: FLUME-2049
 URL: https://issues.apache.org/jira/browse/FLUME-2049
 Project: Flume
  Issue Type: Improvement
  Components: Sinks+Sources
Affects Versions: v1.3.1, v1.4.0
Reporter: Edward Sargisson
Assignee: Edward Sargisson


The ElasticSearchSink currently compiles against es 0.19. Using 0.20 requires a 
recompile. I haven't tried 0.90 yet.

0.90 has been out for 2-ish weeks so we should change to compiling with it 
prior to 1.4 release.

I'm hoping to get priority for doing this in the next two weeks or so - however 
I have no issue with somebody else doing it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Edward Sargisson
Hi All,
+1/+1 for 1.4 and Mike.


I'm very keen to have a 1.4 for the environments I manage. There's a lot of
stuff I'm keen on in there.

On my pre-1.4 list:
1. compile with elasticsearch 0.90
2. figure out file channel state issue which is stopping Flume logging via
itself.

1. Currently we compile with es 0.19. If somebody wants to run es 0.20 they
have to recompile (es made an interface change that is source compatible
but requires a recompile). es 0.90 has been out for 2-ish weeks so safe
enough to change the compile to. I think I'll raise an empty Jira to record
this.

2. I haven't reported this because I haven't isolated it well enough. I'm
having issues with the 1.3.1 file channel which I'd like to resolve.

Cheers,
Edward

"Hi folks,
We have had over 100 commits since 1.3.1, and a bunch of new features and
improvements including a Thrift source, much improved ElasticSearch sink,
support for a new plugins directory and layout, compression support in the
avro sink/source, improved checkpointing in the file channel and more, plus
a lot of bug fixes.

It seems to me that it's time to start thinking about cutting a 1.4
release. I would be happy to volunteer to RM the release. Worth noting that
I will be unavailable for the next two weeks... but after that I'd be happy
to pick this up and run with it. That's also a decent amount of time for
people  to get moving on patches and reviews for their favorite features,
bug fixes, etc.

If this all sounds OK, I'd like to suggest targeting the last week of June
as a release date. If we can release in time for Hadoop Summit then that
would be pretty nice. Otherwise, if something comes up and we can't get the
release out that week, let's shoot for the first week of July at the latest.

Please let me know your thoughts.

Regards,
Mike

+1 for Flume 1.4
+1 for Mike being RM.


Cheers,
Hari"


Re: spooldir source reading Flume itself and thinking the file has changed (1.3.1)

2013-05-22 Thread Edward Sargisson
Hi Mike,
I haven't tried log4j2 in my environments but my review of the log4j2
change is that it should work.

What would I change?
Phil Scala may have some thoughts.

It would be nice if we thought through the file locking. I want to be able
to put a file in the spooldir and know that Flume isn't going to get
started until I'm ready. This certainly involves thinking about what the
file-putting process is doing but it's not clear to me how to ensure this
whole part is safe.

The thing that is currently annoying is handling stack traces. All logging
systems I've seen (except recent log4j2) output the stack trace with each
frame on a new line. This means that each frame gets its own log event and
the timestamp has to be added by Flume (instead of taken from the original
event). That Flume timestamp might be delayed by up to 1 minute (because of
log rolling so its pretty crap). Logstash has a multiline filter that
somewhat solves this.

My current approach is to try and get the Log4j2 FlumeAppender and Flume
1.3.1 reliable and trustworthy.

Cheers,
Edward

"Hi Edward,
Did the fixes in LOG4J2-254 fix your file rolling issue?

What are your thoughts on how to improve spooling directory source's error
handling when it detects a change in the file? Just bail and retry later? I
suppose that's a pretty reasonable approach.

Regards,
Mike


On Tue, May 14, 2013 at 4:50 PM, Edward Sargisson  wrote:

> Unless I'm mistaken (and concurrent code is easy to be mistaken about)
this
> is a race condition in apache-log4j-extras RollingFileAppender. I live in
> hope that when log4j2 becomes GA we can move to it and then be able to use
> it to log Flume itself.
>
> Evidence:
> File: castellan-reader.
20130514T2058.log.COMPLETED
> 2013-05-14 20:57:05,330  INFO ...
>
> File: castellan-reader.20130514T2058.log
> 2013-05-14 21:23:05,709 DEBUG ...
>
> Why would an event from 2123 be written into a file from 2058?
>
> My understanding of log4j shows that the RollingFileAppenders end up
> calling this:
> FileAppender:
> public  synchronized  void setFile(String fileName, boolean append,
boolean
> bufferedIO, int bufferSize)
>
> Which shortly calls:
> this.qw = new QuietWriter(writer, errorHandler);
>
> However, the code to actually write to the writer is this:
> protected
>   void subAppend(LoggingEvent event) {
> this.qw.write(this.layout.format(event));
>
> Unless I'm mistaken there's no happens-before edge between setting the qw
> and calling subappend. The code path to get to subAppend appears not to go
> through any method synchronized on FileAppender's monitor. this.qw is not
> volatile.
>
> Oh, and based on my cursory inspection of the log4j2 code this exists in
> log4j2 as well. I've just raised log4j2-254 to cover it. We'll see if I'm
> actually right...
>
> Cheers,
> Edward
>
>
>
>
> On Mon, May 13, 2013 at 8:45 AM, Edward Sargisson 
> wrote:
>
> > Hi Mike,
> > Based on my reading of the various logging frameworks' source code and
> the
> > Java documentation I come to the conclusion that relying on an atomic
> move
> > is not wise. (Next time I see this I might try and prove that the
spooled
> > file is incomplete).
> >
> > So I suggest two things:
> > 1) A breach of that check should not cause the entire Flume instance to
> > stop passing traffic.
> > 2) A configurable wait time might work. If you're using the spooling
> > source then you've already decided to have some latency so a little more
> is
> > fine. However, there is still a risk of a race condition because there
is
> > no signal that the copy is finished.
> >
> > Cheers,
> > Edward
> >
> > "Hi Edward,
> > Thanks for investigating. I'm definitely open to suggestions for
> > improvement with this. Maybe dying is a bit extreme… the goal was to
> ensure
> > that people could not possibly try to use it to tail a file, which will
> > definitely not work correctly! :)
> >
> > Mike
> >
> >
> >
> > On Fri, May 10, 2013 at 5:02 PM, Edward Sargisson 
> > wrote:
> >
> > > Hi Mike,
> > > I was curious so I went on a bit of a hunt through logger source code.
> > > The result is that loggers can't be relied on to atomically roll the
> > > file so a feature to allow a delay before checking the file would be
> > > of great utility.
> > >
> > > For that matter, having Flume not die completely in this scenario
> > > would also be good.
> > >
> > > apache-log4j-extras does this [1]:
> > > return source.renameTo(destination);
> > >
> > > logback does this [2]:
> > > boolean result = srcFile.renameTo(targetFile);
> > >
> > > log4j2 does this [3]:
> > >  srcStream = new FileInputStream(source);
> > > destStream = new FileOutputStream(destination);
> > > srcChannel = srcStream.getChannel();
> > > destChannel = destStream.getChannel();
> > > destChannel.transferFrom(
> > srcChannel, 0, srcChannel.size());
> > >
> > > The JavaDoc for File.renameTo says:
> > >  Many aspects of the behavior of this method are inherently
> > >  platform-d

[jira] [Commented] (FLUME-1995) CassandraChannel - A Distributed Channel Backed By Apache Cassandra as a Persistent Store for Events

2013-05-22 Thread Edward Sargisson (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664219#comment-13664219
 ] 

Edward Sargisson commented on FLUME-1995:
-

I think this is bad idea. I just finished working on a team for 18 months 
heavily using Cassandra and know a little about its internal design. Generally 
speaking, the advice is that Cassandra is a bad choice for a queue. 

Queuing behaviour means that you have some producers adding items and consumers 
deleting items. Cassandra doesn't really delete - it's an append only system so 
a delete means that it creates a tombstone in the latest SSTable. Then, 
sometime later, a repair process is run which ensures that all the records are 
actually deleted. In the meantime you run the risk of some of the nodes 
replying with a 'latest' record that may have been deleted off some other node 
but the update hasn't propagated yet.

If this is not convincing enough then I'll discuss it on the Cassandra list and 
bring the results back here.

If you happen to want large scalable queueing then a common solution I've seen 
is to use Redis. However, I don't see why you wouldn't use multiple Flume 
agents and file channels to solve the same problem.

> CassandraChannel - A Distributed Channel Backed By Apache Cassandra as a 
> Persistent Store for Events
> 
>
> Key: FLUME-1995
> URL: https://issues.apache.org/jira/browse/FLUME-1995
> Project: Flume
>  Issue Type: New Feature
>  Components: Channel
>Affects Versions: v1.4.0
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>
> Apache Cassandra Channel
> The events received by this channel are queued up in Cassandra to be picked 
> up later when sinks send pickup requests to the channel.
> This type of channel is suitable for use cases where recoverability in the 
> event of a hardware failure on the agent machine is important.
> The Cassandra cluster can be located on a remote machine.
> Cassandra also supports replication which could back up and replicate the 
> events further to other nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Alexander Alten-Lorenz
+1 (non-binding) for both!

Thanks for the effort, Mike!

On May 22, 2013, at 10:09 AM, Jarek Jarcec Cecho  wrote:

> +1 for releasing Flume 1.4
> +1 for Mike being the RM for this release
> 
> Jarcec
> 
> On Wed, May 22, 2013 at 01:02:06AM -0700, Arvind Prabhakar wrote:
>> Thanks for taking this initiative Mike!
>> 
>> +1 for 1.4 and Mike as RM.
>> 
>> Regards,
>> Arvind Prabhakar
>> 
>> On Wed, May 22, 2013 at 12:45 AM, Hari Shreedharan <
>> hshreedha...@cloudera.com> wrote:
>> 
>>> +1 for Flume 1.4
>>> +1 for Mike being RM.
>>> 
>>> 
>>> Cheers,
>>> Hari
>>> 
>>> 
>>> On Wednesday, May 22, 2013 at 12:33 AM, Mike Percy wrote:
>>> 
 Hi folks,
 We have had over 100 commits since 1.3.1, and a bunch of new features and
 improvements including a Thrift source, much improved ElasticSearch sink,
 support for a new plugins directory and layout, compression support in
>>> the
 avro sink/source, improved checkpointing in the file channel and more,
>>> plus
 a lot of bug fixes.
 
 It seems to me that it's time to start thinking about cutting a 1.4
 release. I would be happy to volunteer to RM the release. Worth noting
>>> that
 I will be unavailable for the next two weeks... but after that I'd be
>>> happy
 to pick this up and run with it. That's also a decent amount of time for
 people to get moving on patches and reviews for their favorite features,
 bug fixes, etc.
 
 If this all sounds OK, I'd like to suggest targeting the last week of
>>> June
 as a release date. If we can release in time for Hadoop Summit then that
 would be pretty nice. Otherwise, if something comes up and we can't get
>>> the
 release out that week, let's shoot for the first week of July at the
>>> latest.
 
 Please let me know your thoughts.
 
 Regards,
 Mike
 
 
>>> 
>>> 
>>> 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF



Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Jarek Jarcec Cecho
+1 for releasing Flume 1.4
+1 for Mike being the RM for this release

Jarcec

On Wed, May 22, 2013 at 01:02:06AM -0700, Arvind Prabhakar wrote:
> Thanks for taking this initiative Mike!
> 
> +1 for 1.4 and Mike as RM.
> 
> Regards,
> Arvind Prabhakar
> 
> On Wed, May 22, 2013 at 12:45 AM, Hari Shreedharan <
> hshreedha...@cloudera.com> wrote:
> 
> > +1 for Flume 1.4
> > +1 for Mike being RM.
> >
> >
> > Cheers,
> > Hari
> >
> >
> > On Wednesday, May 22, 2013 at 12:33 AM, Mike Percy wrote:
> >
> > > Hi folks,
> > > We have had over 100 commits since 1.3.1, and a bunch of new features and
> > > improvements including a Thrift source, much improved ElasticSearch sink,
> > > support for a new plugins directory and layout, compression support in
> > the
> > > avro sink/source, improved checkpointing in the file channel and more,
> > plus
> > > a lot of bug fixes.
> > >
> > > It seems to me that it's time to start thinking about cutting a 1.4
> > > release. I would be happy to volunteer to RM the release. Worth noting
> > that
> > > I will be unavailable for the next two weeks... but after that I'd be
> > happy
> > > to pick this up and run with it. That's also a decent amount of time for
> > > people to get moving on patches and reviews for their favorite features,
> > > bug fixes, etc.
> > >
> > > If this all sounds OK, I'd like to suggest targeting the last week of
> > June
> > > as a release date. If we can release in time for Hadoop Summit then that
> > > would be pretty nice. Otherwise, if something comes up and we can't get
> > the
> > > release out that week, let's shoot for the first week of July at the
> > latest.
> > >
> > > Please let me know your thoughts.
> > >
> > > Regards,
> > > Mike
> > >
> > >
> >
> >
> >


signature.asc
Description: Digital signature


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Arvind Prabhakar
Thanks for taking this initiative Mike!

+1 for 1.4 and Mike as RM.

Regards,
Arvind Prabhakar

On Wed, May 22, 2013 at 12:45 AM, Hari Shreedharan <
hshreedha...@cloudera.com> wrote:

> +1 for Flume 1.4
> +1 for Mike being RM.
>
>
> Cheers,
> Hari
>
>
> On Wednesday, May 22, 2013 at 12:33 AM, Mike Percy wrote:
>
> > Hi folks,
> > We have had over 100 commits since 1.3.1, and a bunch of new features and
> > improvements including a Thrift source, much improved ElasticSearch sink,
> > support for a new plugins directory and layout, compression support in
> the
> > avro sink/source, improved checkpointing in the file channel and more,
> plus
> > a lot of bug fixes.
> >
> > It seems to me that it's time to start thinking about cutting a 1.4
> > release. I would be happy to volunteer to RM the release. Worth noting
> that
> > I will be unavailable for the next two weeks... but after that I'd be
> happy
> > to pick this up and run with it. That's also a decent amount of time for
> > people to get moving on patches and reviews for their favorite features,
> > bug fixes, etc.
> >
> > If this all sounds OK, I'd like to suggest targeting the last week of
> June
> > as a release date. If we can release in time for Hadoop Summit then that
> > would be pretty nice. Otherwise, if something comes up and we can't get
> the
> > release out that week, let's shoot for the first week of July at the
> latest.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Mike
> >
> >
>
>
>


[jira] [Updated] (FLUME-2048) Avro container file deserializer

2013-05-22 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2048:
--

Attachment: FLUME-2048.patch

> Avro container file deserializer
> 
>
> Key: FLUME-2048
> URL: https://issues.apache.org/jira/browse/FLUME-2048
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Mike Percy
>Assignee: Mike Percy
> Attachments: FLUME-2048.patch
>
>
> It would be great to support an avro container format deserializer in the 
> spool directory source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (FLUME-2048) Avro container file deserializer

2013-05-22 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2048:
--

Attachment: (was: FLUME-2048.patch)

> Avro container file deserializer
> 
>
> Key: FLUME-2048
> URL: https://issues.apache.org/jira/browse/FLUME-2048
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Mike Percy
>Assignee: Mike Percy
> Attachments: FLUME-2048.patch
>
>
> It would be great to support an avro container format deserializer in the 
> spool directory source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (FLUME-2048) Avro container file deserializer

2013-05-22 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated FLUME-2048:
--

Attachment: FLUME-2048.patch

Unit test is currently failing, but here is some partial progress on this 
feature.

> Avro container file deserializer
> 
>
> Key: FLUME-2048
> URL: https://issues.apache.org/jira/browse/FLUME-2048
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Mike Percy
>Assignee: Mike Percy
> Attachments: FLUME-2048.patch
>
>
> It would be great to support an avro container format deserializer in the 
> spool directory source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (FLUME-2048) Avro container file deserializer

2013-05-22 Thread Mike Percy (JIRA)
Mike Percy created FLUME-2048:
-

 Summary: Avro container file deserializer
 Key: FLUME-2048
 URL: https://issues.apache.org/jira/browse/FLUME-2048
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Mike Percy
 Attachments: FLUME-2048.patch

It would be great to support an avro container format deserializer in the spool 
directory source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (FLUME-2048) Avro container file deserializer

2013-05-22 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy reassigned FLUME-2048:
-

Assignee: Mike Percy

> Avro container file deserializer
> 
>
> Key: FLUME-2048
> URL: https://issues.apache.org/jira/browse/FLUME-2048
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Mike Percy
>Assignee: Mike Percy
> Attachments: FLUME-2048.patch
>
>
> It would be great to support an avro container format deserializer in the 
> spool directory source.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Hari Shreedharan
+1 for Flume 1.4 
+1 for Mike being RM.


Cheers,
Hari


On Wednesday, May 22, 2013 at 12:33 AM, Mike Percy wrote:

> Hi folks,
> We have had over 100 commits since 1.3.1, and a bunch of new features and
> improvements including a Thrift source, much improved ElasticSearch sink,
> support for a new plugins directory and layout, compression support in the
> avro sink/source, improved checkpointing in the file channel and more, plus
> a lot of bug fixes.
> 
> It seems to me that it's time to start thinking about cutting a 1.4
> release. I would be happy to volunteer to RM the release. Worth noting that
> I will be unavailable for the next two weeks... but after that I'd be happy
> to pick this up and run with it. That's also a decent amount of time for
> people to get moving on patches and reviews for their favorite features,
> bug fixes, etc.
> 
> If this all sounds OK, I'd like to suggest targeting the last week of June
> as a release date. If we can release in time for Hadoop Summit then that
> would be pretty nice. Otherwise, if something comes up and we can't get the
> release out that week, let's shoot for the first week of July at the latest.
> 
> Please let me know your thoughts.
> 
> Regards,
> Mike
> 
> 




[DISCUSS] Flume 1.4 release plan

2013-05-22 Thread Mike Percy
Hi folks,
We have had over 100 commits since 1.3.1, and a bunch of new features and
improvements including a Thrift source, much improved ElasticSearch sink,
support for a new plugins directory and layout, compression support in the
avro sink/source, improved checkpointing in the file channel and more, plus
a lot of bug fixes.

It seems to me that it's time to start thinking about cutting a 1.4
release. I would be happy to volunteer to RM the release. Worth noting that
I will be unavailable for the next two weeks... but after that I'd be happy
to pick this up and run with it. That's also a decent amount of time for
people  to get moving on patches and reviews for their favorite features,
bug fixes, etc.

If this all sounds OK, I'd like to suggest targeting the last week of June
as a release date. If we can release in time for Hadoop Summit then that
would be pretty nice. Otherwise, if something comes up and we can't get the
release out that week, let's shoot for the first week of July at the latest.

Please let me know your thoughts.

Regards,
Mike


Re: spooldir source reading Flume itself and thinking the file has changed (1.3.1)

2013-05-22 Thread Mike Percy
Hi Edward,
Did the fixes in LOG4J2-254 fix your file rolling issue?

What are your thoughts on how to improve spooling directory source's error
handling when it detects a change in the file? Just bail and retry later? I
suppose that's a pretty reasonable approach.

Regards,
Mike


On Tue, May 14, 2013 at 4:50 PM, Edward Sargisson  wrote:

> Unless I'm mistaken (and concurrent code is easy to be mistaken about) this
> is a race condition in apache-log4j-extras RollingFileAppender. I live in
> hope that when log4j2 becomes GA we can move to it and then be able to use
> it to log Flume itself.
>
> Evidence:
> File: castellan-reader.20130514T2058.log.COMPLETED
> 2013-05-14 20:57:05,330  INFO ...
>
> File: castellan-reader.20130514T2058.log
> 2013-05-14 21:23:05,709 DEBUG ...
>
> Why would an event from 2123 be written into a file from 2058?
>
> My understanding of log4j shows that the RollingFileAppenders end up
> calling this:
> FileAppender:
> public  synchronized  void setFile(String fileName, boolean append, boolean
> bufferedIO, int bufferSize)
>
> Which shortly calls:
> this.qw = new QuietWriter(writer, errorHandler);
>
> However, the code to actually write to the writer is this:
> protected
>   void subAppend(LoggingEvent event) {
> this.qw.write(this.layout.format(event));
>
> Unless I'm mistaken there's no happens-before edge between setting the qw
> and calling subappend. The code path to get to subAppend appears not to go
> through any method synchronized on FileAppender's monitor. this.qw is not
> volatile.
>
> Oh, and based on my cursory inspection of the log4j2 code this exists in
> log4j2 as well. I've just raised log4j2-254 to cover it. We'll see if I'm
> actually right...
>
> Cheers,
> Edward
>
>
>
>
> On Mon, May 13, 2013 at 8:45 AM, Edward Sargisson 
> wrote:
>
> > Hi Mike,
> > Based on my reading of the various logging frameworks' source code and
> the
> > Java documentation I come to the conclusion that relying on an atomic
> move
> > is not wise. (Next time I see this I might try and prove that the spooled
> > file is incomplete).
> >
> > So I suggest two things:
> > 1) A breach of that check should not cause the entire Flume instance to
> > stop passing traffic.
> > 2) A configurable wait time might work. If you're using the spooling
> > source then you've already decided to have some latency so a little more
> is
> > fine. However, there is still a risk of a race condition because there is
> > no signal that the copy is finished.
> >
> > Cheers,
> > Edward
> >
> > "Hi Edward,
> > Thanks for investigating. I'm definitely open to suggestions for
> > improvement with this. Maybe dying is a bit extreme… the goal was to
> ensure
> > that people could not possibly try to use it to tail a file, which will
> > definitely not work correctly! :)
> >
> > Mike
> >
> >
> >
> > On Fri, May 10, 2013 at 5:02 PM, Edward Sargisson 
> > wrote:
> >
> > > Hi Mike,
> > > I was curious so I went on a bit of a hunt through logger source code.
> > > The result is that loggers can't be relied on to atomically roll the
> > > file so a feature to allow a delay before checking the file would be
> > > of great utility.
> > >
> > > For that matter, having Flume not die completely in this scenario
> > > would also be good.
> > >
> > > apache-log4j-extras does this [1]:
> > > return source.renameTo(destination);
> > >
> > > logback does this [2]:
> > > boolean result = srcFile.renameTo(targetFile);
> > >
> > > log4j2 does this [3]:
> > >  srcStream = new FileInputStream(source);
> > > destStream = new FileOutputStream(destination);
> > > srcChannel = srcStream.getChannel();
> > > destChannel = destStream.getChannel();
> > > destChannel.transferFrom(
> > srcChannel, 0, srcChannel.size());
> > >
> > > The JavaDoc for File.renameTo says:
> > >  Many aspects of the behavior of this method are inherently
> > >  platform-dependent: The rename operation might not be able to move a
> > >  file from one filesystem to another, it might not be atomic, and it
> > >  might not succeed if a file with the destination abstract pathname
> > >  already exists.  The return value should always be checked to make
> sure
> > >  that the rename operation was successful.
> > >
> > > My conclusion is that the loggers (except possibly log4j2) can't be
> > > relied on to atomically roll the file.
> > >
> > > Cheers,
> > > Edward
> > >
> > >
> > > Links:
> > > [1]
> > >
> >
> http://svn.apache.org/viewvc/logging/log4j/companions/extras/trunk/src/main/java/org/apache/log4j/rolling/helper/FileRenameAction.java?view=markup
> > > l77
> > >
> > > [2]
> > >
> >
> https://github.com/qos-ch/logback/blob/master/logback-core/src/main/java/ch/qos/logback/core/rolling/helper/RenameUtil.java
> > > ,
> > > l63
> > > [3]
> > >
> >
> https://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/core/src/main/java/org/apache/logging/log4j/core/appender/rolling/helper/FileRenameAction.java
> > >
> > >
> > > >Hi Edw