date:20120928

[jira] [Commented] (FLUME-1615) HBaseSink writes byte[] address instead of columnFamily name in the exceptions message

2012-09-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466074#comment-13466074
 ] 

Hudson commented on FLUME-1615:
---

Integrated in flume-trunk #311 (See 
[https://builds.apache.org/job/flume-trunk/311/])
FLUME-1615. HBaseSink writes byte[] address instead of columnFamily name in 
the exceptions message (Revision 74d087d54e29286c5ee0216e8810ca7b5452d0ee)

 Result = SUCCESS
hshreedharan : 
http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=74d087d54e29286c5ee0216e8810ca7b5452d0ee
Files : 
* 
flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java


> HBaseSink writes byte[] address instead of columnFamily name in the 
> exceptions message
> --
>
> Key: FLUME-1615
> URL: https://issues.apache.org/jira/browse/FLUME-1615
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Trivial
> Attachments: FLUME-1615-v1.patch
>
>
> HBaseSink has two exceptions that print out columnFamily.
> columnFamily is a byte[] and is not converted to string during the 
> concatenation, and the result is not the name as expected.
> {code}
> org.apache.flume.FlumeException: Error getting column family from 
> HBase.Please verify that the table abc and Column Family,  [B@4ceafb71 exists 
> in HBase.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1616) FileChannel will lose data in when rollback fails with IOException

2012-09-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466075#comment-13466075
 ] 

Hudson commented on FLUME-1616:
---

Integrated in flume-trunk #311 (See 
[https://builds.apache.org/job/flume-trunk/311/])
FLUME-1616: FileChannel will lose data in when rollback fails with 
IOException (Revision a76beeb61700f08bafeb625040379ebd56e000a6)

 Result = SUCCESS
brock : 
http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=a76beeb61700f08bafeb625040379ebd56e000a6
Files : 
* 
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java


> FileChannel will lose data in when rollback fails with IOException
> --
>
> Key: FLUME-1616
> URL: https://issues.apache.org/jira/browse/FLUME-1616
> Project: Flume
>  Issue Type: Bug
>  Components: Channel
>Reporter: Brock Noland
>Assignee: Hari Shreedharan
> Fix For: v1.3.0
>
> Attachments: FLUME-1616.patch
>
>
> In doRollback we write to the log first and then to the queue. If the log 
> write fails the takes will be lost unless the agent is stopped quickly and 
> the events are replayed. I think in the case of an IOException we should 
> still put the events back on the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

2012-09-28 Thread Mike Percy (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466049#comment-13466049
 ] 

Mike Percy edited comment on FLUME-1573 at 9/29/12 11:35 AM:
-

Denny, good catch. You're right about that. That should be a constructor 
argument if we want the sink to control it.

I'm open to adding an option for UUID as well.

Edit: by constructor argument I mean BucketWriter constructor argument

  was (Author: mpercy):
Denny, good catch. You're right about that. That should be a constructor 
argument if we want the sink to control it.

I'm open to adding an option for UUID as well.
  
> Duplicated HDFS file name when multiple SinkRunner was existing
> ---
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always 
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442)  - HDFS IO error
> java.io.IOException: Callable timed out after 1 ms
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As 
> expected, I found the duplicated creation exception with same with at HDFS. 
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO  [hdfs-hdfsSink-3-call-runner-7] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO  [hdfs-hdfsSink-4-call-runner-8] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named 
> 'fileExtensionCounter' at BucketWriter. Different threads should own same 
> counter by protected with CAS, not multiple private property in each thread. 
> It's useless to avoid conflict of HDFS path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

2012-09-28 Thread Mike Percy (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466049#comment-13466049
 ] 

Mike Percy commented on FLUME-1573:
---

Denny, good catch. You're right about that. That should be a constructor 
argument if we want the sink to control it.

I'm open to adding an option for UUID as well.

> Duplicated HDFS file name when multiple SinkRunner was existing
> ---
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always 
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442)  - HDFS IO error
> java.io.IOException: Callable timed out after 1 ms
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As 
> expected, I found the duplicated creation exception with same with at HDFS. 
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO  [hdfs-hdfsSink-3-call-runner-7] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO  [hdfs-hdfsSink-4-call-runner-8] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named 
> 'fileExtensionCounter' at BucketWriter. Different threads should own same 
> counter by protected with CAS, not multiple private property in each thread. 
> It's useless to avoid conflict of HDFS path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (FLUME-1491) Dynamic configuration from Zookeeper watcher

2012-09-28 Thread Hari Shreedharan (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLUME-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan reassigned FLUME-1491:
---

Assignee: Christopher Nagy

Assigning to Christopher since he submitted the first 2 patches.

> Dynamic configuration from Zookeeper watcher
> 
>
> Key: FLUME-1491
> URL: https://issues.apache.org/jira/browse/FLUME-1491
> Project: Flume
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Christopher Nagy
>  Labels: Zookeeper
> Fix For: v1.3.0
>
> Attachments: FLUME-1491-2.patch, FLUME-1491-3.patch
>
>
> Currently, Flume only support file-level dynamic configuration. Another 
> frequent usage in practical environment, we would like to manage 
> configuration with Zookeeper, and modify configuration from Web UI to stored 
> file in Zookeeper. 
> Flume should support this method with Zookeeper watcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (FLUME-1491) Dynamic configuration from Zookeeper watcher

2012-09-28 Thread Denny Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLUME-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denny Ye reassigned FLUME-1491:
---

Assignee: (was: Denny Ye)

> Dynamic configuration from Zookeeper watcher
> 
>
> Key: FLUME-1491
> URL: https://issues.apache.org/jira/browse/FLUME-1491
> Project: Flume
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>  Labels: Zookeeper
> Fix For: v1.3.0
>
> Attachments: FLUME-1491-2.patch, FLUME-1491-3.patch
>
>
> Currently, Flume only support file-level dynamic configuration. Another 
> frequent usage in practical environment, we would like to manage 
> configuration with Zookeeper, and modify configuration from Web UI to stored 
> file in Zookeeper. 
> Flume should support this method with Zookeeper watcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1491) Dynamic configuration from Zookeeper watcher

2012-09-28 Thread Hari Shreedharan (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465997#comment-13465997
 ] 

Hari Shreedharan commented on FLUME-1491:
-

Good work on the patch, some comments before we start a full review:
* AbstractConfigurationProvider and related classes should not be in 
flume-ng-core. Either leave them in the current package or move them to 
flume-ng-configuration. 
* Ideally flume-ng-configuration should not have dependencies on any other 
flume modules. The original idea was to allow this to be used for validating 
individual components' configuration. Eventually, I'd like to complete that 
work.
* The Zookeeper provider can simply be a separate package and not a submodule 
right? Do we need it to be a separate module - I am ok with either, just asking.

When you feel the patch is ready for review/commit, can you submit this for 
review on reviewboard?

> Dynamic configuration from Zookeeper watcher
> 
>
> Key: FLUME-1491
> URL: https://issues.apache.org/jira/browse/FLUME-1491
> Project: Flume
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
>  Labels: Zookeeper
> Fix For: v1.3.0
>
> Attachments: FLUME-1491-2.patch, FLUME-1491-3.patch
>
>
> Currently, Flume only support file-level dynamic configuration. Another 
> frequent usage in practical environment, we would like to manage 
> configuration with Zookeeper, and modify configuration from Web UI to stored 
> file in Zookeeper. 
> Flume should support this method with Zookeeper watcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

2012-09-28 Thread Denny Ye (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465993#comment-13465993
 ] 

Denny Ye commented on FLUME-1573:
-

@Brock, you said right, UUID is standard method to distinguish unique path. 
It's another choose for us.

@Arvind, collision also can be found at different bucket writers in same Sink. 
There is common root cause no matter different Sinks or BucketWriters. If there 
are multiple BucketWriters created at contiguous timestamp, collision would be 
happen. 

> Duplicated HDFS file name when multiple SinkRunner was existing
> ---
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always 
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442)  - HDFS IO error
> java.io.IOException: Callable timed out after 1 ms
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As 
> expected, I found the duplicated creation exception with same with at HDFS. 
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO  [hdfs-hdfsSink-3-call-runner-7] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO  [hdfs-hdfsSink-4-call-runner-8] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named 
> 'fileExtensionCounter' at BucketWriter. Different threads should own same 
> counter by protected with CAS, not multiple private property in each thread. 
> It's useless to avoid conflict of HDFS path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [DISCUSS] 1.3.0 release

2012-09-28 Thread Mike Percy

Brock,
Coming up with a 1.3.0 release sounds like a good idea.

+1 for you as the RM for the release.

Regards,
Mike

On Fri, Sep 28, 2012 at 9:26 AM, Brock Noland  wrote:

> Hi,
>
> I think it's about time to release, 93 issues have been resolved for
> 1.3: http://s.apache.org/0C and users (on user@) have said they are
> running 1.3.0-SNAPSHOT. The file channel has seen a large number of
> fixes and enhancements including Encryption. We have new sources
> (MultiPort Syslog / Scribe) and new features (metrics, batching
> support for many sinks/sources).
>
> I'd like to volunteer as the Release Manager for the 1.3 release.
>
> Cheers!
> Brock
>

[jira] [Updated] (FLUME-1491) Dynamic configuration from Zookeeper watcher

2012-09-28 Thread Christopher Nagy (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLUME-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Nagy updated FLUME-1491:


Attachment: FLUME-1491-3.patch

Same as FLUME-1491-2.patch, but added the capability to read the contents of a 
property file from the root flume node

> Dynamic configuration from Zookeeper watcher
> 
>
> Key: FLUME-1491
> URL: https://issues.apache.org/jira/browse/FLUME-1491
> Project: Flume
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
>  Labels: Zookeeper
> Fix For: v1.3.0
>
> Attachments: FLUME-1491-2.patch, FLUME-1491-3.patch
>
>
> Currently, Flume only support file-level dynamic configuration. Another 
> frequent usage in practical environment, we would like to manage 
> configuration with Zookeeper, and modify configuration from Web UI to stored 
> file in Zookeeper. 
> Flume should support this method with Zookeeper watcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [DISCUSS] 1.3.0 release

2012-09-28 Thread Mubarak Seyed

+1

Thanks,
Mubarak
On Sep 28, 2012, at 11:31 AM, Hari Shreedharan wrote:

> I agree it is a good time for a release. Thanks Brock for volunteering to be 
> the RM!  
> 
> My +1 for both!
> 
> 
> Hari
> 
> 
> -- 
> Hari Shreedharan
> 
> 
> On Friday, September 28, 2012 at 10:13 AM, Will McQueen wrote:
> 
>> +1
>> 
>> On Fri, Sep 28, 2012 at 9:26 AM, Brock Noland > (mailto:br...@cloudera.com)> wrote:
>> 
>>> Hi,
>>> 
>>> I think it's about time to release, 93 issues have been resolved for
>>> 1.3: http://s.apache.org/0C and users (on user@) have said they are
>>> running 1.3.0-SNAPSHOT. The file channel has seen a large number of
>>> fixes and enhancements including Encryption. We have new sources
>>> (MultiPort Syslog / Scribe) and new features (metrics, batching
>>> support for many sinks/sources).
>>> 
>>> I'd like to volunteer as the Release Manager for the 1.3 release.
>>> 
>>> Cheers!
>>> Brock
>>> 
>> 
>> 
>> 
> 
>

Re: [DISCUSS] 1.3.0 release

2012-09-28 Thread Hari Shreedharan

I agree it is a good time for a release. Thanks Brock for volunteering to be 
the RM!  

My +1 for both!


Hari


-- 
Hari Shreedharan


On Friday, September 28, 2012 at 10:13 AM, Will McQueen wrote:

> +1
> 
> On Fri, Sep 28, 2012 at 9:26 AM, Brock Noland  (mailto:br...@cloudera.com)> wrote:
> 
> > Hi,
> > 
> > I think it's about time to release, 93 issues have been resolved for
> > 1.3: http://s.apache.org/0C and users (on user@) have said they are
> > running 1.3.0-SNAPSHOT. The file channel has seen a large number of
> > fixes and enhancements including Encryption. We have new sources
> > (MultiPort Syslog / Scribe) and new features (metrics, batching
> > support for many sinks/sources).
> > 
> > I'd like to volunteer as the Release Manager for the 1.3 release.
> > 
> > Cheers!
> > Brock
> > 
> 
> 
>

Re: [DISCUSS] 1.3.0 release

2012-09-28 Thread Will McQueen

+1

On Fri, Sep 28, 2012 at 9:26 AM, Brock Noland  wrote:

> Hi,
>
> I think it's about time to release, 93 issues have been resolved for
> 1.3: http://s.apache.org/0C and users (on user@) have said they are
> running 1.3.0-SNAPSHOT. The file channel has seen a large number of
> fixes and enhancements including Encryption. We have new sources
> (MultiPort Syslog / Scribe) and new features (metrics, batching
> support for many sinks/sources).
>
> I'd like to volunteer as the Release Manager for the 1.3 release.
>
> Cheers!
> Brock
>

[DISCUSS] 1.3.0 release

2012-09-28 Thread Brock Noland

Hi,

I think it's about time to release, 93 issues have been resolved for
1.3: http://s.apache.org/0C and users (on user@) have said they are
running 1.3.0-SNAPSHOT. The file channel has seen a large number of
fixes and enhancements including Encryption. We have new sources
(MultiPort Syslog / Scribe) and new features (metrics, batching
support for many sinks/sources).

I'd like to volunteer as the Release Manager for the 1.3 release.

Cheers!
Brock

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

2012-09-28 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465699#comment-13465699
 ] 

Brock Noland commented on FLUME-1573:
-

FLUME-1370 suggests using a UUID to generate the timestamp portion. How do we 
feel about that?

> Duplicated HDFS file name when multiple SinkRunner was existing
> ---
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always 
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442)  - HDFS IO error
> java.io.IOException: Callable timed out after 1 ms
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As 
> expected, I found the duplicated creation exception with same with at HDFS. 
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO  [hdfs-hdfsSink-3-call-runner-7] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO  [hdfs-hdfsSink-4-call-runner-8] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named 
> 'fileExtensionCounter' at BucketWriter. Different threads should own same 
> counter by protected with CAS, not multiple private property in each thread. 
> It's useless to avoid conflict of HDFS path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

2012-09-28 Thread Arvind Prabhakar (JIRA)


[ 
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465681#comment-13465681
 ] 

Arvind Prabhakar commented on FLUME-1573:
-

@Denny - a sink is an independent, isolated component of Flume. It cannot 
assume any knowledge of other sink(s) operating within the same agent. Having a 
synchronization requirement across multiple sinks breaks this invariant.

However, if within the same sink there are problems due to collisions between 
different bucket writers, that would be a bug and merits fixing. From the 
explanation above that does not seem to be the case to me.

> Duplicated HDFS file name when multiple SinkRunner was existing
> ---
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0
>Reporter: Denny Ye
>Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always 
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN  
> [SinkRunner-PollingRunner-DefaultSinkProcessor] 
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442)  - HDFS IO error
> java.io.IOException: Callable timed out after 1 ms
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As 
> expected, I found the duplicated creation exception with same with at HDFS. 
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO  [hdfs-hdfsSink-3-call-runner-7] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO  [hdfs-hdfsSink-4-call-runner-8] 
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189)  - Creating 
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named 
> 'fileExtensionCounter' at BucketWriter. Different threads should own same 
> counter by protected with CAS, not multiple private property in each thread. 
> It's useless to avoid conflict of HDFS path

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1615) HBaseSink writes byte[] address instead of columnFamily name in the exceptions message

[jira] [Commented] (FLUME-1616) FileChannel will lose data in when rollback fails with IOException

[jira] [Comment Edited] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

[jira] [Assigned] (FLUME-1491) Dynamic configuration from Zookeeper watcher

[jira] [Assigned] (FLUME-1491) Dynamic configuration from Zookeeper watcher

[jira] [Commented] (FLUME-1491) Dynamic configuration from Zookeeper watcher

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

Re: [DISCUSS] 1.3.0 release

[jira] [Updated] (FLUME-1491) Dynamic configuration from Zookeeper watcher

Re: [DISCUSS] 1.3.0 release

Re: [DISCUSS] 1.3.0 release

Re: [DISCUSS] 1.3.0 release

[DISCUSS] 1.3.0 release

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing

16 matches

Site Navigation

Mail list logo

Footer information