[jira] [Created] (FLUME-2500) Add a channel that uses Kafka

2014-10-13 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created FLUME-2500:
---

 Summary: Add a channel that uses Kafka 
 Key: FLUME-2500
 URL: https://issues.apache.org/jira/browse/FLUME-2500
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan


Here is the rationale:
- Kafka does give a HA channel, which means a dead agent does not affect the 
data in the channel - thus reducing delay of delivery.
- Kafka is used by many companies - it would be a good idea to use Flume to 
pull data from Kafka and write it to HDFS/HBase etc. 

This channel is not going to be useful for cases where Kafka is not already 
used, since it brings is operational overhead of maintaining two systems, but 
if there is Kafka in use - this is good way to integrate Kafka and Flume.

Here is an a scratch implementation: 
https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2298) Replication Channel

2014-10-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168983#comment-14168983
 ] 

Hari Shreedharan commented on FLUME-2298:
-

I also wrote one a few weeks ago, which uses Kafka: 
https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java

 Replication Channel
 ---

 Key: FLUME-2298
 URL: https://issues.apache.org/jira/browse/FLUME-2298
 Project: Flume
  Issue Type: New Feature
  Components: Channel
Reporter: Ted Malaska
Assignee: Ted Malaska
 Attachments: Flume Replication Channel Design.0.3.pdf, 
 FlumeDistributedChannelDesign.0.1.pdf, 
 FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf


 This channel will allow for events to be persisted with a plugable method on 
 more then one agent or node.  
 The goal is to gain the following benefits:
 1. Events will continue to flow to sinks with out loss or with out large 
 delay even in the case of node failure.
 2. Protect against loss in the case of a complete single node failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2298) Replication Channel

2014-10-13 Thread Ashish Paliwal (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168997#comment-14168997
 ] 

Ashish Paliwal commented on FLUME-2298:
---

Cool ! 

A big +1 for this, lets get this in the release

 Replication Channel
 ---

 Key: FLUME-2298
 URL: https://issues.apache.org/jira/browse/FLUME-2298
 Project: Flume
  Issue Type: New Feature
  Components: Channel
Reporter: Ted Malaska
Assignee: Ted Malaska
 Attachments: Flume Replication Channel Design.0.3.pdf, 
 FlumeDistributedChannelDesign.0.1.pdf, 
 FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf


 This channel will allow for events to be persisted with a plugable method on 
 more then one agent or node.  
 The goal is to gain the following benefits:
 1. Events will continue to flow to sinks with out loss or with out large 
 delay even in the case of node failure.
 2. Protect against loss in the case of a complete single node failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2298) Replication Channel

2014-10-13 Thread Ted Malaska (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169215#comment-14169215
 ] 

Ted Malaska commented on FLUME-2298:


Yes +1 to the Kafka Channel.  

This opens up so many options and also allows for external to Flume development 
of improvements to the channel.

Thank you Hari

 Replication Channel
 ---

 Key: FLUME-2298
 URL: https://issues.apache.org/jira/browse/FLUME-2298
 Project: Flume
  Issue Type: New Feature
  Components: Channel
Reporter: Ted Malaska
Assignee: Ted Malaska
 Attachments: Flume Replication Channel Design.0.3.pdf, 
 FlumeDistributedChannelDesign.0.1.pdf, 
 FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf


 This channel will allow for events to be persisted with a plugable method on 
 more then one agent or node.  
 The goal is to gain the following benefits:
 1. Events will continue to flow to sinks with out loss or with out large 
 delay even in the case of node failure.
 2. Protect against loss in the case of a complete single node failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 16179: Patch for FLUME-2246 - event body data size can make it configurable for logger sinker

2014-10-13 Thread Ashish Paliwal


 On Oct. 11, 2014, 8:11 p.m., Jarek Cecho wrote:
  flume-ng-core/src/test/java/org/apache/flume/sink/TestLoggerSink.java, 
  lines 68-89
  https://reviews.apache.org/r/16179/diff/1/?file=396539#file396539line68
 
  I'm missing validation of inserted data. Shouldn't there be some assert 
  statements?
  
  Also the test itself can be probably enhanced. What about having 
  constant message (10 bytes) and changing the maxBytesToDump option from 1 
  to 20 to ensure that we get on the output expected outcome?

This is LoggerSink, writing to a file, not sure how to intercept written 
message and apply an assert on it. The test is built upon an existing test 
case. Any pointers?


- Ashish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16179/#review56298
---


On Dec. 11, 2013, 9:58 a.m., Ashish Paliwal wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16179/
 ---
 
 (Updated Dec. 11, 2013, 9:58 a.m.)
 
 
 Review request for Flume.
 
 
 Bugs: FLUME-2246
 https://issues.apache.org/jira/browse/FLUME-2246
 
 
 Repository: flume-git
 
 
 Description
 ---
 
 Added configuration options to let user select the max bytes to dump in log 
 file. Default stays at 16 bytes.
 
 Change summary
 Made LoggerSink implement configurable
 user can provide maxByteToDump using configuration
 User Guide updated
 
 
 Diffs
 -
 
   flume-ng-core/src/main/java/org/apache/flume/sink/LoggerSink.java 128fa84 
   flume-ng-core/src/test/java/org/apache/flume/sink/TestLoggerSink.java 
 92ff6fe 
   flume-ng-doc/sphinx/FlumeUserGuide.rst ae66f89 
 
 Diff: https://reviews.apache.org/r/16179/diff/
 
 
 Testing
 ---
 
 Added test case that sets option to a value higher than 30 and use a string 
 of length greater than 16 in output.
 
 
 Thanks,
 
 Ashish Paliwal
 




[jira] [Commented] (FLUME-2500) Add a channel that uses Kafka

2014-10-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169427#comment-14169427
 ] 

Jarek Jarcec Cecho commented on FLUME-2500:
---

I like the idea of Kafka channel, so I'm +1 on the idea!

 Add a channel that uses Kafka 
 --

 Key: FLUME-2500
 URL: https://issues.apache.org/jira/browse/FLUME-2500
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan

 Here is the rationale:
 - Kafka does give a HA channel, which means a dead agent does not affect the 
 data in the channel - thus reducing delay of delivery.
 - Kafka is used by many companies - it would be a good idea to use Flume to 
 pull data from Kafka and write it to HDFS/HBase etc. 
 This channel is not going to be useful for cases where Kafka is not already 
 used, since it brings is operational overhead of maintaining two systems, but 
 if there is Kafka in use - this is good way to integrate Kafka and Flume.
 Here is an a scratch implementation: 
 https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2451) HDFS Sink Cannot Reconnect After NameNode Restart

2014-10-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169844#comment-14169844
 ] 

Hari Shreedharan commented on FLUME-2451:
-

Roshan:

- Do you think we should close *all* bucket writers on an HDFS IOException? Is 
there any specific way to identify if the namenode is down? We don't really 
need to close all bucket writers in all cases.
- The sfWritersMap is guarded by a lock - you should use the locking mechanism 
in this method too.

Also, the 2nd patch only has the unit test, not the code fix. Can you merge the 
two patches? Thanks!

 HDFS Sink Cannot Reconnect After NameNode Restart
 -

 Key: FLUME-2451
 URL: https://issues.apache.org/jira/browse/FLUME-2451
 Project: Flume
  Issue Type: Bug
  Components: File Channel, Sinks+Sources
Affects Versions: v1.4.0
 Environment: 8 node CDH 4.2.2 (2.0.0-cdh4.2.2) cluster
 All cluster machines are running Ubuntu 12.04 x86_64
Reporter: Andrew O'Neill
Assignee: Roshan Naik
  Labels: HDFS, Sink
 Attachments: FLUME-2451.patch, FLUME-2451.v2.patch


 I am testing a simple flume setup with a Sequence Generator Source, a File 
 Channel, and an HDFS Sink (see my flume.conf below). This configuration works 
 as expected until I reboot the cluster’s NameNode or until I restart the HDFS 
 service on the cluster. At this point, it appears that the Flume Agent cannot 
 reconnect to HDFS and must be manually restarted.
 Here is our flume.conf:
 appserver.sources = rawtext
 appserver.channels = testchannel
 appserver.sinks = test_sink
 appserver.sources.rawtext.type = seq
 appserver.sources.rawtext.channels = testchannel
 appserver.channels.testchannel.type = file
 appserver.channels.testchannel.capacity = 1000
 appserver.channels.testchannel.minimumRequiredSpace = 214748364800
 appserver.channels.testchannel.checkpointDir = 
 /Users/aoneill/Desktop/testchannel/checkpoint
 appserver.channels.testchannel.dataDirs = 
 /Users/aoneill/Desktop/testchannel/data
 appserver.channels.testchannel.maxFileSize = 2000
 appserver.sinks.test_sink.type = hdfs
 appserver.sinks.test_sink.channel = testchannel
 appserver.sinks.test_sink.hdfs.path = 
 hdfs://cluster01:8020/user/aoneill/flumetest
 appserver.sinks.test_sink.hdfs.closeTries = 3
 appserver.sinks.test_sink.hdfs.filePrefix = events-
 appserver.sinks.test_sink.hdfs.fileSuffix = .avro
 appserver.sinks.test_sink.hdfs.fileType = DataStream
 appserver.sinks.test_sink.hdfs.writeFormat = Text
 appserver.sinks.test_sink.hdfs.inUsePrefix = inuse-
 appserver.sinks.test_sink.hdfs.inUseSuffix = .avro
 appserver.sinks.test_sink.hdfs.rollCount = 10
 appserver.sinks.test_sink.hdfs.rollInterval = 30
 appserver.sinks.test_sink.hdfs.rollSize = 10485760
 These are the two error message that the Flume Agent outputs constantly after 
 the restart:
 2014-08-26 10:47:24,572 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
 [ERROR - 
 org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:96)]
  Unexpected error while checking replication factor
 java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
 at 
 org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
 at 
 org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
 at 
 org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
 at 
 org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891)
 at 
 

[jira] [Created] (FLUME-2501) Updating HttpClient lib version to ensure compat with Kite Morphline v0.12

2014-10-13 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2501:
--

 Summary: Updating HttpClient lib version to ensure compat with 
Kite Morphline v0.12
 Key: FLUME-2501
 URL: https://issues.apache.org/jira/browse/FLUME-2501
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Roshan Naik
Assignee: Roshan Naik


Mismatch in httpclient and http core libs pulled by flume v/s the ones that 
come with kite causes errors at runtime

{code}
2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)]
 Creating new http client, 
config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)]
 Unable to start SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ 
name:null counters:{} } } - Exception follows.
java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET
at 
org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175)
at 
org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158)
at 
org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251)
at 
org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104)
at 
org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39)
at 
org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35)
at 
org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116)
at 
org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70)
at 
org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52)
at 
org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303)
at 
org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250)
at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46)
at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40)
at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126)
at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55)
at 
org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101)
at 
org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97)
at 
org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46)
at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
at 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr

2014-10-13 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2501:
---
Description: 
Mismatch in httpclient and http core libs pulled by flume v/s the ones that 
come with Solr causes errors at runtime

{code}
2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)]
 Creating new http client, 
config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)]
 Unable to start SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ 
name:null counters:{} } } - Exception follows.
java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET
at 
org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175)
at 
org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158)
at 
org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251)
at 
org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133)
at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104)
at 
org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39)
at 
org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35)
at 
org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116)
at 
org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70)
at 
org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52)
at 
org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303)
at 
org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250)
at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46)
at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40)
at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126)
at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55)
at 
org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101)
at 
org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97)
at 
org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46)
at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
at 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

  was:
Mismatch in httpclient and http core libs pulled by flume v/s the ones that 
come with kite causes errors at runtime

{code}
2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)]
 Creating new http client, 
config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)]
 Unable to start SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ 
name:null counters:{} } } - Exception follows.

[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr

2014-10-13 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2501:
---
Summary: Updating HttpClient lib version to ensure compat with Solr  (was: 
Updating HttpClient lib version to ensure compat with Kite Morphline v0.12)

 Updating HttpClient lib version to ensure compat with Solr
 --

 Key: FLUME-2501
 URL: https://issues.apache.org/jira/browse/FLUME-2501
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Roshan Naik
Assignee: Roshan Naik

 Mismatch in httpclient and http core libs pulled by flume v/s the ones that 
 come with kite causes errors at runtime
 {code}
 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)]
  Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)]
  Unable to start SinkRunner: { 
 policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ 
 name:null counters:{} } } - Exception follows.
 java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET
   at 
 org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175)
   at 
 org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158)
   at 
 org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251)
   at 
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109)
   at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161)
   at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104)
   at 
 org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39)
   at 
 org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35)
   at 
 org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116)
   at 
 org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70)
   at 
 org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52)
   at 
 org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303)
   at 
 org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250)
   at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46)
   at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40)
   at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126)
   at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55)
   at 
 org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101)
   at 
 org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97)
   at 
 org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46)
   at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
   at 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr

2014-10-13 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2501:
---
Attachment: FLUME-2501.patch

 Updating HttpClient lib version to ensure compat with Solr
 --

 Key: FLUME-2501
 URL: https://issues.apache.org/jira/browse/FLUME-2501
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: FLUME-2501.patch


 Mismatch in httpclient and http core libs pulled by flume v/s the ones that 
 come with Solr causes errors at runtime
 {code}
 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)]
  Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)]
  Unable to start SinkRunner: { 
 policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ 
 name:null counters:{} } } - Exception follows.
 java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET
   at 
 org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175)
   at 
 org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158)
   at 
 org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251)
   at 
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133)
   at 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109)
   at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161)
   at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114)
   at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104)
   at 
 org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39)
   at 
 org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35)
   at 
 org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116)
   at 
 org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70)
   at 
 org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52)
   at 
 org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303)
   at 
 org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250)
   at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46)
   at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40)
   at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126)
   at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55)
   at 
 org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101)
   at 
 org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97)
   at 
 org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46)
   at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
   at 
 org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2497) TCP and UDP syslog sources parsing the timestamp incorrectly

2014-10-13 Thread Johny Rufus (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johny Rufus updated FLUME-2497:
---
Description: 
TCP and UDP syslog sources parse the timestamp incorrectly while using 
Syslogutils extractEvent and buildEvent  methods.


  was:
TCP and UDP syslog sources parse the timestamp incorrectly while using 
Syslogutils extractEvent and buildEvent  methods.
We need to align these sources to use the syslogParser (as used by 
MultiportSyslogTCPSource )to parse the messages correctly.


 TCP and UDP syslog sources parsing the timestamp incorrectly
 

 Key: FLUME-2497
 URL: https://issues.apache.org/jira/browse/FLUME-2497
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0.1
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0


 TCP and UDP syslog sources parse the timestamp incorrectly while using 
 Syslogutils extractEvent and buildEvent  methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2497) TCP and UDP syslog sources parsing the timestamp incorrectly

2014-10-13 Thread Johny Rufus (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johny Rufus updated FLUME-2497:
---
Attachment: FLUME-2497.patch

 TCP and UDP syslog sources parsing the timestamp incorrectly
 

 Key: FLUME-2497
 URL: https://issues.apache.org/jira/browse/FLUME-2497
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0.1
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2497.patch


 TCP and UDP syslog sources parse the timestamp incorrectly while using 
 Syslogutils extractEvent and buildEvent  methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2482) Race condition in File Channels' Log.removeOldLogs

2014-10-13 Thread Johny Rufus (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johny Rufus updated FLUME-2482:
---
Attachment: FLUME-2482.patch

 Race condition in File Channels' Log.removeOldLogs
 --

 Key: FLUME-2482
 URL: https://issues.apache.org/jira/browse/FLUME-2482
 Project: Flume
  Issue Type: Bug
  Components: File Channel
Affects Versions: v1.5.0.1
Reporter: Santiago M. Mola
  Labels: race-condition
 Fix For: v1.6.0

 Attachments: FLUME-2482.patch


 TestFileChannelRestart.testToggleCheckpointCompressionFromFalseToTrue 
 sometimes produces a race condition at Log.removeOldLogs.
 https://travis-ci.org/Stratio/flume/jobs/36782318#L6193
 testToggleCheckpointCompressionFromFalseToTrue(org.apache.flume.channel.file.TestFileChannelRestart)
  Time elapsed: 144362 sec  ERROR!
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at org.apache.flume.channel.file.Log.removeOldLogs(Log.java:1070)
 at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1055)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.fest.reflect.method.Invoker.invoke(Invoker.java:110)
 at org.apache.flume.channel.file.TestUtils.forceCheckpoint(TestUtils.java:134)
 at 
 org.apache.flume.channel.file.TestFileChannelRestart.restartToggleCompression(TestFileChannelRestart.java:930)
 at 
 org.apache.flume.channel.file.TestFileChannelRestart.testToggleCheckpointCompressionFromFalseToTrue(TestFileChannelRestart.java:896)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
 at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
 at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize

2014-10-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170503#comment-14170503
 ] 

Hari Shreedharan commented on FLUME-2006:
-

This generally looks good, but it introduces a dependency on guava in the 
flume-ng-sdk. Looks like it is used only for StringUtils#isNullOrEmpty - you 
can just copy this over to a utils class, to avoid having to bundle guava with 
the sdk.

 in Avro the batch size is called batch-size, in all other sources batchSize
 ---

 Key: FLUME-2006
 URL: https://issues.apache.org/jira/browse/FLUME-2006
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.4.0, v1.3.1
Reporter: Alexander Alten-Lorenz
Assignee: Ashish Paliwal
Priority: Trivial
 Attachments: FLUME-2006-0.patch


 http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e
 The mismatch is with Avro Sink as well as Thrift sink. Other sinks use 
 batchSize as param name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize

2014-10-13 Thread Ashish Paliwal (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170508#comment-14170508
 ] 

Ashish Paliwal commented on FLUME-2006:
---

Thanks [~hshreedharan] ! The patch doesn't apply on trunk, so working on it. I 
notices issue with the fix in sub-classes of AbstractRpcSink, where I replaced 
the code with new param 'batchSize'. I think they need to support both param. 
Taking a look again at the patch. I shall remove the dependency from Guava as 
well. Once done shall upload the patch.

 in Avro the batch size is called batch-size, in all other sources batchSize
 ---

 Key: FLUME-2006
 URL: https://issues.apache.org/jira/browse/FLUME-2006
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.4.0, v1.3.1
Reporter: Alexander Alten-Lorenz
Assignee: Ashish Paliwal
Priority: Trivial
 Attachments: FLUME-2006-0.patch


 http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e
 The mismatch is with Avro Sink as well as Thrift sink. Other sinks use 
 batchSize as param name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)