[jira] [Created] (FLUME-2500) Add a channel that uses Kafka
Hari Shreedharan created FLUME-2500: --- Summary: Add a channel that uses Kafka Key: FLUME-2500 URL: https://issues.apache.org/jira/browse/FLUME-2500 Project: Flume Issue Type: Bug Reporter: Hari Shreedharan Assignee: Hari Shreedharan Here is the rationale: - Kafka does give a HA channel, which means a dead agent does not affect the data in the channel - thus reducing delay of delivery. - Kafka is used by many companies - it would be a good idea to use Flume to pull data from Kafka and write it to HDFS/HBase etc. This channel is not going to be useful for cases where Kafka is not already used, since it brings is operational overhead of maintaining two systems, but if there is Kafka in use - this is good way to integrate Kafka and Flume. Here is an a scratch implementation: https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2298) Replication Channel
[ https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168983#comment-14168983 ] Hari Shreedharan commented on FLUME-2298: - I also wrote one a few weeks ago, which uses Kafka: https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java Replication Channel --- Key: FLUME-2298 URL: https://issues.apache.org/jira/browse/FLUME-2298 Project: Flume Issue Type: New Feature Components: Channel Reporter: Ted Malaska Assignee: Ted Malaska Attachments: Flume Replication Channel Design.0.3.pdf, FlumeDistributedChannelDesign.0.1.pdf, FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf This channel will allow for events to be persisted with a plugable method on more then one agent or node. The goal is to gain the following benefits: 1. Events will continue to flow to sinks with out loss or with out large delay even in the case of node failure. 2. Protect against loss in the case of a complete single node failure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2298) Replication Channel
[ https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168997#comment-14168997 ] Ashish Paliwal commented on FLUME-2298: --- Cool ! A big +1 for this, lets get this in the release Replication Channel --- Key: FLUME-2298 URL: https://issues.apache.org/jira/browse/FLUME-2298 Project: Flume Issue Type: New Feature Components: Channel Reporter: Ted Malaska Assignee: Ted Malaska Attachments: Flume Replication Channel Design.0.3.pdf, FlumeDistributedChannelDesign.0.1.pdf, FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf This channel will allow for events to be persisted with a plugable method on more then one agent or node. The goal is to gain the following benefits: 1. Events will continue to flow to sinks with out loss or with out large delay even in the case of node failure. 2. Protect against loss in the case of a complete single node failure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2298) Replication Channel
[ https://issues.apache.org/jira/browse/FLUME-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169215#comment-14169215 ] Ted Malaska commented on FLUME-2298: Yes +1 to the Kafka Channel. This opens up so many options and also allows for external to Flume development of improvements to the channel. Thank you Hari Replication Channel --- Key: FLUME-2298 URL: https://issues.apache.org/jira/browse/FLUME-2298 Project: Flume Issue Type: New Feature Components: Channel Reporter: Ted Malaska Assignee: Ted Malaska Attachments: Flume Replication Channel Design.0.3.pdf, FlumeDistributedChannelDesign.0.1.pdf, FlumeDistributedChannelDesign.0.2.1.pdf, FlumeDistributedChannelDesign.0.2.pdf This channel will allow for events to be persisted with a plugable method on more then one agent or node. The goal is to gain the following benefits: 1. Events will continue to flow to sinks with out loss or with out large delay even in the case of node failure. 2. Protect against loss in the case of a complete single node failure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 16179: Patch for FLUME-2246 - event body data size can make it configurable for logger sinker
On Oct. 11, 2014, 8:11 p.m., Jarek Cecho wrote: flume-ng-core/src/test/java/org/apache/flume/sink/TestLoggerSink.java, lines 68-89 https://reviews.apache.org/r/16179/diff/1/?file=396539#file396539line68 I'm missing validation of inserted data. Shouldn't there be some assert statements? Also the test itself can be probably enhanced. What about having constant message (10 bytes) and changing the maxBytesToDump option from 1 to 20 to ensure that we get on the output expected outcome? This is LoggerSink, writing to a file, not sure how to intercept written message and apply an assert on it. The test is built upon an existing test case. Any pointers? - Ashish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16179/#review56298 --- On Dec. 11, 2013, 9:58 a.m., Ashish Paliwal wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16179/ --- (Updated Dec. 11, 2013, 9:58 a.m.) Review request for Flume. Bugs: FLUME-2246 https://issues.apache.org/jira/browse/FLUME-2246 Repository: flume-git Description --- Added configuration options to let user select the max bytes to dump in log file. Default stays at 16 bytes. Change summary Made LoggerSink implement configurable user can provide maxByteToDump using configuration User Guide updated Diffs - flume-ng-core/src/main/java/org/apache/flume/sink/LoggerSink.java 128fa84 flume-ng-core/src/test/java/org/apache/flume/sink/TestLoggerSink.java 92ff6fe flume-ng-doc/sphinx/FlumeUserGuide.rst ae66f89 Diff: https://reviews.apache.org/r/16179/diff/ Testing --- Added test case that sets option to a value higher than 30 and use a string of length greater than 16 in output. Thanks, Ashish Paliwal
[jira] [Commented] (FLUME-2500) Add a channel that uses Kafka
[ https://issues.apache.org/jira/browse/FLUME-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169427#comment-14169427 ] Jarek Jarcec Cecho commented on FLUME-2500: --- I like the idea of Kafka channel, so I'm +1 on the idea! Add a channel that uses Kafka -- Key: FLUME-2500 URL: https://issues.apache.org/jira/browse/FLUME-2500 Project: Flume Issue Type: Bug Reporter: Hari Shreedharan Assignee: Hari Shreedharan Here is the rationale: - Kafka does give a HA channel, which means a dead agent does not affect the data in the channel - thus reducing delay of delivery. - Kafka is used by many companies - it would be a good idea to use Flume to pull data from Kafka and write it to HDFS/HBase etc. This channel is not going to be useful for cases where Kafka is not already used, since it brings is operational overhead of maintaining two systems, but if there is Kafka in use - this is good way to integrate Kafka and Flume. Here is an a scratch implementation: https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2451) HDFS Sink Cannot Reconnect After NameNode Restart
[ https://issues.apache.org/jira/browse/FLUME-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169844#comment-14169844 ] Hari Shreedharan commented on FLUME-2451: - Roshan: - Do you think we should close *all* bucket writers on an HDFS IOException? Is there any specific way to identify if the namenode is down? We don't really need to close all bucket writers in all cases. - The sfWritersMap is guarded by a lock - you should use the locking mechanism in this method too. Also, the 2nd patch only has the unit test, not the code fix. Can you merge the two patches? Thanks! HDFS Sink Cannot Reconnect After NameNode Restart - Key: FLUME-2451 URL: https://issues.apache.org/jira/browse/FLUME-2451 Project: Flume Issue Type: Bug Components: File Channel, Sinks+Sources Affects Versions: v1.4.0 Environment: 8 node CDH 4.2.2 (2.0.0-cdh4.2.2) cluster All cluster machines are running Ubuntu 12.04 x86_64 Reporter: Andrew O'Neill Assignee: Roshan Naik Labels: HDFS, Sink Attachments: FLUME-2451.patch, FLUME-2451.v2.patch I am testing a simple flume setup with a Sequence Generator Source, a File Channel, and an HDFS Sink (see my flume.conf below). This configuration works as expected until I reboot the cluster’s NameNode or until I restart the HDFS service on the cluster. At this point, it appears that the Flume Agent cannot reconnect to HDFS and must be manually restarted. Here is our flume.conf: appserver.sources = rawtext appserver.channels = testchannel appserver.sinks = test_sink appserver.sources.rawtext.type = seq appserver.sources.rawtext.channels = testchannel appserver.channels.testchannel.type = file appserver.channels.testchannel.capacity = 1000 appserver.channels.testchannel.minimumRequiredSpace = 214748364800 appserver.channels.testchannel.checkpointDir = /Users/aoneill/Desktop/testchannel/checkpoint appserver.channels.testchannel.dataDirs = /Users/aoneill/Desktop/testchannel/data appserver.channels.testchannel.maxFileSize = 2000 appserver.sinks.test_sink.type = hdfs appserver.sinks.test_sink.channel = testchannel appserver.sinks.test_sink.hdfs.path = hdfs://cluster01:8020/user/aoneill/flumetest appserver.sinks.test_sink.hdfs.closeTries = 3 appserver.sinks.test_sink.hdfs.filePrefix = events- appserver.sinks.test_sink.hdfs.fileSuffix = .avro appserver.sinks.test_sink.hdfs.fileType = DataStream appserver.sinks.test_sink.hdfs.writeFormat = Text appserver.sinks.test_sink.hdfs.inUsePrefix = inuse- appserver.sinks.test_sink.hdfs.inUseSuffix = .avro appserver.sinks.test_sink.hdfs.rollCount = 10 appserver.sinks.test_sink.hdfs.rollInterval = 30 appserver.sinks.test_sink.hdfs.rollSize = 10485760 These are the two error message that the Flume Agent outputs constantly after the restart: 2014-08-26 10:47:24,572 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:96)] Unexpected error while checking replication factor java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) at org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) at org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891) at
[jira] [Created] (FLUME-2501) Updating HttpClient lib version to ensure compat with Kite Morphline v0.12
Roshan Naik created FLUME-2501: -- Summary: Updating HttpClient lib version to ensure compat with Kite Morphline v0.12 Key: FLUME-2501 URL: https://issues.apache.org/jira/browse/FLUME-2501 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.5.0.1 Reporter: Roshan Naik Assignee: Roshan Naik Mismatch in httpclient and http core libs pulled by flume v/s the ones that come with kite causes errors at runtime {code} 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)] Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ name:null counters:{} } } - Exception follows. java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET at org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175) at org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158) at org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448) at org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251) at org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58) at org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35) at org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116) at org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70) at org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52) at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303) at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250) at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46) at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55) at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101) at org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97) at org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46) at org.apache.flume.SinkRunner.start(SinkRunner.java:79) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr
[ https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2501: --- Description: Mismatch in httpclient and http core libs pulled by flume v/s the ones that come with Solr causes errors at runtime {code} 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)] Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ name:null counters:{} } } - Exception follows. java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET at org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175) at org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158) at org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448) at org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251) at org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58) at org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35) at org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116) at org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70) at org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52) at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303) at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250) at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46) at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55) at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101) at org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97) at org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46) at org.apache.flume.SinkRunner.start(SinkRunner.java:79) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} was: Mismatch in httpclient and http core libs pulled by flume v/s the ones that come with kite causes errors at runtime {code} 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)] Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ name:null counters:{} } } - Exception follows.
[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr
[ https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2501: --- Summary: Updating HttpClient lib version to ensure compat with Solr (was: Updating HttpClient lib version to ensure compat with Kite Morphline v0.12) Updating HttpClient lib version to ensure compat with Solr -- Key: FLUME-2501 URL: https://issues.apache.org/jira/browse/FLUME-2501 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.5.0.1 Reporter: Roshan Naik Assignee: Roshan Naik Mismatch in httpclient and http core libs pulled by flume v/s the ones that come with kite causes errors at runtime {code} 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)] Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ name:null counters:{} } } - Exception follows. java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET at org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175) at org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158) at org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448) at org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251) at org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58) at org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35) at org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116) at org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70) at org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52) at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303) at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250) at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46) at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55) at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101) at org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97) at org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46) at org.apache.flume.SinkRunner.start(SinkRunner.java:79) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2501) Updating HttpClient lib version to ensure compat with Solr
[ https://issues.apache.org/jira/browse/FLUME-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2501: --- Attachment: FLUME-2501.patch Updating HttpClient lib version to ensure compat with Solr -- Key: FLUME-2501 URL: https://issues.apache.org/jira/browse/FLUME-2501 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.5.0.1 Reporter: Roshan Naik Assignee: Roshan Naik Attachments: FLUME-2501.patch Mismatch in httpclient and http core libs pulled by flume v/s the ones that come with Solr causes errors at runtime {code} 2014-10-13 19:52:32,042 (lifecycleSupervisor-1-1) [DEBUG - org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:106)] Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2014-10-13 19:52:32,225 (lifecycleSupervisor-1-1) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4752b854 counterGroup:{ name:null counters:{} } } - Exception follows. java.lang.NoSuchFieldError: DEF_CONTENT_CHARSET at org.apache.http.impl.client.DefaultHttpClient.setDefaultHttpParams(DefaultHttpClient.java:175) at org.apache.http.impl.client.DefaultHttpClient.createHttpParams(DefaultHttpClient.java:158) at org.apache.http.impl.client.AbstractHttpClient.getParams(AbstractHttpClient.java:448) at org.apache.solr.client.solrj.impl.HttpClientUtil.setFollowRedirects(HttpClientUtil.java:251) at org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:58) at org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:133) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:109) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:161) at org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:138) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:122) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:114) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:104) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:39) at org.kitesdk.morphline.solr.SafeConcurrentUpdateSolrServer.init(SafeConcurrentUpdateSolrServer.java:35) at org.kitesdk.morphline.solr.SolrLocator.getLoader(SolrLocator.java:116) at org.kitesdk.morphline.solr.LoadSolrBuilder$LoadSolr.init(LoadSolrBuilder.java:70) at org.kitesdk.morphline.solr.LoadSolrBuilder.build(LoadSolrBuilder.java:52) at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:303) at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:250) at org.kitesdk.morphline.stdlib.Pipe.init(Pipe.java:46) at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126) at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55) at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101) at org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97) at org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46) at org.apache.flume.SinkRunner.start(SinkRunner.java:79) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2497) TCP and UDP syslog sources parsing the timestamp incorrectly
[ https://issues.apache.org/jira/browse/FLUME-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johny Rufus updated FLUME-2497: --- Description: TCP and UDP syslog sources parse the timestamp incorrectly while using Syslogutils extractEvent and buildEvent methods. was: TCP and UDP syslog sources parse the timestamp incorrectly while using Syslogutils extractEvent and buildEvent methods. We need to align these sources to use the syslogParser (as used by MultiportSyslogTCPSource )to parse the messages correctly. TCP and UDP syslog sources parsing the timestamp incorrectly Key: FLUME-2497 URL: https://issues.apache.org/jira/browse/FLUME-2497 Project: Flume Issue Type: Bug Affects Versions: v1.5.0.1 Reporter: Johny Rufus Assignee: Johny Rufus Fix For: v1.6.0 TCP and UDP syslog sources parse the timestamp incorrectly while using Syslogutils extractEvent and buildEvent methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2497) TCP and UDP syslog sources parsing the timestamp incorrectly
[ https://issues.apache.org/jira/browse/FLUME-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johny Rufus updated FLUME-2497: --- Attachment: FLUME-2497.patch TCP and UDP syslog sources parsing the timestamp incorrectly Key: FLUME-2497 URL: https://issues.apache.org/jira/browse/FLUME-2497 Project: Flume Issue Type: Bug Affects Versions: v1.5.0.1 Reporter: Johny Rufus Assignee: Johny Rufus Fix For: v1.6.0 Attachments: FLUME-2497.patch TCP and UDP syslog sources parse the timestamp incorrectly while using Syslogutils extractEvent and buildEvent methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2482) Race condition in File Channels' Log.removeOldLogs
[ https://issues.apache.org/jira/browse/FLUME-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johny Rufus updated FLUME-2482: --- Attachment: FLUME-2482.patch Race condition in File Channels' Log.removeOldLogs -- Key: FLUME-2482 URL: https://issues.apache.org/jira/browse/FLUME-2482 Project: Flume Issue Type: Bug Components: File Channel Affects Versions: v1.5.0.1 Reporter: Santiago M. Mola Labels: race-condition Fix For: v1.6.0 Attachments: FLUME-2482.patch TestFileChannelRestart.testToggleCheckpointCompressionFromFalseToTrue sometimes produces a race condition at Log.removeOldLogs. https://travis-ci.org/Stratio/flume/jobs/36782318#L6193 testToggleCheckpointCompressionFromFalseToTrue(org.apache.flume.channel.file.TestFileChannelRestart) Time elapsed: 144362 sec ERROR! java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.flume.channel.file.Log.removeOldLogs(Log.java:1070) at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1055) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.fest.reflect.method.Invoker.invoke(Invoker.java:110) at org.apache.flume.channel.file.TestUtils.forceCheckpoint(TestUtils.java:134) at org.apache.flume.channel.file.TestFileChannelRestart.restartToggleCompression(TestFileChannelRestart.java:930) at org.apache.flume.channel.file.TestFileChannelRestart.testToggleCheckpointCompressionFromFalseToTrue(TestFileChannelRestart.java:896) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize
[ https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170503#comment-14170503 ] Hari Shreedharan commented on FLUME-2006: - This generally looks good, but it introduces a dependency on guava in the flume-ng-sdk. Looks like it is used only for StringUtils#isNullOrEmpty - you can just copy this over to a utils class, to avoid having to bundle guava with the sdk. in Avro the batch size is called batch-size, in all other sources batchSize --- Key: FLUME-2006 URL: https://issues.apache.org/jira/browse/FLUME-2006 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.4.0, v1.3.1 Reporter: Alexander Alten-Lorenz Assignee: Ashish Paliwal Priority: Trivial Attachments: FLUME-2006-0.patch http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e The mismatch is with Avro Sink as well as Thrift sink. Other sinks use batchSize as param name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize
[ https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170508#comment-14170508 ] Ashish Paliwal commented on FLUME-2006: --- Thanks [~hshreedharan] ! The patch doesn't apply on trunk, so working on it. I notices issue with the fix in sub-classes of AbstractRpcSink, where I replaced the code with new param 'batchSize'. I think they need to support both param. Taking a look again at the patch. I shall remove the dependency from Guava as well. Once done shall upload the patch. in Avro the batch size is called batch-size, in all other sources batchSize --- Key: FLUME-2006 URL: https://issues.apache.org/jira/browse/FLUME-2006 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.4.0, v1.3.1 Reporter: Alexander Alten-Lorenz Assignee: Ashish Paliwal Priority: Trivial Attachments: FLUME-2006-0.patch http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e The mismatch is with Avro Sink as well as Thrift sink. Other sinks use batchSize as param name -- This message was sent by Atlassian JIRA (v6.3.4#6332)