[jira] [Commented] (FLUME-3036) Create a RegexSerializer for Hive Sink
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954335#comment-15954335 ] Roshan Naik commented on FLUME-3036: [~kalyanhadoop] thanks for your efforts to get the RegexWriter patch into Hive. Looks like there is a release candidate out for Hive v1.2.2 ... so we should be able to commit this in Flume as soon as that gets released. Does that sound ok ? Looks lie we need a revised patch here without the RegExWriter class. Also would be good to combine the docs and code patches into one. > Create a RegexSerializer for Hive Sink > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036-docs.patch, FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLUME-2620) File channel throws NullPointerException if a header value is null
[ https://issues.apache.org/jira/browse/FLUME-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923388#comment-15923388 ] Roshan Naik commented on FLUME-2620: [~marcellhegedus] It is easier to review the patches if put on [Review Board|https://reviews.apache.org/r/] > File channel throws NullPointerException if a header value is null > -- > > Key: FLUME-2620 > URL: https://issues.apache.org/jira/browse/FLUME-2620 > Project: Flume > Issue Type: Bug > Components: File Channel >Reporter: Santiago M. Mola >Assignee: Neerja Khattar > Attachments: FLUME-2620-0.patch, FLUME-2620-1.patch, > FLUME-2620-2.patch, FLUME-2620-3.patch, FLUME-2620-4.patch, > FLUME-2620-5.patch, FLUME-2620.patch, FLUME-2620.patch > > > File channel throws NullPointerException if a header value is null. > If this is intended, it should be reported correctly in the logs. > Sample trace: > org.apache.flume.ChannelException: Unable to put batch on required channel: > FileChannel chan { dataDirs: [/var/lib/ingestion-csv/chan/data] } > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200) > at > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:236) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.flume.channel.file.proto.ProtosFactory$FlumeEventHeader$Builder.setValue(ProtosFactory.java:7415) > at org.apache.flume.channel.file.Put.writeProtos(Put.java:85) > at > org.apache.flume.channel.file.TransactionEventRecord.toByteBuffer(TransactionEventRecord.java:174) > at org.apache.flume.channel.file.Log.put(Log.java:622) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:469) > at > org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93) > at > org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80) > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLUME-2620) File channel throws NullPointerException if a header value is null
[ https://issues.apache.org/jira/browse/FLUME-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923387#comment-15923387 ] Roshan Naik commented on FLUME-2620: [~marcellhegedus] i have added you as contributor. You should be able to go ahead and assign this JIRA yourself if you like. > File channel throws NullPointerException if a header value is null > -- > > Key: FLUME-2620 > URL: https://issues.apache.org/jira/browse/FLUME-2620 > Project: Flume > Issue Type: Bug > Components: File Channel >Reporter: Santiago M. Mola >Assignee: Neerja Khattar > Attachments: FLUME-2620-0.patch, FLUME-2620-1.patch, > FLUME-2620-2.patch, FLUME-2620-3.patch, FLUME-2620-4.patch, > FLUME-2620-5.patch, FLUME-2620.patch, FLUME-2620.patch > > > File channel throws NullPointerException if a header value is null. > If this is intended, it should be reported correctly in the logs. > Sample trace: > org.apache.flume.ChannelException: Unable to put batch on required channel: > FileChannel chan { dataDirs: [/var/lib/ingestion-csv/chan/data] } > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200) > at > org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:236) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.flume.channel.file.proto.ProtosFactory$FlumeEventHeader$Builder.setValue(ProtosFactory.java:7415) > at org.apache.flume.channel.file.Put.writeProtos(Put.java:85) > at > org.apache.flume.channel.file.TransactionEventRecord.toByteBuffer(TransactionEventRecord.java:174) > at org.apache.flume.channel.file.Log.put(Log.java:622) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:469) > at > org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93) > at > org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80) > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLUME-3036) Create a RegexSerializer for Hive Sink
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-3036: --- Summary: Create a RegexSerializer for Hive Sink (was: Create a Hive Sink based on the Streaming support with RegexSerializer) > Create a RegexSerializer for Hive Sink > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036-docs.patch, FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829101#comment-15829101 ] Roshan Naik commented on FLUME-3036: Put my comments in github. > Create a Hive Sink based on the Streaming support with RegexSerializer > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036-docs.patch, FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829050#comment-15829050 ] Roshan Naik commented on FLUME-3036: [~kalyanhadoop] Can you please comment on what other testing you were able to do in addition to the new unit test. > Create a Hive Sink based on the Streaming support with RegexSerializer > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036-docs.patch, FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15786173#comment-15786173 ] Roshan Naik commented on FLUME-3036: Reviewed the doc and put my comments there in github. Have not looked at the code yet. > Create a Hive Sink based on the Streaming support with RegexSerializer > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036-docs.patch, FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-3036: --- Assignee: Kalyan > Create a Hive Sink based on the Streaming support with RegexSerializer > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan >Assignee: Kalyan > Attachments: FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer
[ https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781609#comment-15781609 ] Roshan Naik commented on FLUME-3036: can u update the user guide with info on this ? > Create a Hive Sink based on the Streaming support with RegexSerializer > -- > > Key: FLUME-3036 > URL: https://issues.apache.org/jira/browse/FLUME-3036 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Kalyan > Attachments: FLUME-3036.patch > > > Providing Hive Regex Serializer to work with Flume Streaming data. > Based on existing hive sink using hive regex serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2892) Unable to compile & install flume with Maven
[ https://issues.apache.org/jira/browse/FLUME-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490818#comment-15490818 ] Roshan Naik commented on FLUME-2892: the Console output seems like an inconsistent copy/paste job ... initially section says ... {quote}[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on project flume-jdbc-channel: There are test failures.{quote} later it says {quote} [INFO] Flume NG Spillable Memory channel .. FAILURE [15:02 min] {quote} Not sure which is right. If you are having test run issues you can run maven using -DskipTests option. > Unable to compile & install flume with Maven > > > Key: FLUME-2892 > URL: https://issues.apache.org/jira/browse/FLUME-2892 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 > Environment: AWS running: > Ubuntu > Maven 3.3.4 > Java 1.6 >Reporter: Nathan Sturgess >Priority: Critical > > This is my situation: > http://stackoverflow.com/questions/35866078/flume-kafka-integration-produces-zookeeper-related-error > I am trying to build flume with this patch for zookeeper and it is not > working. Most recent example of errors: > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 16:34 min > [INFO] Finished at: 2016-03-10T12:01:54+00:00 > [INFO] Final Memory: 64M/192M > [INFO] > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on > project flume-jdbc-channel: There are test failures. > [ERROR] > [ERROR] Please refer to > /home/bitnami/flume/flume-ng-channels/flume-jdbc-channel/target/surefire-reports > for the individual test results. > [ERROR] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :flume-jdbc-channel > > [INFO] Flume NG Spillable Memory channel .. FAILURE [15:02 > min] > [INFO] Flume NG Node .. SKIPPED > [INFO] Flume NG Embedded Agent SKIPPED > [INFO] Flume NG HBase Sink SKIPPED > [INFO] Flume NG ElasticSearch Sink SKIPPED > [INFO] Flume NG Morphline Solr Sink ... SKIPPED > [INFO] Flume Kafka Sink ... SKIPPED > [INFO] Flume NG Kite Dataset Sink . SKIPPED > [INFO] Flume NG Hive Sink . SKIPPED > [INFO] Flume Sources .. SKIPPED > [INFO] Flume Scribe Source SKIPPED > [INFO] Flume JMS Source ... SKIPPED > [INFO] Flume Twitter Source ... SKIPPED > [INFO] Flume Kafka Source . SKIPPED > [INFO] flume-kafka-channel SKIPPED > [INFO] Flume legacy Sources ... SKIPPED > [INFO] Flume legacy Avro source ... SKIPPED > [INFO] Flume legacy Thrift Source . SKIPPED > [INFO] Flume NG Clients ... SKIPPED > [INFO] Flume NG Log4j Appender SKIPPED > [INFO] Flume NG Tools . SKIPPED > [INFO] Flume NG distribution .. SKIPPED > [INFO] Flume NG Integration Tests . SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 34:20 min > [INFO] Finished at: 2016-03-10T12:52:02+00:00 > [INFO] Final Memory: 52M/247M > [INFO] > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on > project flume-spillable-memory-channel: ExecutionException; nested exception > is java.util.concurrent.ExecutionException: java.lang.RuntimeException: The > forked VM terminated wit
[jira] [Commented] (FLUME-2965) race condition in SpillableMemoryChannel log print
[ https://issues.apache.org/jira/browse/FLUME-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406832#comment-15406832 ] Roshan Naik commented on FLUME-2965: Thanks [~lqp276] Took a quick look... and in general it looks like the right fix. Would be easier to idenitify all the changes in this fix if just the diff/patch was uploaded. Also please create a code review as suggested by Denes. > race condition in SpillableMemoryChannel log print > -- > > Key: FLUME-2965 > URL: https://issues.apache.org/jira/browse/FLUME-2965 > Project: Flume > Issue Type: Bug > Components: Channel >Affects Versions: v1.7.0 >Reporter: liqiaoping >Priority: Minor > Attachments: SpillableMemoryChannel.java > > > use SpillableMemoryChannel with http blob handler, and send many request > concurrently, As the jetty has a threadpool to handle incoming request, the > commit to SpillableMemoryChannel will be concurrent. > the Following code : > @Override > protected void doCommit() throws InterruptedException { > if (putCalled) { > putCommit(); > if (LOGGER.isDebugEnabled()) { > LOGGER.debug("Put Committed. Drain Order Queue state : " > + drainOrder.dump()); > } > in method - >drainOrder.dump() will iterate its internal queue, in the > meantime, has changed by other thread, thus throw a concurrent modification > exception. thus will result the channel processor try to rollback, but > actually the transaction has commit succefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2947) Upgrade Hive and thrift libraries
Roshan Naik created FLUME-2947: -- Summary: Upgrade Hive and thrift libraries Key: FLUME-2947 URL: https://issues.apache.org/jira/browse/FLUME-2947 Project: Flume Issue Type: Bug Components: Sinks+Sources Reporter: Roshan Naik Assignee: Roshan Naik Upgrade Hive version to use new API call that captures additional info about the agent's sink name (HIVE-11956). Also upgrade thrift libraries as Hive is moving to 0.9.3 (HIVE-13724). Some of the 0.9.2 methods are now missing in 0.9.3. So need to upgrade both. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLUME-2762) Improve HDFS Sink performance
[ https://issues.apache.org/jira/browse/FLUME-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik resolved FLUME-2762. Resolution: Invalid Release Note: my thoughts didn't pan out on prototyping > Improve HDFS Sink performance > - > > Key: FLUME-2762 > URL: https://issues.apache.org/jira/browse/FLUME-2762 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > > Have some thoughts around improving HDFS sink's performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2762) Improve HDFS Sink performance
[ https://issues.apache.org/jira/browse/FLUME-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2762: --- Fix Version/s: (was: v1.7.0) > Improve HDFS Sink performance > - > > Key: FLUME-2762 > URL: https://issues.apache.org/jira/browse/FLUME-2762 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Reporter: Roshan Naik >Assignee: Roshan Naik > > Have some thoughts around improving HDFS sink's performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2914) Upgrade httpclient version 4.3.6
[ https://issues.apache.org/jira/browse/FLUME-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2914: --- Attachment: FLUME-2914.v2.patch > Upgrade httpclient version 4.3.6 > > > Key: FLUME-2914 > URL: https://issues.apache.org/jira/browse/FLUME-2914 > Project: Flume > Issue Type: Bug >Affects Versions: v1.7.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: FLUME-2914.patch, FLUME-2914.v2.patch > > > There seem to be security vulnerabilities in httpclient from httpcomponents > as per https://issues.apache.org/jira/browse/HADOOP-12767 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers
[ https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319493#comment-15319493 ] Roshan Naik commented on FLUME-2799: [~michael.andre.pearce] any update ? > Kafka Source - Message Offset and Partition add to headers > -- > > Key: FLUME-2799 > URL: https://issues.apache.org/jira/browse/FLUME-2799 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Michael Andre Pearce (IG) >Priority: Minor > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2799-0.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently Kafka source only persists the original kafka message's topic into > the Flume event headers. > For downstream interceptors and sinks that may want to have available to them > the partition and the offset , we need to add these. > Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is > not configurable unlike other sources such as JMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2914) Upgrade httpclient version 4.3.6
[ https://issues.apache.org/jira/browse/FLUME-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2914: --- Attachment: FLUME-2914.patch > Upgrade httpclient version 4.3.6 > > > Key: FLUME-2914 > URL: https://issues.apache.org/jira/browse/FLUME-2914 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > Attachments: FLUME-2914.patch > > > There seem to be security vulnerabilities in httpclient from httpcomponents > as per https://issues.apache.org/jira/browse/HADOOP-12767 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2914) Upgrade httpclient version 4.3.6
Roshan Naik created FLUME-2914: -- Summary: Upgrade httpclient version 4.3.6 Key: FLUME-2914 URL: https://issues.apache.org/jira/browse/FLUME-2914 Project: Flume Issue Type: Bug Reporter: Roshan Naik There seem to be security vulnerabilities in httpclient from httpcomponents as per https://issues.apache.org/jira/browse/HADOOP-12767 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287547#comment-15287547 ] Roshan Naik commented on FLUME-2792: Yes .. in the next release. > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256883#comment-15256883 ] Roshan Naik commented on FLUME-2889: +1 .. am running the tests. Shall commit them once they pass. > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Tristan Stevens > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, > FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2889: --- Assignee: Tristan Stevens (was: Roshan Naik) > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Tristan Stevens > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, > FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel
[ https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2901: --- Attachment: FLUME-2901.v3.patch uploading revised patch v3 with fix for issue pointed out by [~jholoman] > Document Kerberos setup for Kafka channel > - > > Key: FLUME-2901 > URL: https://issues.apache.org/jira/browse/FLUME-2901 > Project: Flume > Issue Type: Bug > Components: Docs >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2901.patch, FLUME-2901.v2.patch, > FLUME-2901.v3.patch > > > Add details about configuring Kafka channel to work with a Kerberized Kafka > cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2901) Document Kerberos setup for Kafka channel
[ https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250758#comment-15250758 ] Roshan Naik commented on FLUME-2901: [~jholoman] thanks for spotting that problem .. shall upload a fixed patch soon. I considered providing examples for other modes too but was not able to test them out. After briefly looking over the other modes, leaned away from providing examples for each of them and instead just refer to them via a link... as it seemed a bit much (e.g. with and without authorization) although useful. I decided to put this example since the kafka folks tell me that SASL_PLAINTEXT is most common one. > Document Kerberos setup for Kafka channel > - > > Key: FLUME-2901 > URL: https://issues.apache.org/jira/browse/FLUME-2901 > Project: Flume > Issue Type: Bug > Components: Docs >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2901.patch, FLUME-2901.v2.patch > > > Add details about configuring Kafka channel to work with a Kerberized Kafka > cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2901) Document Kerberos setup for Kafka channel
[ https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248741#comment-15248741 ] Roshan Naik commented on FLUME-2901: [~hshreedharan] can u take a look ? > Document Kerberos setup for Kafka channel > - > > Key: FLUME-2901 > URL: https://issues.apache.org/jira/browse/FLUME-2901 > Project: Flume > Issue Type: Bug > Components: Docs >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2901.patch, FLUME-2901.v2.patch > > > Add details about configuring Kafka channel to work with a Kerberized Kafka > cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel
[ https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2901: --- Attachment: FLUME-2901.v2.patch revising patch with minor addition to documentation > Document Kerberos setup for Kafka channel > - > > Key: FLUME-2901 > URL: https://issues.apache.org/jira/browse/FLUME-2901 > Project: Flume > Issue Type: Bug > Components: Docs >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2901.patch, FLUME-2901.v2.patch > > > Add details about configuring Kafka channel to work with a Kerberized Kafka > cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel
[ https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2901: --- Attachment: FLUME-2901.patch Uploading patch. includes Kafka channel kerberos documentation and also a minor fix to existing doc (removed capacity & transactionCapacity settings from example as they dont apply to this channel) > Document Kerberos setup for Kafka channel > - > > Key: FLUME-2901 > URL: https://issues.apache.org/jira/browse/FLUME-2901 > Project: Flume > Issue Type: Bug > Components: Docs >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2901.patch > > > Add details about configuring Kafka channel to work with a Kerberized Kafka > cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2901) Document Kerberos setup for Kafka channel
Roshan Naik created FLUME-2901: -- Summary: Document Kerberos setup for Kafka channel Key: FLUME-2901 URL: https://issues.apache.org/jira/browse/FLUME-2901 Project: Flume Issue Type: Bug Components: Docs Reporter: Roshan Naik Assignee: Roshan Naik Fix For: v1.7.0 Add details about configuring Kafka channel to work with a Kerberized Kafka cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2433) Add kerberos support for Hive sink
[ https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238492#comment-15238492 ] Roshan Naik edited comment on FLUME-2433 at 4/13/16 2:50 AM: - [~wpwang] can you please try the same with the hdfs sink and see if the same credentials work. I think you have some setup issue going on there. Also you might have to copy hive-site.xml and hdfs-site.xml into the flume classpath. was (Author: roshan_naik): [~wpwang] can you please try the same with the hdfs sink and see if the same credentials work. I think you have some setup issue going on there. Most likely you to have your hive-site.xml and hdfs-site.xml copied over the flume classpath. > Add kerberos support for Hive sink > -- > > Key: FLUME-2433 > URL: https://issues.apache.org/jira/browse/FLUME-2433 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.5.0.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: HiveSink, Kerberos, > Attachments: FLUME-2433.patch, FLUME-2433.v2.patch > > > Add kerberos authentication support for Hive sink > FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 > this should be available in the next hive release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2433) Add kerberos support for Hive sink
[ https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238492#comment-15238492 ] Roshan Naik commented on FLUME-2433: [~wpwang] can you please try the same with the hdfs sink and see if the same credentials work. I think you have some setup issue going on there. Most likely you to have your hive-site.xml and hdfs-site.xml copied over the flume classpath. > Add kerberos support for Hive sink > -- > > Key: FLUME-2433 > URL: https://issues.apache.org/jira/browse/FLUME-2433 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.5.0.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: HiveSink, Kerberos, > Attachments: FLUME-2433.patch, FLUME-2433.v2.patch > > > Add kerberos authentication support for Hive sink > FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 > this should be available in the next hive release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source
[ https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220676#comment-15220676 ] Roshan Naik commented on FLUME-2781: Thanks for confirming [~gherreros] I got a confused when i saw this stmt in some of the automated comments here in this jira. {quote}"Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events." .{quote} which i believe is exactly the opposite of what is intended. > A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used > by a Flume source > - > > Key: FLUME-2781 > URL: https://issues.apache.org/jira/browse/FLUME-2781 > Project: Flume > Issue Type: Improvement >Affects Versions: v1.6.0 >Reporter: Gonzalo Herreros >Assignee: Gonzalo Herreros > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2781.patch > > > When a Kafka channel is configured as parseAsFlumeEvent=false, the channel > will read events from the topic as text instead of serialized Avro Flume > events. > This is useful so Flume can read from an existing Kafka topic, where other > Kafka clients publish as text. > However, if you use a Flume source on that channel, it will still write the > events as Avro so it will create an inconsistency and those events will fail > to be read correctly. > Also, this would allow a Flume source to write to a Kafka channel and any > Kafka subscriber to listen to Flume events passing through without binary > dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source
[ https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219177#comment-15219177 ] Roshan Naik commented on FLUME-2781: [~gherreros] this is a very useful feature! Don't see any note in the UserGuide about this... so needed clarifcation... Setting parseAsFlumeEvent=*false* will cause events to be written as-is into the Kafka topic without the FlumeEvent wrapper right ? > A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used > by a Flume source > - > > Key: FLUME-2781 > URL: https://issues.apache.org/jira/browse/FLUME-2781 > Project: Flume > Issue Type: Improvement >Affects Versions: v1.6.0 >Reporter: Gonzalo Herreros >Assignee: Gonzalo Herreros > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2781.patch > > > When a Kafka channel is configured as parseAsFlumeEvent=false, the channel > will read events from the topic as text instead of serialized Avro Flume > events. > This is useful so Flume can read from an existing Kafka topic, where other > Kafka clients publish as text. > However, if you use a Flume source on that channel, it will still write the > events as Avro so it will create an inconsistency and those events will fail > to be read correctly. > Also, this would allow a Flume source to write to a Kafka channel and any > Kafka subscriber to listen to Flume events passing through without binary > dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2863) How to install Flume on Windows server (best practices)
[ https://issues.apache.org/jira/browse/FLUME-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216363#comment-15216363 ] Roshan Naik commented on FLUME-2863: I know that HDP bundles Flume on Windows (and an installer) since i work on HDP. But there might be other distros too out there with Windows support. > How to install Flume on Windows server (best practices) > --- > > Key: FLUME-2863 > URL: https://issues.apache.org/jira/browse/FLUME-2863 > Project: Flume > Issue Type: Bug > Components: Configuration >Affects Versions: v1.6.0 > Environment: Windows Server 2008 R2 Datacenter > Intel Xeon CPU e5 - 4Gb ram > Flume1.6.0. > Flume0.9.4 (exe) >Reporter: Alan Paiva >Priority: Trivial > Labels: build, newbie, starter, windows > Original Estimate: 24h > Remaining Estimate: 24h > > Hi guys, > I have to install flume on Windows environment and, have some issues.. > 1st attempt - I used a file flume-node0.9.4.exe from cloudera git website > (https://github.com/cloudera/flume/downloads) than, the log files on flume > give that error: > WARN agent.MultiMasterRPC: Could not connect to any master nodes (tried 1: > [localhost:35872]) > INFO agent.MultiMasterRPC: No active master RPC connection > 2nd attempt - I downloaded flume from apache website > (https://flume.apache.org/download.html) file: apache-flume-1.6.0-bin.tar.gz, > extract all files and configure files: flume.conf, flume-env.sh and, executed > flume-ng.cmd (command: flume-ng shaman) I don't receive any error message, > just simple shutdown of flume-ng.cmd window.. > Anyone has any experience with Flume on Windows sending .log files to HDFS in > another environment? > Attach my file flume.conf to all see what I wanna do.. > -- define source, channel and sink > shaman.sources = tail_source1 > shaman.channels = ch1 > shaman.sinks = hdfs_sink1 > -- define tail source > shaman.sources.tail_source1.type = exec > shaman.sources.tail_source1.channels = ch1 > shaman.sources.tail_source1.shell = /bin/bash -c > shaman.sources.tail_source1.command = tail -F F:/LogFiles/w3r54 > shaman.sources.tail_source1.interceptors = ts > shaman.sources.tail_source1.interceptors.ts.type = timestamp > -- define in-memory channel > shaman.channels.ch1.type = memory > shaman.channels.ch1.capacity = 10 > shaman.channels.ch1.transactionCapacity = 1000 > -- define HDFS sink properties > shaman.sinks.hdfs_sink1.type = hdfs > shaman.sinks.hdfs_sink1.hdfs.path = > hdfs://:8020/user/admin/fromflume/%y%m%d/%H%M%S > shaman.sinks.hdfs_sink1.hdfs.fileType = DataStream > shaman.sinks.hdfs_sink1.channel = ch1 > Thanks! > Alan.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210557#comment-15210557 ] Roshan Naik commented on FLUME-2889: Sorry [~tmgstev] i have not. Wont be able to get to it for another week at least. [~hshreedharan] are u able to take a look ? > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, > FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186250#comment-15186250 ] Roshan Naik commented on FLUME-2889: oh i thought i had committed it... perhaps we can wait for the revision Tristan has in mind. [~tmgstev] please go ahead and reuse this JIRA. > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172541#comment-15172541 ] Roshan Naik commented on FLUME-2889: Shall wait for [~tmgstev] till end of day to see if he gets a chance to review this before committing. > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170096#comment-15170096 ] Roshan Naik commented on FLUME-2889: Yes will wait for Tristan. Trying to assess the impact this bug has on flume users. Could use some inputs here .. My assessment is below : 1) SyslogAvroEventSerializer.java: Here it could cause bad data to be written out to the file 2) SyslogParser.java : This looks like will apply a bad date in the timestamp header. That could then either end up causing events to go down the wrong destination if that header is used to determine the destination... or even bad data .. if the header value is somehow during serialization 3) In both cases is the problem limited to adjusting leap year dates (Feb 29) only ? would be nice to see an example showing the problem. Thanks > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170010#comment-15170010 ] Roshan Naik commented on FLUME-2889: [~hshreedharan] i am thinking we revert/override the previous commit with this patch ? > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169995#comment-15169995 ] Roshan Naik edited comment on FLUME-2889 at 2/26/16 10:43 PM: -- Thanks [~tmgstev] for catching that. I am revising the patch with the same fix applied to SyslogAvroEventSerializer.java. Not sure why there is duplication of logic. Since short on time (feb 26 already) i wont try to attempt to merge the two code pieces and upload another fix. Can you please review this v3 patch ? was (Author: roshan_naik): Thanks [~tmgstev] for catching that. I am revising the patch with the same fix applied to SyslogAvroEventSerializer.java. Not sure why there is duplication of logic. Since short on time (feb 26 already) i wont try to attempt to merge the two code pieces and upload another fix. Can you please review this v4 patch ? > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2889: --- Attachment: FLUME-2889.3.patch Thanks [~tmgstev] for catching that. I am revising the patch with the same fix applied to SyslogAvroEventSerializer.java. Not sure why there is duplication of logic. Since short on time (feb 26 already) i wont try to attempt to merge the two code pieces and upload another fix. Can you please review this v4 patch ? > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2889: --- Fix Version/s: v1.7.0 > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: v1.7.0 > > Attachments: FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2889: --- Affects Version/s: v1.6.0 > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2889) Fixes to DateTime computations
[ https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2889: --- Attachment: FLUME-2889.patch Uploading patch... minor 1 liner fixes in a few places > Fixes to DateTime computations > --- > > Key: FLUME-2889 > URL: https://issues.apache.org/jira/browse/FLUME-2889 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: FLUME-2889.patch > > > date.withYear(year+1) can lead to incorrect date calculations .. for example > if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2889) Fixes to DateTime computations
Roshan Naik created FLUME-2889: -- Summary: Fixes to DateTime computations Key: FLUME-2889 URL: https://issues.apache.org/jira/browse/FLUME-2889 Project: Flume Issue Type: Bug Reporter: Roshan Naik Assignee: Roshan Naik date.withYear(year+1) can lead to incorrect date calculations .. for example if the date is Feb 29th. need to use date.plusYears(1) instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2881) Windows Launch Script fails in plugins dir code
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2881: --- Assignee: Jonathan Smith > Windows Launch Script fails in plugins dir code > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith >Assignee: Jonathan Smith > Labels: easyfix, patch, windows > Fix For: v1.7.0 > > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2881) Windows Launch Script fails in plugins dir code
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2881: --- Fix Version/s: v1.7.0 > Windows Launch Script fails in plugins dir code > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Fix For: v1.7.0 > > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2881) Windows Launch Script fails in plugins dir code
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151297#comment-15151297 ] Roshan Naik commented on FLUME-2881: Added some jars in plugins.d/1/lib/x.jar plugins.d/1/lib/y.jarand plugins.d/1/libext/x.jar but it didnt make a difference. Anyway I test that your fixes to the script work on my setup .. and since other may also experience the same issue that you have noticed,... I am +1 on this and will commit it shortly. Thanks for the patch [~jonathansmith] > Windows Launch Script fails in plugins dir code > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2881) Windows Launch Script fails in plugins dir code
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2881: --- Summary: Windows Launch Script fails in plugins dir code (was: Windows Launch Script fails) > Windows Launch Script fails in plugins dir code > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2881) Windows Launch Script fails
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151168#comment-15151168 ] Roshan Naik commented on FLUME-2881: oh i see. i did add some plugin dirs but it didn't seem to make a difference. let me try adding some jars there and check. > Windows Launch Script fails > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2881) Windows Launch Script fails
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149627#comment-15149627 ] Roshan Naik commented on FLUME-2881: [~jonathansmith] i am able to run the script just fine without the need for this fix. what version of powershell are you using ? i used 4.0 {code} PS C:\Users\Administrator\Downloads\apache-flume-1.6.0-bin> $PSVersionTable Name Value - PSVersion 4.0 WSManStackVersion 3.0 SerializationVersion 1.1.0.1 CLRVersion 4.0.30319.42000 BuildVersion 6.3.9600.17400 PSCompatibleVersions {1.0, 2.0, 3.0, 4.0} PSRemotingProtocolVersion 2.2 {code} > Windows Launch Script fails > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2881) Windows Launch Script fails
[ https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149367#comment-15149367 ] Roshan Naik commented on FLUME-2881: sure.. will take a look by tomorrow. > Windows Launch Script fails > --- > > Key: FLUME-2881 > URL: https://issues.apache.org/jira/browse/FLUME-2881 > Project: Flume > Issue Type: Bug > Components: Configuration, Windows >Affects Versions: v1.6.0 > Environment: Tested on Windows 7 and Windows 8 >Reporter: Jonathan Smith > Labels: easyfix, patch, windows > Attachments: fix_windows_launch.patch, op-addition-not-found.log > > > Running flume-ng.cmd results in the attached error from the Windows command > line. > The problem seems to originate in flume-ng.ps1, line 323 where the plugins > are added to the class path. Adding together directory information does not > seem to be supported on windows 7 or 8. I was able to fix the problem by > separating out the two plugin directories in the attached patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers
[ https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102905#comment-15102905 ] Roshan Naik commented on FLUME-2799: In flume-ng-doc/sphinx/FlumeUserGuide.rst look for the Kafka source section. > Kafka Source - Message Offset and Partition add to headers > -- > > Key: FLUME-2799 > URL: https://issues.apache.org/jira/browse/FLUME-2799 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Michael Andre Pearce (IG) >Priority: Minor > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2799-0.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently Kafka source only persists the original kafka message's topic into > the Flume event headers. > For downstream interceptors and sinks that may want to have available to them > the partition and the offset , we need to add these. > Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is > not configurable unlike other sources such as JMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2704) Configurable poll delay for spooling directory source
[ https://issues.apache.org/jira/browse/FLUME-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102901#comment-15102901 ] Roshan Naik commented on FLUME-2704: [~jrufus] can u commit this if it looks good ? > Configurable poll delay for spooling directory source > - > > Key: FLUME-2704 > URL: https://issues.apache.org/jira/browse/FLUME-2704 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 >Reporter: Somin Mithraa >Assignee: Somin Mithraa >Priority: Minor > Labels: SpoolDir, pollDelay, sources > Attachments: FLUME-2704.patch > > > SpoolDir source polls a directory for new files at specific interval. This > interval(or poll delay) is currently hardcoded as 500ms. > 500ms may be too fast for some applications. This JIRA is to make this > property configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2865) Upgrade thrift version to 0.9.3
Roshan Naik created FLUME-2865: -- Summary: Upgrade thrift version to 0.9.3 Key: FLUME-2865 URL: https://issues.apache.org/jira/browse/FLUME-2865 Project: Flume Issue Type: Bug Reporter: Roshan Naik Assignee: Roshan Naik Hive is now moving to thrift v0.9.3 and some older symbols are missing in this newer thrift version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers
[ https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098913#comment-15098913 ] Roshan Naik commented on FLUME-2799: - I don't think it is wise to block this requirement on Kafka 0.9. This ability seems useful in its own right. - Functionally, It does seem to overlap with notion of interceptors even if its not the intention.. JMS convertors, deal more with body and less with headers. - If each source implements its own converters. It is better to have a common reusable convertor system shared by others sources. Which can then bring into question the need for interceptors. - Although well motivated, It feels excessive to introduce convertors in this ticket which deals with merely adding couple headers. - My thoughts + Documentation needs update + Make the new headers optional (and disabled by default) so that existing users don 't see any impact. + If you are willing to simplify it to do this without convertors it would make this a simpler review and require less debate. Unless you any other thoughts ? > Kafka Source - Message Offset and Partition add to headers > -- > > Key: FLUME-2799 > URL: https://issues.apache.org/jira/browse/FLUME-2799 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Michael Andre Pearce (IG) >Priority: Minor > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2799-0.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently Kafka source only persists the original kafka message's topic into > the Flume event headers. > For downstream interceptors and sinks that may want to have available to them > the partition and the offset , we need to add these. > Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is > not configurable unlike other sources such as JMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers
[ https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097032#comment-15097032 ] Roshan Naik commented on FLUME-2799: My comments: - Needs update to documentation wrt Converter, converter config and custom converters. - The default converter here is applying 4 headers (2 old, plus 2 new ones). Adding headers to every event is expensive in terms of memory (and also some cpu due to added GC pressure). Which headers to apply should be user selectable with the default settings preserving existing behavior. - The need for introducing Convertor for adding additional headers may be a bit overkill, but acceptable. > Kafka Source - Message Offset and Partition add to headers > -- > > Key: FLUME-2799 > URL: https://issues.apache.org/jira/browse/FLUME-2799 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Michael Andre Pearce (IG) >Priority: Minor > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2799-0.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently Kafka source only persists the original kafka message's topic into > the Flume event headers. > For downstream interceptors and sinks that may want to have available to them > the partition and the offset , we need to add these. > Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is > not configurable unlike other sources such as JMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers
[ https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096971#comment-15096971 ] Roshan Naik commented on FLUME-2799: [~gwenshap]would you be able to review this ? > Kafka Source - Message Offset and Partition add to headers > -- > > Key: FLUME-2799 > URL: https://issues.apache.org/jira/browse/FLUME-2799 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Michael Andre Pearce (IG) >Priority: Minor > Labels: easyfix, patch > Fix For: v1.7.0 > > Attachments: FLUME-2799-0.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently Kafka source only persists the original kafka message's topic into > the Flume event headers. > For downstream interceptors and sinks that may want to have available to them > the partition and the offset , we need to add these. > Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is > not configurable unlike other sources such as JMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows
[ https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074295#comment-15074295 ] Roshan Naik commented on FLUME-2806: thanks [~lmousseau] for the contribution. committed.. > flume-ng.ps1 Error running script to start an agent on Windows > -- > > Key: FLUME-2806 > URL: https://issues.apache.org/jira/browse/FLUME-2806 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 > Environment: Windows 8 >Reporter: Liam Mousseau >Assignee: Liam Mousseau > Fix For: v1.7.0 > > Attachments: flume-ng.ps1.txt > > Original Estimate: 1h > Remaining Estimate: 1h > > Error: > {noformat} > C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a > cmdlet, function, > script file, or operable program. Check the spelling of the name, or if a > path was included, verify that the path is > correct and try again. > At line:1 char:1 > + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f > conf\flume-conf.properties.templ ... > + > > + CategoryInfo : ObjectNotFound: (ss:String) [flume-ng.ps1], > CommandNotFoundException > + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1 > {noformat} > Fix: Remove the 'ss' on line 169: > {noformat} > ... > Function GetJavaPath { > if ($env:JAVA_HOME) { > return "$env:JAVA_HOME\bin\java.exe" }ss > Write-Host "WARN: JAVA_HOME not set" > return '"' + (Resolve-Path "java.exe").Path + '"' > } > ... > {noformat} > Work-around: Remove the ss on line 169 manually in the powershell script and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows
[ https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik resolved FLUME-2806. Resolution: Fixed Fix Version/s: v1.7.0 > flume-ng.ps1 Error running script to start an agent on Windows > -- > > Key: FLUME-2806 > URL: https://issues.apache.org/jira/browse/FLUME-2806 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 > Environment: Windows 8 >Reporter: Liam Mousseau >Assignee: Liam Mousseau > Fix For: v1.7.0 > > Attachments: flume-ng.ps1.txt > > Original Estimate: 1h > Remaining Estimate: 1h > > Error: > {noformat} > C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a > cmdlet, function, > script file, or operable program. Check the spelling of the name, or if a > path was included, verify that the path is > correct and try again. > At line:1 char:1 > + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f > conf\flume-conf.properties.templ ... > + > > + CategoryInfo : ObjectNotFound: (ss:String) [flume-ng.ps1], > CommandNotFoundException > + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1 > {noformat} > Fix: Remove the 'ss' on line 169: > {noformat} > ... > Function GetJavaPath { > if ($env:JAVA_HOME) { > return "$env:JAVA_HOME\bin\java.exe" }ss > Write-Host "WARN: JAVA_HOME not set" > return '"' + (Resolve-Path "java.exe").Path + '"' > } > ... > {noformat} > Work-around: Remove the ss on line 169 manually in the powershell script and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows
[ https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074287#comment-15074287 ] Roshan Naik commented on FLUME-2806: +1 > flume-ng.ps1 Error running script to start an agent on Windows > -- > > Key: FLUME-2806 > URL: https://issues.apache.org/jira/browse/FLUME-2806 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 > Environment: Windows 8 >Reporter: Liam Mousseau >Assignee: Liam Mousseau > Attachments: flume-ng.ps1.txt > > Original Estimate: 1h > Remaining Estimate: 1h > > Error: > {noformat} > C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a > cmdlet, function, > script file, or operable program. Check the spelling of the name, or if a > path was included, verify that the path is > correct and try again. > At line:1 char:1 > + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f > conf\flume-conf.properties.templ ... > + > > + CategoryInfo : ObjectNotFound: (ss:String) [flume-ng.ps1], > CommandNotFoundException > + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1 > {noformat} > Fix: Remove the 'ss' on line 169: > {noformat} > ... > Function GetJavaPath { > if ($env:JAVA_HOME) { > return "$env:JAVA_HOME\bin\java.exe" }ss > Write-Host "WARN: JAVA_HOME not set" > return '"' + (Resolve-Path "java.exe").Path + '"' > } > ... > {noformat} > Work-around: Remove the ss on line 169 manually in the powershell script and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows
[ https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2806: --- Assignee: Liam Mousseau > flume-ng.ps1 Error running script to start an agent on Windows > -- > > Key: FLUME-2806 > URL: https://issues.apache.org/jira/browse/FLUME-2806 > Project: Flume > Issue Type: Bug >Affects Versions: v1.6.0 > Environment: Windows 8 >Reporter: Liam Mousseau >Assignee: Liam Mousseau > Attachments: flume-ng.ps1.txt > > Original Estimate: 1h > Remaining Estimate: 1h > > Error: > {noformat} > C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a > cmdlet, function, > script file, or operable program. Check the spelling of the name, or if a > path was included, verify that the path is > correct and try again. > At line:1 char:1 > + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f > conf\flume-conf.properties.templ ... > + > > + CategoryInfo : ObjectNotFound: (ss:String) [flume-ng.ps1], > CommandNotFoundException > + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1 > {noformat} > Fix: Remove the 'ss' on line 169: > {noformat} > ... > Function GetJavaPath { > if ($env:JAVA_HOME) { > return "$env:JAVA_HOME\bin\java.exe" }ss > Write-Host "WARN: JAVA_HOME not set" > return '"' + (Resolve-Path "java.exe").Path + '"' > } > ... > {noformat} > Work-around: Remove the ss on line 169 manually in the powershell script and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (FLUME-2854) Parameterize jetty version in pom
[ https://issues.apache.org/jira/browse/FLUME-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2854: --- Comment: was deleted (was: +1) > Parameterize jetty version in pom > - > > Key: FLUME-2854 > URL: https://issues.apache.org/jira/browse/FLUME-2854 > Project: Flume > Issue Type: Bug >Reporter: Sriharsha Chintalapani >Assignee: Sriharsha Chintalapani > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2854) Parameterize jetty version in pom
[ https://issues.apache.org/jira/browse/FLUME-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062566#comment-15062566 ] Roshan Naik commented on FLUME-2854: +1 > Parameterize jetty version in pom > - > > Key: FLUME-2854 > URL: https://issues.apache.org/jira/browse/FLUME-2854 > Project: Flume > Issue Type: Bug >Reporter: Sriharsha Chintalapani >Assignee: Sriharsha Chintalapani > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2801) Performance improvement on TailDir source
[ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062513#comment-15062513 ] Roshan Naik edited comment on FLUME-2801 at 12/17/15 6:45 PM: -- +1 Thanks [~iijima_satoshi] for the review. Im running tests.. will commit soon. was (Author: roshan_naik): Thanks [~iijima_satoshi] for the review. Im running tests.. will commit soon. > Performance improvement on TailDir source > - > > Key: FLUME-2801 > URL: https://issues.apache.org/jira/browse/FLUME-2801 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.7.0 >Reporter: Jun Seok Hong >Assignee: Jun Seok Hong > Fix For: v1.7.0 > > Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch > > > This a proposal of performance improvement for new tailing source FLUME-2498. > Taildir source reads a file by 1byte, so the performance is very low compared > to tailing on exec source. > I tested lot's of ways to improve performance and implemented the best one. > Changes. > * Reading a file by a 8k block instead of 1 byte. > * Use byte[] for handling data instead of > ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance. > * Don't convert byte[] to string and vice verse. > Simple file reading test results. > {quote} > File size: 100 MB, > Line size: 500 byte > Estimated time to read the file: > |Reading 1byte(Using the code in Taildir)|32544 ms| > |Reading 8K Block|431 ms| > {quote} > Testing on flume, it catches up the performance of tailing on exec source. > (30x performance boost) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2801) Performance improvement on TailDir source
[ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062513#comment-15062513 ] Roshan Naik commented on FLUME-2801: Thanks [~iijima_satoshi] for the review. Im running tests.. will commit soon. > Performance improvement on TailDir source > - > > Key: FLUME-2801 > URL: https://issues.apache.org/jira/browse/FLUME-2801 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.7.0 >Reporter: Jun Seok Hong >Assignee: Jun Seok Hong > Fix For: v1.7.0 > > Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch > > > This a proposal of performance improvement for new tailing source FLUME-2498. > Taildir source reads a file by 1byte, so the performance is very low compared > to tailing on exec source. > I tested lot's of ways to improve performance and implemented the best one. > Changes. > * Reading a file by a 8k block instead of 1 byte. > * Use byte[] for handling data instead of > ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance. > * Don't convert byte[] to string and vice verse. > Simple file reading test results. > {quote} > File size: 100 MB, > Line size: 500 byte > Estimated time to read the file: > |Reading 1byte(Using the code in Taildir)|32544 ms| > |Reading 8K Block|431 ms| > {quote} > Testing on flume, it catches up the performance of tailing on exec source. > (30x performance boost) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2801) Performance improvement on TailDir source
[ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060897#comment-15060897 ] Roshan Naik commented on FLUME-2801: [~iijima_satoshi] could you help review this patch ? > Performance improvement on TailDir source > - > > Key: FLUME-2801 > URL: https://issues.apache.org/jira/browse/FLUME-2801 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.7.0 >Reporter: Jun Seok Hong >Assignee: Jun Seok Hong > Fix For: v1.7.0 > > Attachments: FLUME-2801-1.patch, FLUME-2801.patch > > > This a proposal of performance improvement for new tailing source FLUME-2498. > Taildir source reads a file by 1byte, so the performance is very low compared > to tailing on exec source. > I tested lot's of ways to improve performance and implemented the best one. > Changes. > * Reading a file by a 8k block instead of 1 byte. > * Use byte[] for handling data instead of > ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance. > * Don't convert byte[] to string and vice verse. > Simple file reading test results. > {quote} > File size: 100 MB, > Line size: 500 byte > Estimated time to read the file: > |Reading 1byte(Using the code in Taildir)|32544 ms| > |Reading 8K Block|431 ms| > {quote} > Testing on flume, it catches up the performance of tailing on exec source. > (30x performance boost) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2801) Performance improvement on TailDir source
[ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2801: --- Assignee: Jun Seok Hong > Performance improvement on TailDir source > - > > Key: FLUME-2801 > URL: https://issues.apache.org/jira/browse/FLUME-2801 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.7.0 >Reporter: Jun Seok Hong >Assignee: Jun Seok Hong > Fix For: v1.7.0 > > Attachments: FLUME-2801-1.patch, FLUME-2801.patch > > > This a proposal of performance improvement for new tailing source FLUME-2498. > Taildir source reads a file by 1byte, so the performance is very low compared > to tailing on exec source. > I tested lot's of ways to improve performance and implemented the best one. > Changes. > * Reading a file by a 8k block instead of 1 byte. > * Use byte[] for handling data instead of > ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance. > * Don't convert byte[] to string and vice verse. > Simple file reading test results. > {quote} > File size: 100 MB, > Line size: 500 byte > Estimated time to read the file: > |Reading 1byte(Using the code in Taildir)|32544 ms| > |Reading 8K Block|431 ms| > {quote} > Testing on flume, it catches up the performance of tailing on exec source. > (30x performance boost) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2451) HDFS Sink Cannot Reconnect After NameNode Restart
[ https://issues.apache.org/jira/browse/FLUME-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049341#comment-15049341 ] Roshan Naik commented on FLUME-2451: patch v2 from here went into the Flume included in HDP. If you are able to build the Apache version of Flume with this patch v3 that would be helpful > HDFS Sink Cannot Reconnect After NameNode Restart > - > > Key: FLUME-2451 > URL: https://issues.apache.org/jira/browse/FLUME-2451 > Project: Flume > Issue Type: Bug > Components: File Channel, Sinks+Sources >Affects Versions: v1.4.0 > Environment: 8 node CDH 4.2.2 (2.0.0-cdh4.2.2) cluster > All cluster machines are running Ubuntu 12.04 x86_64 >Reporter: Andrew O'Neill >Assignee: Roshan Naik > Labels: HDFS, Sink > Attachments: FLUME-2451.patch, FLUME-2451.v2.patch, > FLUME-2451.v3.patch > > > I am testing a simple flume setup with a Sequence Generator Source, a File > Channel, and an HDFS Sink (see my flume.conf below). This configuration works > as expected until I reboot the cluster’s NameNode or until I restart the HDFS > service on the cluster. At this point, it appears that the Flume Agent cannot > reconnect to HDFS and must be manually restarted. > Here is our flume.conf: > appserver.sources = rawtext > appserver.channels = testchannel > appserver.sinks = test_sink > appserver.sources.rawtext.type = seq > appserver.sources.rawtext.channels = testchannel > appserver.channels.testchannel.type = file > appserver.channels.testchannel.capacity = 1000 > appserver.channels.testchannel.minimumRequiredSpace = 214748364800 > appserver.channels.testchannel.checkpointDir = > /Users/aoneill/Desktop/testchannel/checkpoint > appserver.channels.testchannel.dataDirs = > /Users/aoneill/Desktop/testchannel/data > appserver.channels.testchannel.maxFileSize = 2000 > appserver.sinks.test_sink.type = hdfs > appserver.sinks.test_sink.channel = testchannel > appserver.sinks.test_sink.hdfs.path = > hdfs://cluster01:8020/user/aoneill/flumetest > appserver.sinks.test_sink.hdfs.closeTries = 3 > appserver.sinks.test_sink.hdfs.filePrefix = events- > appserver.sinks.test_sink.hdfs.fileSuffix = .avro > appserver.sinks.test_sink.hdfs.fileType = DataStream > appserver.sinks.test_sink.hdfs.writeFormat = Text > appserver.sinks.test_sink.hdfs.inUsePrefix = inuse- > appserver.sinks.test_sink.hdfs.inUseSuffix = .avro > appserver.sinks.test_sink.hdfs.rollCount = 10 > appserver.sinks.test_sink.hdfs.rollInterval = 30 > appserver.sinks.test_sink.hdfs.rollSize = 10485760 > These are the two error message that the Flume Agent outputs constantly after > the restart: > 2014-08-26 10:47:24,572 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - > org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:96)] > Unexpected error while checking replication factor > java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162) > at > org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82) > at > org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452) > at > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525) > at > org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:881) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineF
[jira] [Commented] (FLUME-2716) File Channel cannot handle capacity Integer.MAX_VALUE
[ https://issues.apache.org/jira/browse/FLUME-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045807#comment-15045807 ] Roshan Naik commented on FLUME-2716: I assume the new limit after this patch for the 'capacity' setting is Integer.MAX_VALUE. Is there clarity on what was the limit for 'capacity' setting on file channel before this patch ? > File Channel cannot handle capacity Integer.MAX_VALUE > - > > Key: FLUME-2716 > URL: https://issues.apache.org/jira/browse/FLUME-2716 > Project: Flume > Issue Type: Bug > Components: Channel, File Channel >Affects Versions: v1.6.0, v1.7.0 >Reporter: Dong Zhao > Fix For: v1.7.0 > > Attachments: FLUME-2716.patch > > > if capacity is set to Integer.MAX_VALUE(2147483647), checkpoint file size is > calculated wrongly to 8224. The calculation should first cast int to long, > then calculate the totalBytes. See the patch for details. Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990763#comment-14990763 ] Roshan Naik commented on FLUME-2792: Updates - - PLAINTESTSAL is being renamed to SASL_PLAINTEXT in Apache Kafka. - Kafka has decided to support security only for the new Producer APIs. Apache Flume uses the old API today. So until Flume code is updated for the new APIs, this wont work. - The Kafka that is shipped as part of HDP had secure support for the old API too. > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990763#comment-14990763 ] Roshan Naik edited comment on FLUME-2792 at 11/5/15 12:09 AM: -- Updates - - PLAINTEXTSASL is being renamed to SASL_PLAINTEXT in Apache Kafka. - Kafka has decided to support security only for the new Producer APIs. Apache Flume uses the old API today. So until Flume code is updated for the new APIs, this wont work. - The Kafka that is shipped as part of HDP had secure support for the old API too. was (Author: roshan_naik): Updates - - PLAINTESTSAL is being renamed to SASL_PLAINTEXT in Apache Kafka. - Kafka has decided to support security only for the new Producer APIs. Apache Flume uses the old API today. So until Flume code is updated for the new APIs, this wont work. - The Kafka that is shipped as part of HDP had secure support for the old API too. > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2835) Hive Sink tests need to create table with transactional property set
[ https://issues.apache.org/jira/browse/FLUME-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988108#comment-14988108 ] Roshan Naik commented on FLUME-2835: +1 > Hive Sink tests need to create table with transactional property set > - > > Key: FLUME-2835 > URL: https://issues.apache.org/jira/browse/FLUME-2835 > Project: Flume > Issue Type: Bug >Reporter: Sriharsha Chintalapani >Assignee: Sriharsha Chintalapani > Attachments: FLUME-hive-test.patch > > > As per Hive streaming wiki the transactional=true property needs to be set > on the table for streaming. > https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro
[ https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972217#comment-14972217 ] Roshan Naik edited comment on FLUME-2819 at 10/24/15 12:52 AM: --- Flume generally follows this pattern for its deps (hadoop, hive, hbase) And no doubt a common issue in other prjs too. For e.g Hive does not ship with Hadoop deps. Basically end up relying on the backward binary compat provided by those deps. Unfortunately due to the complexity it gets tested with only 1 version of each dependency. Helps a bit that each Hadoop vendor ends up testing it with a different versions depending on what they ship. Your suggestion or a variation of it seems like worth looking into. There might be some tricky things that can happen when including multiple version of multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such deps under a separate lib2 folder. Many components in Flume will need that change if we go with that...in addition to startup script needs changes to support for each of those. best tracked in a separate jira. Also this is not flume specific situation. was (Author: roshan_naik): No doubt a common issue. For e.g Hive does not ship with Hadoop deps. Basically end up relying on the backward binary compat provided by those deps. Unfortunately due to the complexity it gets tested with only 1 version of each dependency. Helps a bit that each Hadoop vendor ends up testing it with a different versions depending on what they ship. Your suggestion or a variation of it seems like worth looking into. There might be some tricky things that can happen when including multiple version of multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such deps under a separate lib2 folder. Many components in Flume will need that change if we go with that...in addition to startup script needs changes to support for each of those. best tracked in a separate jira. Also this is not flume specific situation. > Kafka libs are being bundled into Flume distro > -- > > Key: FLUME-2819 > URL: https://issues.apache.org/jira/browse/FLUME-2819 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > > Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro
[ https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972217#comment-14972217 ] Roshan Naik edited comment on FLUME-2819 at 10/24/15 12:51 AM: --- No doubt a common issue. For e.g Hive does not ship with Hadoop deps. Basically end up relying on the backward binary compat provided by those deps. Unfortunately due to the complexity it gets tested with only 1 version of each dependency. Helps a bit that each Hadoop vendor ends up testing it with a different versions depending on what they ship. Your suggestion or a variation of it seems like worth looking into. There might be some tricky things that can happen when including multiple version of multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such deps under a separate lib2 folder. Many components in Flume will need that change if we go with that...in addition to startup script needs changes to support for each of those. best tracked in a separate jira. Also this is not flume specific situation. was (Author: roshan_naik): No doubt a common issue. For e.g Hive does not ship with Hadoop deps. Basically end up relying on the backward binary compat provided by those deps. Unfortunately due to the complexity it gets tested with only 1 version of each dependency. Helps a bit that each Hadoop vendor ends up testing it with a different versions depending on what they ship. Your suggestion or a variation of it seems like worth looking into. There might be some tricky things that can happen when including multiple version of multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such deps under a separate lib2 folder. Many components in Flume will need that change if we go with that...in addition to startup script needs changes to support for each of those. best tracked in a separate jira. But like you mentioned, this is not flume specific situation. > Kafka libs are being bundled into Flume distro > -- > > Key: FLUME-2819 > URL: https://issues.apache.org/jira/browse/FLUME-2819 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > > Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2819) Kafka libs are being bundled into Flume distro
[ https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972217#comment-14972217 ] Roshan Naik commented on FLUME-2819: No doubt a common issue. For e.g Hive does not ship with Hadoop deps. Basically end up relying on the backward binary compat provided by those deps. Unfortunately due to the complexity it gets tested with only 1 version of each dependency. Helps a bit that each Hadoop vendor ends up testing it with a different versions depending on what they ship. Your suggestion or a variation of it seems like worth looking into. There might be some tricky things that can happen when including multiple version of multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such deps under a separate lib2 folder. Many components in Flume will need that change if we go with that...in addition to startup script needs changes to support for each of those. best tracked in a separate jira. But like you mentioned, this is not flume specific situation. > Kafka libs are being bundled into Flume distro > -- > > Key: FLUME-2819 > URL: https://issues.apache.org/jira/browse/FLUME-2819 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > > Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro
[ https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972022#comment-14972022 ] Roshan Naik edited comment on FLUME-2819 at 10/23/15 10:49 PM: --- - Flume depends on several external services (hadoop and non hadoop) and that list keeps growing with each release. So not feasible to keep updating and releasing Flume frequently (i.e each time one of them release a new version) to enable Flume users leverage new features being release in them. - Releasing frequently with updated version creates a diff problem .. it ties each Flume version to a specific dep version. And in the end user may not want to use that version of the service. - Release frequently model also forces Flume to release each time there is a security fix releases in them - If users have to wait for new Flume releases, that then leads to long waits for users... or the need to perform surgery on the flume/lib directory. Neither is reasonable. Consequently, IMO ... using the "provided" model for such deps and giving the users a mechanism to customize the Flume Classpath easily, is in the end a better choice. The slight OOB experience penalty is a smaller problem when considering all the trade offs. was (Author: roshan_naik): - Flume depends on several external services (hadoop and non hadoop) and that list keeps growing with each release. So not feasible to keep updating and releasing Flume frequently (i.e each time one of them release a new version) to enable Flume users leverage new features being release in them. - Releasing frequently with updated version creates a diff problem .. it ties each Flume version to a specific dep version. And in the end user may not want to use that version of the service. - Release frequently model also forces Flume to release each time there is a security fix releases in them - If users have to wait for new Flume releases, that then leads to long waits for users... or the need to perform surgery on the flume/lib directory. Neither is reasonable. Consequently, IMO, using the "provided" model for such deps and giving the users a mechanism to customize the Flume Classpath easily, is in the end a better choice. > Kafka libs are being bundled into Flume distro > -- > > Key: FLUME-2819 > URL: https://issues.apache.org/jira/browse/FLUME-2819 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > > Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2819) Kafka libs are being bundled into Flume distro
[ https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972022#comment-14972022 ] Roshan Naik commented on FLUME-2819: - Flume depends on several external services (hadoop and non hadoop) and that list keeps growing with each release. So not feasible to keep updating and releasing Flume frequently (i.e each time one of them release a new version) to enable Flume users leverage new features being release in them. - Releasing frequently with updated version creates a diff problem .. it ties each Flume version to a specific dep version. And in the end user may not want to use that version of the service. - Release frequently model also forces Flume to release each time there is a security fix releases in them - If users have to wait for new Flume releases, that then leads to long waits for users... or the need to perform surgery on the flume/lib directory. Neither is reasonable. Consequently, IMO, using the "provided" model for such deps and giving the users a mechanism to customize the Flume Classpath easily, is in the end a better choice. > Kafka libs are being bundled into Flume distro > -- > > Key: FLUME-2819 > URL: https://issues.apache.org/jira/browse/FLUME-2819 > Project: Flume > Issue Type: Bug >Reporter: Roshan Naik > > Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967495#comment-14967495 ] Roshan Naik edited comment on FLUME-2792 at 10/21/15 5:23 PM: -- opened FLUME-2819 to track the kafka libs issue noted above was (Author: roshan_naik): opened FLUME-2819 > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967495#comment-14967495 ] Roshan Naik commented on FLUME-2792: opened FLUME-2819 > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2819) Kafka libs are being bundled into Flume distro
Roshan Naik created FLUME-2819: -- Summary: Kafka libs are being bundled into Flume distro Key: FLUME-2819 URL: https://issues.apache.org/jira/browse/FLUME-2819 Project: Flume Issue Type: Bug Reporter: Roshan Naik Kafka dependency libs need to be marked as 'provided' in the pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik resolved FLUME-2798. Resolution: Fixed Fix Version/s: v1.7.0 > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Assignee: Phil D'Amore >Priority: Critical > Fix For: v1.7.0 > > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941944#comment-14941944 ] Roshan Naik commented on FLUME-2798: Its committed. Thanks very much for the patch [~tweek] > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Assignee: Phil D'Amore >Priority: Critical > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2798: --- Assignee: Phil D'Amore > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Assignee: Phil D'Amore >Priority: Critical > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941911#comment-14941911 ] Roshan Naik commented on FLUME-2798: This patch looks good to me. +1 > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Priority: Critical > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2804: --- Assignee: Sriharsha Chintalapani > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Sriharsha Chintalapani >Assignee: Sriharsha Chintalapani > Labels: Hive > Fix For: v1.7.0 > > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support
[ https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940582#comment-14940582 ] Roshan Naik commented on FLUME-2792: Here are some notes that i gathered from talking to Kafka experts. Give it a shot... It seems like it might be possible for Kafka Sink (i,e kafka producer side). This won't work for Kafka source (I.e Kafka consumer side) Below are the steps we identified: 1) *In Flume's Kafka Sink config set:* agentName.sinks.KafkaSinkName.kafka.security.protocol=PLAINTEXTSASL The sink will forward this setting to the underlying Kafka Producer APIs. This inform the Producer APIs to use kerberos 2) *Pass the following JVM args to Flume:* -Djava.security.auth.login.config= /path/jaas.conf This indicates the name of the file which has additional security settings and used by the Producer APIs. JVM args for Flume can be set using the flume-env.sh which resides in the directory specified by the –c argument to Flume startup command. If Ambari managed, ambari also allows you to directly edit the flume-env.sh as far I recall. 3) *The jaas.conf file's contents should look like this:* KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/security/keytabs/flume_agent.keytab" storeKey=true useTicketCache=false principal="flume_agent/host_n...@example.com" serviceName="kafka"; }; You need to customize the key tab, principal and service name. 4) *Ensure the right Kafka libraries are used by Flume:* The Kerberos support is being added to version the upcoming Kafka v 0.9. Just ensure flume/lib does not have conflicting kafka jar versions. > Flume Kafka Kerberos Support > > > Key: FLUME-2792 > URL: https://issues.apache.org/jira/browse/FLUME-2792 > Project: Flume > Issue Type: Bug > Components: Configuration, Docs, Sinks+Sources >Affects Versions: v1.6.0, v1.5.2 > Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume > 1.5.2 or Apache Flume 1.6 downloaded from apache.org >Reporter: Hari Sekhon >Priority: Blocker > > Following on from FLUME-2790 it appears as though Flume doesn't yet have > support for Kafka + Kerberos as there are is no setting documented in the > Flume 1.6.0 user guide under the Kafka source section to tell Flume to use > plaintextsasl as the connection mechanism to Kafka and Kafka rejects > unauthenticated plaintext mechanism: > {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: > [ConsumerFetcherManager-1441903874830] Added fetcher for partitions > ArrayBuffer() > 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: > [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed > to find leader for Set([,0], [,1]) > kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not > found for broker 0 > at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124) > at > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2804: --- Component/s: Sinks+Sources > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Sriharsha Chintalapani > Labels: Hive > Fix For: v1.7.0 > > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2804: --- Labels: Hive (was: ) > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Sriharsha Chintalapani > Labels: Hive > Fix For: v1.7.0 > > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2804: --- Fix Version/s: v1.7.0 > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Sriharsha Chintalapani > Labels: Hive > Fix For: v1.7.0 > > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2804: --- Affects Version/s: v1.6.0 > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Sriharsha Chintalapani > Labels: Hive > Fix For: v1.7.0 > > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik resolved FLUME-2804. Resolution: Fixed Committed. Thanks [~sriharsha] > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug >Reporter: Sriharsha Chintalapani > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2804) Hive sink - abort remaining transactions on shutdown
[ https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935961#comment-14935961 ] Roshan Naik commented on FLUME-2804: +1 > Hive sink - abort remaining transactions on shutdown > > > Key: FLUME-2804 > URL: https://issues.apache.org/jira/browse/FLUME-2804 > Project: Flume > Issue Type: Bug >Reporter: Sriharsha Chintalapani > Attachments: FLUME-2804.patch > > > Currently the hive sink does not explicitly abort unused transactions. > Although these eventually timeout on the hive side, it is preferable to > explicitly abort them so that the associated locks on the hive > table/partition are released. As long as the locks stay open, the > table/partition cannot be dropped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2798: --- Assignee: (was: Roshan Naik) > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Priority: Critical > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException
[ https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik reassigned FLUME-2798: -- Assignee: Roshan Naik > Malformed Syslog messages can lead to OutOfMemoryException > -- > > Key: FLUME-2798 > URL: https://issues.apache.org/jira/browse/FLUME-2798 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.5.0, v1.6.0 >Reporter: Phil D'Amore >Assignee: Roshan Naik >Priority: Critical > Attachments: FLUME-2798.patch > > > It's possible for a client submitting syslog data which is malformed in > various ways to convince SyslogUtils.extractEvent to continually fill the > ByteArrayOutputStream it uses to collect the event until the agent runs out > of memory. Since the OOM condition affects the whole agent, it's possible > that a client sending such data (due to accident or malicious intent) to > disable the agent, as long as it remains connected. > Note that this is probably only possible using SyslogTcpSource although the > fix touches common code in SyslogUtils.java. > The issue can happen in two ways: > Scenario 1: Send a message like this: > {{<> some more stuff here}} > This causes a NumberFormatException: > {code} > Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler > WARNING: EXCEPTION, please implement > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() > for proper handling. > java.lang.NumberFormatException: For input string: "" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:504) > at java.lang.Integer.parseInt(Integer.java:527) > at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198) > at > org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344) > at > org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > This exception does not get handled, and it happens before reset() can be > called. The result is that the state machine in SyslogUtils gets stuck in > the DATA state, and all subsequent data just gets appended to the baos, while > the above exception streams to the log. Eventually the agent runs out of > memory. > Scenario 2: Send some data like this: > {{<123...}} > No length checking is done in the PRIO state so you could potentially fill > the agent memory this way too. > I'm attaching a patch which handles both of these issues and adds more > exception handling to buildEvent to make sure that reset() is called in > future unforeseen situations. > Thanks also to [~roshan_naik] for helping to make this patch better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2225) Elasticsearch Sink for ES HTTP API
[ https://issues.apache.org/jira/browse/FLUME-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739203#comment-14739203 ] Roshan Naik commented on FLUME-2225: [~mbonica] canu open a new jira for this ? > Elasticsearch Sink for ES HTTP API > -- > > Key: FLUME-2225 > URL: https://issues.apache.org/jira/browse/FLUME-2225 > Project: Flume > Issue Type: New Feature >Affects Versions: v1.5.0 >Reporter: Otis Gospodnetic >Assignee: Pawel Rog > Fix For: v1.4.1, v1.5.0 > > Attachments: FLUME-2225-0.patch, FLUME-2225-1.patch, > FLUME-2225-5.patch, FLUME-2225-6.patch > > > Existing ElasticSearchSink uses ES TransportClient. As such, one cannot use > the ES HTTP API, which is sometimes easier, and doesn't have issues around > client and server/cluster components using incompatible versions - currently, > both client and server/cluster need to be on the same version. > See > http://search-hadoop.com/m/k76HH9Te68/otis&subj=Elasticsearch+sink+that+uses+HTTP+API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2433) Add kerberos support for Hive sink
[ https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated FLUME-2433: --- Attachment: FLUME-2433.v2.patch [~jrufus] It seems like your suggestion is the right way to go. My current implementation mimics the previous HDFS kerberos implementation and doesn't use the mod-auth. Right now, I am uploading the rebased patch as it will take me some time to figure out what changes are needed for switching to mod-auth etc. System testing of the new implementation will also take sometime as it requires a secure cluster setup. Leave it up to you if you want to commit this in its current state and have the switch it mod-auth in another jira or hold this Jira for the revised implementation. It will take me sometime for me to do that i think. > Add kerberos support for Hive sink > -- > > Key: FLUME-2433 > URL: https://issues.apache.org/jira/browse/FLUME-2433 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.5.0.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: HiveSink, Kerberos, > Attachments: FLUME-2433.patch, FLUME-2433.v2.patch > > > Add kerberos authentication support for Hive sink > FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 > this should be available in the next hive release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLUME-2754) Hive Sink skipping first transaction in each Batch of Hive Transactions
[ https://issues.apache.org/jira/browse/FLUME-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik resolved FLUME-2754. Resolution: Fixed Fix Version/s: v1.7.0 Committed. > Hive Sink skipping first transaction in each Batch of Hive Transactions > --- > > Key: FLUME-2754 > URL: https://issues.apache.org/jira/browse/FLUME-2754 > Project: Flume > Issue Type: Bug >Affects Versions: v1.5.0, 1.6 >Reporter: Roshan Naik >Assignee: Deepesh Khandelwal > Fix For: v1.7.0 > > Attachments: FLUME-2754.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2754) Hive Sink skipping first transaction in each Batch of Hive Transactions
[ https://issues.apache.org/jira/browse/FLUME-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712332#comment-14712332 ] Roshan Naik commented on FLUME-2754: Thanks [~deepesh] for patch and including the test. +1 > Hive Sink skipping first transaction in each Batch of Hive Transactions > --- > > Key: FLUME-2754 > URL: https://issues.apache.org/jira/browse/FLUME-2754 > Project: Flume > Issue Type: Bug >Affects Versions: v1.5.0, 1.6 >Reporter: Roshan Naik >Assignee: Deepesh Khandelwal > Attachments: FLUME-2754.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2498) Implement Taildir Source
[ https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709683#comment-14709683 ] Roshan Naik commented on FLUME-2498: [~evilezh] could u open a jira for that feature request.. and consider submitting a patch for it ? > Implement Taildir Source > > > Key: FLUME-2498 > URL: https://issues.apache.org/jira/browse/FLUME-2498 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Reporter: Satoshi Iijima > Fix For: v1.7.0 > > Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, > FLUME-2498-4.patch, FLUME-2498-5.patch, FLUME-2498.patch > > > This is the proposal of implementing a new tailing source. > This source watches the specified files, and tails them in nearly real-time > once appends are detected to these files. > * This source is reliable and will not miss data even when the tailing files > rotate. > * It periodically writes the last read position of each file in a position > file using the JSON format. > * If Flume is stopped or down for some reason, it can restart tailing from > the position written on the existing position file. > * It can add event headers to each tailing file group. > A attached patch includes a config documentation of this. > This source requires Unix-style file system and Java 1.7 or later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2433) Add kerberos support for Hive sink
[ https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707801#comment-14707801 ] Roshan Naik commented on FLUME-2433: This patch probably needs to be rebased. Its actually been in production for over year with the HDP distribution. Perhaps [~ashishpaliwal] or [~jrufus] can help with review and commit once i revise the patch. > Add kerberos support for Hive sink > -- > > Key: FLUME-2433 > URL: https://issues.apache.org/jira/browse/FLUME-2433 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.5.0.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: HiveSink, Kerberos, > Attachments: FLUME-2433.patch > > > Add kerberos authentication support for Hive sink > FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 > this should be available in the next hive release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2436) Make hadoop-2 the default build profile
[ https://issues.apache.org/jira/browse/FLUME-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707288#comment-14707288 ] Roshan Naik commented on FLUME-2436: [~raviprak] with flume-1.6 and above the default profile should be 'hbase-1' which also uses hadoop 2. Not sure if hadoop-2 profile has much use anymore. > Make hadoop-2 the default build profile > --- > > Key: FLUME-2436 > URL: https://issues.apache.org/jira/browse/FLUME-2436 > Project: Flume > Issue Type: Bug >Reporter: Hari Shreedharan >Assignee: Johny Rufus > Labels: build > Attachments: FLUME-2436.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)