[jira] [Commented] (FLUME-3036) Create a RegexSerializer for Hive Sink

2017-04-03 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954335#comment-15954335
 ] 

Roshan Naik commented on FLUME-3036:


[~kalyanhadoop] thanks for your efforts to get the RegexWriter patch into Hive.
Looks like there is a release candidate out for Hive v1.2.2 ... so we should be 
able to commit this in Flume as soon as that gets released.
Does that sound ok  ?

Looks lie we need a revised patch here without the RegExWriter class. Also 
would be good to combine the docs and code patches into one.

> Create a RegexSerializer for Hive Sink
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: FLUME-3036-docs.patch, FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-2620) File channel throws NullPointerException if a header value is null

2017-03-13 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923388#comment-15923388
 ] 

Roshan Naik commented on FLUME-2620:


[~marcellhegedus]  It is easier to review the patches if put on [Review 
Board|https://reviews.apache.org/r/]

> File channel throws NullPointerException if a header value is null
> --
>
> Key: FLUME-2620
> URL: https://issues.apache.org/jira/browse/FLUME-2620
> Project: Flume
>  Issue Type: Bug
>  Components: File Channel
>Reporter: Santiago M. Mola
>Assignee: Neerja Khattar
> Attachments: FLUME-2620-0.patch, FLUME-2620-1.patch, 
> FLUME-2620-2.patch, FLUME-2620-3.patch, FLUME-2620-4.patch, 
> FLUME-2620-5.patch, FLUME-2620.patch, FLUME-2620.patch
>
>
> File channel throws NullPointerException if a header value is null.
> If this is intended, it should be reported correctly in the logs.
> Sample trace:
> org.apache.flume.ChannelException: Unable to put batch on required channel: 
> FileChannel chan { dataDirs: [/var/lib/ingestion-csv/chan/data] }
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)
>   at 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:236)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.flume.channel.file.proto.ProtosFactory$FlumeEventHeader$Builder.setValue(ProtosFactory.java:7415)
>   at org.apache.flume.channel.file.Put.writeProtos(Put.java:85)
>   at 
> org.apache.flume.channel.file.TransactionEventRecord.toByteBuffer(TransactionEventRecord.java:174)
>   at org.apache.flume.channel.file.Log.put(Log.java:622)
>   at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:469)
>   at 
> org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
>   at 
> org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLUME-2620) File channel throws NullPointerException if a header value is null

2017-03-13 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923387#comment-15923387
 ] 

Roshan Naik commented on FLUME-2620:


[~marcellhegedus] i have added you as contributor. You should be able to go 
ahead and assign this JIRA yourself if you like.

> File channel throws NullPointerException if a header value is null
> --
>
> Key: FLUME-2620
> URL: https://issues.apache.org/jira/browse/FLUME-2620
> Project: Flume
>  Issue Type: Bug
>  Components: File Channel
>Reporter: Santiago M. Mola
>Assignee: Neerja Khattar
> Attachments: FLUME-2620-0.patch, FLUME-2620-1.patch, 
> FLUME-2620-2.patch, FLUME-2620-3.patch, FLUME-2620-4.patch, 
> FLUME-2620-5.patch, FLUME-2620.patch, FLUME-2620.patch
>
>
> File channel throws NullPointerException if a header value is null.
> If this is intended, it should be reported correctly in the logs.
> Sample trace:
> org.apache.flume.ChannelException: Unable to put batch on required channel: 
> FileChannel chan { dataDirs: [/var/lib/ingestion-csv/chan/data] }
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)
>   at 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:236)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.flume.channel.file.proto.ProtosFactory$FlumeEventHeader$Builder.setValue(ProtosFactory.java:7415)
>   at org.apache.flume.channel.file.Put.writeProtos(Put.java:85)
>   at 
> org.apache.flume.channel.file.TransactionEventRecord.toByteBuffer(TransactionEventRecord.java:174)
>   at org.apache.flume.channel.file.Log.put(Log.java:622)
>   at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:469)
>   at 
> org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
>   at 
> org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLUME-3036) Create a RegexSerializer for Hive Sink

2017-01-20 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-3036:
---
Summary: Create a RegexSerializer for Hive Sink  (was: Create a Hive Sink 
based on the Streaming support with RegexSerializer)

> Create a RegexSerializer for Hive Sink
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: FLUME-3036-docs.patch, FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer

2017-01-18 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829101#comment-15829101
 ] 

Roshan Naik commented on FLUME-3036:


Put my comments in github.

> Create a Hive Sink based on the Streaming support with RegexSerializer
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: FLUME-3036-docs.patch, FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer

2017-01-18 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829050#comment-15829050
 ] 

Roshan Naik commented on FLUME-3036:


[~kalyanhadoop]  Can you please comment on what other testing you were able to 
do in addition to the new unit test.

> Create a Hive Sink based on the Streaming support with RegexSerializer
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: FLUME-3036-docs.patch, FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer

2016-12-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-3036:
---
Assignee: Kalyan

> Create a Hive Sink based on the Streaming support with RegexSerializer
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
>Assignee: Kalyan
> Attachments: FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-3036) Create a Hive Sink based on the Streaming support with RegexSerializer

2016-12-27 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15781609#comment-15781609
 ] 

Roshan Naik commented on FLUME-3036:


can u update the user guide with info on this ?

> Create a Hive Sink based on the Streaming support with RegexSerializer
> --
>
> Key: FLUME-3036
> URL: https://issues.apache.org/jira/browse/FLUME-3036
> Project: Flume
>  Issue Type: New Feature
>  Components: Sinks+Sources
>Reporter: Kalyan
> Attachments: FLUME-3036.patch
>
>
> Providing Hive Regex Serializer to work with Flume Streaming data.
> Based on existing hive sink using hive regex serde.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2892) Unable to compile & install flume with Maven

2016-09-14 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490818#comment-15490818
 ] 

Roshan Naik commented on FLUME-2892:


the Console output seems like an inconsistent copy/paste job 
... initially  section says ...
{quote}[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
project flume-jdbc-channel: There are test failures.{quote}

later it says
{quote}
[INFO] Flume NG Spillable Memory channel .. FAILURE [15:02 min]
{quote}

Not sure which is right.

If you are having test run issues you can  run maven using -DskipTests option.


> Unable to compile & install flume with Maven
> 
>
> Key: FLUME-2892
> URL: https://issues.apache.org/jira/browse/FLUME-2892
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
> Environment: AWS running:
> Ubuntu 
> Maven 3.3.4
> Java 1.6
>Reporter: Nathan Sturgess
>Priority: Critical
>
> This is my situation: 
> http://stackoverflow.com/questions/35866078/flume-kafka-integration-produces-zookeeper-related-error
> I am trying to build flume with this patch for zookeeper and it is not 
> working. Most recent example of errors: 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 16:34 min
> [INFO] Finished at: 2016-03-10T12:01:54+00:00
> [INFO] Final Memory: 64M/192M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
> project flume-jdbc-channel: There are test failures.
> [ERROR]
> [ERROR] Please refer to 
> /home/bitnami/flume/flume-ng-channels/flume-jdbc-channel/target/surefire-reports
>  for the individual test results.
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flume-jdbc-channel
> 
> [INFO] Flume NG Spillable Memory channel .. FAILURE [15:02 
> min]
> [INFO] Flume NG Node .. SKIPPED
> [INFO] Flume NG Embedded Agent  SKIPPED
> [INFO] Flume NG HBase Sink  SKIPPED
> [INFO] Flume NG ElasticSearch Sink  SKIPPED
> [INFO] Flume NG Morphline Solr Sink ... SKIPPED
> [INFO] Flume Kafka Sink ... SKIPPED
> [INFO] Flume NG Kite Dataset Sink . SKIPPED
> [INFO] Flume NG Hive Sink . SKIPPED
> [INFO] Flume Sources .. SKIPPED
> [INFO] Flume Scribe Source  SKIPPED
> [INFO] Flume JMS Source ... SKIPPED
> [INFO] Flume Twitter Source ... SKIPPED
> [INFO] Flume Kafka Source . SKIPPED
> [INFO] flume-kafka-channel  SKIPPED
> [INFO] Flume legacy Sources ... SKIPPED
> [INFO] Flume legacy Avro source ... SKIPPED
> [INFO] Flume legacy Thrift Source . SKIPPED
> [INFO] Flume NG Clients ... SKIPPED
> [INFO] Flume NG Log4j Appender  SKIPPED
> [INFO] Flume NG Tools . SKIPPED
> [INFO] Flume NG distribution .. SKIPPED
> [INFO] Flume NG Integration Tests . SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 34:20 min
> [INFO] Finished at: 2016-03-10T12:52:02+00:00
> [INFO] Final Memory: 52M/247M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
> project flume-spillable-memory-channel: ExecutionException; nested exception 
> is java.util.concurrent.ExecutionException: java.lang.RuntimeException: The 
> forked VM terminated without saying 

[jira] [Commented] (FLUME-2965) race condition in SpillableMemoryChannel log print

2016-08-03 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406832#comment-15406832
 ] 

Roshan Naik commented on FLUME-2965:


Thanks [~lqp276]
Took a quick look... and in general it looks like the right fix. Would be 
easier to idenitify all the changes in this fix if just the diff/patch was 
uploaded.  Also please create a code review as suggested by Denes.

> race condition in SpillableMemoryChannel log print
> --
>
> Key: FLUME-2965
> URL: https://issues.apache.org/jira/browse/FLUME-2965
> Project: Flume
>  Issue Type: Bug
>  Components: Channel
>Affects Versions: v1.7.0
>Reporter: liqiaoping
>Priority: Minor
> Attachments: SpillableMemoryChannel.java
>
>
> use SpillableMemoryChannel with http blob handler, and send many request 
> concurrently, As the jetty has a threadpool to handle incoming request, the 
> commit to SpillableMemoryChannel will be concurrent.
> the Following code :
> @Override
> protected void doCommit() throws InterruptedException {
>   if (putCalled) {
> putCommit();
> if (LOGGER.isDebugEnabled()) {
>   LOGGER.debug("Put Committed. Drain Order Queue state : "
>   + drainOrder.dump());
> }
> in method - >drainOrder.dump() will iterate its internal queue,  in the 
> meantime, has changed by other thread, thus throw a concurrent modification 
> exception. thus will result the channel processor try to rollback, but 
> actually the transaction has commit succefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2947) Upgrade Hive and thrift libraries

2016-07-05 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2947:
--

 Summary: Upgrade Hive and thrift libraries
 Key: FLUME-2947
 URL: https://issues.apache.org/jira/browse/FLUME-2947
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Reporter: Roshan Naik
Assignee: Roshan Naik


Upgrade Hive version to use new API call that captures additional info about 
the agent's sink name (HIVE-11956). Also upgrade thrift libraries as Hive is 
moving to 0.9.3 (HIVE-13724). Some of the 0.9.2 methods are now missing in 
0.9.3. So need to upgrade both.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2762) Improve HDFS Sink performance

2016-06-21 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved FLUME-2762.

  Resolution: Invalid
Release Note: my thoughts didn't pan out on prototyping

> Improve HDFS Sink performance
> -
>
> Key: FLUME-2762
> URL: https://issues.apache.org/jira/browse/FLUME-2762
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
>
> Have some thoughts around improving HDFS sink's performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2914) Upgrade httpclient version 4.3.6

2016-06-08 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2914:
---
Attachment: FLUME-2914.v2.patch

> Upgrade httpclient version 4.3.6
> 
>
> Key: FLUME-2914
> URL: https://issues.apache.org/jira/browse/FLUME-2914
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.7.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: FLUME-2914.patch, FLUME-2914.v2.patch
>
>
> There seem to be security vulnerabilities in httpclient from httpcomponents 
> as per https://issues.apache.org/jira/browse/HADOOP-12767



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers

2016-06-07 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319493#comment-15319493
 ] 

Roshan Naik commented on FLUME-2799:


[~michael.andre.pearce] any update ?

> Kafka Source - Message Offset and Partition add to headers
> --
>
> Key: FLUME-2799
> URL: https://issues.apache.org/jira/browse/FLUME-2799
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Michael Andre Pearce (IG)
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2799-0.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently Kafka source only persists the original kafka message's topic into 
> the Flume event headers.
> For downstream interceptors and sinks that may want to have available to them 
> the partition and the offset , we need to add these.
> Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is 
> not configurable unlike other sources such as JMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2914) Upgrade httpclient version 4.3.6

2016-05-26 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2914:
---
Attachment: FLUME-2914.patch

> Upgrade httpclient version 4.3.6
> 
>
> Key: FLUME-2914
> URL: https://issues.apache.org/jira/browse/FLUME-2914
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
> Attachments: FLUME-2914.patch
>
>
> There seem to be security vulnerabilities in httpclient from httpcomponents 
> as per https://issues.apache.org/jira/browse/HADOOP-12767



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2914) Upgrade httpclient version 4.3.6

2016-05-26 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2914:
--

 Summary: Upgrade httpclient version 4.3.6
 Key: FLUME-2914
 URL: https://issues.apache.org/jira/browse/FLUME-2914
 Project: Flume
  Issue Type: Bug
Reporter: Roshan Naik


There seem to be security vulnerabilities in httpclient from httpcomponents as 
per https://issues.apache.org/jira/browse/HADOOP-12767





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support

2016-05-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287547#comment-15287547
 ] 

Roshan Naik commented on FLUME-2792:


Yes ..  in the next release.

> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-04-25 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256883#comment-15256883
 ] 

Roshan Naik commented on FLUME-2889:


+1 ..  am running the tests. Shall  commit them once they pass.

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Tristan Stevens
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, 
> FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2889) Fixes to DateTime computations

2016-04-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2889:
---
Assignee: Tristan Stevens  (was: Roshan Naik)

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Tristan Stevens
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, 
> FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-20 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2901:
---
Attachment: FLUME-2901.v3.patch

uploading revised patch v3 with fix for issue pointed out  by [~jholoman]

> Document Kerberos setup for Kafka channel
> -
>
> Key: FLUME-2901
> URL: https://issues.apache.org/jira/browse/FLUME-2901
> Project: Flume
>  Issue Type: Bug
>  Components: Docs
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2901.patch, FLUME-2901.v2.patch, 
> FLUME-2901.v3.patch
>
>
> Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
> cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-20 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250758#comment-15250758
 ] 

Roshan Naik commented on FLUME-2901:


[~jholoman] thanks for spotting that problem .. shall upload a fixed patch soon.

I considered providing examples for other modes too but was not able to test 
them out. After briefly looking over the other modes,  leaned away from 
providing examples for each of them and instead just refer to them via a 
link... as it seemed a bit much (e.g. with and without authorization) although 
useful. 

I decided to put this example since the kafka folks tell me that SASL_PLAINTEXT 
is most common one.

> Document Kerberos setup for Kafka channel
> -
>
> Key: FLUME-2901
> URL: https://issues.apache.org/jira/browse/FLUME-2901
> Project: Flume
>  Issue Type: Bug
>  Components: Docs
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2901.patch, FLUME-2901.v2.patch
>
>
> Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
> cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-19 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248741#comment-15248741
 ] 

Roshan Naik commented on FLUME-2901:


[~hshreedharan] can u take a look ?

> Document Kerberos setup for Kafka channel
> -
>
> Key: FLUME-2901
> URL: https://issues.apache.org/jira/browse/FLUME-2901
> Project: Flume
>  Issue Type: Bug
>  Components: Docs
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2901.patch, FLUME-2901.v2.patch
>
>
> Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
> cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-19 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2901:
---
Attachment: FLUME-2901.v2.patch

revising patch with minor addition to documentation

> Document Kerberos setup for Kafka channel
> -
>
> Key: FLUME-2901
> URL: https://issues.apache.org/jira/browse/FLUME-2901
> Project: Flume
>  Issue Type: Bug
>  Components: Docs
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2901.patch, FLUME-2901.v2.patch
>
>
> Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
> cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-18 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2901:
---
Attachment: FLUME-2901.patch

Uploading  patch.  includes Kafka channel kerberos documentation and also a 
minor fix to existing doc (removed capacity & transactionCapacity settings from 
example as they dont apply to this channel)

> Document Kerberos setup for Kafka channel
> -
>
> Key: FLUME-2901
> URL: https://issues.apache.org/jira/browse/FLUME-2901
> Project: Flume
>  Issue Type: Bug
>  Components: Docs
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2901.patch
>
>
> Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
> cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2901) Document Kerberos setup for Kafka channel

2016-04-18 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2901:
--

 Summary: Document Kerberos setup for Kafka channel
 Key: FLUME-2901
 URL: https://issues.apache.org/jira/browse/FLUME-2901
 Project: Flume
  Issue Type: Bug
  Components: Docs
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: v1.7.0


Add details about  configuring Kafka channel to work with a  Kerberized Kafka 
cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2433) Add kerberos support for Hive sink

2016-04-12 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238492#comment-15238492
 ] 

Roshan Naik edited comment on FLUME-2433 at 4/13/16 2:50 AM:
-

[~wpwang] can you please try the same with the hdfs sink and see if the same 
credentials work.  I think you have some setup issue going on there. Also you 
might have to copy hive-site.xml and hdfs-site.xml into the flume classpath.


was (Author: roshan_naik):
[~wpwang] can you please try the same with the hdfs sink and see if the same 
credentials work.  I think you have some setup issue going on there.  Most 
likely you to have your hive-site.xml and hdfs-site.xml copied over the flume 
classpath.

> Add kerberos support for Hive sink
> --
>
> Key: FLUME-2433
> URL: https://issues.apache.org/jira/browse/FLUME-2433
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.5.0.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: HiveSink, Kerberos,
> Attachments: FLUME-2433.patch, FLUME-2433.v2.patch
>
>
> Add kerberos authentication support for Hive sink
> FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 
> this should be available in the next hive release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2433) Add kerberos support for Hive sink

2016-04-12 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238492#comment-15238492
 ] 

Roshan Naik commented on FLUME-2433:


[~wpwang] can you please try the same with the hdfs sink and see if the same 
credentials work.  I think you have some setup issue going on there.  Most 
likely you to have your hive-site.xml and hdfs-site.xml copied over the flume 
classpath.

> Add kerberos support for Hive sink
> --
>
> Key: FLUME-2433
> URL: https://issues.apache.org/jira/browse/FLUME-2433
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.5.0.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: HiveSink, Kerberos,
> Attachments: FLUME-2433.patch, FLUME-2433.v2.patch
>
>
> Add kerberos authentication support for Hive sink
> FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 
> this should be available in the next hive release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source

2016-03-31 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220676#comment-15220676
 ] 

Roshan Naik commented on FLUME-2781:


Thanks for confirming  [~gherreros]

I got a confused when i saw this stmt  in some of the automated comments here 
in this jira.

{quote}"Kafka Channel with parseAsFlumeEvent=true should write data as is, not 
as flume events." .{quote}

which i believe is exactly the opposite of what is intended.

> A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used 
> by a Flume source
> -
>
> Key: FLUME-2781
> URL: https://issues.apache.org/jira/browse/FLUME-2781
> Project: Flume
>  Issue Type: Improvement
>Affects Versions: v1.6.0
>Reporter: Gonzalo Herreros
>Assignee: Gonzalo Herreros
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2781.patch
>
>
> When a Kafka channel is configured as parseAsFlumeEvent=false, the channel 
> will read events from the topic as text instead of serialized Avro Flume 
> events.
> This is useful so Flume can read from an existing Kafka topic, where other 
> Kafka clients publish as text.
> However, if you use a Flume source on that channel, it will still write the 
> events as Avro so it will create an inconsistency and those events will fail 
> to be read correctly.
> Also, this would allow a Flume source to write to a Kafka channel and any 
> Kafka subscriber to listen to Flume events passing through without binary 
> dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source

2016-03-30 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219177#comment-15219177
 ] 

Roshan Naik commented on FLUME-2781:


[~gherreros] this is a very useful feature! 

Don't see any note in the UserGuide about this... so needed clarifcation...

Setting parseAsFlumeEvent=*false*  will cause events to be written as-is into 
the Kafka topic without the FlumeEvent wrapper right ?


> A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used 
> by a Flume source
> -
>
> Key: FLUME-2781
> URL: https://issues.apache.org/jira/browse/FLUME-2781
> Project: Flume
>  Issue Type: Improvement
>Affects Versions: v1.6.0
>Reporter: Gonzalo Herreros
>Assignee: Gonzalo Herreros
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2781.patch
>
>
> When a Kafka channel is configured as parseAsFlumeEvent=false, the channel 
> will read events from the topic as text instead of serialized Avro Flume 
> events.
> This is useful so Flume can read from an existing Kafka topic, where other 
> Kafka clients publish as text.
> However, if you use a Flume source on that channel, it will still write the 
> events as Avro so it will create an inconsistency and those events will fail 
> to be read correctly.
> Also, this would allow a Flume source to write to a Kafka channel and any 
> Kafka subscriber to listen to Flume events passing through without binary 
> dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-03-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210557#comment-15210557
 ] 

Roshan Naik commented on FLUME-2889:


Sorry [~tmgstev] i have not.  Wont be able to get to it for another week at 
least. [~hshreedharan] are u able to take a look ?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889-4.patch, 
> FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-03-08 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186250#comment-15186250
 ] 

Roshan Naik commented on FLUME-2889:


oh  i thought i had committed it... perhaps we can wait for the revision 
Tristan has in mind.

[~tmgstev]  please go ahead and reuse this JIRA.

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-29 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172541#comment-15172541
 ] 

Roshan Naik commented on FLUME-2889:


Shall wait for  [~tmgstev]  till end of day to see if he gets a chance to 
review this before committing.

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170096#comment-15170096
 ] 

Roshan Naik commented on FLUME-2889:


Yes will  wait for Tristan.

Trying to assess the impact this bug has on flume users. Could use some inputs 
here .. My assessment is below :
1) SyslogAvroEventSerializer.java: Here it could cause bad data to be written 
out to the file
2) SyslogParser.java : This looks like will apply a bad date in the timestamp 
header. That could then either end up causing events to go down the wrong 
destination if that header is used to determine the destination... or even bad 
data .. if the header value is somehow during serialization
3) In both cases is the problem limited to adjusting leap year dates (Feb 29) 
only ? would be nice to see an example showing the problem.
Thanks

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170010#comment-15170010
 ] 

Roshan Naik commented on FLUME-2889:


[~hshreedharan] i am thinking we revert/override the previous  commit with this 
patch ?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169995#comment-15169995
 ] 

Roshan Naik edited comment on FLUME-2889 at 2/26/16 10:43 PM:
--

Thanks  [~tmgstev] for catching that.

I am revising the patch with the same fix applied to 
SyslogAvroEventSerializer.java.   

Not sure why there is duplication of logic. 
Since short on time (feb 26 already) i wont try to attempt to merge the two 
code pieces and upload another fix. 

Can you please review this v3 patch  ?


was (Author: roshan_naik):
Thanks  [~tmgstev] for catching that.

I am revising the patch with the same fix applied to 
SyslogAvroEventSerializer.java.   

Not sure why there is duplication of logic. 
Since short on time (feb 26 already) i wont try to attempt to merge the two 
code pieces and upload another fix. 

Can you please review this v4 patch  ?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2889) Fixes to DateTime computations

2016-02-26 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2889:
---
Attachment: FLUME-2889.3.patch

Thanks  [~tmgstev] for catching that.

I am revising the patch with the same fix applied to 
SyslogAvroEventSerializer.java.   

Not sure why there is duplication of logic. 
Since short on time (feb 26 already) i wont try to attempt to merge the two 
code pieces and upload another fix. 

Can you please review this v4 patch  ?

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889-2.patch, FLUME-2889.3.patch, FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2889) Fixes to DateTime computations

2016-02-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2889:
---
Fix Version/s: v1.7.0

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Fix For: v1.7.0
>
> Attachments: FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2889) Fixes to DateTime computations

2016-02-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2889:
---
Affects Version/s: v1.6.0

> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2889) Fixes to DateTime computations

2016-02-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2889:
---
Attachment: FLUME-2889.patch

Uploading patch...  minor 1 liner fixes in a few places


> Fixes to  DateTime computations
> ---
>
> Key: FLUME-2889
> URL: https://issues.apache.org/jira/browse/FLUME-2889
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>Assignee: Roshan Naik
> Attachments: FLUME-2889.patch
>
>
> date.withYear(year+1)  can lead to incorrect date calculations .. for example 
> if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2889) Fixes to DateTime computations

2016-02-24 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2889:
--

 Summary: Fixes to  DateTime computations
 Key: FLUME-2889
 URL: https://issues.apache.org/jira/browse/FLUME-2889
 Project: Flume
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik


date.withYear(year+1)  can lead to incorrect date calculations .. for example 
if  the date is Feb 29th.   need to use date.plusYears(1) instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2881) Windows Launch Script fails in plugins dir code

2016-02-17 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2881:
---
Fix Version/s: v1.7.0

> Windows Launch Script fails in plugins dir code
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Fix For: v1.7.0
>
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2881) Windows Launch Script fails in plugins dir code

2016-02-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151297#comment-15151297
 ] 

Roshan Naik commented on FLUME-2881:


Added some jars in  plugins.d/1/lib/x.jar  plugins.d/1/lib/y.jarand   
plugins.d/1/libext/x.jar   but it didnt make a difference.

Anyway I test that your fixes to the script work on my setup .. and since other 
may also experience the same issue that you have noticed,...

I am +1 on this and will commit it shortly.

Thanks for the patch  [~jonathansmith] 

> Windows Launch Script fails in plugins dir code
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2881) Windows Launch Script fails in plugins dir code

2016-02-17 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2881:
---
Summary: Windows Launch Script fails in plugins dir code  (was: Windows 
Launch Script fails)

> Windows Launch Script fails in plugins dir code
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2881) Windows Launch Script fails

2016-02-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151168#comment-15151168
 ] 

Roshan Naik commented on FLUME-2881:


oh i see.  i did add some plugin dirs but it didn't seem to make a difference. 
let me try adding some jars there and check.

> Windows Launch Script fails
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2881) Windows Launch Script fails

2016-02-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149627#comment-15149627
 ] 

Roshan Naik commented on FLUME-2881:


[~jonathansmith] i am able to run the script just fine without the need for 
this fix.  what version of powershell are you using ?
i used 4.0

{code}
PS C:\Users\Administrator\Downloads\apache-flume-1.6.0-bin> $PSVersionTable

Name   Value
   -
PSVersion  4.0
WSManStackVersion  3.0
SerializationVersion   1.1.0.1
CLRVersion 4.0.30319.42000
BuildVersion   6.3.9600.17400
PSCompatibleVersions   {1.0, 2.0, 3.0, 4.0}
PSRemotingProtocolVersion  2.2
{code}

> Windows Launch Script fails
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2881) Windows Launch Script fails

2016-02-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149367#comment-15149367
 ] 

Roshan Naik commented on FLUME-2881:


sure.. will take a look by tomorrow.

> Windows Launch Script fails
> ---
>
> Key: FLUME-2881
> URL: https://issues.apache.org/jira/browse/FLUME-2881
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Windows
>Affects Versions: v1.6.0
> Environment: Tested on Windows 7 and Windows 8
>Reporter: Jonathan Smith
>  Labels: easyfix, patch, windows
> Attachments: fix_windows_launch.patch, op-addition-not-found.log
>
>
> Running flume-ng.cmd results in the attached error from the Windows command 
> line.
> The problem seems to originate in flume-ng.ps1, line 323 where the plugins 
> are added to the class path. Adding together directory information does not 
> seem to be supported on windows 7 or 8. I was able to fix the problem by 
> separating out the two plugin directories in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers

2016-01-15 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102905#comment-15102905
 ] 

Roshan Naik commented on FLUME-2799:


In flume-ng-doc/sphinx/FlumeUserGuide.rst 
look for the Kafka source section.

> Kafka Source - Message Offset and Partition add to headers
> --
>
> Key: FLUME-2799
> URL: https://issues.apache.org/jira/browse/FLUME-2799
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Michael Andre Pearce (IG)
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2799-0.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently Kafka source only persists the original kafka message's topic into 
> the Flume event headers.
> For downstream interceptors and sinks that may want to have available to them 
> the partition and the offset , we need to add these.
> Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is 
> not configurable unlike other sources such as JMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2865) Upgrade thrift version to 0.9.3

2016-01-15 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2865:
--

 Summary: Upgrade thrift version to 0.9.3
 Key: FLUME-2865
 URL: https://issues.apache.org/jira/browse/FLUME-2865
 Project: Flume
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik


Hive is now moving to thrift v0.9.3 and  some older symbols are missing in this 
newer thrift version. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2704) Configurable poll delay for spooling directory source

2016-01-15 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102901#comment-15102901
 ] 

Roshan Naik commented on FLUME-2704:


[~jrufus] can u commit this if it looks good ?

> Configurable poll delay for spooling directory source
> -
>
> Key: FLUME-2704
> URL: https://issues.apache.org/jira/browse/FLUME-2704
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
>Reporter: Somin Mithraa
>Assignee: Somin Mithraa
>Priority: Minor
>  Labels: SpoolDir, pollDelay, sources
> Attachments: FLUME-2704.patch
>
>
> SpoolDir source polls a directory for new files at specific interval. This 
> interval(or poll delay) is currently hardcoded as 500ms.
> 500ms may be too fast for some applications. This JIRA is to make this 
> property configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers

2016-01-14 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098913#comment-15098913
 ] 

Roshan Naik commented on FLUME-2799:


- I don't think it is wise to block this requirement on Kafka 0.9. This ability 
seems useful in its own right.
- Functionally, It does seem to overlap with notion of interceptors even if its 
not the intention.. JMS convertors, deal more with body and less with headers. 
- If each source implements its own converters. It is better to have a common 
reusable convertor system shared by others sources. Which can then bring into 
question the need for interceptors.
- Although well motivated, It feels excessive to introduce convertors in this 
ticket which deals with merely adding couple headers.
- My thoughts
   + Documentation needs update
   + Make the new headers optional (and disabled by default) so that existing 
users don
't see any impact.
   + If you are willing to simplify it to do this without convertors it would 
make this a simpler review and require less debate. Unless you any other 
thoughts ?


> Kafka Source - Message Offset and Partition add to headers
> --
>
> Key: FLUME-2799
> URL: https://issues.apache.org/jira/browse/FLUME-2799
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Michael Andre Pearce (IG)
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2799-0.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently Kafka source only persists the original kafka message's topic into 
> the Flume event headers.
> For downstream interceptors and sinks that may want to have available to them 
> the partition and the offset , we need to add these.
> Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is 
> not configurable unlike other sources such as JMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers

2016-01-13 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096971#comment-15096971
 ] 

Roshan Naik commented on FLUME-2799:


[~gwenshap]would you be able to review this ?

> Kafka Source - Message Offset and Partition add to headers
> --
>
> Key: FLUME-2799
> URL: https://issues.apache.org/jira/browse/FLUME-2799
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Michael Andre Pearce (IG)
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2799-0.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently Kafka source only persists the original kafka message's topic into 
> the Flume event headers.
> For downstream interceptors and sinks that may want to have available to them 
> the partition and the offset , we need to add these.
> Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is 
> not configurable unlike other sources such as JMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2799) Kafka Source - Message Offset and Partition add to headers

2016-01-13 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097032#comment-15097032
 ] 

Roshan Naik commented on FLUME-2799:


My comments:
- Needs update to documentation wrt  Converter, converter config and custom 
converters.
- The default converter here is applying 4 headers (2 old, plus 2 new ones). 
Adding headers to every event is expensive in terms of memory (and also some 
cpu due to added GC pressure). Which headers to apply should be user selectable 
with the default settings preserving existing behavior.
- The need for introducing Convertor for adding additional headers may be a bit 
overkill, but acceptable.

> Kafka Source - Message Offset and Partition add to headers
> --
>
> Key: FLUME-2799
> URL: https://issues.apache.org/jira/browse/FLUME-2799
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Michael Andre Pearce (IG)
>Priority: Minor
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2799-0.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently Kafka source only persists the original kafka message's topic into 
> the Flume event headers.
> For downstream interceptors and sinks that may want to have available to them 
> the partition and the offset , we need to add these.
> Also it is noted that the conversion from MessageAndMetaData to FlumeEvent is 
> not configurable unlike other sources such as JMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows

2015-12-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2806:
---
Assignee: Liam Mousseau

> flume-ng.ps1 Error running script to start an agent on Windows
> --
>
> Key: FLUME-2806
> URL: https://issues.apache.org/jira/browse/FLUME-2806
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
> Environment: Windows 8
>Reporter: Liam Mousseau
>Assignee: Liam Mousseau
> Attachments: flume-ng.ps1.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Error:
> {noformat}
> C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a 
> cmdlet, function,
> script file, or operable program. Check the spelling of the name, or if a 
> path was included, verify that the path is
> correct and try again.
> At line:1 char:1
> + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f 
> conf\flume-conf.properties.templ ...
> + 
> 
> + CategoryInfo  : ObjectNotFound: (ss:String) [flume-ng.ps1], 
> CommandNotFoundException
> + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1
> {noformat}
> Fix: Remove the 'ss' on line 169:
> {noformat}
> ...
> Function GetJavaPath {
> if ($env:JAVA_HOME) {
> return "$env:JAVA_HOME\bin\java.exe" }ss
> Write-Host "WARN: JAVA_HOME not set"
> return '"' + (Resolve-Path "java.exe").Path + '"'
> }
> ...
> {noformat}
> Work-around: Remove the ss on line 169 manually in the powershell script and 
> try again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows

2015-12-29 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074295#comment-15074295
 ] 

Roshan Naik commented on FLUME-2806:


 thanks [~lmousseau] for the contribution.

committed.. 

> flume-ng.ps1 Error running script to start an agent on Windows
> --
>
> Key: FLUME-2806
> URL: https://issues.apache.org/jira/browse/FLUME-2806
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
> Environment: Windows 8
>Reporter: Liam Mousseau
>Assignee: Liam Mousseau
> Fix For: v1.7.0
>
> Attachments: flume-ng.ps1.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Error:
> {noformat}
> C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a 
> cmdlet, function,
> script file, or operable program. Check the spelling of the name, or if a 
> path was included, verify that the path is
> correct and try again.
> At line:1 char:1
> + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f 
> conf\flume-conf.properties.templ ...
> + 
> 
> + CategoryInfo  : ObjectNotFound: (ss:String) [flume-ng.ps1], 
> CommandNotFoundException
> + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1
> {noformat}
> Fix: Remove the 'ss' on line 169:
> {noformat}
> ...
> Function GetJavaPath {
> if ($env:JAVA_HOME) {
> return "$env:JAVA_HOME\bin\java.exe" }ss
> Write-Host "WARN: JAVA_HOME not set"
> return '"' + (Resolve-Path "java.exe").Path + '"'
> }
> ...
> {noformat}
> Work-around: Remove the ss on line 169 manually in the powershell script and 
> try again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows

2015-12-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved FLUME-2806.

   Resolution: Fixed
Fix Version/s: v1.7.0

> flume-ng.ps1 Error running script to start an agent on Windows
> --
>
> Key: FLUME-2806
> URL: https://issues.apache.org/jira/browse/FLUME-2806
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
> Environment: Windows 8
>Reporter: Liam Mousseau
>Assignee: Liam Mousseau
> Fix For: v1.7.0
>
> Attachments: flume-ng.ps1.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Error:
> {noformat}
> C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a 
> cmdlet, function,
> script file, or operable program. Check the spelling of the name, or if a 
> path was included, verify that the path is
> correct and try again.
> At line:1 char:1
> + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f 
> conf\flume-conf.properties.templ ...
> + 
> 
> + CategoryInfo  : ObjectNotFound: (ss:String) [flume-ng.ps1], 
> CommandNotFoundException
> + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1
> {noformat}
> Fix: Remove the 'ss' on line 169:
> {noformat}
> ...
> Function GetJavaPath {
> if ($env:JAVA_HOME) {
> return "$env:JAVA_HOME\bin\java.exe" }ss
> Write-Host "WARN: JAVA_HOME not set"
> return '"' + (Resolve-Path "java.exe").Path + '"'
> }
> ...
> {noformat}
> Work-around: Remove the ss on line 169 manually in the powershell script and 
> try again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2806) flume-ng.ps1 Error running script to start an agent on Windows

2015-12-29 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074287#comment-15074287
 ] 

Roshan Naik commented on FLUME-2806:


+1

> flume-ng.ps1 Error running script to start an agent on Windows
> --
>
> Key: FLUME-2806
> URL: https://issues.apache.org/jira/browse/FLUME-2806
> Project: Flume
>  Issue Type: Bug
>Affects Versions: v1.6.0
> Environment: Windows 8
>Reporter: Liam Mousseau
>Assignee: Liam Mousseau
> Attachments: flume-ng.ps1.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Error:
> {noformat}
> C:\...\flume-ng.ps1 : The term 'ss' is not recognized as the name of a 
> cmdlet, function,
> script file, or operable program. Check the spelling of the name, or if a 
> path was included, verify that the path is
> correct and try again.
> At line:1 char:1
> + .\bin\flume-ng.ps1 agent -n Flume_Test_Agent -f 
> conf\flume-conf.properties.templ ...
> + 
> 
> + CategoryInfo  : ObjectNotFound: (ss:String) [flume-ng.ps1], 
> CommandNotFoundException
> + FullyQualifiedErrorId : CommandNotFoundException,flume-ng.ps1
> {noformat}
> Fix: Remove the 'ss' on line 169:
> {noformat}
> ...
> Function GetJavaPath {
> if ($env:JAVA_HOME) {
> return "$env:JAVA_HOME\bin\java.exe" }ss
> Write-Host "WARN: JAVA_HOME not set"
> return '"' + (Resolve-Path "java.exe").Path + '"'
> }
> ...
> {noformat}
> Work-around: Remove the ss on line 169 manually in the powershell script and 
> try again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2801) Performance improvement on TailDir source

2015-12-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062513#comment-15062513
 ] 

Roshan Naik commented on FLUME-2801:


Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.

> Performance improvement on TailDir source
> -
>
> Key: FLUME-2801
> URL: https://issues.apache.org/jira/browse/FLUME-2801
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Jun Seok Hong
>Assignee: Jun Seok Hong
> Fix For: v1.7.0
>
> Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch
>
>
> This a proposal of performance improvement for new tailing source FLUME-2498.
> Taildir source reads a file by 1byte, so the performance is very low compared 
> to tailing on exec source.
> I tested lot's of ways to improve performance and implemented the best one.
> Changes.
> * Reading a file by a 8k block instead of 1 byte.
> * Use byte[] for handling data instead of 
> ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
> * Don't convert byte[] to string and vice verse.
> Simple file reading test results.
> {quote}
>  File size: 100 MB, 
>  Line size: 500 byte
> Estimated time to read the file:
> |Reading 1byte(Using the code in Taildir)|32544 ms|
> |Reading 8K Block|431 ms|
> {quote}
> Testing on flume, it catches up the performance of tailing on exec source. 
> (30x performance boost)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2801) Performance improvement on TailDir source

2015-12-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062513#comment-15062513
 ] 

Roshan Naik edited comment on FLUME-2801 at 12/17/15 6:45 PM:
--

+1 
Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.


was (Author: roshan_naik):
Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.

> Performance improvement on TailDir source
> -
>
> Key: FLUME-2801
> URL: https://issues.apache.org/jira/browse/FLUME-2801
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Jun Seok Hong
>Assignee: Jun Seok Hong
> Fix For: v1.7.0
>
> Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch
>
>
> This a proposal of performance improvement for new tailing source FLUME-2498.
> Taildir source reads a file by 1byte, so the performance is very low compared 
> to tailing on exec source.
> I tested lot's of ways to improve performance and implemented the best one.
> Changes.
> * Reading a file by a 8k block instead of 1 byte.
> * Use byte[] for handling data instead of 
> ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
> * Don't convert byte[] to string and vice verse.
> Simple file reading test results.
> {quote}
>  File size: 100 MB, 
>  Line size: 500 byte
> Estimated time to read the file:
> |Reading 1byte(Using the code in Taildir)|32544 ms|
> |Reading 8K Block|431 ms|
> {quote}
> Testing on flume, it catches up the performance of tailing on exec source. 
> (30x performance boost)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLUME-2854) Parameterize jetty version in pom

2015-12-17 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2854:
---
Comment: was deleted

(was: +1)

> Parameterize jetty version in pom
> -
>
> Key: FLUME-2854
> URL: https://issues.apache.org/jira/browse/FLUME-2854
> Project: Flume
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2854) Parameterize jetty version in pom

2015-12-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062566#comment-15062566
 ] 

Roshan Naik commented on FLUME-2854:


+1

> Parameterize jetty version in pom
> -
>
> Key: FLUME-2854
> URL: https://issues.apache.org/jira/browse/FLUME-2854
> Project: Flume
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2451) HDFS Sink Cannot Reconnect After NameNode Restart

2015-12-09 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049341#comment-15049341
 ] 

Roshan Naik commented on FLUME-2451:


patch v2 from here went into the Flume included in HDP.  If you are able to 
build the  Apache version of Flume with this patch v3 that would be helpful

> HDFS Sink Cannot Reconnect After NameNode Restart
> -
>
> Key: FLUME-2451
> URL: https://issues.apache.org/jira/browse/FLUME-2451
> Project: Flume
>  Issue Type: Bug
>  Components: File Channel, Sinks+Sources
>Affects Versions: v1.4.0
> Environment: 8 node CDH 4.2.2 (2.0.0-cdh4.2.2) cluster
> All cluster machines are running Ubuntu 12.04 x86_64
>Reporter: Andrew O'Neill
>Assignee: Roshan Naik
>  Labels: HDFS, Sink
> Attachments: FLUME-2451.patch, FLUME-2451.v2.patch, 
> FLUME-2451.v3.patch
>
>
> I am testing a simple flume setup with a Sequence Generator Source, a File 
> Channel, and an HDFS Sink (see my flume.conf below). This configuration works 
> as expected until I reboot the cluster’s NameNode or until I restart the HDFS 
> service on the cluster. At this point, it appears that the Flume Agent cannot 
> reconnect to HDFS and must be manually restarted.
> Here is our flume.conf:
> appserver.sources = rawtext
> appserver.channels = testchannel
> appserver.sinks = test_sink
> appserver.sources.rawtext.type = seq
> appserver.sources.rawtext.channels = testchannel
> appserver.channels.testchannel.type = file
> appserver.channels.testchannel.capacity = 1000
> appserver.channels.testchannel.minimumRequiredSpace = 214748364800
> appserver.channels.testchannel.checkpointDir = 
> /Users/aoneill/Desktop/testchannel/checkpoint
> appserver.channels.testchannel.dataDirs = 
> /Users/aoneill/Desktop/testchannel/data
> appserver.channels.testchannel.maxFileSize = 2000
> appserver.sinks.test_sink.type = hdfs
> appserver.sinks.test_sink.channel = testchannel
> appserver.sinks.test_sink.hdfs.path = 
> hdfs://cluster01:8020/user/aoneill/flumetest
> appserver.sinks.test_sink.hdfs.closeTries = 3
> appserver.sinks.test_sink.hdfs.filePrefix = events-
> appserver.sinks.test_sink.hdfs.fileSuffix = .avro
> appserver.sinks.test_sink.hdfs.fileType = DataStream
> appserver.sinks.test_sink.hdfs.writeFormat = Text
> appserver.sinks.test_sink.hdfs.inUsePrefix = inuse-
> appserver.sinks.test_sink.hdfs.inUseSuffix = .avro
> appserver.sinks.test_sink.hdfs.rollCount = 10
> appserver.sinks.test_sink.hdfs.rollInterval = 30
> appserver.sinks.test_sink.hdfs.rollSize = 10485760
> These are the two error message that the Flume Agent outputs constantly after 
> the restart:
> 2014-08-26 10:47:24,572 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
> [ERROR - 
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:96)]
>  Unexpected error while checking replication factor
> java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.getNumCurrentReplicas(AbstractHDFSWriter.java:162)
> at 
> org.apache.flume.sink.hdfs.AbstractHDFSWriter.isUnderReplicated(AbstractHDFSWriter.java:82)
> at 
> org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:452)
> at 
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:387)
> at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:525)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1253)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:891)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:881)
> at 
> 

[jira] [Commented] (FLUME-2716) File Channel cannot handle capacity Integer.MAX_VALUE

2015-12-07 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045807#comment-15045807
 ] 

Roshan Naik commented on FLUME-2716:


 I assume the new limit after this patch for the 'capacity' setting is 
Integer.MAX_VALUE.
Is there clarity on what was the limit for  'capacity' setting on file channel 
before  this patch ?

> File Channel cannot handle capacity Integer.MAX_VALUE
> -
>
> Key: FLUME-2716
> URL: https://issues.apache.org/jira/browse/FLUME-2716
> Project: Flume
>  Issue Type: Bug
>  Components: Channel, File Channel
>Affects Versions: v1.6.0, v1.7.0
>Reporter: Dong Zhao
> Fix For: v1.7.0
>
> Attachments: FLUME-2716.patch
>
>
> if capacity is set to Integer.MAX_VALUE(2147483647), checkpoint file size is 
> calculated wrongly to 8224. The calculation should first cast int to long, 
> then calculate the totalBytes. See the patch for details. Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support

2015-11-04 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990763#comment-14990763
 ] 

Roshan Naik commented on FLUME-2792:


Updates - 
 - PLAINTESTSAL  is being renamed to  SASL_PLAINTEXT in Apache Kafka.
 - Kafka has decided to support security only for the new Producer APIs. Apache 
Flume uses the old API today. So until Flume code is updated for the new APIs, 
this wont work.
- The Kafka that is shipped as part of HDP had secure support for the old API 
too. 

> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2792) Flume Kafka Kerberos Support

2015-11-04 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990763#comment-14990763
 ] 

Roshan Naik edited comment on FLUME-2792 at 11/5/15 12:09 AM:
--

Updates - 
 - PLAINTEXTSASL  is being renamed to  SASL_PLAINTEXT in Apache Kafka.
 - Kafka has decided to support security only for the new Producer APIs. Apache 
Flume uses the old API today. So until Flume code is updated for the new APIs, 
this wont work.
- The Kafka that is shipped as part of HDP had secure support for the old API 
too. 


was (Author: roshan_naik):
Updates - 
 - PLAINTESTSAL  is being renamed to  SASL_PLAINTEXT in Apache Kafka.
 - Kafka has decided to support security only for the new Producer APIs. Apache 
Flume uses the old API today. So until Flume code is updated for the new APIs, 
this wont work.
- The Kafka that is shipped as part of HDP had secure support for the old API 
too. 

> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2835) Hive Sink tests need to create table with transactional property set

2015-11-03 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988108#comment-14988108
 ] 

Roshan Naik commented on FLUME-2835:


+1

>  Hive Sink tests need to create table with transactional property set
> -
>
> Key: FLUME-2835
> URL: https://issues.apache.org/jira/browse/FLUME-2835
> Project: Flume
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
> Attachments: FLUME-hive-test.patch
>
>
> As per Hive streaming wiki the  transactional=true property needs to be set 
> on the table for streaming. 
> https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-23 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972022#comment-14972022
 ] 

Roshan Naik edited comment on FLUME-2819 at 10/23/15 10:49 PM:
---

- Flume depends on several external services (hadoop and non hadoop)  and that 
list keeps growing with each release. So not feasible to keep updating and 
releasing Flume  frequently (i.e each time one of them release a new version) 
to enable Flume users leverage new features being release in them.  
- Releasing frequently with updated version creates a diff problem .. it ties 
each Flume version to a specific dep version. And in the end user may not want 
to use that version of the service.
- Release frequently model also forces Flume to release each time there is a 
security fix releases in them
- If users have to wait for new Flume releases, that then leads to long waits 
for users... or the need to perform surgery on the flume/lib directory. Neither 
is reasonable.


Consequently, IMO ...  using the "provided" model for such deps and giving the 
users a mechanism to customize the Flume Classpath easily, is in the end a 
better choice. The slight OOB experience penalty is a smaller problem when 
considering all the trade offs.


was (Author: roshan_naik):
- Flume depends on several external services (hadoop and non hadoop)  and that 
list keeps growing with each release. So not feasible to keep updating and 
releasing Flume  frequently (i.e each time one of them release a new version) 
to enable Flume users leverage new features being release in them.  
- Releasing frequently with updated version creates a diff problem .. it ties 
each Flume version to a specific dep version. And in the end user may not want 
to use that version of the service.
- Release frequently model also forces Flume to release each time there is a 
security fix releases in them
- If users have to wait for new Flume releases, that then leads to long waits 
for users... or the need to perform surgery on the flume/lib directory. Neither 
is reasonable.


Consequently, IMO, using the "provided" model for such deps and giving the 
users a mechanism to customize the Flume Classpath easily, is in the end a 
better choice.

> Kafka libs are being bundled into Flume distro
> --
>
> Key: FLUME-2819
> URL: https://issues.apache.org/jira/browse/FLUME-2819
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>
> Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-23 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972217#comment-14972217
 ] 

Roshan Naik commented on FLUME-2819:


No doubt a common issue. For e.g Hive does not ship with Hadoop deps.

Basically end up relying on the backward binary compat provided by those deps. 

Unfortunately due to the complexity it gets tested with only 1 version of each 
dependency.  Helps a bit that each Hadoop vendor ends up testing it with a 
different versions depending on what they ship.

Your suggestion or a variation of it seems like worth looking into. There might 
be some tricky things that can happen when including multiple version of 
multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such 
deps under a separate lib2 folder.

   Many components in Flume will need that change if we go with that...in 
addition to startup script needs changes to support for each of those. best 
tracked in a separate jira. 

But like you mentioned, this is not flume specific situation.

> Kafka libs are being bundled into Flume distro
> --
>
> Key: FLUME-2819
> URL: https://issues.apache.org/jira/browse/FLUME-2819
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>
> Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-23 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972217#comment-14972217
 ] 

Roshan Naik edited comment on FLUME-2819 at 10/24/15 12:51 AM:
---

No doubt a common issue. For e.g Hive does not ship with Hadoop deps.

Basically end up relying on the backward binary compat provided by those deps. 

Unfortunately due to the complexity it gets tested with only 1 version of each 
dependency.  Helps a bit that each Hadoop vendor ends up testing it with a 
different versions depending on what they ship.

Your suggestion or a variation of it seems like worth looking into. There might 
be some tricky things that can happen when including multiple version of 
multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such 
deps under a separate lib2 folder.

   Many components in Flume will need that change if we go with that...in 
addition to startup script needs changes to support for each of those. best 
tracked in a separate jira. 

Also this is not flume specific situation.


was (Author: roshan_naik):
No doubt a common issue. For e.g Hive does not ship with Hadoop deps.

Basically end up relying on the backward binary compat provided by those deps. 

Unfortunately due to the complexity it gets tested with only 1 version of each 
dependency.  Helps a bit that each Hadoop vendor ends up testing it with a 
different versions depending on what they ship.

Your suggestion or a variation of it seems like worth looking into. There might 
be some tricky things that can happen when including multiple version of 
multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such 
deps under a separate lib2 folder.

   Many components in Flume will need that change if we go with that...in 
addition to startup script needs changes to support for each of those. best 
tracked in a separate jira. 

But like you mentioned, this is not flume specific situation.

> Kafka libs are being bundled into Flume distro
> --
>
> Key: FLUME-2819
> URL: https://issues.apache.org/jira/browse/FLUME-2819
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>
> Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-23 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972217#comment-14972217
 ] 

Roshan Naik edited comment on FLUME-2819 at 10/24/15 12:52 AM:
---

Flume generally follows this pattern for its deps (hadoop, hive, hbase)

And no doubt a common issue in other prjs too. For e.g Hive does not ship with 
Hadoop deps.

Basically end up relying on the backward binary compat provided by those deps. 

Unfortunately due to the complexity it gets tested with only 1 version of each 
dependency.  Helps a bit that each Hadoop vendor ends up testing it with a 
different versions depending on what they ship.

Your suggestion or a variation of it seems like worth looking into. There might 
be some tricky things that can happen when including multiple version of 
multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such 
deps under a separate lib2 folder.

   Many components in Flume will need that change if we go with that...in 
addition to startup script needs changes to support for each of those. best 
tracked in a separate jira. 

Also this is not flume specific situation.


was (Author: roshan_naik):
No doubt a common issue. For e.g Hive does not ship with Hadoop deps.

Basically end up relying on the backward binary compat provided by those deps. 

Unfortunately due to the complexity it gets tested with only 1 version of each 
dependency.  Helps a bit that each Hadoop vendor ends up testing it with a 
different versions depending on what they ship.

Your suggestion or a variation of it seems like worth looking into. There might 
be some tricky things that can happen when including multiple version of 
multiple jars. Esp if some of those are fat/bulky jars. Perhaps put all such 
deps under a separate lib2 folder.

   Many components in Flume will need that change if we go with that...in 
addition to startup script needs changes to support for each of those. best 
tracked in a separate jira. 

Also this is not flume specific situation.

> Kafka libs are being bundled into Flume distro
> --
>
> Key: FLUME-2819
> URL: https://issues.apache.org/jira/browse/FLUME-2819
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>
> Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-23 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972022#comment-14972022
 ] 

Roshan Naik commented on FLUME-2819:


- Flume depends on several external services (hadoop and non hadoop)  and that 
list keeps growing with each release. So not feasible to keep updating and 
releasing Flume  frequently (i.e each time one of them release a new version) 
to enable Flume users leverage new features being release in them.  
- Releasing frequently with updated version creates a diff problem .. it ties 
each Flume version to a specific dep version. And in the end user may not want 
to use that version of the service.
- Release frequently model also forces Flume to release each time there is a 
security fix releases in them
- If users have to wait for new Flume releases, that then leads to long waits 
for users... or the need to perform surgery on the flume/lib directory. Neither 
is reasonable.


Consequently, IMO, using the "provided" model for such deps and giving the 
users a mechanism to customize the Flume Classpath easily, is in the end a 
better choice.

> Kafka libs are being bundled into Flume distro
> --
>
> Key: FLUME-2819
> URL: https://issues.apache.org/jira/browse/FLUME-2819
> Project: Flume
>  Issue Type: Bug
>Reporter: Roshan Naik
>
> Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2819) Kafka libs are being bundled into Flume distro

2015-10-21 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2819:
--

 Summary: Kafka libs are being bundled into Flume distro
 Key: FLUME-2819
 URL: https://issues.apache.org/jira/browse/FLUME-2819
 Project: Flume
  Issue Type: Bug
Reporter: Roshan Naik


Kafka dependency libs need to be marked as 'provided' in the pom.xml 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2792) Flume Kafka Kerberos Support

2015-10-21 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967495#comment-14967495
 ] 

Roshan Naik edited comment on FLUME-2792 at 10/21/15 5:23 PM:
--

opened FLUME-2819 to track the kafka libs issue noted above


was (Author: roshan_naik):
opened FLUME-2819

> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support

2015-10-21 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967495#comment-14967495
 ] 

Roshan Naik commented on FLUME-2792:


opened FLUME-2819

> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-10-02 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2804:
---
Assignee: Sriharsha Chintalapani

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>  Labels: Hive
> Fix For: v1.7.0
>
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException

2015-10-02 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2798:
---
Assignee: Phil D'Amore

> Malformed Syslog messages can lead to OutOfMemoryException
> --
>
> Key: FLUME-2798
> URL: https://issues.apache.org/jira/browse/FLUME-2798
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.5.0, v1.6.0
>Reporter: Phil D'Amore
>Assignee: Phil D'Amore
>Priority: Critical
> Attachments: FLUME-2798.patch
>
>
> It's possible for a client submitting syslog data which is malformed in 
> various ways to convince SyslogUtils.extractEvent to continually fill the 
> ByteArrayOutputStream it uses to collect the event until the agent runs out 
> of memory.  Since the OOM condition affects the whole agent, it's possible 
> that a client sending such data (due to accident or malicious intent) to 
> disable the agent, as long as it remains connected.
> Note that this is probably only possible using SyslogTcpSource although the 
> fix touches common code in SyslogUtils.java.
> The issue can happen in two ways:
> Scenario 1: Send a message like this:
> {{<> some more stuff here}}
> This causes a NumberFormatException:
> {code}
> Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
> WARNING: EXCEPTION, please implement 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() 
> for proper handling.
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
>   at 
> org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
>   at 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
>   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This exception does not get handled, and it happens before reset() can be 
> called.  The result is that the state machine in SyslogUtils gets stuck in 
> the DATA state, and all subsequent data just gets appended to the baos, while 
> the above exception streams to the log.  Eventually the agent runs out of 
> memory.
> Scenario 2: Send some data like this:
> {{<123...}}
> No length checking is done in the PRIO state so you could potentially fill 
> the agent memory this way too.
> I'm attaching a patch which handles both of these issues and adds more 
> exception handling to buildEvent to make sure that reset() is called in 
> future unforeseen situations.
> Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException

2015-10-02 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941944#comment-14941944
 ] 

Roshan Naik commented on FLUME-2798:


Its committed.

Thanks very much for the patch  [~tweek] 

> Malformed Syslog messages can lead to OutOfMemoryException
> --
>
> Key: FLUME-2798
> URL: https://issues.apache.org/jira/browse/FLUME-2798
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.5.0, v1.6.0
>Reporter: Phil D'Amore
>Assignee: Phil D'Amore
>Priority: Critical
> Attachments: FLUME-2798.patch
>
>
> It's possible for a client submitting syslog data which is malformed in 
> various ways to convince SyslogUtils.extractEvent to continually fill the 
> ByteArrayOutputStream it uses to collect the event until the agent runs out 
> of memory.  Since the OOM condition affects the whole agent, it's possible 
> that a client sending such data (due to accident or malicious intent) to 
> disable the agent, as long as it remains connected.
> Note that this is probably only possible using SyslogTcpSource although the 
> fix touches common code in SyslogUtils.java.
> The issue can happen in two ways:
> Scenario 1: Send a message like this:
> {{<> some more stuff here}}
> This causes a NumberFormatException:
> {code}
> Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
> WARNING: EXCEPTION, please implement 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() 
> for proper handling.
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
>   at 
> org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
>   at 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
>   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This exception does not get handled, and it happens before reset() can be 
> called.  The result is that the state machine in SyslogUtils gets stuck in 
> the DATA state, and all subsequent data just gets appended to the baos, while 
> the above exception streams to the log.  Eventually the agent runs out of 
> memory.
> Scenario 2: Send some data like this:
> {{<123...}}
> No length checking is done in the PRIO state so you could potentially fill 
> the agent memory this way too.
> I'm attaching a patch which handles both of these issues and adds more 
> exception handling to buildEvent to make sure that reset() is called in 
> future unforeseen situations.
> Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException

2015-10-02 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved FLUME-2798.

   Resolution: Fixed
Fix Version/s: v1.7.0

> Malformed Syslog messages can lead to OutOfMemoryException
> --
>
> Key: FLUME-2798
> URL: https://issues.apache.org/jira/browse/FLUME-2798
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.5.0, v1.6.0
>Reporter: Phil D'Amore
>Assignee: Phil D'Amore
>Priority: Critical
> Fix For: v1.7.0
>
> Attachments: FLUME-2798.patch
>
>
> It's possible for a client submitting syslog data which is malformed in 
> various ways to convince SyslogUtils.extractEvent to continually fill the 
> ByteArrayOutputStream it uses to collect the event until the agent runs out 
> of memory.  Since the OOM condition affects the whole agent, it's possible 
> that a client sending such data (due to accident or malicious intent) to 
> disable the agent, as long as it remains connected.
> Note that this is probably only possible using SyslogTcpSource although the 
> fix touches common code in SyslogUtils.java.
> The issue can happen in two ways:
> Scenario 1: Send a message like this:
> {{<> some more stuff here}}
> This causes a NumberFormatException:
> {code}
> Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
> WARNING: EXCEPTION, please implement 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() 
> for proper handling.
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
>   at 
> org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
>   at 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
>   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This exception does not get handled, and it happens before reset() can be 
> called.  The result is that the state machine in SyslogUtils gets stuck in 
> the DATA state, and all subsequent data just gets appended to the baos, while 
> the above exception streams to the log.  Eventually the agent runs out of 
> memory.
> Scenario 2: Send some data like this:
> {{<123...}}
> No length checking is done in the PRIO state so you could potentially fill 
> the agent memory this way too.
> I'm attaching a patch which handles both of these issues and adds more 
> exception handling to buildEvent to make sure that reset() is called in 
> future unforeseen situations.
> Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2792) Flume Kafka Kerberos Support

2015-10-01 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940582#comment-14940582
 ] 

Roshan Naik commented on FLUME-2792:


Here are some notes that i gathered from talking to Kafka experts. Give it a 
shot...


It seems like it might be possible for Kafka Sink (i,e kafka producer side). 
This won't work for Kafka source (I.e Kafka consumer side)
Below are the steps we identified:

1)  *In Flume's Kafka Sink config set:*
   agentName.sinks.KafkaSinkName.kafka.security.protocol=PLAINTEXTSASL
The sink  will forward this setting to the underlying Kafka Producer APIs. This 
inform the Producer APIs to use kerberos

2) *Pass the following JVM args to Flume:*
-Djava.security.auth.login.config= /path/jaas.conf 
This indicates the name of the file which has additional security settings and 
used by the Producer APIs.
JVM args for Flume can be set using the flume-env.sh which resides in the 
directory specified by the –c argument to Flume startup command. If Ambari 
managed, ambari also allows you to directly edit the flume-env.sh as far I 
recall.

3) *The jaas.conf file's contents should look like this:*
KafkaClient { 
 
 
com.sun.security.auth.module.Krb5LoginModule required 
useKeyTab=true 
keyTab="/etc/security/keytabs/flume_agent.keytab" 
storeKey=true 
useTicketCache=false 
principal="flume_agent/host_n...@example.com" 
 
serviceName="kafka"; 
 
};
You need to customize the key tab, principal and service name.

4) *Ensure the right Kafka libraries are used by Flume:*
  The Kerberos support is being added to version the upcoming Kafka v 0.9. Just 
ensure flume/lib does not have conflicting kafka jar versions.



> Flume Kafka Kerberos Support
> 
>
> Key: FLUME-2792
> URL: https://issues.apache.org/jira/browse/FLUME-2792
> Project: Flume
>  Issue Type: Bug
>  Components: Configuration, Docs, Sinks+Sources
>Affects Versions: v1.6.0, v1.5.2
> Environment: HDP 2.3 fully kerberized including Kafka 0.8.2.2 + Flume 
> 1.5.2 or Apache Flume 1.6 downloaded from apache.org
>Reporter: Hari Sekhon
>Priority: Blocker
>
> Following on from FLUME-2790 it appears as though Flume doesn't yet have 
> support for Kafka + Kerberos as there are is no setting documented in the 
> Flume 1.6.0 user guide under the Kafka source section to tell Flume to use 
> plaintextsasl as the connection mechanism to Kafka and Kafka rejects 
> unauthenticated plaintext mechanism:
> {code}15/09/10 16:51:22 INFO consumer.ConsumerFetcherManager: 
> [ConsumerFetcherManager-1441903874830] Added fetcher for partitions 
> ArrayBuffer()
> 15/09/10 16:51:22 WARN consumer.ConsumerFetcherManager$LeaderFinderThread: 
> [flume_-1441903874763-abdc98ec-leader-finder-thread], Failed 
> to find leader for Set([,0], [,1])
> kafka.common.BrokerEndPointNotAvailableException: End point PLAINTEXT not 
> found for broker 0
> at kafka.cluster.Broker.getBrokerEndPoint(Broker.scala:140)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> kafka.utils.ZkUtils$$anonfun$getAllBrokerEndPointsForChannel$1.apply(ZkUtils.scala:124)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at 
> kafka.utils.ZkUtils$.getAllBrokerEndPointsForChannel(ZkUtils.scala:124)
> at 
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
> at 
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2804:
---
Affects Version/s: v1.6.0

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Sriharsha Chintalapani
>  Labels: Hive
> Fix For: v1.7.0
>
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2804:
---
Fix Version/s: v1.7.0

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Sriharsha Chintalapani
>  Labels: Hive
> Fix For: v1.7.0
>
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935961#comment-14935961
 ] 

Roshan Naik commented on FLUME-2804:


+1 

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved FLUME-2804.

Resolution: Fixed

Committed.

Thanks [~sriharsha]

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2804:
---
Labels: Hive  (was: )

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Sriharsha Chintalapani
>  Labels: Hive
> Fix For: v1.7.0
>
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2804) Hive sink - abort remaining transactions on shutdown

2015-09-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2804:
---
Component/s: Sinks+Sources

> Hive sink - abort remaining transactions on shutdown
> 
>
> Key: FLUME-2804
> URL: https://issues.apache.org/jira/browse/FLUME-2804
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Sriharsha Chintalapani
>  Labels: Hive
> Fix For: v1.7.0
>
> Attachments: FLUME-2804.patch
>
>
> Currently the hive sink does not explicitly abort unused transactions. 
> Although these eventually timeout on the hive side, it is preferable to 
> explicitly abort them so that the associated locks on the hive 
> table/partition are released. As long as the locks stay open, the 
> table/partition cannot be dropped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException

2015-09-23 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2798:
---
Assignee: (was: Roshan Naik)

> Malformed Syslog messages can lead to OutOfMemoryException
> --
>
> Key: FLUME-2798
> URL: https://issues.apache.org/jira/browse/FLUME-2798
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.5.0, v1.6.0
>Reporter: Phil D'Amore
>Priority: Critical
> Attachments: FLUME-2798.patch
>
>
> It's possible for a client submitting syslog data which is malformed in 
> various ways to convince SyslogUtils.extractEvent to continually fill the 
> ByteArrayOutputStream it uses to collect the event until the agent runs out 
> of memory.  Since the OOM condition affects the whole agent, it's possible 
> that a client sending such data (due to accident or malicious intent) to 
> disable the agent, as long as it remains connected.
> Note that this is probably only possible using SyslogTcpSource although the 
> fix touches common code in SyslogUtils.java.
> The issue can happen in two ways:
> Scenario 1: Send a message like this:
> {{<> some more stuff here}}
> This causes a NumberFormatException:
> {code}
> Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
> WARNING: EXCEPTION, please implement 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() 
> for proper handling.
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
>   at 
> org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
>   at 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
>   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This exception does not get handled, and it happens before reset() can be 
> called.  The result is that the state machine in SyslogUtils gets stuck in 
> the DATA state, and all subsequent data just gets appended to the baos, while 
> the above exception streams to the log.  Eventually the agent runs out of 
> memory.
> Scenario 2: Send some data like this:
> {{<123...}}
> No length checking is done in the PRIO state so you could potentially fill 
> the agent memory this way too.
> I'm attaching a patch which handles both of these issues and adds more 
> exception handling to buildEvent to make sure that reset() is called in 
> future unforeseen situations.
> Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLUME-2798) Malformed Syslog messages can lead to OutOfMemoryException

2015-09-23 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik reassigned FLUME-2798:
--

Assignee: Roshan Naik

> Malformed Syslog messages can lead to OutOfMemoryException
> --
>
> Key: FLUME-2798
> URL: https://issues.apache.org/jira/browse/FLUME-2798
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.5.0, v1.6.0
>Reporter: Phil D'Amore
>Assignee: Roshan Naik
>Priority: Critical
> Attachments: FLUME-2798.patch
>
>
> It's possible for a client submitting syslog data which is malformed in 
> various ways to convince SyslogUtils.extractEvent to continually fill the 
> ByteArrayOutputStream it uses to collect the event until the agent runs out 
> of memory.  Since the OOM condition affects the whole agent, it's possible 
> that a client sending such data (due to accident or malicious intent) to 
> disable the agent, as long as it remains connected.
> Note that this is probably only possible using SyslogTcpSource although the 
> fix touches common code in SyslogUtils.java.
> The issue can happen in two ways:
> Scenario 1: Send a message like this:
> {{<> some more stuff here}}
> This causes a NumberFormatException:
> {code}
> Sep 11, 2015 2:27:07 AM org.jboss.netty.channel.SimpleChannelHandler
> WARNING: EXCEPTION, please implement 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.exceptionCaught() 
> for proper handling.
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at org.apache.flume.source.SyslogUtils.buildEvent(SyslogUtils.java:198)
>   at 
> org.apache.flume.source.SyslogUtils.extractEvent(SyslogUtils.java:344)
>   at 
> org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived(SyslogTcpSource.java:76)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364)
>   at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:238)
>   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This exception does not get handled, and it happens before reset() can be 
> called.  The result is that the state machine in SyslogUtils gets stuck in 
> the DATA state, and all subsequent data just gets appended to the baos, while 
> the above exception streams to the log.  Eventually the agent runs out of 
> memory.
> Scenario 2: Send some data like this:
> {{<123...}}
> No length checking is done in the PRIO state so you could potentially fill 
> the agent memory this way too.
> I'm attaching a patch which handles both of these issues and adds more 
> exception handling to buildEvent to make sure that reset() is called in 
> future unforeseen situations.
> Thanks also to [~roshan_naik] for helping to make this patch better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2225) Elasticsearch Sink for ES HTTP API

2015-09-10 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739203#comment-14739203
 ] 

Roshan Naik commented on FLUME-2225:


[~mbonica] canu open a new jira for this ?

> Elasticsearch Sink for ES HTTP API
> --
>
> Key: FLUME-2225
> URL: https://issues.apache.org/jira/browse/FLUME-2225
> Project: Flume
>  Issue Type: New Feature
>Affects Versions: v1.5.0
>Reporter: Otis Gospodnetic
>Assignee: Pawel Rog
> Fix For: v1.4.1, v1.5.0
>
> Attachments: FLUME-2225-0.patch, FLUME-2225-1.patch, 
> FLUME-2225-5.patch, FLUME-2225-6.patch
>
>
> Existing ElasticSearchSink uses ES TransportClient.  As such, one cannot use 
> the ES HTTP API, which is sometimes easier, and doesn't have issues around 
> client and server/cluster components using incompatible versions - currently, 
> both client and server/cluster need to be on the same version.
> See
> http://search-hadoop.com/m/k76HH9Te68/otis=Elasticsearch+sink+that+uses+HTTP+API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2433) Add kerberos support for Hive sink

2015-08-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2433:
---
Attachment: FLUME-2433.v2.patch

[~jrufus]  It seems like your suggestion is  the right way to go. 

My current implementation mimics the previous HDFS kerberos implementation and 
doesn't use the mod-auth. Right now, I am uploading the rebased patch as it 
will take me some time to figure out what changes are needed for switching to 
mod-auth etc.  System testing of the new implementation will also take sometime 
as it requires a secure cluster setup.

Leave it up to you if you want to commit this in its current state and have the 
switch it mod-auth in another jira or hold this Jira for the revised 
implementation.   It will take me sometime for me to do that i think.

 Add kerberos support for Hive sink
 --

 Key: FLUME-2433
 URL: https://issues.apache.org/jira/browse/FLUME-2433
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: HiveSink, Kerberos,
 Attachments: FLUME-2433.patch, FLUME-2433.v2.patch


 Add kerberos authentication support for Hive sink
 FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 
 this should be available in the next hive release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2754) Hive Sink skipping first transaction in each Batch of Hive Transactions

2015-08-25 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712332#comment-14712332
 ] 

Roshan Naik commented on FLUME-2754:


Thanks [~deepesh] for patch and including the test. 
+1

 Hive Sink skipping first transaction in each Batch of Hive Transactions
 ---

 Key: FLUME-2754
 URL: https://issues.apache.org/jira/browse/FLUME-2754
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0, 1.6
Reporter: Roshan Naik
Assignee: Deepesh Khandelwal
 Attachments: FLUME-2754.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2754) Hive Sink skipping first transaction in each Batch of Hive Transactions

2015-08-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved FLUME-2754.

   Resolution: Fixed
Fix Version/s: v1.7.0

Committed.

 Hive Sink skipping first transaction in each Batch of Hive Transactions
 ---

 Key: FLUME-2754
 URL: https://issues.apache.org/jira/browse/FLUME-2754
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0, 1.6
Reporter: Roshan Naik
Assignee: Deepesh Khandelwal
 Fix For: v1.7.0

 Attachments: FLUME-2754.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-08-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709683#comment-14709683
 ] 

Roshan Naik commented on FLUME-2498:


[~evilezh] could u open a jira for that feature request.. and consider 
submitting a patch for it ?

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, 
 FLUME-2498-4.patch, FLUME-2498-5.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2433) Add kerberos support for Hive sink

2015-08-21 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707801#comment-14707801
 ] 

Roshan Naik commented on FLUME-2433:


This patch probably needs to be rebased.  Its actually been in production for 
over year with the  HDP distribution.

Perhaps [~ashishpaliwal] or [~jrufus] can  help with review and commit once i 
revise the patch.

 Add kerberos support for Hive sink
 --

 Key: FLUME-2433
 URL: https://issues.apache.org/jira/browse/FLUME-2433
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: HiveSink, Kerberos,
 Attachments: FLUME-2433.patch


 Add kerberos authentication support for Hive sink
 FYI: The HCatalog API support for Kerberos is not available in hive 0.13.1 
 this should be available in the next hive release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2436) Make hadoop-2 the default build profile

2015-08-21 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707288#comment-14707288
 ] 

Roshan Naik commented on FLUME-2436:


[~raviprak] with flume-1.6 and above the default profile should be 'hbase-1' 
which also uses  hadoop 2.

Not sure if hadoop-2 profile has much use anymore.

 Make hadoop-2 the default build profile
 ---

 Key: FLUME-2436
 URL: https://issues.apache.org/jira/browse/FLUME-2436
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Johny Rufus
  Labels: build
 Attachments: FLUME-2436.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2767) How to extract records from Facebook and linkedin

2015-08-21 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707276#comment-14707276
 ] 

Roshan Naik commented on FLUME-2767:


For general questions better to use the mailing list.

There are no Flume sources currently implemented for Facebook or LinkedIn that 
I am aware of.

 How to extract records from Facebook and linkedin 
 --

 Key: FLUME-2767
 URL: https://issues.apache.org/jira/browse/FLUME-2767
 Project: Flume
  Issue Type: Question
  Components: Configuration
Affects Versions: v1.6.0
 Environment: POC
Reporter: swayam
Priority: Critical
   Original Estimate: 48h
  Remaining Estimate: 48h

 Hi Team,
 Could you let me know how we can extract the data from facebook and linkedin 
 ? Twitter we can able to extract .
 Please let me know the process .. 
 Thanks !!
 Swayam



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2766) Add deserializer support to TailDir source

2015-08-17 Thread Roshan Naik (JIRA)
Roshan Naik created FLUME-2766:
--

 Summary: Add deserializer support to TailDir source
 Key: FLUME-2766
 URL: https://issues.apache.org/jira/browse/FLUME-2766
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.7.0
Reporter: Roshan Naik


TailDir does not support deserializers currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-08-17 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700643#comment-14700643
 ] 

Roshan Naik commented on FLUME-2498:


Seems like there are no blocker issues. And all changes have been reviewed by 
others and myself.
So +1 from me. 
Will initiate the commit now.

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, 
 FLUME-2498-4.patch, FLUME-2498-5.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2498) Implement Taildir Source

2015-08-14 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated FLUME-2498:
---
Attachment: FLUME-2498-5.patch

Uploading Patch.
  - Changed the file consume order to be based on modification time (let me if 
you feel this is inappropriate choice, can easily switch to creation time if 
needed). 
 - Also  added to unit test to check file consume order
  - it includes changes from patch v4 submitted by Johny  (can u please verify 
this [~jrufus] ?)
  -corrected the description of 'filegroups' setting in doc.  [~iijima_satoshi] 
 can you please validate ?


Although i have a unit test to verify the consume order, would be good if 
someone does a review of it to ensure i didnt mess up. [~iijima_satoshi] johny, 
ashish or  others ? I have kept my changes to bare minimal to reduce churn.

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, 
 FLUME-2498-4.patch, FLUME-2498-5.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLUME-2498) Implement Taildir Source

2015-08-13 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696110#comment-14696110
 ] 

Roshan Naik edited comment on FLUME-2498 at 8/13/15 10:59 PM:
--

Qiuck update.. changing the order has proved a bit more complex than expected. 
thought i had it but the tests indicate otherwise.

Still working on it.
Also it is becoming evident to me that consumption order based on lastModified 
time is more appropriate than creation time as old files may still be getting 
updated. that way a file deleting agent can  remove files that have not been 
modified for a long time rather than based on creation time.


was (Author: roshan_naik):
Qiuck update.. changing the order has proved a bit more complex than expected. 
thought i had it but the tests indicate otherwise.

Still working on it.
Also it is becoming evident to me that  lastModified time is more appropriate 
than creation time as old files may still be getting updated. that way a file 
deleting agent can  remove files that have not been modified for a long time 
rather than based on creation time.

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, 
 FLUME-2498-4.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-08-12 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694141#comment-14694141
 ] 

Roshan Naik commented on FLUME-2498:


yes I guess that sounds like a good idea. Good to have a little unit test for 
that function with two or three different type of lines feeding into it.


 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2498) Implement Taildir Source

2015-08-12 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694144#comment-14694144
 ] 

Roshan Naik commented on FLUME-2498:


if it works well we should use your implementation for   FLUME-2508 also

 Implement Taildir Source
 

 Key: FLUME-2498
 URL: https://issues.apache.org/jira/browse/FLUME-2498
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Satoshi Iijima
 Fix For: v1.7.0

 Attachments: FLUME-2498-2.patch, FLUME-2498-3.patch, FLUME-2498.patch


 This is the proposal of implementing a new tailing source.
 This source watches the specified files, and tails them in nearly real-time 
 once appends are detected to these files.
 * This source is reliable and will not miss data even when the tailing files 
 rotate.
 * It periodically writes the last read position of each file in a position 
 file using the JSON format.
 * If Flume is stopped or down for some reason, it can restart tailing from 
 the position written on the existing position file.
 * It can add event headers to each tailing file group. 
 A attached patch includes a config documentation of this.
 This source requires Unix-style file system and Java 1.7 or later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   >