[jira] [Resolved] (FLUME-2920) Kafka Channel Should Not Commit Offsets When Stopping

2016-06-10 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2920.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Thank you for your contribution [~kevinconaway] and for the review [~jholoman] 
/ [~granthenke]!

> Kafka Channel Should Not Commit Offsets When Stopping
> -
>
> Key: FLUME-2920
> URL: https://issues.apache.org/jira/browse/FLUME-2920
> Project: Flume
>  Issue Type: Bug
>  Components: Channel
>Affects Versions: v1.6.0
>Reporter: Kevin Conaway
>Assignee: Kevin Conaway
>Priority: Critical
> Fix For: v1.7.0
>
> Attachments: FLUME-2920.patch
>
>
> The Kafka channel commits the consumer offsets when shutting down (via stop() 
> -> decommissionConsumerAndRecords())
> This can lead to data loss if the channel is shut down while messages have 
> been fetched in a transaction but the transaction has not yet been committed.
> The only time that the offsets should be committed is when a transaction is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLUME-2920) Kafka Channel Should Not Commit Offsets When Stopping

2016-06-10 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned FLUME-2920:
-

Assignee: Kevin Conaway

> Kafka Channel Should Not Commit Offsets When Stopping
> -
>
> Key: FLUME-2920
> URL: https://issues.apache.org/jira/browse/FLUME-2920
> Project: Flume
>  Issue Type: Bug
>  Components: Channel
>Affects Versions: v1.6.0
>Reporter: Kevin Conaway
>Assignee: Kevin Conaway
>Priority: Critical
> Attachments: FLUME-2920.patch
>
>
> The Kafka channel commits the consumer offsets when shutting down (via stop() 
> -> decommissionConsumerAndRecords())
> This can lead to data loss if the channel is shut down while messages have 
> been fetched in a transaction but the transaction has not yet been committed.
> The only time that the offsets should be committed is when a transaction is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks

2016-05-18 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289528#comment-15289528
 ] 

Jarek Jarcec Cecho commented on FLUME-2905:
---

My local build is failing with:

{code}
Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.481 sec <<< 
FAILURE!
testSourceStoppedOnFlumeException(org.apache.flume.source.TestNetcatSource)  
Time elapsed: 6 sec  <<< ERROR!
java.lang.IllegalStateException: No channel processor configured
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at org.apache.flume.source.AbstractSource.start(AbstractSource.java:45)
at org.apache.flume.source.NetcatSource.start(NetcatSource.java:192)
at 
org.apache.flume.source.TestNetcatSource.startSource(TestNetcatSource.java:344)
at 
org.apache.flume.source.TestNetcatSource.testSourceStoppedOnFlumeException(TestNetcatSource.java:314)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
{code}

Wondering if you saw that in your environment as well?

> NetcatSource - Socket not closed when an exception is encountered during 
> start() leading to file descriptor leaks
> -
>
> Key: FLUME-2905
> URL: https://issues.apache.org/jira/browse/FLUME-2905
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
> Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch
>
>
> During the flume agent start-up, the flume configuration containing the 
> NetcatSource is parsed and the source's start() is called. If there is an 
> issue while binding the channel's socket to a local address to configure the 
> socket to listen for connections following exception is thrown but the socket 
> open just before is not closed. 
> {code}
> 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: 
> Unable to start EventDrivenSourceRunner: { 
> source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - 
> Exception follows.
> org.apache.flume.FlumeException: java.net.

[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks

2016-05-12 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281953#comment-15281953
 ] 

Jarek Jarcec Cecho commented on FLUME-2905:
---

Sure, my pleasure.  Quickly looking at the patch, can you please remove the 
trailing white space warnings [~sahuja]?

{code}
jarcec@arlene flume % git apply FLUME-2905-0.patch  


   
[trunk ✗] (1.8.0_77-b03) (ruby-2.2.3) [12:54:00]
FLUME-2905-0.patch:25: space before tab in indent.
.setNameFormat("netcat-handler-%d").build());
FLUME-2905-0.patch:50: trailing whitespace.

FLUME-2905-0.patch:52: trailing whitespace.
   * Tests that the source is stopped when an exception is thrown
FLUME-2905-0.patch:54: trailing whitespace.
   * clean up the sockets opened during source.start().
FLUME-2905-0.patch:55: trailing whitespace.
   *
warning: squelched 5 whitespace errors
warning: 10 lines add whitespace errors.
{code}

> NetcatSource - Socket not closed when an exception is encountered during 
> start() leading to file descriptor leaks
> -
>
> Key: FLUME-2905
> URL: https://issues.apache.org/jira/browse/FLUME-2905
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
> Attachments: FLUME-2905-0.patch
>
>
> During the flume agent start-up, the flume configuration containing the 
> NetcatSource is parsed and the source's start() is called. If there is an 
> issue while binding the channel's socket to a local address to configure the 
> socket to listen for connections following exception is thrown but the socket 
> open just before is not closed. 
> {code}
> 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: 
> Unable to start EventDrivenSourceRunner: { 
> source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - 
> Exception follows.
> org.apache.flume.FlumeException: java.net.BindException: Address already in 
> use
> at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173)
> at 
> org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
> at 
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:444)
> at sun.nio.ch.Net.bind(Net.java:436)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
> at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167)
> ... 9 more
> {code}
> The source's start() is then called again leading to another socket being 
> opened but not closed and so on. This leads to file descriptor (socket) leaks.
> This can be easily reproduced as follows:
> 1. Set Netcat as the source in flume agent configuration.
> 2. Set the bind port for the netcat source to a port which is already in use. 
> e.g. in my case I used 50010 which is the port for DataNode's XCeiver 
> Protocol in use by the HDFS service.
> 3. Start flume agent and perform "lsof -p  | wc -l". Notice 
> the file descriptors keep on growing due to socket leaks with errors like: 
> "can't identify protocol".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2620) File channel throws NullPointerException if a header value is null

2016-05-05 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272559#comment-15272559
 ] 

Jarek Jarcec Cecho commented on FLUME-2620:
---

I have two comments to the [latest 
patch|https://issues.apache.org/jira/secure/attachment/12802350/FLUME-2620.patch]:

* Our code style is to put opening bracket next to the previous statement. E.g. 
{{if(headerVal.getValue()==null) {}} on one single line rather then two. Can 
you please fix the method {{checkAndReplaceNullInHeaders}} to look like that?
* Can we introduce a configuration option that will allow user to configure the 
default "replacement" value for {{null}}? I'm concerned that someone might need 
to distinguish the case of value being {{null}} or empty.

> File channel throws NullPointerException if a header value is null
> --
>
> Key: FLUME-2620
> URL: https://issues.apache.org/jira/browse/FLUME-2620
> Project: Flume
>  Issue Type: Bug
>  Components: File Channel
>Reporter: Santiago M. Mola
>Assignee: Neerja Khattar
> Attachments: FLUME-2620.patch, FLUME-2620.patch
>
>
> File channel throws NullPointerException if a header value is null.
> If this is intended, it should be reported correctly in the logs.
> Sample trace:
> org.apache.flume.ChannelException: Unable to put batch on required channel: 
> FileChannel chan { dataDirs: [/var/lib/ingestion-csv/chan/data] }
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)
>   at 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:236)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.flume.channel.file.proto.ProtosFactory$FlumeEventHeader$Builder.setValue(ProtosFactory.java:7415)
>   at org.apache.flume.channel.file.Put.writeProtos(Put.java:85)
>   at 
> org.apache.flume.channel.file.TransactionEventRecord.toByteBuffer(TransactionEventRecord.java:174)
>   at org.apache.flume.channel.file.Log.put(Log.java:622)
>   at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:469)
>   at 
> org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
>   at 
> org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
>   at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLUME-2898) Kafka source test will freeze on JDK8

2016-03-29 Thread Jarek Jarcec Cecho (JIRA)
Jarek Jarcec Cecho created FLUME-2898:
-

 Summary: Kafka source test will freeze on JDK8
 Key: FLUME-2898
 URL: https://issues.apache.org/jira/browse/FLUME-2898
 Project: Flume
  Issue Type: Bug
Reporter: Jarek Jarcec Cecho


After committing FLUME-2821, I've noticed that the test case doesn't work 
properly on JDK8. It gets stuck:

{code}
"main" #1 prio=5 os_prio=31 tid=0x7fa662000800 nid=0x1703 runnable 
[0x70217000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00076bc0dd78> (a sun.nio.ch.Util$2)
- locked <0x00076bc0dd68> (a java.util.Collections$UnmodifiableSet)
- locked <0x00076bc0dc48> (a sun.nio.ch.KQueueSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.kafka.common.network.Selector.select(Selector.java:425)
at org.apache.kafka.common.network.Selector.poll(Selector.java:254)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:193)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:134)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:184)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:886)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:853)
at 
org.apache.flume.source.kafka.KafkaSource.doStart(KafkaSource.java:352)
at 
org.apache.flume.source.BasicSourceSemantics.start(BasicSourceSemantics.java:83)
- locked <0x0006c81d1698> (a 
org.apache.flume.source.kafka.KafkaSource)
at 
org.apache.flume.source.kafka.TestKafkaSource.testOffsets(TestKafkaSource.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderPro

[jira] [Resolved] (FLUME-2820) Support New Kafka APIs

2016-03-29 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2820.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

I'm resolving the umbrella JIRA as all subtasks has been committed.

> Support New Kafka APIs
> --
>
> Key: FLUME-2820
> URL: https://issues.apache.org/jira/browse/FLUME-2820
> Project: Flume
>  Issue Type: Improvement
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
> Fix For: v1.7.0
>
>
> Flume utilizes the older Kafka producer and consumer APIs. We should add 
> implementations of the Source, Channel and Sink that utilize the new APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2823) Flume-Kafka-Channel with new APIs

2016-03-29 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2823.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Thank you for your contribution [~jholoman]!

> Flume-Kafka-Channel with new APIs
> -
>
> Key: FLUME-2823
> URL: https://issues.apache.org/jira/browse/FLUME-2823
> Project: Flume
>  Issue Type: Sub-task
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
> Fix For: v1.7.0
>
> Attachments: FLUME-2823v4.patch, FLUME-2823v6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2822) Flume-Kafka-Sink with new Producer

2016-03-29 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2822.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Thank you for your contribution [~jholoman]!

> Flume-Kafka-Sink with new Producer
> --
>
> Key: FLUME-2822
> URL: https://issues.apache.org/jira/browse/FLUME-2822
> Project: Flume
>  Issue Type: Sub-task
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
> Fix For: v1.7.0
>
> Attachments: FLUME-2822.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2897) AsyncHBase sink NPE when Channel.getTransaction() fails

2016-03-29 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216136#comment-15216136
 ] 

Jarek Jarcec Cecho commented on FLUME-2897:
---

Thanks [~mpercy]. I've quickly looked into other Sinks to see how they are 
handling this and both 
[AvroSink|https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/sink/AbstractRpcSink.java#L336]
 and 
[HdfsSink|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java#L368]
 are using the idiom that you've mentioned. Some of the newer sinks 
([KafkaSink|https://github.com/apache/flume/blob/trunk/flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java#L83])
 are getting the transaction inside the {{try-catch}}, but then they are 
properly guarding the transaction object against being {{null}}. This seems a 
bit simpler, so I'm +1.

> AsyncHBase sink NPE when Channel.getTransaction() fails
> ---
>
> Key: FLUME-2897
> URL: https://issues.apache.org/jira/browse/FLUME-2897
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.6.0
>Reporter: Mike Percy
>Assignee: Mike Percy
> Attachments: FLUME-2897-1.patch
>
>
> There is a possibility for a NPE in the AsyncHBaseSink when a channel 
> getTransaction() call fails. This is possible when the FileChannel is out of 
> disk space.
> Patch forthcoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2821) Flume-Kafka Source with new Consumer

2016-03-28 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214994#comment-15214994
 ] 

Jarek Jarcec Cecho commented on FLUME-2821:
---

Can you please upload the latest patch from the review board to this JIRA as 
well [~groghkov]? I need that because by attaching the patch here, you're 
implicitly giving ASF rights to include it (which is not the case on review 
board). After that I'll go ahead and commit it as discussed on FLUME-2820.

> Flume-Kafka Source with new Consumer
> 
>
> Key: FLUME-2821
> URL: https://issues.apache.org/jira/browse/FLUME-2821
> Project: Flume
>  Issue Type: Sub-task
>Reporter: Jeff Holoman
>Assignee: Grigoriy Rozhkov
> Attachments: FLUME-2821.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2820) Support New Kafka APIs

2016-03-23 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209030#comment-15209030
 ] 

Jarek Jarcec Cecho commented on FLUME-2820:
---

Updated channel and sink got several +1s, so I believe that they are ready to 
be committed. I've asked [~singhashish] from Kafka community to take a look on 
the source patch. Once that will be done I'll go ahead and commit all those 
patches at once. As they obviously overlap a bit (overriding root {{pom.xml}} 
the same way), I'll remove the overlap myself and commit that change.

> Support New Kafka APIs
> --
>
> Key: FLUME-2820
> URL: https://issues.apache.org/jira/browse/FLUME-2820
> Project: Flume
>  Issue Type: Improvement
>Reporter: Jeff Holoman
>Assignee: Jeff Holoman
>
> Flume utilizes the older Kafka producer and consumer APIs. We should add 
> implementations of the Source, Channel and Sink that utilize the new APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2891) Revert FLUME-2712 and FLUME-2886

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187678#comment-15187678
 ] 

Jarek Jarcec Cecho commented on FLUME-2891:
---

+1, seems reasonable to me

> Revert FLUME-2712 and FLUME-2886
> 
>
> Key: FLUME-2891
> URL: https://issues.apache.org/jira/browse/FLUME-2891
> Project: Flume
>  Issue Type: Bug
>Reporter: Hari Shreedharan
>Assignee: Hari Shreedharan
> Attachments: FLUME-2891.patch
>
>
> FLUME-2712 can be fixed by simply setting keep-alive to 0. I think it added 
> additional complexity, which we can probably avoid. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2886) Optional Channels can cause OOMs

2016-02-23 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2886.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Thank you for your contribution [~hshreedharan]!

> Optional Channels can cause OOMs
> 
>
> Key: FLUME-2886
> URL: https://issues.apache.org/jira/browse/FLUME-2886
> Project: Flume
>  Issue Type: Bug
>Reporter: Hari Shreedharan
>Assignee: Hari Shreedharan
> Fix For: v1.7.0
>
> Attachments: FLUME-2886.patch
>
>
> If an optional channel is full, the queue backing the executor that is 
> asynchronously submitting the events to the channel can grow indefinitely in 
> size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2886) Optional Channels can cause OOMs

2016-02-23 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159118#comment-15159118
 ] 

Jarek Jarcec Cecho commented on FLUME-2886:
---

+1

> Optional Channels can cause OOMs
> 
>
> Key: FLUME-2886
> URL: https://issues.apache.org/jira/browse/FLUME-2886
> Project: Flume
>  Issue Type: Bug
>Reporter: Hari Shreedharan
>Assignee: Hari Shreedharan
> Attachments: FLUME-2886.patch
>
>
> If an optional channel is full, the queue backing the executor that is 
> asynchronously submitting the events to the channel can grow indefinitely in 
> size leading to a huge number of events on the heap and causing OOMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2712) Optional channel errors slows down the Source to Main channel event rate

2016-01-28 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2712:
--
Fix Version/s: v1.7.0

> Optional channel errors slows down the Source to Main channel event rate
> 
>
> Key: FLUME-2712
> URL: https://issues.apache.org/jira/browse/FLUME-2712
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Fix For: v1.7.0
>
> Attachments: FLUME-2712-1.patch, FLUME-2712-2.patch, 
> FLUME-2712.patch, FLUME-2712.patch
>
>
> When we have a source configured to deliver events to a main channel and an 
> optional channel, and if the delivery to optional channel fails, this 
> significantly slows down the rate at which events are delivered to the main 
> channel by the source.
> We need to evaluate async means of delivering events to the optional channel 
> and isolate the errors happening in optional channel from the delivery to the 
> main channel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLUME-2718) HTTP Source to support generic Stream Handler

2016-01-15 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned FLUME-2718:
-

Assignee: Hari

The patch is in, thank you for your contribution Hari!

> HTTP Source to support generic Stream Handler
> -
>
> Key: FLUME-2718
> URL: https://issues.apache.org/jira/browse/FLUME-2718
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Hari
>Assignee: Hari
> Attachments: 
> 0001-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch, 
> 0002-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch
>
>
> Currently the HTTP Source supports JSONHandler as the default implementation. 
> A more generic approach will be having a BLOBHandler which accepts any 
> request input stream (that loads the stream as Event payload). Furthermore, 
> this Handler lets you define mandatory request parameters and maps those 
> parameters into Event Headers. 
> This way HTTPSource can be used as a generic Data Ingress endpoint for any 
> sink, where one can specify attributes run like basepath, filename & 
> timestamp as request parameters and access those values via HEADER values in 
> sink properties.
> All this can be done without developing any custom Handler code.
> For e.g.
> With the below agent configuration, you can send any type of data 
> (JSON/CSV/TSV) and store it in any sink, HDFS in this case. 
> {code:title=sample command|borderStyle=solid}
> curl -v -X POST 
> "http://testHost:8080/?basepath=/data/=test.json=1434101498275;
>  --data @test.json
> {code}
> {code:title=HDFS data path |borderStyle=solid}
> /data/2015/06/12/test.json.1434101498275.lzo
> {code}
> {code:title=agent.conf|borderStyle=solid}
> #Agent configuration
> #HTTP Source configuration
> agent.sources = httpSrc
> agent.channels = memChannel
> agent.sources.httpSrc.type = http
> agent.sources.httpSrc.channels = memChannel
> agent.sources.httpSrc.bind = testHost
> agent.sources.httpSrc.port = 8080
> agent.sources.httpSrc.handler = org.apache.flume.source.http.BLOBHandler
> agent.sources.httpSrc.handler.mandatoryParameters = basepath, filename
> #Memory channel with default configuration
> agent.channels.memChannel.type = memory
> agent.channels.memChannel.capacity = 10
> agent.channels.memChannel.transactionCapacity = 1000
> #HDFS Sink configuration
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.hdfs.path = %{basepath}/%Y/%m/%d
> agent.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
> agent.sinks.hdfsSink.hdfs.filePrefix = %{filename}
> agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
> agent.sinks.hdfsSink.hdfs.codeC = lzop
> agent.sinks.hdfsSink.channel = memChannel
> #Finally, activate.
> agent.channels = memChannel
> agent.sources = httpSrc
> agent.sinks = hdfsSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2718) HTTP Source to support generic Stream Handler

2016-01-15 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2718.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

> HTTP Source to support generic Stream Handler
> -
>
> Key: FLUME-2718
> URL: https://issues.apache.org/jira/browse/FLUME-2718
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Hari
>Assignee: Hari
> Fix For: v1.7.0
>
> Attachments: 
> 0001-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch, 
> 0002-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch
>
>
> Currently the HTTP Source supports JSONHandler as the default implementation. 
> A more generic approach will be having a BLOBHandler which accepts any 
> request input stream (that loads the stream as Event payload). Furthermore, 
> this Handler lets you define mandatory request parameters and maps those 
> parameters into Event Headers. 
> This way HTTPSource can be used as a generic Data Ingress endpoint for any 
> sink, where one can specify attributes run like basepath, filename & 
> timestamp as request parameters and access those values via HEADER values in 
> sink properties.
> All this can be done without developing any custom Handler code.
> For e.g.
> With the below agent configuration, you can send any type of data 
> (JSON/CSV/TSV) and store it in any sink, HDFS in this case. 
> {code:title=sample command|borderStyle=solid}
> curl -v -X POST 
> "http://testHost:8080/?basepath=/data/=test.json=1434101498275;
>  --data @test.json
> {code}
> {code:title=HDFS data path |borderStyle=solid}
> /data/2015/06/12/test.json.1434101498275.lzo
> {code}
> {code:title=agent.conf|borderStyle=solid}
> #Agent configuration
> #HTTP Source configuration
> agent.sources = httpSrc
> agent.channels = memChannel
> agent.sources.httpSrc.type = http
> agent.sources.httpSrc.channels = memChannel
> agent.sources.httpSrc.bind = testHost
> agent.sources.httpSrc.port = 8080
> agent.sources.httpSrc.handler = org.apache.flume.source.http.BLOBHandler
> agent.sources.httpSrc.handler.mandatoryParameters = basepath, filename
> #Memory channel with default configuration
> agent.channels.memChannel.type = memory
> agent.channels.memChannel.capacity = 10
> agent.channels.memChannel.transactionCapacity = 1000
> #HDFS Sink configuration
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.hdfs.path = %{basepath}/%Y/%m/%d
> agent.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
> agent.sinks.hdfsSink.hdfs.filePrefix = %{filename}
> agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
> agent.sinks.hdfsSink.hdfs.codeC = lzop
> agent.sinks.hdfsSink.channel = memChannel
> #Finally, activate.
> agent.channels = memChannel
> agent.sources = httpSrc
> agent.sinks = hdfsSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2718) HTTP Source to support generic Stream Handler

2016-01-14 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098323#comment-15098323
 ] 

Jarek Jarcec Cecho commented on FLUME-2718:
---

I left few comments on the review board.

> HTTP Source to support generic Stream Handler
> -
>
> Key: FLUME-2718
> URL: https://issues.apache.org/jira/browse/FLUME-2718
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Hari
> Attachments: 
> 0001-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch
>
>
> Currently the HTTP Source supports JSONHandler as the default implementation. 
> A more generic approach will be having a BLOBHandler which accepts any 
> request input stream (that loads the stream as Event payload). Furthermore, 
> this Handler lets you define mandatory request parameters and maps those 
> parameters into Event Headers. 
> This way HTTPSource can be used as a generic Data Ingress endpoint for any 
> sink, where one can specify attributes run like basepath, filename & 
> timestamp as request parameters and access those values via HEADER values in 
> sink properties.
> All this can be done without developing any custom Handler code.
> For e.g.
> With the below agent configuration, you can send any type of data 
> (JSON/CSV/TSV) and store it in any sink, HDFS in this case. 
> {code:title=sample command|borderStyle=solid}
> curl -v -X POST 
> "http://testHost:8080/?basepath=/data/=test.json=1434101498275;
>  --data @test.json
> {code}
> {code:title=HDFS data path |borderStyle=solid}
> /data/2015/06/12/test.json.1434101498275.lzo
> {code}
> {code:title=agent.conf|borderStyle=solid}
> #Agent configuration
> #HTTP Source configuration
> agent.sources = httpSrc
> agent.channels = memChannel
> agent.sources.httpSrc.type = http
> agent.sources.httpSrc.channels = memChannel
> agent.sources.httpSrc.bind = testHost
> agent.sources.httpSrc.port = 8080
> agent.sources.httpSrc.handler = org.apache.flume.source.http.BLOBHandler
> agent.sources.httpSrc.handler.mandatoryParameters = basepath, filename
> #Memory channel with default configuration
> agent.channels.memChannel.type = memory
> agent.channels.memChannel.capacity = 10
> agent.channels.memChannel.transactionCapacity = 1000
> #HDFS Sink configuration
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.hdfs.path = %{basepath}/%Y/%m/%d
> agent.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
> agent.sinks.hdfsSink.hdfs.filePrefix = %{filename}
> agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
> agent.sinks.hdfsSink.hdfs.codeC = lzop
> agent.sinks.hdfsSink.channel = memChannel
> #Finally, activate.
> agent.channels = memChannel
> agent.sources = httpSrc
> agent.sinks = hdfsSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2703) HDFS sink: Ability to exclude time counter in fileName via sink configuration

2016-01-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097170#comment-15097170
 ] 

Jarek Jarcec Cecho commented on FLUME-2703:
---

I'm a bit concerned about introducing this functionality to flume - flume has 
been designed as a event based system and not necessary file based one. Trying 
to preserve the original filename might make it seem like we're transferring 
whole files which is not the case. Even with SpoolDirectorySource that reads 
whole files we can change order or generate duplicates at the end so the 
resulting file on HDFS might not end up having the same checksum. Also this can 
lead to a lot of issues with two independent flume agents will try to write to 
the same output file on HDFS.

> HDFS sink: Ability to exclude time counter in fileName via sink configuration 
> --
>
> Key: FLUME-2703
> URL: https://issues.apache.org/jira/browse/FLUME-2703
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Hari
>Priority: Minor
> Attachments: FLUME-2703-0.patch
>
>
> HDFS sinks always append time counter to filenames which is not configurable.
> In some use cases, it is desirable to retain the original filename. 
> For e.g. While ingesting a blob using Spool directory source, it's desirable 
> to retain the original filename (basename) in HDFS.  
> This patch allows to configure a HDFS sink to override this behavior 
> retaining the backward compatible file naming convention by default i.e,
> hdfs.appendTimeCounter = false



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2718) HTTP Source to support generic Stream Handler

2016-01-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097166#comment-15097166
 ] 

Jarek Jarcec Cecho commented on FLUME-2718:
---

Thanks for the patch [~hariprasad kuppuswamy]! Would you mind uploading it to 
[review board|https://reviews.apache.org/dashboard/] for deeper review?

> HTTP Source to support generic Stream Handler
> -
>
> Key: FLUME-2718
> URL: https://issues.apache.org/jira/browse/FLUME-2718
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Reporter: Hari
> Attachments: 
> 0001-FLUME-2718-HTTP-Source-to-support-generic-Stream-Han.patch
>
>
> Currently the HTTP Source supports JSONHandler as the default implementation. 
> A more generic approach will be having a BLOBHandler which accepts any 
> request input stream (that loads the stream as Event payload). Furthermore, 
> this Handler lets you define mandatory request parameters and maps those 
> parameters into Event Headers. 
> This way HTTPSource can be used as a generic Data Ingress endpoint for any 
> sink, where one can specify attributes run like basepath, filename & 
> timestamp as request parameters and access those values via HEADER values in 
> sink properties.
> All this can be done without developing any custom Handler code.
> For e.g.
> With the below agent configuration, you can send any type of data 
> (JSON/CSV/TSV) and store it in any sink, HDFS in this case. 
> {code:title=sample command|borderStyle=solid}
> curl -v -X POST 
> "http://testHost:8080/?basepath=/data/=test.json=1434101498275;
>  --data @test.json
> {code}
> {code:title=HDFS data path |borderStyle=solid}
> /data/2015/06/12/test.json.1434101498275.lzo
> {code}
> {code:title=agent.conf|borderStyle=solid}
> #Agent configuration
> #HTTP Source configuration
> agent.sources = httpSrc
> agent.channels = memChannel
> agent.sources.httpSrc.type = http
> agent.sources.httpSrc.channels = memChannel
> agent.sources.httpSrc.bind = testHost
> agent.sources.httpSrc.port = 8080
> agent.sources.httpSrc.handler = org.apache.flume.source.http.BLOBHandler
> agent.sources.httpSrc.handler.mandatoryParameters = basepath, filename
> #Memory channel with default configuration
> agent.channels.memChannel.type = memory
> agent.channels.memChannel.capacity = 10
> agent.channels.memChannel.transactionCapacity = 1000
> #HDFS Sink configuration
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.hdfs.path = %{basepath}/%Y/%m/%d
> agent.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
> agent.sinks.hdfsSink.hdfs.filePrefix = %{filename}
> agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
> agent.sinks.hdfsSink.hdfs.codeC = lzop
> agent.sinks.hdfsSink.channel = memChannel
> #Finally, activate.
> agent.channels = memChannel
> agent.sources = httpSrc
> agent.sinks = hdfsSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2720) HDFS Sink: Autogenerate LZO indexes while creating LZO files

2016-01-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097155#comment-15097155
 ] 

Jarek Jarcec Cecho commented on FLUME-2720:
---

Thanks for the patch [~hariprasad kuppuswamy]. I like the idea of making it 
simpler and auto-generate the LZO indexes. Sadly the LZO libraries are shared 
under GPL license that [is not compatible with 
ASLv2|http://www.apache.org/legal/resolved.html] and therefore we can't accept 
this improvement :(

> HDFS Sink: Autogenerate LZO indexes while creating LZO files
> 
>
> Key: FLUME-2720
> URL: https://issues.apache.org/jira/browse/FLUME-2720
> Project: Flume
>  Issue Type: Improvement
>  Components: Sinks+Sources
>Affects Versions: v1.7.0
>Reporter: Hari
>Priority: Minor
> Attachments: 
> 0001-FLUME-2720-Autogenerate-LZO-indexes-while-creating-L.patch
>
>
> The LZO indexes are now generated offline using DistributedLZOIndexer. 
> It will be nice to auto generate these index files during the ingestion 
> itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Flume-Kafka 0.9 Support

2015-12-22 Thread Jarek Jarcec Cecho
It’s unfortunate that in order to support the new features in Kafka 0.9 
(primarily the security additions), one have to lose support of previous 
version (0.8).

I do believe that the security additions that have been added recently are 
important enough for us to migrate to the new version of Kafka and use it for 
the next Flume release. If some people will need to continue using future Flume 
version with Kafka 0.8, they should be able to simply take 1.6.0 version of 
Kafka Channel/Source/Sink jars and use them with the new agent, so we do have a 
mitigation plan if needed.

Jarcec

> On Dec 22, 2015, at 3:26 PM, Jeff Holoman  wrote:
> 
> With the new release of Kafka I wanted to start the discussion on how best
> to handle updating Flume to be able to make use of some of the new features
> available in 0.9.
> 
> First, it is important for Flume to adopt the 0.9 Kafka Clients as the new
> Consumer / Producer API's are the only APIs that support new Security
> features put into the latest Kafka release such as SSL.
> 
> If we agree that this is important, then we need to consider how best to
> make this switch. With many projects, we could just update the jars/clients
> and move along happily, however, the Kafka compatibility story complicates
> this.
> 
> 
>   -
> 
>   Kafka promises to be backward compatible with clients
>   -
> 
>  i.e. A 0.8.x client can talk to a 0.9.x broker
>  -
> 
>   Kafka does not promise to be forward compatible (at all) from client
>   perspective:
>   -
> 
>  i.e. A 0.9.x client can not talk to a 0.8.x broker
>  -
> 
>  If it works, its is by luck and not reliable, even for old
>  functionality
>  -
> 
>  This is due to protocol changes and no way for the client to know the
>  version of Kafka it’s talking to. Hopefully KIP-35 (Retrieving protocol
>  version) will move this in the right direction.
> 
> 
> 
>   -
> 
>   Integrations that utilize Kafka 0.9.x clients will not be able to talk
>   to Kafka 0.8.x brokers at all and may get cryptic error messages when doing
>   so.
>   -
> 
>   Integrations will only be able to support one major version of Kafka at
>   a time without more complex class-loading
>   -
> 
>  Note: The kafka_2.10 artifact depends on the kafka-clients artifact
>  so you cannot have both kafka-clients & kafka_2.10 of different
> versions at
>  the same time without collision
>  -
> 
>   However older clients (0.8.x) will work when talking to 0.9.x server.
>   -
> 
>  But that is pretty much useless as the benefits of 0.9.x (security
>  features) won’t be available.
> 
> 
> Given these constraints, and after careful consideration, I propose that we
> do the following.
> 
> 1) Update the Kafka libraries to the latest 0.9/0.9+ release and update the
> Source, Sink and Channel Implementations to make use of the new Kafka
> Clients
> 2) Document that Flume no longer supports Kafka Brokers < 0.9
> 
> Given that both producer and clients will be updated, there will need to be
> changes in agent configurations to support the new clients.
> 
> This means if upgrading Flume, existing agent configurations will break. I
> don't see a clean way around this, unfortunately. This seems to be a
> situation where we break things, and document this to be the case.
> 
> 
> 
> 
> -- 
> Jeff Holoman
> Systems Engineer



[jira] [Assigned] (FLUME-2858) Add better exception message for malformed zookeeper, also add a default of 2181 if the port is not given

2015-12-18 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned FLUME-2858:
-

Assignee: Samuel David Glover  (was: Ted Malaska)

Assigning to [~samglover] per the JIRA comments.

> Add better exception message for malformed  zookeeper, also add a default of 
> 2181 if the port is not given 
> ---
>
> Key: FLUME-2858
> URL: https://issues.apache.org/jira/browse/FLUME-2858
> Project: Flume
>  Issue Type: Bug
>Reporter: Ted Malaska
>Assignee: Samuel David Glover
>Priority: Minor
> Attachments: FLUME-2858-0.patch, FLUME-2858-1.patch
>
>
> I lot time in my last set up because my zookeeperQuorum didn't have ports in 
> it.  The error message required me to go into the code to figure out.
> So this jira is to give a better exception message for this
> Also we should add a default port of 2181 if the port is not given.  This is 
> because HBase has this default port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Potential Gerrit support in review/commit flow

2015-11-06 Thread Jarek Jarcec Cecho
I like the gerrit flow a lot. I can’t speak for the flume community, but I 
would be in favor of that :)

You might consider sending similar email to Sqoop community where I would 
support gerrit as well.

Jarcec

> On Nov 6, 2015, at 10:27 AM, Zhe Zhang  wrote:
> 
> Hi Flume contributors,
> 
> The Hadoop community is considering adding Gerrit as a review / commit
> tool. Since this will require support from the Apache Infra team, it makes
> more sense if multiple projects can benefit from the effort.
> 
> The main benefit of Gerrit over ReviewBoard is better integration with git
> and Jenkins. A Gerrit review request is created through a simple "git push"
> instead of manually creating and uploading a diff file. Conflicts detection
> and rebase can be done on the review UI as well. When the programmed commit
> criteria are met (e.g. a code review +1 and a Jenkins verification),
> committing can also be done with a single button click, or even
> automatically.
> 
> The main benefit of Gerrit over Github pull requests is the rebase workflow
> (rather than git merge), which avoids merge commits. The rebase workflow
> also enables a clear interdiff view, rather than reviewing every patch rev
> from scratch.
> 
> This also just augments instead of replacing the current review / commit
> flow. Every task will still start as a JIRA. Review comments can be made on
> both JIRA and Gerrit and will be bi-directionally mirrored. Patches can
> also be directly committed through git command line (Gerrit will recognize
> a direct commit and close the review request as long as a simple git hook
> is installed:
> https://gerrit.googlecode.com/svn/documentation/2.0/user-changeid.html).
> 
> I wonder if the Flume community would be interested in moving on this
> direction as well. Any feedback is much appreciated.
> 
> Thanks,
> Zhe Zhang



[jira] [Updated] (FLUME-2781) A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source

2015-10-29 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2781:
--
Fix Version/s: v1.7.0

> A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used 
> by a Flume source
> -
>
> Key: FLUME-2781
> URL: https://issues.apache.org/jira/browse/FLUME-2781
> Project: Flume
>  Issue Type: Improvement
>Affects Versions: v1.6.0
>Reporter: Gonzalo Herreros
>Assignee: Gonzalo Herreros
>  Labels: easyfix, patch
> Fix For: v1.7.0
>
> Attachments: FLUME-2781.patch
>
>
> When a Kafka channel is configured as parseAsFlumeEvent=false, the channel 
> will read events from the topic as text instead of serialized Avro Flume 
> events.
> This is useful so Flume can read from an existing Kafka topic, where other 
> Kafka clients publish as text.
> However, if you use a Flume source on that channel, it will still write the 
> events as Avro so it will create an inconsistency and those events will fail 
> to be read correctly.
> Also, this would allow a Flume source to write to a Kafka channel and any 
> Kafka subscriber to listen to Flume events passing through without binary 
> dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] Change of Apache Flume PMC Chair

2015-10-22 Thread Jarek Jarcec Cecho
Congratulations Hari!

Jarcec

> On Oct 21, 2015, at 5:50 PM, Arvind Prabhakar  wrote:
> 
> Dear Flume Users and Developers,
> 
> I have had the pleasure of serving as the PMC Chair of Apache Flume since its 
> graduation three years ago. I sincerely thank you and the Flume PMC for this 
> opportunity. However, I have decided to step down from this responsibility 
> due to personal reasons.
> 
> I am very happy to announce that on the request of Flume PMC and with the 
> approval from the board of directors at The Apache Software Foundation, Hari 
> Shreedharan is hereby appointed as the new PMC Chair. I am confident that 
> Hari will do everything possible to help further grow the community and 
> adoption of Apache Flume.
> 
> Please join me in congratulating Hari on his appointment and welcoming him to 
> this role.
> 
> Regards,
> Arvind Prabhakar



[jira] [Commented] (FLUME-2734) Kafka Channel timeout property is overridden by default value

2015-09-30 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14937481#comment-14937481
 ] 

Jarek Jarcec Cecho commented on FLUME-2734:
---

+1

> Kafka Channel timeout property is overridden by default value
> -
>
> Key: FLUME-2734
> URL: https://issues.apache.org/jira/browse/FLUME-2734
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2734.patch
>
>
> Kafka Channel timeout property which will be passed to the Kafka Consumer 
> internally, does not  work as expected, as in the code, it is overridden by 
> default value (or value specified by .timeout property which is undocumented)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2751) Upgrade Derby version to 10.11.1.1

2015-09-30 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14937464#comment-14937464
 ] 

Jarek Jarcec Cecho commented on FLUME-2751:
---

+1

> Upgrade Derby version to 10.11.1.1
> --
>
> Key: FLUME-2751
> URL: https://issues.apache.org/jira/browse/FLUME-2751
> Project: Flume
>  Issue Type: Bug
>Reporter: Johny Rufus
>Assignee: Johny Rufus
> Attachments: FLUME-2751.patch
>
>
> Derby has lot has bug fixes in the past releases over 3 years, we should 
> upgrade to the latest stable release to 10.11.1.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2737) Documentation for Pollable Source config parameters introduced in FLUME-2729

2015-07-14 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2737.
---
   Resolution: Fixed
Fix Version/s: v1.7.0

Thank you for your contribution [~malaskat]!

 Documentation for Pollable Source config parameters introduced in FLUME-2729
 

 Key: FLUME-2737
 URL: https://issues.apache.org/jira/browse/FLUME-2737
 Project: Flume
  Issue Type: Documentation
Reporter: Johny Rufus
Assignee: Ted Malaska
 Fix For: v1.7.0

 Attachments: FLUME-2737.patch, FLUME-2737.patch.1


 Documentation needs to be updated in Flume User Guide, for all Pollable 
 Sources, that would use the new config parameters introduced in FLUME-2729



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Flume committer - Johny Rufus

2015-07-06 Thread Jarek Jarcec Cecho
Congratulations Johny!

Jarcec

 On Jun 19, 2015, at 1:38 PM, Hari Shreedharan hshreedha...@apache.org wrote:
 
 On behalf of the Apache Flume PMC, I am excited to welcome Johny Rufus as a 
 committer on the Apache Flume project. Johny has actively contributed several 
 patches to the Flume project, including bug fixes, authentication and other 
 new features. 
 
 Congratulations and Welcome, Johny!
 
 
 Cheers,
 Hari Shreedharan



Re: The error I meet When I compile the source of apache-flume-1.5.2-src with eclipse with m2

2015-05-31 Thread Jarek Jarcec Cecho
Hi Maojun,
those sort of questions are best asked on the flume user mailing list. You can 
find instructions how to sing up on following web page:

http://flume.apache.org/mailinglists.html

Jarcec

 On May 28, 2015, at 11:16 PM, Thanks 864157...@qq.com wrote:
 
 Dear Jaroslav Cecho,
   This is Maojun Wang,from JJWorld(Beijing) network technology 
 Co.,LTD.When I compile the source of apache-flume-1.5.2-src with eclipse with 
 m2,I meet an error as follows:
 
 1 the error I meet in my eclipse :The import com.cloudera cannot be resolved
CAC19F0B@A662E310.57046855
 
 
 2 when I use mvn install -DskipTests to install the 
 apache-flume-1.5.2-src,every thing is Ok.
 
 3 when I unpack the flume-avro-source-1.5.2.jar from 2,I find the missing 
 classes(as 1 describe)
B2B97585@A662E310.57046855
 
 4 Where is the problem? Help me Please!
 
 Thank you very much!
 Best regards
 Sincerely,Maojun Wang 
   
   



[jira] [Commented] (FLUME-1934) Spoolingdir source exception when reading multiple zero size files

2015-04-03 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394766#comment-14394766
 ] 

Jarek Jarcec Cecho commented on FLUME-1934:
---

I've granted you contributor role on JIRA [~granthenke]. You should be able 
to assign the JIRA to yourself.

 Spoolingdir source exception when reading multiple   zero size files 
 -

 Key: FLUME-1934
 URL: https://issues.apache.org/jira/browse/FLUME-1934
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.3.1
 Environment: windows 7, flume 1.3.1, spooling dir source.
Reporter: andy zhou
 Attachments: FLUME-1934.patch


 move more than one files to the spool dir, and each file size is 0, then 
 flume agent will throw IllegalStateException forever, and never work 
 again,its' main cause is commited flag will not set to true.
 logs:
 08 三月 2013 08:00:14,406 ERROR [pool-5-thread-1] 
 (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:148) 
  - Uncaught exception in Runnable
 java.lang.IllegalStateException: File should not roll when  commit is 
 outstanding.
   at 
 org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:164)
   at 
 org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:135)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
   at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Is 1.6 a good time to switch to hadoop2 ?

2015-04-01 Thread Jarek Jarcec Cecho
Sqoop2 method is very simple - we have different profiles for Hadoop 1 and 2 
and simply use classifiers to mark the appropriate hadoop “profile”. No need to 
generate different pom.xml file. Check it out here:

https://github.com/apache/sqoop/blob/branch-1.99.4/execution/mapreduce/pom.xml#L65

Jarcec

 On Apr 1, 2015, at 2:56 PM, Hari Shreedharan hshreedha...@cloudera.com 
 wrote:
 
 We should choose which ever is the easiest to do I guess (Sqoop method or 
 HBase method).
 
 
 
 
 Thanks, Hari
 
 On Wed, Apr 1, 2015 at 1:33 PM, Roshan Naik ros...@hortonworks.com
 wrote:
 
 FWIWŠ From what I recollect, Hbase did this when they started supporting
 hadoop12. Not sure if its still the same method they use..
 - their default pom had the artifacts named for hadoop1 binaries
 - they had a shell script to produce a modified pom ..with -hadoop2 suffix
 for the artifact names
 - they built and published the hadoop2 binaries using the modified pom,
 and hadoop1 binaries form the original pom
 -roshan
 On 4/1/15 1:24 PM, Hari Shreedharan hshreedha...@cloudera.com wrote:
 I know sqoop2 does it. Maybe Jarcec can help?
 
 
 
 
 Thanks, Hari
 
 On Wed, Apr 1, 2015 at 1:13 PM, Roshan Naik ros...@hortonworks.com
 wrote:
 
 To push artifacts for both hadoop2 and hadoop1 will need to name the
 hadoop2 and hadoop1 artifacts differently. Tricky to do that with one
 pom
 I think.
 -roshan
 On 4/1/15 1:05 PM, Hari Shreedharan hshreedha...@cloudera.com wrote:
 I am a +1 for doing this, since this is the last JDK 6 release. Maybe we
 should push artifacts for both Hadoop-1 and hbase-98 profiles this time
 and switch out to Hadoop 2 exclusively from 1.7?
 
 
 
 
 Thanks, Hari
 
 On Wed, Apr 1, 2015 at 1:04 PM, Roshan Naik ros...@hortonworks.com
 wrote:
 
 Don't recall if this has been discussed before. Its been sometime
 since
 hadoop2 has been out.
 Sooner or later Flume will switch to hadoop2 based builds (hbase98
 profile?) as the default. Not sure if 1.6 is the time or worth waiting
 longer.
 -roshan



30 tech skills that will get you a $110,000-plus salary

2015-03-23 Thread Jarek Jarcec Cecho
It seems that being Flume developer is a well paying job :)

Jarcec


[jira] [Resolved] (FLUME-2630) Update documentation for Thrift SRc/Sink SSL support

2015-03-14 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2630.
---
Resolution: Fixed

Thank you for the fix [~jrufus]!

 Update documentation for Thrift SRc/Sink SSL support
 

 Key: FLUME-2630
 URL: https://issues.apache.org/jira/browse/FLUME-2630
 Project: Flume
  Issue Type: Documentation
  Components: Sinks+Sources
Affects Versions: v1.5.2
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2630-1.patch, FLUME-2630-2.patch, 
 FLUME-2630-3.patch, FLUME-2630.patch


 Update User guide to document the SSL related input parameters for Thrift Src 
 and Thrift Sink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2641) Drop Java 6 support post Flume 1.6

2015-03-12 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357161#comment-14357161
 ] 

Jarek Jarcec Cecho commented on FLUME-2641:
---

I'm +1 on dropping JKD 6 support as this version is not supported for more then 
two years. I don't mind dropping Hadoop 1 support either as Hadoop 2 is stable 
and used in all major distributions.

 Drop Java 6 support post Flume 1.6
 --

 Key: FLUME-2641
 URL: https://issues.apache.org/jira/browse/FLUME-2641
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan

 We should also test Java 7 + Hadoop 1. We will not target Java 6 at all, so 
 we might have to drop Hadoop 1 support if Hadoop 1 jars won't run on Java 7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2630) Update documentation for Thrift SRc/Sink SSL support

2015-03-12 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359831#comment-14359831
 ] 

Jarek Jarcec Cecho commented on FLUME-2630:
---

I still see some formatting issue [~jrufus]:

{code}
System Message: ERROR/3 
(/Users/jarcec/apache/flume/flume-ng-doc/sphinx/FlumeUserGuide.rst, line 1960)

Malformed table. Text in column margin at line offset 17.
{code}

 Update documentation for Thrift SRc/Sink SSL support
 

 Key: FLUME-2630
 URL: https://issues.apache.org/jira/browse/FLUME-2630
 Project: Flume
  Issue Type: Documentation
  Components: Sinks+Sources
Affects Versions: v1.5.2
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2630-1.patch, FLUME-2630.patch


 Update User guide to document the SSL related input parameters for Thrift Src 
 and Thrift Sink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2126) Problem in elasticsearch sink when the event body is a complex field

2015-03-11 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2126:
--
Fix Version/s: 1.6

 Problem in elasticsearch sink when the event body is a complex field
 

 Key: FLUME-2126
 URL: https://issues.apache.org/jira/browse/FLUME-2126
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
 Environment: 1.3.1 and 1.4
Reporter: Massimo Paladin
 Fix For: 1.6

 Attachments: FLUME-2126-0.patch


 I have found a bug in the elasticsearch sink, the problem is in the 
 {{ContentBuilderUtil.addComplexField}} method, when it does 
 {{builder.field(fieldName, tmp);}} the {{tmp}} object is taken as {{Object}} 
 with the result of being serialized with the {{toString}} method in the 
 {{XContentBuilder}}. In the end you get the object reference as content.
 The following change workaround the problem for me, the bad point is that it 
 has to parse the content twice, I guess there is a better way to solve the 
 problem but I am not an elasticsearch api expert. 
 {code}
 --- 
 a/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 +++ 
 b/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 @@ -61,7 +61,12 @@ public class ContentBuilderUtil {
parser = XContentFactory.xContent(contentType).createParser(data);
parser.nextToken();
tmp.copyCurrentStructure(parser);
 -  builder.field(fieldName, tmp);
 +
 +  // if it is a valid structure then we include it
 +  parser = XContentFactory.xContent(contentType).createParser(data);
 +  parser.nextToken();
 +  builder.field(fieldName);
 +  builder.copyCurrentStructure(parser);
  } catch (JsonParseException ex) {
// If we get an exception here the most likely cause is nested JSON 
 that
// can't be figured out in the body. At this point just push it through
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2126) Problem in elasticsearch sink when the event body is a complex field

2015-03-11 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2126.
---
Resolution: Fixed

I didn't get any response on the JIRA, so I'll return it back to resolved 
state. Let's take any subsequent discussion to separate JIRA.

 Problem in elasticsearch sink when the event body is a complex field
 

 Key: FLUME-2126
 URL: https://issues.apache.org/jira/browse/FLUME-2126
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
 Environment: 1.3.1 and 1.4
Reporter: Massimo Paladin
 Attachments: FLUME-2126-0.patch


 I have found a bug in the elasticsearch sink, the problem is in the 
 {{ContentBuilderUtil.addComplexField}} method, when it does 
 {{builder.field(fieldName, tmp);}} the {{tmp}} object is taken as {{Object}} 
 with the result of being serialized with the {{toString}} method in the 
 {{XContentBuilder}}. In the end you get the object reference as content.
 The following change workaround the problem for me, the bad point is that it 
 has to parse the content twice, I guess there is a better way to solve the 
 problem but I am not an elasticsearch api expert. 
 {code}
 --- 
 a/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 +++ 
 b/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 @@ -61,7 +61,12 @@ public class ContentBuilderUtil {
parser = XContentFactory.xContent(contentType).createParser(data);
parser.nextToken();
tmp.copyCurrentStructure(parser);
 -  builder.field(fieldName, tmp);
 +
 +  // if it is a valid structure then we include it
 +  parser = XContentFactory.xContent(contentType).createParser(data);
 +  parser.nextToken();
 +  builder.field(fieldName);
 +  builder.copyCurrentStructure(parser);
  } catch (JsonParseException ex) {
// If we get an exception here the most likely cause is nested JSON 
 that
// can't be figured out in the body. At this point just push it through
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2126) Problem in elasticsearch sink when the event body is a complex field

2015-03-06 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350602#comment-14350602
 ] 

Jarek Jarcec Cecho commented on FLUME-2126:
---

Since the patch on this JIRA has been committed and not reverted, might I 
suggest to either:

1) Resolve this JIRA and move the open questions to subsequent discussion
2) Revert the committed patch

 Problem in elasticsearch sink when the event body is a complex field
 

 Key: FLUME-2126
 URL: https://issues.apache.org/jira/browse/FLUME-2126
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
 Environment: 1.3.1 and 1.4
Reporter: Massimo Paladin
 Attachments: FLUME-2126-0.patch


 I have found a bug in the elasticsearch sink, the problem is in the 
 {{ContentBuilderUtil.addComplexField}} method, when it does 
 {{builder.field(fieldName, tmp);}} the {{tmp}} object is taken as {{Object}} 
 with the result of being serialized with the {{toString}} method in the 
 {{XContentBuilder}}. In the end you get the object reference as content.
 The following change workaround the problem for me, the bad point is that it 
 has to parse the content twice, I guess there is a better way to solve the 
 problem but I am not an elasticsearch api expert. 
 {code}
 --- 
 a/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 +++ 
 b/flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java
 @@ -61,7 +61,12 @@ public class ContentBuilderUtil {
parser = XContentFactory.xContent(contentType).createParser(data);
parser.nextToken();
tmp.copyCurrentStructure(parser);
 -  builder.field(fieldName, tmp);
 +
 +  // if it is a valid structure then we include it
 +  parser = XContentFactory.xContent(contentType).createParser(data);
 +  parser.nextToken();
 +  builder.field(fieldName);
 +  builder.copyCurrentStructure(parser);
  } catch (JsonParseException ex) {
// If we get an exception here the most likely cause is nested JSON 
 that
// can't be figured out in the body. At this point just push it through
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2633) Update Kite dependency to 1.0.0

2015-02-26 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339490#comment-14339490
 ] 

Jarek Jarcec Cecho commented on FLUME-2633:
---

Since this was already committed, could you open a new JIRA for it [~whoschek]?

 Update Kite dependency to 1.0.0
 ---

 Key: FLUME-2633
 URL: https://issues.apache.org/jira/browse/FLUME-2633
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Reporter: Tom White
Assignee: Tom White
 Fix For: v1.6.0

 Attachments: FLUME-2633.patch


 Update the dataset sink to use Kite 1.0.0 which has a stable API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2631) End to End authentication in Flume

2015-02-25 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336646#comment-14336646
 ] 

Jarek Jarcec Cecho commented on FLUME-2631:
---

We've recently added dependency on {{hadoop-common}} on Sqoop client side 
because {{hadoop-auth}} didn't contain all required code. [~abec] have more 
context there and perhaps can help to see if we can limit the hadoop dependency 
here?

 End to End authentication in Flume 
 ---

 Key: FLUME-2631
 URL: https://issues.apache.org/jira/browse/FLUME-2631
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2631.patch


 1. The idea is to enable authentication primarily by using 
 SASL/GSSAPI/Kerberos with Thrift RPC. [Thrift already has support for SASL 
 api that supports kerberos, so implementing right now for Thrift. For Avro 
 RPC kerberos support, Avro needs to support SASL first for its Netty Server, 
 before we can use it in flume]
 2. Authentication will happen hop to hop[Client to source, intermediate 
 sources to sinks, final sink to destination]. 
 3. As per the initial model, the user principals won’t be carried forward. 
 The flume client[ThriftRpcClient] will authenticate itself to the KDC. All 
 the intermediate agents [Thrift Sources/Sinks] will authenticate as principal 
 ‘flume’ (typically, but this can be any valid principal that KDC can 
 autenticate) to each other and the final agent will authenticate to the 
 destination as the principal it wishes to identify to the destination



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2631) End to End authentication in Flume

2015-02-25 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337038#comment-14337038
 ] 

Jarek Jarcec Cecho commented on FLUME-2631:
---

Sqoop client is actually a dependency of the shell that exposes public Java 
APIs that users can use to talk to Sqoop 2 server from their Java applications 
:) I don't think that we have any active users yet, but considering how many 
requests we've seen for public Java API, I'm assuming that it will be used a 
lot.

 End to End authentication in Flume 
 ---

 Key: FLUME-2631
 URL: https://issues.apache.org/jira/browse/FLUME-2631
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2631.patch


 1. The idea is to enable authentication primarily by using 
 SASL/GSSAPI/Kerberos with Thrift RPC. [Thrift already has support for SASL 
 api that supports kerberos, so implementing right now for Thrift. For Avro 
 RPC kerberos support, Avro needs to support SASL first for its Netty Server, 
 before we can use it in flume]
 2. Authentication will happen hop to hop[Client to source, intermediate 
 sources to sinks, final sink to destination]. 
 3. As per the initial model, the user principals won’t be carried forward. 
 The flume client[ThriftRpcClient] will authenticate itself to the KDC. All 
 the intermediate agents [Thrift Sources/Sinks] will authenticate as principal 
 ‘flume’ (typically, but this can be any valid principal that KDC can 
 autenticate) to each other and the final agent will authenticate to the 
 destination as the principal it wishes to identify to the destination



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2631) End to End authentication in Flume

2015-02-25 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337151#comment-14337151
 ] 

Jarek Jarcec Cecho commented on FLUME-2631:
---

(For the record, I don't particularly like depending on Hadoop common in Sqoop 
client either, I'm just saying that we did that)

 End to End authentication in Flume 
 ---

 Key: FLUME-2631
 URL: https://issues.apache.org/jira/browse/FLUME-2631
 Project: Flume
  Issue Type: New Feature
  Components: Sinks+Sources
Reporter: Johny Rufus
Assignee: Johny Rufus
 Fix For: v1.6.0

 Attachments: FLUME-2631.patch


 1. The idea is to enable authentication primarily by using 
 SASL/GSSAPI/Kerberos with Thrift RPC. [Thrift already has support for SASL 
 api that supports kerberos, so implementing right now for Thrift. For Avro 
 RPC kerberos support, Avro needs to support SASL first for its Netty Server, 
 before we can use it in flume]
 2. Authentication will happen hop to hop[Client to source, intermediate 
 sources to sinks, final sink to destination]. 
 3. As per the initial model, the user principals won’t be carried forward. 
 The flume client[ThriftRpcClient] will authenticate itself to the KDC. All 
 the intermediate agents [Thrift Sources/Sinks] will authenticate as principal 
 ‘flume’ (typically, but this can be any valid principal that KDC can 
 autenticate) to each other and the final agent will authenticate to the 
 destination as the principal it wishes to identify to the destination



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2580) Sink side interceptors

2015-02-05 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307566#comment-14307566
 ] 

Jarek Jarcec Cecho commented on FLUME-2580:
---

Thanks for the document [~hshreedharan] - I agree with your assessment that it 
looks like a hack, but it seems as a reasonable solution for now. Quick 
question - are we planning to support event dropping as well?

 Sink side interceptors
 --

 Key: FLUME-2580
 URL: https://issues.apache.org/jira/browse/FLUME-2580
 Project: Flume
  Issue Type: New Feature
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: SinkSideInterceptors.pdf


 Currently, we only have source-side interceptors that help routing. But if we 
 use something like Kafka Channel, having a sink-side interceptor can help us 
 modify events as they come in. We could also do validation on event schemas 
 and drop them before they hit the sink, rather than have an infinite loop due 
 to such events. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-1207) Add sink-side decorators

2015-02-05 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-1207.
---
Resolution: Duplicate

 Add sink-side decorators
 

 Key: FLUME-1207
 URL: https://issues.apache.org/jira/browse/FLUME-1207
 Project: Flume
  Issue Type: New Feature
Reporter: Joey Echeverria
Priority: Minor

 FLUME-1157 added support for interceptors (source-side decorators) which 
 enables a number of the use cases that decorators were used for in Flume 0.xx.
 It would be nice to have a sink-side equivalent so that the same source can 
 feed multiple channels/sinks with some getting decorated event, and others 
 getting the original.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2443) org.apache.hadoop.fs.FSDataOutputStream.sync() is deprecated in hadoop 2.4

2015-02-03 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304044#comment-14304044
 ] 

Jarek Jarcec Cecho commented on FLUME-2443:
---

+1

 org.apache.hadoop.fs.FSDataOutputStream.sync() is deprecated in hadoop 2.4
 --

 Key: FLUME-2443
 URL: https://issues.apache.org/jira/browse/FLUME-2443
 Project: Flume
  Issue Type: Dependency upgrade
  Components: Sinks+Sources, Technical Debt
Affects Versions: v1.5.0, v1.5.0.1, v1.6.0
 Environment: Linux PPC / x86
Reporter: Corentin Baron
Assignee: Hari Shreedharan
 Fix For: v1.6.0

 Attachments: FLUME-2443.patch


 HDFS sink uses this method, which is deprecated in hadoop 2.4.1, and no 
 longer present in the current hadoop developments:
 java.lang.NoSuchMethodError: org/apache/hadoop/fs/FSDataOutputStream.sync()V
   at 
 org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:131)
   at org.apache.flume.sink.hdfs.BucketWriter$6.call(BucketWriter.java:502)
   at org.apache.flume.sink.hdfs.BucketWriter$6.call(BucketWriter.java:499)
   at 
 org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:718)
   at 
 org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:183)
   at 
 org.apache.flume.sink.hdfs.BucketWriter.access$1700(BucketWriter.java:59)
   at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:715)
   at java.util.concurrent.FutureTask.run(FutureTask.java:273)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1176)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
   at java.lang.Thread.run(Thread.java:853)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2594) Close Async HBase Client if there are large number of consecutive timeouts

2015-01-21 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286348#comment-14286348
 ] 

Jarek Jarcec Cecho commented on FLUME-2594:
---

+1

 Close Async HBase Client if there are large number of consecutive timeouts
 --

 Key: FLUME-2594
 URL: https://issues.apache.org/jira/browse/FLUME-2594
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2594.patch


 A large number of timeouts can lead to several thousands of messages being 
 buffered but never removed from the HBaseClient's buffers. This can cause a 
 major heap spike. So we should close and dereference the HBaseClient if we 
 hit many failures from the HBase side to clear up the buffers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLUME-2594) Close Async HBase Client if there are large number of consecutive timeouts

2015-01-21 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2594.
---
   Resolution: Fixed
Fix Version/s: v1.6.0

Thank you for the contribution [~hshreedharan]!

 Close Async HBase Client if there are large number of consecutive timeouts
 --

 Key: FLUME-2594
 URL: https://issues.apache.org/jira/browse/FLUME-2594
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.6.0

 Attachments: FLUME-2594.patch


 A large number of timeouts can lead to several thousands of messages being 
 buffered but never removed from the HBaseClient's buffers. This can cause a 
 major heap spike. So we should close and dereference the HBaseClient if we 
 hit many failures from the HBase side to clear up the buffers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLUME-2562) Metrics for Flafka components

2015-01-17 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2562:
--
Fix Version/s: v1.6.0

 Metrics for Flafka components
 -

 Key: FLUME-2562
 URL: https://issues.apache.org/jira/browse/FLUME-2562
 Project: Flume
  Issue Type: Improvement
Reporter: Gwen Shapira
Assignee: Gwen Shapira
 Fix For: v1.6.0

 Attachments: FLUME-2562.1.patch, FLUME-2562.2.patch


 Kafka source, sink and channel should have metrics. This will help us track 
 down possible issues or performance problems.
 Here are the metrics I came up with:
 kafka.next.time - Time spent waiting for events from Kafka (source and 
 channel)
 kafka.send.time - Time spent sending events (channel and sink) 
 kafka.commit.time - Time spent committing (source and channel)
 events.sent - Number of events sent to Kafka (sink and channel)
 events.read - Number of events read from Kafka (channel and source)
 events.rollback - Number of event rolled back (channel) or number of rollback 
 calls (sink)
 kafka.empty - Number of times backing off due to empty kafka topic (source)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Fwd: [ NOTICE ] Service Downtime Notification - R/W git repos

2015-01-13 Thread Jarek Jarcec Cecho
FYI: Our git repo will be down on Thursday.

Jarcec

 Begin forwarded message:
 
 Reply-To: priv...@sqoop.apache.org
 Date: January 13, 2015 at 6:48:51 AM PST
 Subject: [ NOTICE ] Service Downtime Notification - R/W git repos
 From: Tony Stevenson pct...@apache.org
 To: undisclosed-recipients:;
 
 Folks,
 
 Please note than on Thursday 15th at 20:00 UTC the Infrastructure team
 will be taking the read/write git repositories offline.  We expect
 that this migration to last about 4 hours.
 
 During the outage the service will be migrated from an old host to a
 new one.   We intend to keep the URL the same for access to the repos
 after the migration, but an alternate name is already in place in case
 DNS updates take too long.   Please be aware it might take some hours
 after the completion of the downtime for github to update and reflect
 any changes.
 
 The Infrastructure team have been trialling the new host for about a
 week now, and [touch wood] have not had any problems with it.
 
 The service is current;y available by accessing repos via:
 https://git-wip-us.apache.org
 
 If you have any questions please address them to infrastruct...@apache.org
 
 
 
 
 -- 
 Cheers,
 Tony
 
 On behalf of the Apache Infrastructure Team
 
 --
 Tony Stevenson
 
 t...@pc-tony.com
 pct...@apache.org
 
 http://www.pc-tony.com
 
 GPG - 1024D/51047D66
 --



[jira] [Assigned] (FLUME-2488) TestElasticSearchRestClient fails on Oracle JDK 8

2014-11-21 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned FLUME-2488:
-

Assignee: Johny Rufus  (was: Santiago M. Mola)

 TestElasticSearchRestClient fails on Oracle JDK 8
 -

 Key: FLUME-2488
 URL: https://issues.apache.org/jira/browse/FLUME-2488
 Project: Flume
  Issue Type: Bug
  Components: Sinks+Sources
Affects Versions: v1.5.0.1
Reporter: Santiago M. Mola
Assignee: Johny Rufus
  Labels: jdk8
 Fix For: v1.6.0

 Attachments: FLUME-2488-0.patch, FLUME-2488-1.patch


 The JSON comparison should be performed independently of the keys order:
 https://travis-ci.org/Stratio/flume/jobs/36847693#L6735
 shouldAddNewEventWithoutTTL(org.apache.flume.sink.elasticsearch.client.TestElasticSearchRestClient)
  Time elapsed: 493 sec  FAILURE!
 junit.framework.ComparisonFailure: 
 expected:{index:{_[type:bar_type,_index:foo_index]}}
 {body:test}
  but was:{index:{_[index:foo_index,_type:bar_type]}}
 {body:test}
 
 at junit.framework.Assert.assertEquals(Assert.java:85)
 at junit.framework.Assert.assertEquals(Assert.java:91)
 at 
 org.apache.flume.sink.elasticsearch.client.TestElasticSearchRestClient.shouldAddNewEventWithoutTTL(TestElasticSearchRestClient.java:105)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: SSLv2Hello is required on Java 6

2014-11-12 Thread Jarek Jarcec Cecho
Hi Hari,
I’ve just reviewed and committed FLUME-2547 and it’s subtasks. As we are 
supporting JDK6, I would be in favor of doing another quick release.

Jarcec

 On Nov 11, 2014, at 2:03 PM, Hari Shreedharan hshreedha...@cloudera.com 
 wrote:
 
 Hi,
 
 
 Right after we pushed 1.5.1 out (I have not even sent the announce email), we 
 discovered that Java 6 requires SSLv2Hello on the server side for negotiation 
 even if TLS is used (unless the client code is also changed to disable 
 SSLv2Hello). So:
 
 
 - For HTTP Source, any clients running on Java 6 would need code changes to 
 also disable SSLv2Hello to be able to send data via TLSv1.
 - For Avro Source, any clients running Flume SDK  1.5.1 on Java 6 would 
 break and requires the client application to upgrade to 1.5.1. 
 
 
 I filed FLUME-2547 to fix this.
 
 
 My question to the community here is whether we want a new release bringing 
 SSLv2Hello back or if we are willing to just document this and move forward?
 
 
 I am willing to put together an RC if required.
 
 Thanks,
 Hari



Re: [VOTE] Apache Flume 1.5.2 RC1

2014-11-12 Thread Jarek Jarcec Cecho
+1

* Verified checksums and signature files
* Verified that each jar in binary tarball is in the license
* Checked top level files (NOTICE, ...)
* Run tests

(pretty much the same email I’ve sent for 1.5.1 :))

Jarcec
 On Nov 12, 2014, at 1:15 PM, Hari Shreedharan hshreedha...@cloudera.com 
 wrote:
 
 This is the eighth release for Apache Flume as a top-level project,
 version 1.5.2. We are voting on release candidate RC1.
 
 This release fixes an incompatibility with Java 6 based clients found
 in Apache Flume 1.5.1 Release.
 
 It fixes the following
 issues:https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob;f=CHANGELOG;h=cc7321361d0b702ba870de20d6a3d2106987186a;hb=229442aa6835ee0faa17e3034bcab42754c460f5
 
 *** Please cast your vote within the next 72 hours ***
 
 The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
 for the source and binary artifacts can be found here:
  *https://people.apache.org/~hshreedharan/apache-flume-1.5.2-rc1/
 https://people.apache.org/~hshreedharan/apache-flume-1.5.2-rc1/*
 
 Maven staging repo:
  *https://repository.apache.org/content/repositories/orgapacheflume-1008/
 https://repository.apache.org/content/repositories/orgapacheflume-1008/*
 
 The tag to be voted on:
  
 *https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=229442aa6835ee0faa17e3034bcab42754c460f5
 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=229442aa6835ee0faa17e3034bcab42754c460f5*
 
 Flume's KEYS file containing PGP keys we use to sign the release:
  http://www.apache.org/dist/flume/KEYS
 
 
 Thanks,
 Hari



Re: [VOTE] Release Apache Flume 1.5.1 RC1

2014-11-10 Thread Jarek Jarcec Cecho
+1

* Verified checksums and signature files
* Verified that each jar in binary tarball is in the license
* Checked top level files (NOTICE, ...)
* Run tests

Jarcec

 On Nov 6, 2014, at 3:17 PM, Hari Shreedharan hshreedha...@cloudera.com 
 wrote:
 
 This is the seventh release for Apache Flume as a top-level project,
 version 1.5.1. We are voting on release candidate RC1.
 
 It fixes the following issues:
 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob_plain;f=CHANGELOG;hb=c74804226bcee59823c0cbc09cdf803a3d9e6920
 
 *** Please cast your vote within the next 72 hours ***
 
 The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
 for the source and binary artifacts can be found here:
  https://people.apache.org/~hshreedharan/apache-flume-1.5.1-rc1/
 
 Maven staging repo:
  https://repository.apache.org/content/repositories/orgapacheflume-1006/
 
 The tag to be voted on:
  
 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=c74804226bcee59823c0cbc09cdf803a3d9e6920
 
 Flume's KEYS file containing PGP keys we use to sign the release:
  http://www.apache.org/dist/flume/KEYS
 
 Thanks,
 Hari



[jira] [Commented] (FLUME-2505) Test added in FLUME-2502 is flaky

2014-11-08 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203574#comment-14203574
 ] 

Jarek Jarcec Cecho commented on FLUME-2505:
---

+1

 Test added in FLUME-2502 is flaky
 -

 Key: FLUME-2505
 URL: https://issues.apache.org/jira/browse/FLUME-2505
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2505.patch, FLUME-2505.patch


 I added a test to Prateek's patch - which is flaky on Jenkins (not locally) - 
 probably due to slower machines. I think we should make the test a bit more 
 tolerant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Flume 1.5.1 Release

2014-11-06 Thread Jarek Jarcec Cecho
+1 go for 1.5.1 release Hari

Jarcec

 On Nov 6, 2014, at 1:39 PM, Hari Shreedharan hshreedha...@cloudera.com 
 wrote:
 
 As some of you may have noticed, I have been pushing some commits to trunk, 
 1.5 and 1.6 branches. This is primarily meant for a 1.5.1 maintenance release 
 that includes the following patches:
 
 
 FLUME-2520: HTTP Source should be able to block a prefixed set of protocols.
 
 
 FLUME-2511. Allow configuration of enabled protocols in Avro source and 
 RpcClient.
 
 
 FLUME-2441. HTTP Source Unit tests fail on IBM JDK 7
 
 
 FLUME-2533: HTTPS tests fail on Java 6
 
 
 I will try to spin up an RC today so we can vote through the weekend. Let me 
 know if you have any concerns.
 
 
 
 
 
 
 
 
 
 Thanks,
 Hari



[jira] [Commented] (FLUME-2533) HTTPS tests fail on Java 6

2014-11-05 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199490#comment-14199490
 ] 

Jarek Jarcec Cecho commented on FLUME-2533:
---

+1

 HTTPS tests fail on Java 6
 --

 Key: FLUME-2533
 URL: https://issues.apache.org/jira/browse/FLUME-2533
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2533.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Flume PMC Member - Roshan Naik

2014-11-04 Thread Jarek Jarcec Cecho
Congratulations Roshan!

Jarcec

 On Nov 4, 2014, at 2:12 PM, Arvind Prabhakar arv...@apache.org wrote:
 
 On behalf of Apache Flume PMC, it is my pleasure to announce that Roshan Naik 
 has been elected to the Flume Project Management Committee. Roshan has been 
 active with the project for many years and has been a committer on the 
 project since September of 2013.
 
 Please join me in congratulating Roshan and welcoming him to the Flume PMC.
 
 Regards,
 Arvind Prabhakar



[jira] [Commented] (FLUME-2500) Add a channel that uses Kafka

2014-10-16 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173787#comment-14173787
 ] 

Jarek Jarcec Cecho commented on FLUME-2500:
---

Could you also post the patch to review board [~hshreedharan]?

 Add a channel that uses Kafka 
 --

 Key: FLUME-2500
 URL: https://issues.apache.org/jira/browse/FLUME-2500
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2500.patch


 Here is the rationale:
 - Kafka does give a HA channel, which means a dead agent does not affect the 
 data in the channel - thus reducing delay of delivery.
 - Kafka is used by many companies - it would be a good idea to use Flume to 
 pull data from Kafka and write it to HDFS/HBase etc. 
 This channel is not going to be useful for cases where Kafka is not already 
 used, since it brings is operational overhead of maintaining two systems, but 
 if there is Kafka in use - this is good way to integrate Kafka and Flume.
 Here is an a scratch implementation: 
 https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2500) Add a channel that uses Kafka

2014-10-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169427#comment-14169427
 ] 

Jarek Jarcec Cecho commented on FLUME-2500:
---

I like the idea of Kafka channel, so I'm +1 on the idea!

 Add a channel that uses Kafka 
 --

 Key: FLUME-2500
 URL: https://issues.apache.org/jira/browse/FLUME-2500
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan

 Here is the rationale:
 - Kafka does give a HA channel, which means a dead agent does not affect the 
 data in the channel - thus reducing delay of delivery.
 - Kafka is used by many companies - it would be a good idea to use Flume to 
 pull data from Kafka and write it to HDFS/HBase etc. 
 This channel is not going to be useful for cases where Kafka is not already 
 used, since it brings is operational overhead of maintaining two systems, but 
 if there is Kafka in use - this is good way to integrate Kafka and Flume.
 Here is an a scratch implementation: 
 https://github.com/harishreedharan/flume/blob/kafka-channel/flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLUME-2250) Add support for Kafka Source

2014-09-14 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133601#comment-14133601
 ] 

Jarek Jarcec Cecho commented on FLUME-2250:
---

Good point, thank you for raising the concern about legality [~hshreedharan]! I 
can second [~gwenshap] sentiment. Contributors are agreeing with including 
their work in ASF software by attaching their patches on JIRA - commit to a 
repository is not necessary from a legal perspective. Hence, this situation 
when one contributor took uploaded patch of another contributor to finish the 
work is completely fine and in fact has happened several times through my time 
at Apache.

 Add support for Kafka Source
 

 Key: FLUME-2250
 URL: https://issues.apache.org/jira/browse/FLUME-2250
 Project: Flume
  Issue Type: Sub-task
  Components: Sinks+Sources
Affects Versions: v1.5.0
Reporter: Ashish Paliwal
Priority: Minor
 Attachments: FLUME-2250-0.patch, FLUME-2250-1.patch, 
 FLUME-2250-2.patch, FLUME-2250.patch


 Add support for Kafka Source



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Interest in a Flume Meetup?

2014-07-30 Thread Jarek Jarcec Cecho
Yeah, I don’t even remember when we had last Flume meet up, so I’m definitely 
up for doing it.

I don’t have preference to either place or time at the moment.

Jarcec

On Jul 28, 2014, at 4:34 PM, Jonathan Natkins na...@streamsets.com wrote:

 Hi Flume folks,
 
 I've noticed that it's been quite a while since the last Flume meetup, and
 was curious if there would be interest in doing one sometime in August
 around SF or somewhere else in the bay area. Would anybody on this list be
 interested in attending if one were scheduled? Are there preferences as to
 whether or not it is in San Francisco or somewhere in the South Bay (most
 likely Palo Alto)?
 
 Feel free to respond to the list or me directly with questions or comments.
 
 Thanks!
 Natty



[jira] [Commented] (FLUME-2416) Use CodecPool in compressed stream to prevent leak of direct buffers

2014-06-26 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045493#comment-14045493
 ] 

Jarek Jarcec Cecho commented on FLUME-2416:
---

+1

 Use CodecPool in compressed stream to prevent leak of direct buffers
 

 Key: FLUME-2416
 URL: https://issues.apache.org/jira/browse/FLUME-2416
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2416.patch


 Even though they may no longer be references, Java only cleans up direct 
 buffers on full gc. If there is enough heap available, a full GC is never hit 
 and these buffers are leaked. Hadoop keeps creating new compressors instead 
 of using the pools causing a leak - which is a bug in itself which is being 
 addressed by HADOOP-10591



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [VOTE] Apache Flume 1.5.0.1 RC1

2014-06-10 Thread Jarek Jarcec Cecho
+1.

Looks good to me.

Jarcec

On Tue, Jun 10, 2014 at 03:40:25PM -0700, Hari Shreedharan wrote:
  This is a vote for the next release of Apache Flume, version 1.5.0.1. We
 are voting on release candidate RC1.
 
 It fixes the following issues:
   http://s.apache.org/v7X
 
 *** Please cast your vote within the next 72 hours ***
 
 The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1)
 for the source and binary artifacts can be found here:
https://people.apache.org/~hshreedharan/apache-flume-1.5.0.1-rc1/
 
 Maven staging repo:
   https://repository.apache.org/content/repositories/orgapacheflume-1004/
 
 The tag to be voted on:
 
 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=ceda6aa1126a01370641caf729d8b1dd6d80aa61
 
 Flume's KEYS file containing PGP keys we use to sign the release:
   http://www.apache.org/dist/flume/KEYS
 
 
 Thanks,
 Hari


signature.asc
Description: Digital signature


Re: Preparing For Flume 1.5.0.1

2014-06-09 Thread Jarek Jarcec Cecho
+1 thank you for working on it Hari!

Jarcec

On Mon, Jun 09, 2014 at 11:25:37AM -0700, Hari Shreedharan wrote:
 Hi all, 
 
 Flume 1.5.0 did not have a build compatible with HBase-98, which is blocking 
 the next BigTop release. I expect the new release vote to be out early this 
 week and the release in the later part of the week. This release will not 
 have any new features, and is only a build-related release to unblock BigTop 
 
 
 Thanks,
 Hari
 


signature.asc
Description: Digital signature


[jira] [Commented] (FLUME-2397) HBase-98 compatibility

2014-06-05 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018920#comment-14018920
 ] 

Jarek Jarcec Cecho commented on FLUME-2397:
---

+1

 HBase-98 compatibility
 --

 Key: FLUME-2397
 URL: https://issues.apache.org/jira/browse/FLUME-2397
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
Priority: Blocker
 Fix For: v1.5.0.1

 Attachments: FLUME-2397.patch


 In addition to adding a new profile in the POM, it looks like the new version 
 of asynchbase uses the same executor to create a channel selector boss and 
 worker threads. We need to pass in two different single threaded executors to 
 ensure that the test does not fail.
 This is *required* for HBase 98 support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (FLUME-2397) HBase-98 compatibility

2014-06-05 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned FLUME-2397:
-

Assignee: Jarek Jarcec Cecho  (was: Hari Shreedharan)

 HBase-98 compatibility
 --

 Key: FLUME-2397
 URL: https://issues.apache.org/jira/browse/FLUME-2397
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.5.0
Reporter: Hari Shreedharan
Assignee: Jarek Jarcec Cecho
Priority: Blocker
 Fix For: v1.5.0.1

 Attachments: FLUME-2397.patch


 In addition to adding a new profile in the POM, it looks like the new version 
 of asynchbase uses the same executor to create a channel selector boss and 
 worker threads. We need to pass in two different single threaded executors to 
 ensure that the test does not fail.
 This is *required* for HBase 98 support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Flume 1.6.0 - 80% issues with patches

2014-05-28 Thread Jarek Jarcec Cecho
Hi Otis,
we've just released flume 1.5.0, so I'm wondering what is the motivation for 
1.6.0? Is there any patch you need to get in?

Jarcec

On Wed, May 28, 2014 at 05:10:25PM -0400, Otis Gospodnetic wrote:
 Btw. I had a peek at
 https://issues.apache.org/jira/browse/FLUME/fixforversion/12327047/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-paneland
 noticed that in version 1.6.0:
 
 * there are a total of 19 open issues
 * 80% of issues already have a patch available - nice :)
 
 Is it thus realistic to see Flume 1.6.0 towards the end of June?
 
 Thanks,
 Otis


signature.asc
Description: Digital signature


Re: Flume 1.5.0.1 branching

2014-05-28 Thread Jarek Jarcec Cecho
Hi Roshan,
I believe that we are using branches to encoding minor line (e.g. 1.4 or 1.5) 
and then for the particular release we have tags (1.5.0, 1.4.0). Hence I would 
assume that we will release from branch flume-1.5.

Jarcec

On Wed, May 28, 2014 at 01:47:51PM -0700, Roshan Naik wrote:
  Based on discussion on FLUME-1618 there appears to be an intent to release
 Flume 1.5.0.1
 
 Wondering if this release would have a branch of its own ?
 
 -roshan
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


signature.asc
Description: Digital signature


Re: [VOTE] Apache Flume 1.5.0 RC1

2014-05-11 Thread Jarek Jarcec Cecho
+1

* Verified checksums and signature files
* Verified that each jar in binary tarball is in the license
* Notice have copyright from 2012, we should update it (not a blocker in my 
mind)
* Checked top level files (NOTICE, ...)
* Run tests

Jarcec

On Wed, May 07, 2014 at 03:28:36PM -0700, Hari Shreedharan wrote:
 This is a vote for the next release of Apache Flume, version 1.5.0. We are
 voting on release candidate RC1.
 
 It fixes the following issues:
   http://s.apache.org/4eQ
 
 *** Please cast your vote within the next 72 hours ***
 
 The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5,
 *.sha1) for the source and binary artifacts can be found here:
https://people.apache.org/~hshreedharan/apache-flume-1.5.0-rc1/
 
 Maven staging repo:
   https://repository.apache.org/content/repositories/orgapacheflume-1001/
 
 
 The tag to be voted on:
 
 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=8633220df808c4cd0c13d1cf0320454a94f1ea97
 
 Flume's KEYS file containing PGP keys we use to sign the release:
   http://www.apache.org/dist/flume/KEYS
 
 
 Thanks,
 Hari


signature.asc
Description: Digital signature


[jira] [Commented] (FLUME-2381) Upgrade Hadoop version in Hadoop 2 profile to 2.4.0

2014-05-03 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988910#comment-13988910
 ] 

Jarek Jarcec Cecho commented on FLUME-2381:
---

+1

 Upgrade Hadoop version in Hadoop 2 profile to 2.4.0
 ---

 Key: FLUME-2381
 URL: https://issues.apache.org/jira/browse/FLUME-2381
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.4.0
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2381.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (FLUME-2381) Upgrade Hadoop version in Hadoop 2 profile to 2.4.0

2014-05-03 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2381.
---

Resolution: Fixed

Thank you [~hshreedharan]!

 Upgrade Hadoop version in Hadoop 2 profile to 2.4.0
 ---

 Key: FLUME-2381
 URL: https://issues.apache.org/jira/browse/FLUME-2381
 Project: Flume
  Issue Type: Bug
Affects Versions: v1.4.0
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2381.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (FLUME-2347) Add FLUME_JAVA_OPTS which allows users to inject java properties from cmd line

2014-03-24 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945615#comment-13945615
 ] 

Jarek Jarcec Cecho commented on FLUME-2347:
---

+1

 Add FLUME_JAVA_OPTS which allows users to inject java properties from cmd line
 --

 Key: FLUME-2347
 URL: https://issues.apache.org/jira/browse/FLUME-2347
 Project: Flume
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: FLUME-2347.patch


 In order to set java properties such as -X, -D, and -javaagent we have teh 
 following:
 * flume-ng takes -X and -D as native properties
 * JAVA_OPTS can be placed in the flume-env.sh file
 However, there is no way to set properties on the command line which do not 
 start with -X or -D.
 eg.
 env JAVA_OPTS=-javaagent flume-ng 
 Therefore I suggest we introduce FLUME_JAVA_OPTS which sets properties from 
 the env starting the flume-ng command. This will not impact users who use 
 JAVA_OPTs in the non-flume environment incompatibly with flume.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (FLUME-2347) Add FLUME_JAVA_OPTS which allows users to inject java properties from cmd line

2014-03-24 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2347.
---

   Resolution: Fixed
Fix Version/s: v1.5.0

Thank you for your contribution [~brocknoland]!

 Add FLUME_JAVA_OPTS which allows users to inject java properties from cmd line
 --

 Key: FLUME-2347
 URL: https://issues.apache.org/jira/browse/FLUME-2347
 Project: Flume
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: v1.5.0

 Attachments: FLUME-2347.patch


 In order to set java properties such as -X, -D, and -javaagent we have teh 
 following:
 * flume-ng takes -X and -D as native properties
 * JAVA_OPTS can be placed in the flume-env.sh file
 However, there is no way to set properties on the command line which do not 
 start with -X or -D.
 eg.
 env JAVA_OPTS=-javaagent flume-ng 
 Therefore I suggest we introduce FLUME_JAVA_OPTS which sets properties from 
 the env starting the flume-ng command. This will not impact users who use 
 JAVA_OPTs in the non-flume environment incompatibly with flume.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (FLUME-2336) HBase tests that pass in ZK configs must use a new context object

2014-03-01 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917229#comment-13917229
 ] 

Jarek Jarcec Cecho commented on FLUME-2336:
---

+1

 HBase tests that pass in ZK configs must use a new context object
 -

 Key: FLUME-2336
 URL: https://issues.apache.org/jira/browse/FLUME-2336
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2336.patch


 In java 7, since the test order is undefined, calling testZkIncorrectPorts 
 followed by other tests causes a series of failures.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (FLUME-2336) HBase tests that pass in ZK configs must use a new context object

2014-03-01 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2336:
--

Fix Version/s: (was: v0.9.5)
   v1.5.0

 HBase tests that pass in ZK configs must use a new context object
 -

 Key: FLUME-2336
 URL: https://issues.apache.org/jira/browse/FLUME-2336
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2336.patch


 In java 7, since the test order is undefined, calling testZkIncorrectPorts 
 followed by other tests causes a series of failures.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2324) Support writing to multiple HBase clusters using HBaseSink

2014-02-28 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916538#comment-13916538
 ] 

Jarek Jarcec Cecho commented on FLUME-2324:
---

Couple of nits:

* I've noticed that this fragment is in the code twice, perhaps we should have 
it only once?
{code}
  logger.info(Using ZK Quorum:  + zkQuorum);
{code}

* Do you think that you could update the user guide with this argument? It 
seems that we have currently documented the {{zookeeperQuorum}} only for 
AsyncHBaseSink.


 Support writing to multiple HBase clusters using HBaseSink
 --

 Key: FLUME-2324
 URL: https://issues.apache.org/jira/browse/FLUME-2324
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2324.patch


 The AsyncHBaseSink can already write to multiple HBase clusters, but 
 HBaseSink cannot. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2328) FileChannel Dual Checkpoint Backup Thread not released on Application stop

2014-02-28 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916552#comment-13916552
 ] 

Jarek Jarcec Cecho commented on FLUME-2328:
---

+1

 FileChannel Dual Checkpoint Backup Thread not released on Application stop
 --

 Key: FLUME-2328
 URL: https://issues.apache.org/jira/browse/FLUME-2328
 Project: Flume
  Issue Type: Bug
  Components: File Channel
Affects Versions: v1.4.0
Reporter: Arun
Assignee: Hari Shreedharan
 Attachments: FLUME-2328.patch


 In my application wired the filechannel with dual checkpoint enabled. Even 
 after calling application.stop() i can see checkpoint backup thread is still 
 in waiting state.
 [channel=c1] - CheckpointBackUpThread prio=6 tid=0x3a069400 nid=0x8a4 
 waiting on condition [0x3b17f000]
 Since i am usign java service wrapper to run application and for stopping 
 service i am waiting all user threads to be released, service is not stopping 
 gracefully even after waiting for 5 mins.
 in code i can see checkpointBackUpExecutor is started 
 if (shouldBackup) { 
   checkpointBackUpExecutor = Executors.newSingleThreadExecutor( 
 new ThreadFactoryBuilder().setNameFormat(   getName() +  - 
 CheckpointBackUpThread).build()); 
 } else {   checkpointBackUpExecutor = null; } 
 
 there is no shutdown call for checkpointBackUpExecutor in anywhere in 
 EventQueueBackingStoreFile
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2323) Morphline sink must increment eventDrainAttemptCount when it takes event from channel

2014-02-28 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916580#comment-13916580
 ] 

Jarek Jarcec Cecho commented on FLUME-2323:
---

+1

 Morphline sink must increment eventDrainAttemptCount when it takes event from 
 channel
 -

 Key: FLUME-2323
 URL: https://issues.apache.org/jira/browse/FLUME-2323
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2323.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (FLUME-2283) Spool Dir source must check interrupt flag before writing to channel

2014-02-28 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2283.
---

   Resolution: Fixed
Fix Version/s: v1.5.0

Your contribution is appreciated [~hshreedharan]!

 Spool Dir source must check interrupt flag before writing to channel
 

 Key: FLUME-2283
 URL: https://issues.apache.org/jira/browse/FLUME-2283
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2283.patch


 FLUME-2282 was likely caused by the Spool Dir Source continuing to attempt 
 writing the channel even though the stop method was called, because it never 
 checks the interrupt flag.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (FLUME-2307) Remove Log writetimeout

2014-02-10 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2307:
--

Description: 
I've observed Flume failing to clean up old log data in FileChannels. The 
amount of old log data can range anywhere from tens to hundreds of GB. I was 
able to confirm that the channels were in fact empty. This behavior always 
occurs after lock timeouts when attempting to put, take, rollback, or commit to 
a FileChannel. Once the timeout occurs, Flume stops cleaning up the old files. 
I was able to confirm that the Log's writeCheckpoint method was still being 
called and successfully obtaining a lock from tryLockExclusive(), but I was not 
able to confirm removeOldLogs being called. The application log did not include 
Removing old file: log-xyz for the old files which the Log class would output 
if they were correctly being removed. I suspect the lock timeouts were due to 
high I/O load at the time.

Some stack traces:

{code}
org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
log. Try increasing the log write timeout value. [channel=fileChannel]
at 
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
at 
org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
at 
org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
at 
org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)

org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
log. Try increasing the log write timeout value. [channel=fileChannel]
at 
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
at 
org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
at 
dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
at 
dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
at 
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:619)

org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
log. Try increasing the log write timeout value. [channel=fileChannel]
at 
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
at 
org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
at 
org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
at dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
at org.apache.avro.ipc.Responder.respond(Responder.java:151)
at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:94)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:364

[jira] [Commented] (FLUME-2314) Upgrade to Mapdb 0.9.9

2014-02-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895931#comment-13895931
 ] 

Jarek Jarcec Cecho commented on FLUME-2314:
---

+1

 Upgrade to Mapdb 0.9.9
 --

 Key: FLUME-2314
 URL: https://issues.apache.org/jira/browse/FLUME-2314
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2314.patch


 This affects classes which we use though I am not sure it affects our 
 use-case: https://github.com/jankotek/MapDB/issues/259. This is fixed in the 
 latest release



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2312) Add utility for adorning HTTP contexts in Jetty

2014-02-05 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892927#comment-13892927
 ] 

Jarek Jarcec Cecho commented on FLUME-2312:
---

Thank you [~hshreedharan]!

 Add utility for adorning HTTP contexts in Jetty
 ---

 Key: FLUME-2312
 URL: https://issues.apache.org/jira/browse/FLUME-2312
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2312.patch, FLUME-2312.patch


 The idea is to accomplish the same as HBASE-10473 to disallow TRACE and 
 OPTIONS



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [DISCUSS] Release Flume 1.5.0

2014-01-30 Thread Jarek Jarcec Cecho
Strong +1 on creating a new release.

Jarcec

On Thu, Jan 30, 2014 at 09:17:43AM -0800, Hari Shreedharan wrote:
 Hi folks,
 
 It has been about 6 months since we did a release. We have added several
 new features and fixed a lot of bugs. What do you guys think about
 releasing Flume 1.5.0?
 
 
 Thanks
 Hari


signature.asc
Description: Digital signature


[jira] [Commented] (FLUME-2305) BucketWriter#close must cancel idleFuture

2014-01-28 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884924#comment-13884924
 ] 

Jarek Jarcec Cecho commented on FLUME-2305:
---

+1

 BucketWriter#close must cancel idleFuture
 -

 Key: FLUME-2305
 URL: https://issues.apache.org/jira/browse/FLUME-2305
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2305.patch


 We use a ScheduledExecutorService to close the files after a time period. If 
 the task has been submitted, the service does not cancel that task by 
 default.  In the ScheduledThreadPoolExecutor class: this field 
 executeExistingDelayedTasksAfterShutdown is set to true, by default - which 
 causes the task to be executed even if the executor was shutdown. 
 If we don't cancel it, the agent does not die till idleTimeout even though 
 every other component is stopped.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (FLUME-2302) TestHDFS Sink fails with Can't get Kerberos realm

2014-01-20 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2302.
---

   Resolution: Fixed
Fix Version/s: v1.5.0
 Assignee: Hari Shreedharan

Thank you for the fix [~hshreedharan]!

 TestHDFS Sink fails with Can't get Kerberos realm
 -

 Key: FLUME-2302
 URL: https://issues.apache.org/jira/browse/FLUME-2302
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2302.patch


 Adding this to the test fixes it:
 {code}
   static {
 System.setProperty(java.security.krb5.realm, flume);
 System.setProperty(java.security.krb5.kdc, blah);
   }
 {code}
 The same issue caused HBASE-8842, and the fix is pretty much the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2303) HBaseSink tests can fail based on order of execution

2014-01-18 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875819#comment-13875819
 ] 

Jarek Jarcec Cecho commented on FLUME-2303:
---

+1

 HBaseSink tests can fail based on order of execution
 

 Key: FLUME-2303
 URL: https://issues.apache.org/jira/browse/FLUME-2303
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2303.patch


 Some tests do not remove all events from the channel because of batch sizes 
 getting reset in other tests



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (FLUME-2301) Update HBaseSink tests to reflect sink returning backoff only on empty batches

2014-01-16 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873806#comment-13873806
 ] 

Jarek Jarcec Cecho commented on FLUME-2301:
---

+1

 Update HBaseSink tests to reflect sink returning backoff only on empty batches
 --

 Key: FLUME-2301
 URL: https://issues.apache.org/jira/browse/FLUME-2301
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2301.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (FLUME-2301) Update HBaseSink tests to reflect sink returning backoff only on empty batches

2014-01-16 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2301.
---

   Resolution: Fixed
Fix Version/s: v1.5.0

Committed to 
[trunk|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=9a4f047668eb56895fb4bf5c1ee3e5dd6add8601]
 and 
[flume-1.5|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=4bb2658348f836e0e03b14f5a949a05663b54bfe].
 Thank you Hari for your contribution!

 Update HBaseSink tests to reflect sink returning backoff only on empty batches
 --

 Key: FLUME-2301
 URL: https://issues.apache.org/jira/browse/FLUME-2301
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2301.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Git commit confirmation on jira

2014-01-14 Thread Jarek Jarcec Cecho
We are using svngit2jira on other projects and we are quite happy with that. 
Thank you for enabling that Hari!

Jarcec

On Tue, Jan 14, 2014 at 01:45:17PM -0800, Hari Shreedharan wrote:
 Hi folks, 
 
 I requested ASF Infra to enable svngit2jira to put in a message on each jira 
 immediately when it is committed. Infra has already enabled it, but if you do 
 not agree with this change and think this will inconvenience you in any way, 
 please reply to this email. 
 
 
 Thanks,
 Hari
 


signature.asc
Description: Digital signature


[jira] [Commented] (FLUME-2289) Disable maxUnderReplication test which is extremely flakey

2014-01-07 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864088#comment-13864088
 ] 

Jarek Jarcec Cecho commented on FLUME-2289:
---

+1

 Disable maxUnderReplication test which is extremely flakey
 --

 Key: FLUME-2289
 URL: https://issues.apache.org/jira/browse/FLUME-2289
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2289.patch


 There is a test that already tests that under replication causes file rolls. 
 The max under replication test is inherently a bit flakey because it depends 
 on HDFS client API noticing a bunch of things. I think we should disable it



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (FLUME-2265) Closed bucket writers should be removed from sfwriters map

2013-12-23 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved FLUME-2265.
---

   Resolution: Fixed
Fix Version/s: v1.5.0

Thank you for your contribution [~hshreedharan]! I've committed the patch to 
both 
[trunk|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=3b1034e8229eb9ad3e27ed0faab77c3f68f708c6]
 and 
[flume-1.5|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=c6117b50d998d8989890bd1dca911cdf402ba1a2]
 branch.

 Closed bucket writers should be removed from sfwriters map
 --

 Key: FLUME-2265
 URL: https://issues.apache.org/jira/browse/FLUME-2265
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2265.patch


 If we don't remove the bucket writer from the map, the reference is kept in 
 the map and not removed till the maxOpenFiles limit is hit, even though the 
 bucket writer is closed. This leads to HDFS buffers sticking around for a 
 long time after the bucket writer is closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [DISCUSS] Feature bloat and contrib module

2013-12-14 Thread Jarek Jarcec Cecho
Hi Hari,
thank you very much for starting the discussion about contrib module. Flume is 
a very pluggable project where user have opportunity to plug in variety of 
different components (sink, source, channel, interceptor, ...). I believe that 
having a place where users can easily exchange their own plugins would be 
extremely useful and would help with further adoption of the project. Having a 
contrib module seems as one way how to accomplish this, another way would be 
perhaps to have a special wiki page (which we already do have [1]).

I do see benefits of repository based contrib module in having the ability to 
track the authorship of the changes and their history as the time passed. As I 
do see the contrib space more as a user driven, so I would see the bar of 
including a new component very low. It should be very easy for one user to 
submit component that works in his particular environment and let other user to 
tweak it for yet another purpose. With the bar keeping low we can't expect that 
the components will always work with current flume version nor that they will 
be backward compatible. Having said that I don't think that such contrib module 
should be official part of a Flume release. We can do our best to make a 
contrib release as a part of our usual release process and offer it as a 
separate download. I would not consider broken or removed components in the 
contrib as blocker for flume release though.

I'm happy to hear others opinions!

Jarcec

Links:
1: https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Plugins

On Fri, Dec 13, 2013 at 04:21:29PM -0800, Hari Shreedharan wrote:
 Hi all  
 
 Over the past few weeks, there were some discussions about including 
 additional features in core flume itself. This led to a discussion about 
 adding a contrib module. I thought I’d start an official discussion regarding 
 this. The argument for contrib module was that there are too many components 
 without generic use which are getting submitted/committed. The use-cases for 
 such components  are limited, and hence they should not be part of core flume 
 itself. First, we should answer the question if we want to separate 
 components into a contrib module and why? What components go into contrib and 
 what into core flume? What does it mean to be make a component part of the 
 contrib module. Do contrib components get released with Flume? Can they break 
 compatibility with older versions (what does this mean if they are not 
 getting released?) etc. How supported are these?
 
 Please respond and let us know your view!  
 
 
 Thanks,
 Hari
 


signature.asc
Description: Digital signature


[jira] [Updated] (FLUME-2262) Log4j Appender should use timeStamp field not getTimestamp

2013-12-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated FLUME-2262:
--

Fix Version/s: v1.5.0

 Log4j Appender should use timeStamp field not getTimestamp
 --

 Key: FLUME-2262
 URL: https://issues.apache.org/jira/browse/FLUME-2262
 Project: Flume
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: v1.5.0

 Attachments: FLUME-2262.patch


 getTimestamp was added in log4j 1.2.15, we should use the timestamp field 
 instead:
 1.2.14: 
 https://github.com/apache/log4j/blob/v1_2_14/src/java/org/apache/log4j/spi/LoggingEvent.java#L124
 trunk: 
 https://github.com/apache/log4j/blob/trunk/src/main/java/org/apache/log4j/spi/LoggingEvent.java#L569



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (FLUME-2235) idleFuture should be cancelled at the start of append

2013-11-07 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816747#comment-13816747
 ] 

Jarek Jarcec Cecho commented on FLUME-2235:
---

Thank you for the patch [~hshreedharan], it looks good to me - +1.

 idleFuture should be cancelled at the start of append
 -

 Key: FLUME-2235
 URL: https://issues.apache.org/jira/browse/FLUME-2235
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2235.patch, FLUME-2235.patch


 It seems like if idleTimeout is reached in the middle of an append call, it 
 will rotate the file while a write is happening, since the idleFuture is not 
 cancelled before the append starts.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2235) idleFuture should be cancelled at the start of append

2013-11-07 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816748#comment-13816748
 ] 

Jarek Jarcec Cecho commented on FLUME-2235:
---

Committed to 
[trunk|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=705abaf00fbf8ee69ac88cbccae47c1a33f4b4b2]
 and 
[flume-1.5|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=ad612c28bec3845775e12d6bc51ef724e6f78f06].

 idleFuture should be cancelled at the start of append
 -

 Key: FLUME-2235
 URL: https://issues.apache.org/jira/browse/FLUME-2235
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2235.patch, FLUME-2235.patch


 It seems like if idleTimeout is reached in the middle of an append call, it 
 will rotate the file while a write is happening, since the idleFuture is not 
 cancelled before the append starts.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2229) Backoff period gets reset too often in OrderSelector

2013-10-31 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810626#comment-13810626
 ] 

Jarek Jarcec Cecho commented on FLUME-2229:
---

+1

 Backoff period gets reset too often in OrderSelector
 

 Key: FLUME-2229
 URL: https://issues.apache.org/jira/browse/FLUME-2229
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2229.patch


 In OrderSelector.java, the backoff period is calculated by using the number 
 of sequential failures seen to the same host. Since CONSIDER_SEQUENTIAL_RANGE 
 is set to 2 seconds, if the host is not selected for connection within 2 
 seconds of the backoff period ending, the sequential failures variable is 
 reset and backoff period is not increased. 
 We must increase the value of CONSIDER_SEQUENTIAL_RANGE so that the backoff 
 is not reset too often.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2229) Backoff period gets reset too often in OrderSelector

2013-10-31 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810630#comment-13810630
 ] 

Jarek Jarcec Cecho commented on FLUME-2229:
---

Committed to 
[trunk|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=a89897bec4e7d6f3342ed966c61668e8a8139af5]
 and 
[cherry-picked|https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=7fc23d7d41757ce75058cead963b5bd54c395727]
 to flume-1.5. Thank you for your contribution [~hshreedharan]!

 Backoff period gets reset too often in OrderSelector
 

 Key: FLUME-2229
 URL: https://issues.apache.org/jira/browse/FLUME-2229
 Project: Flume
  Issue Type: Bug
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan
 Attachments: FLUME-2229.patch


 In OrderSelector.java, the backoff period is calculated by using the number 
 of sequential failures seen to the same host. Since CONSIDER_SEQUENTIAL_RANGE 
 is set to 2 seconds, if the host is not selected for connection within 2 
 seconds of the backoff period ending, the sequential failures variable is 
 reset and backoff period is not increased. 
 We must increase the value of CONSIDER_SEQUENTIAL_RANGE so that the backoff 
 is not reset too often.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >