hdfs-bolt write/sync problems

2015-04-29 Thread Volker Janz
Hi, we are using the storm-hdfs bolt (0.9.4) to write data from Kafka to Hadoop (Hadoop 2.5.0-cdh5.2.0). This works fine for us but we discovered some unexpected behavior: Our bolt uses the TimedRotationPolicy to rotate finished files from one location within HDFS to another. Unfortunately,

Re: Storm Trident Topology -- ParallelismHint

2015-04-29 Thread P. Taylor Goetz
In scenario “b”, set the parallelism for the spout before the `shuffle()` operation. Trident topologies compile down to regular spouts and bolts. Partitioning operations like `shuffle()`, etc. define the bolt boundaries and hence where parallelism hints take effect. -Taylor On Apr 29, 2015,

Re: Storm multilang performance

2015-04-29 Thread Srikanth
I guess tuples are waiting for them to be read by your python bolt. 200ms per tuple is a lot of processing time.Your up stream bolt/spout might have emitted thousands of tuple by then and they have no where to go. Have you measured how many tuples were emitted per sec by your spout? Add a time

Storm Trident Topology -- ParallelismHint

2015-04-29 Thread nitin sharma
Hi Team, I am trying to understand ParallelismHint in Trident Topology but somehow not getting anywhere close.. It will be great if someone can help me Also, kindly explain the difference between MasterCorrdinator and SpoutCoordinator... Things that i have tried so far: a. I created a

Re: Shared static data between bolts

2015-04-29 Thread Huy Le Van
Hi Michael, If a bolt of type A on a JVM modifies the shared data, will another bolt of type B on another JVM see the changes? To be clear, I have a use-case where I need multiple bolts of multiple types to be able to read/write on the shared collection. You can think of it as a shared