Storm-Kafka, TransactionalSpout, last version dependencies

2014-05-23 Thread Romain Leroux
Hi,

I am currently testing storm 0.9.1-incubating and the only working
KafkaSpout for Kafka 0.8.x that I found was :
groupIdnet.wurstmeister.storm/groupId
artifactIdstorm-kafka-0.8-plus/artifactId

Sadly, I have never been able to get the TransactionalSpout working as soon
as I also used some TransactionalState (Cassandra or Memcached) in my
topology.
If someone has some experience with it, I would be pleased to hear that.

Anyway after some tests I finally concluded that something was buggy on the
Storm-Kafka side and tried to switch to the new :
https://github.com/apache/incubator-storm/tree/master/external/storm-kafka

But when I try it locally on a very basic topology I get the following
Exception :

java.lang.NoSuchMethodError:
org.apache.zookeeper.ZooKeeper.init(Ljava/lang/String;ILorg/apache/zookeeper/Watcher;Z)V
at
org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:169)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.ConnectionState.reset(ConnectionState.java:219)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.ConnectionState.start(ConnectionState.java:103)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:188)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:234)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
storm.kafka.DynamicBrokersReader.init(DynamicBrokersReader.java:53)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
storm.kafka.trident.ZkBrokerReader.init(ZkBrokerReader.java:41)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at storm.kafka.KafkaUtils.makeBrokerReader(KafkaUtils.java:57)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at storm.kafka.trident.Coordinator.init(Coordinator.java:33)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
storm.kafka.trident.TransactionalTridentKafkaSpout.getCoordinator(TransactionalTridentKafkaSpout.java:41)
~[storm-ad-0.0.1-jar-with-dependencies.jar:na]
at
storm.trident.spout.PartitionedTridentSpoutExecutor$Coordinator.init(PartitionedTridentSpoutExecutor.java:47)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at
storm.trident.spout.PartitionedTridentSpoutExecutor.getCoordinator(PartitionedTridentSpoutExecutor.java:154)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at
storm.trident.topology.MasterBatchCoordinator.open(MasterBatchCoordinator.java:109)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
at
backtype.storm.daemon.executor$eval5100$fn__5101$fn__5116.invoke(executor.clj:519)
~[na:na]
at backtype.storm.util$async_loop$fn__390.invoke(util.clj:431)
~[na:na]
at clojure.lang.AFn.run(AFn.java:24) ~[clojure-1.4.0.jar:na]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]

My Zookeeper is 3.4.5 which seems to be the one required for this Spout
(curator 2.4.0)...
Does someone know what settings/dependencies(pom.xml) should be used to
make this Storm-Kafka TransactionalSpout working ?


Re: Is it possible join another spou/bolt later to the topology?

2014-05-23 Thread Sajith
Hi Gosh,

My concern is that I don't want to include all the dependencies required by
the bulky bolt to the topology jar. Because that bolt will run in single
node(I wrote a scheduler to guarantee that) in the cluster it will be not
required to distribute the jar across the cluster.

Actually, was can if it's possible to change the wiring of the components
of the topology dynamically at the runtime. But I might be able to use the
approach that you suggested.

Thanks,
Sajith.


On Thu, May 22, 2014 at 10:35 PM, P Ghosh javadevgh...@gmail.com wrote:

 Can you elaborate on what exactly you mean by join. You can have a bolt
 defined as part of topology which will load the other jar in the prepare()
 method call actual functional methods in the execute method. This way ,
 you are dynamically loading the other jar into your storm topo...(which in
 my view... not a great idea). However, the point of concern is , how will
 you distribute the other jar... that's a hassle. How big is your OSGi jar
 and why is it such a big concern ?

 Prasun




 On Thu, May 22, 2014 at 2:23 AM, Sajith sajith...@gmail.com wrote:

 Hi all,

 Is it possible for us to join another spout or a bolt (not an instance,
 but a separate one) later to the topology after topology being deployed.

 My requirement is to receive all the tuples processed by a storm cluster
 to a single bolt which developed on OSGi and many other dependencies. I
 don't want this bolt to be added to the original topology since the JAR
 which contains the topology becomes bulky.

 Therefore, is it possible for me to join this special spout to the
 cluster to receive messages  one the toplogy is deployed.

 Any other suggestions or recommendations to achieve this are appreciated.

 Thanks,
 Sajith.





Re: Storm-Kafka, TransactionalSpout, last version dependencies

2014-05-23 Thread Irek Khasyanov
I’m using storm-plus 0.4 with zookeeper 3.3.6. If you using storm 
0.9.1-incubating, you should downgrade storm-plus. Or use storem 
0.9.2-incubating which compatible with new zookeeper and curator


— With best regards, Irek Khasyanov 

Re: Sample project using Storm and Kafka (using Scala 2.10)

2014-05-23 Thread János Háber
If only the interfaces used from kafka (looked superficially) the provided
scope will be good idea... (both to scala and kafka dependencies)

János Háber
Fine Solution Ltd



On Thu, May 22, 2014 at 6:35 PM, P. Taylor Goetz ptgo...@gmail.com wrote:

 Come to think about it, this is a non-issue. This is just java code that
 happens to depend on a library written in scala. There isn’t anything that
 needs to be cross compiled.

 So it comes down to dependency management and documentation. So the scala
 and kafka dependencies would be set to “provided” scope, and it would be up
 to the user to pull in the kafka/scala dependency versions they want.

 Correct me if I’m missing anything here.

 - Taylor



 On May 22, 2014, at 6:08 AM, Marco zentrop...@yahoo.co.uk wrote:

 Taylor,
 cross-compatibility is a Scala thingie and THEREFORE it reflects onto
 Kafka and THEREFORE onto storm-kafka (or whatever-kafka).

 Kafka (if I understand correctly
 https://github.com/apache/kafka/blob/0.8.1/build.gradle#l131) manages
 this with Gradle (which I'm personally not much a fan of).

 I don't think it'd be much of an overhead having that in storm-kafka (and
 so effectively have storm-kafka_2_8_0, storm-kafka_2_9_1,
 storm-kafka_2_9_2, storm-kafka_2_10_1).

 Maven profiles can do that (as Micheal proved).

 Plus Storm seems to have put a lot of interest towards polyglottism
 (mixing Clojure and Java for example).

 Also, I agree with all the points János made.
 Having it on Maven Central would be THE perfect starting point for any
 (Scala) developer willing to give it a spin.

 ps: Scala 2.11 is just out

 -Marco
   Il Giovedì 22 Maggio 2014 3:43, P. Taylor Goetz ptgo...@gmail.com ha
 scritto:


 János,

 I sense and appreciate your frustration.

 The initial goal with bringing the storm-kafka project under the Apache
 Storm project umbrella is to provide a definitive source for the code and
 eliminate fragmentation, which has been an issue in the past. This is a
 good first step for many reasons, and will hopefully improve greatly in
 future iterations.

 Just to be clear, Apache releases source code, not binaries. Any binaries
 built against an Apache project's source are provided as a convenience.

 The cross-compilation ballet you describe is a feature of scala [1], not
 anything Storm or maven related. Yes, we can and will improve on the build
 and binary release process. But for now the goal is to provide a definitive
 source, and make sure users can build that source as needed -- which I
 think we have done.

 -Taylor

 [1] a not-so-subtle dig against scala's build process, not the language.
 Hopefully this will get sorted out someday.

 On May 21, 2014, at 7:48 PM, János Háber janos.ha...@finesolution.hu
 wrote:

 Dear Taylor, I love your work, but:
 - I don't want to build myself
 - Dependent libraries (like tormenta-kafka) need a cross compiled version
 of storm-kafka, without this they need to clone the project, change the
 group id, handle every changes by hand, and publish to central repository.
 - I need to have own maven repository to store the cross compiled version
 (which need to be public if somebody want to use my application) and
 maintain the changes
 - I think hand made library is the best way to make a project to unstable,
 because if I clone the project I need to clone tormenta-kafka project too
 and handle myself both version changes, solve the compatibility issues,
 etc...

 I know how can I compile to 2.10 by hand, but all other project (example
 kafka, which is apache project too) has cross compiled version in CENTRAL
 maven repository, if a project required a cross-compiled scala library - my
 oppinion - the project need to provide cross-compiled version too, no
 exception.

 b0c1

 János Háber
 Fine Solution Ltd



 On Wed, May 21, 2014 at 11:48 PM, P. Taylor Goetz ptgo...@gmail.comwrote:

 If you build yourself, you can do the following:

 mvn install -DscalaVersion=2.10.3 -DkafkaArtifact=kafka_2.10

 - Taylor


 On May 21, 2014, at 5:32 PM, János Háber janos.ha...@finesolution.hu
 wrote:


 https://github.com/apache/incubator-storm/blob/master/external/storm-kafka/pom.xml#L34
 And 2.10 not backward compatible with 2.9... that's why scala
 applications/libraries is cross compiled with multiple scala version.
 (that's why SBT will handle this natively, but I think in apache sbt is not
 allowed so you need to create multiple maven project (hard way) or switch
 to gradle (like kafka) to produce multiple version)

 János Háber
 Fine Solution Ltd



 On Wed, May 21, 2014 at 11:28 PM, János Háber janos.ha...@finesolution.hu
  wrote:


 On Wed, May 21, 2014 at 11:27 PM, P. Taylor Goetz ptgo...@gmail.comwrote:

 ill include the Kafka spout (for Kafka 0.8.x).


 Yeah, but the current maven build compile to 2.9 scala not 2.10...


 János Háber
 Fine Solution Ltd










Re: Sample project using Storm and Kafka (using Scala 2.10)

2014-05-23 Thread János Háber
I change the issue ( https://issues.apache.org/jira/browse/STORM-325 )...

János Háber
Fine Solution Ltd



On Fri, May 23, 2014 at 10:21 AM, János Háber
janos.ha...@finesolution.huwrote:

 If only the interfaces used from kafka (looked superficially) the provided
 scope will be good idea... (both to scala and kafka dependencies)

 János Háber
 Fine Solution Ltd



 On Thu, May 22, 2014 at 6:35 PM, P. Taylor Goetz ptgo...@gmail.comwrote:

 Come to think about it, this is a non-issue. This is just java code that
 happens to depend on a library written in scala. There isn’t anything that
 needs to be cross compiled.

 So it comes down to dependency management and documentation. So the scala
 and kafka dependencies would be set to “provided” scope, and it would be up
 to the user to pull in the kafka/scala dependency versions they want.

 Correct me if I’m missing anything here.

 - Taylor



 On May 22, 2014, at 6:08 AM, Marco zentrop...@yahoo.co.uk wrote:

 Taylor,
 cross-compatibility is a Scala thingie and THEREFORE it reflects onto
 Kafka and THEREFORE onto storm-kafka (or whatever-kafka).

 Kafka (if I understand correctly
 https://github.com/apache/kafka/blob/0.8.1/build.gradle#l131) manages
 this with Gradle (which I'm personally not much a fan of).

 I don't think it'd be much of an overhead having that in storm-kafka (and
 so effectively have storm-kafka_2_8_0, storm-kafka_2_9_1,
 storm-kafka_2_9_2, storm-kafka_2_10_1).

 Maven profiles can do that (as Micheal proved).

 Plus Storm seems to have put a lot of interest towards polyglottism
 (mixing Clojure and Java for example).

 Also, I agree with all the points János made.
 Having it on Maven Central would be THE perfect starting point for any
 (Scala) developer willing to give it a spin.

 ps: Scala 2.11 is just out

 -Marco
   Il Giovedì 22 Maggio 2014 3:43, P. Taylor Goetz ptgo...@gmail.com ha
 scritto:


  János,

 I sense and appreciate your frustration.

 The initial goal with bringing the storm-kafka project under the Apache
 Storm project umbrella is to provide a definitive source for the code and
 eliminate fragmentation, which has been an issue in the past. This is a
 good first step for many reasons, and will hopefully improve greatly in
 future iterations.

 Just to be clear, Apache releases source code, not binaries. Any binaries
 built against an Apache project's source are provided as a convenience.

 The cross-compilation ballet you describe is a feature of scala [1], not
 anything Storm or maven related. Yes, we can and will improve on the build
 and binary release process. But for now the goal is to provide a definitive
 source, and make sure users can build that source as needed -- which I
 think we have done.

 -Taylor

 [1] a not-so-subtle dig against scala's build process, not the language.
 Hopefully this will get sorted out someday.

 On May 21, 2014, at 7:48 PM, János Háber janos.ha...@finesolution.hu
 wrote:

 Dear Taylor, I love your work, but:
 - I don't want to build myself
 - Dependent libraries (like tormenta-kafka) need a cross compiled version
 of storm-kafka, without this they need to clone the project, change the
 group id, handle every changes by hand, and publish to central repository.
 - I need to have own maven repository to store the cross compiled version
 (which need to be public if somebody want to use my application) and
 maintain the changes
 - I think hand made library is the best way to make a project to
 unstable, because if I clone the project I need to clone tormenta-kafka
 project too and handle myself both version changes, solve the compatibility
 issues, etc...

 I know how can I compile to 2.10 by hand, but all other project (example
 kafka, which is apache project too) has cross compiled version in CENTRAL
 maven repository, if a project required a cross-compiled scala library - my
 oppinion - the project need to provide cross-compiled version too, no
 exception.

 b0c1

 János Háber
 Fine Solution Ltd



 On Wed, May 21, 2014 at 11:48 PM, P. Taylor Goetz ptgo...@gmail.comwrote:

 If you build yourself, you can do the following:

 mvn install -DscalaVersion=2.10.3 -DkafkaArtifact=kafka_2.10

 - Taylor


 On May 21, 2014, at 5:32 PM, János Háber janos.ha...@finesolution.hu
 wrote:


 https://github.com/apache/incubator-storm/blob/master/external/storm-kafka/pom.xml#L34
 And 2.10 not backward compatible with 2.9... that's why scala
 applications/libraries is cross compiled with multiple scala version.
 (that's why SBT will handle this natively, but I think in apache sbt is not
 allowed so you need to create multiple maven project (hard way) or switch
 to gradle (like kafka) to produce multiple version)

 János Háber
 Fine Solution Ltd



 On Wed, May 21, 2014 at 11:28 PM, János Háber 
 janos.ha...@finesolution.hu wrote:


 On Wed, May 21, 2014 at 11:27 PM, P. Taylor Goetz ptgo...@gmail.comwrote:

 ill include the Kafka spout (for Kafka 0.8.x).


 Yeah, but the current maven build compile to 2.9 

Extend Supervisors

2014-05-23 Thread Matteo Nardelli
I'd like to implement a new application logic in a distributed manner and I
was wondering if extending the Supervisor process is the best way of doing
it.

Basically I want to implement a distributed scheduler and Nimbus
peculiarities seems to be too restrictive.

Is there a way to extend the Supervisor preserving its default logic?

Thank you all

-- 
Matteo Nardelli


kafka-storm-starter released: code examples that integrate Kafka 0.8 and Storm 0.9

2014-05-23 Thread Michael G. Noll

Hi everyone,

to sweeten the upcoming long weekend I have released code examples that 
show how to integrate Kafka 0.8+ with Storm 0.9+, while using Apache 
Avro as the data serialization format.


https://github.com/miguno/kafka-storm-starter

Since the integration of the latest Kafka and Storm versions have been 
a popular topic on the mailing lists (read: many questions/threads) I 
hope that this will help a little to not only drive the adoption but 
also with minimizing the frustration when getting started. ;-)



Enjoy!
Michael


PS: Of course the code above can be integrated into, say, the recently 
added storm-kafka component if there is community interest.


(This message was cross-posted to the kafka-user mailing list.)


Re: Sample project using Storm and Kafka (using Scala 2.10)

2014-05-23 Thread Michael G. Noll
Marco,

Take a look at kafka-storm-starter:
https://github.com/miguno/kafka-storm-starter

The code is Scala 2.10.

Best,
Michael


 On 21.05.2014, at 22:32, Marco zentrop...@yahoo.co.uk wrote:
 
 Hi,
 I'm having some troubles understanding how to boostrap a Kafka + Storm 
 project.
 
 Specifically I don't get if storm-kafka is indeed included in Storm release 
 or not.
 Also, I'm currently using Scala 2.10 but it seems that Kafka main release is 
 only available in its Scala 2.9.2 version.
 
 Thanks for your help
 
 -Marco


Config maxClientCnxns=100 in LocalCluster

2014-05-23 Thread Max, Shen
Hi all,

I'm running some tests on LocalCluster and having issues on connecting to
the local zookeeper created by the LocalCluster when the test
submitTopology.

I've resolved the issue in remote mode by increasing the maxClientCnxns
number in the zoo.cfg on my zookeeper box.

However, I couldn't figure out an approach to increase the number for the
zookeeper created by the LocalCluster.

Any suggestions?

-- 

- Max


Workers constantly restarted due to session timeout

2014-05-23 Thread Michael Dev
Hi all,

We are seeing our workers constantly being killed by Storm with to the 
following logs:
worker: 2014-05-23 20:15:08 INFO ClientCxn:1157 - Client session timed out, 
have not heard from the server in 28105ms for sessionid 0x14619bf2f4e0109, 
closing socket and attempting reconnect
supervisor: 2014-05-23 20:17:30 INFO supervisor:0 - Shutting down and clearing 
state for id 94349373-74ec-484b-a9f8-a5076e17d474. Current supervisor time: 
1400876250. State: :disallowed, Heartbeat: 
#backtype.storm.daemon.common.WorkerHeartbeat{{:time-secs 1400876249, :storm-id 
test-46-1400863199, :executors #{[-1 -1]}, :port 6700}

Eventually Storm decides to just kill the worker and restart it as you see in 
the supervisor log. We theorize this is the Zookeeper heartbeat thread and it 
is being choked out due to very high CPU load on the machine (near 100%).

I have increased the connection timeouts in the storm.yaml config file yet 
Storm seems to continue to use some unknown value for the above client session 
timeout messages:
storm.zookeeper.connection.timeout: 30
storm.zookeeper.session.timeout: 30

1) What timeout config is appropriate for the above timeout  message?
2) Is this expected behavior for Storm to be unable to keep up with heartbeat 
threads under high CPU or is our theory incorrect?

Thanks,
Michael
  

Building Storm

2014-05-23 Thread Justin Workman
I have seen this question a couple of times, but for the life of me, I
cannot get around it and have not seen an answer that resolves it. Here is
the process I am walking through.

# git clone https://github.com/apache/incubator-storm
# cd incubator-storm
# git checkout master
Already on 'master'

** Before building I had to add the kryo dependency to the parent pom and
the storm-core pom.xml files

# mvn -e clean install

This is where it fails with the following stack trace. There were no errors
in the output prior to this error, and passing the -X flag for debug does
not produce any other details. Any help getting past this will be greatly
appreciated.

Thanks
Justin

[ERROR] Failed to execute goal
com.theoryinpractise:clojure-maven-plugin:1.3.18:compile (compile-clojure)
on project storm-core: Clojure failed. - [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
goal com.theoryinpractise:clojure-maven-plugin:1.3.18:compile
(compile-clojure) on project storm-core: Clojure failed.
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:217)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
at
org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Caused by: org.apache.maven.plugin.MojoExecutionException: Clojure failed.
at
com.theoryinpractise.clojure.AbstractClojureCompilerMojo.callClojureWith(AbstractClojureCompilerMojo.java:451)
at
com.theoryinpractise.clojure.AbstractClojureCompilerMojo.callClojureWith(AbstractClojureCompilerMojo.java:367)
at
com.theoryinpractise.clojure.AbstractClojureCompilerMojo.callClojureWith(AbstractClojureCompilerMojo.java:344)
at
com.theoryinpractise.clojure.ClojureCompilerMojo.execute(ClojureCompilerMojo.java:47)
at
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
... 19 more


Re: Workers constantly restarted due to session timeout

2014-05-23 Thread Derek Dagit

2) Is this expected behavior for Storm to be unable to keep up with heartbeat 
threads under high CPU or is our theory incorrect?


Check your JVM max heap size (-Xmx).  If you use too much, the JVM will 
garbage-collect, and that will stop everything--including the thread whose job 
it is to do the heartbeating.



--
Derek

On 5/23/14, 15:38, Michael Dev wrote:

Hi all,

We are seeing our workers constantly being killed by Storm with to the 
following logs:
worker: 2014-05-23 20:15:08 INFO ClientCxn:1157 - Client session timed out, 
have not heard from the server in 28105ms for sessionid 0x14619bf2f4e0109, 
closing socket and attempting reconnect
supervisor: 2014-05-23 20:17:30 INFO supervisor:0 - Shutting down and clearing state for 
id 94349373-74ec-484b-a9f8-a5076e17d474. Current supervisor time: 1400876250. State: 
:disallowed, Heartbeat: #backtype.storm.daemon.common.WorkerHeartbeat{{:time-secs 
1400876249, :storm-id test-46-1400863199, :executors #{[-1 -1]}, :port 6700}

Eventually Storm decides to just kill the worker and restart it as you see in 
the supervisor log. We theorize this is the Zookeeper heartbeat thread and it 
is being choked out due to very high CPU load on the machine (near 100%).

I have increased the connection timeouts in the storm.yaml config file yet 
Storm seems to continue to use some unknown value for the above client session 
timeout messages:
storm.zookeeper.connection.timeout: 30
storm.zookeeper.session.timeout: 30

1) What timeout config is appropriate for the above timeout  message?
2) Is this expected behavior for Storm to be unable to keep up with heartbeat 
threads under high CPU or is our theory incorrect?

Thanks,
Michael




Conceptual question on Streams definition...

2014-05-23 Thread P Ghosh
My definition of stream is continuous feed of data of certain type or with
certain purpose (depends on how you want to define your process)

I have a situation, where the Domain Object is same across the whole
topology, however, each component working on bits and pieces to construct
the final document (a JSN document).
Option -1 sounds logical when I think , everything is working on same
domain object. Option -2 sounds logical when I think, those streams
represents different parts of the domain object, so they are not same in
reality.



Please note that SPoutA to BoltC1 is part of transaction. So , spout A
should get an ACK only when all bolts have acked.

What I'm trying to understand is , how this Option - 1 and Option 2 affect
the functionality.

Just an FYI: BoltC1 has a

RotatingMapListObject, MapGlobalStreamId, Tuple pendingTuples
which it uses to ensure that it acks back only when it has received all
tuples from the previous bolts.

Thanks,
Prasun


Conceptual question on Streams definition...

2014-05-23 Thread P Ghosh
My definition of stream is continuous feed of data of certain type or with
certain purpose (depends on how you want to define your process)

I have a situation, where the Domain Object is same across the whole
topology, however, each component working on bits and pieces to construct
the final document (a JSN document).
Option -1 sounds logical when I think , everything is working on same
domain object. Option -2 sounds logical when I think, those streams
represents different parts of the domain object, so they are not same in
reality.
[image: Inline image 2]


Source link for the image -
https://drive.google.com/file/d/0B7Y7mM2uzsNFTWRJekZzS0FXUDQ/edit?usp=sharing

Please note that SPoutA to BoltC1 is part of transaction. So , spout A
should get an ACK only when all bolts have acked.

What I'm trying to understand is , how this Option - 1 and Option 2 affect
the functionality.

Just an FYI: BoltC1 has a

RotatingMapListObject, MapGlobalStreamId, Tuple pendingTuples
which it uses to ensure that it acks back only when it has received all
tuples from the previous bolts.

Thanks,
Prasun