Re: storm ui show problem

2014-07-23 Thread hushengbo
hey do you find storm ui show wrong number of execute tuple sometimes the 
number of execute tuple is bigger than its real sometime is less  
 
anybody can help ?



2014-07-23 



hushengbo 



发件人: 唐思成 
发送时间: 2014-07-17  19:24:08 
收件人: storm_user 
抄送: 
主题: storm upgrade issue 
 
Hi all:
I try to upgrade storm from 0.9.1 to 0.9.2-incubating, and when the worknode 
supervisor startup, the nimbus process goes down, here is what the nimbus.log 
say:

Before upgrade, I already change storm.local.dir: to a new location and remove 
storm node in zookeeper using zkCli.sh, however that dont help. 

AnyIdea?

2014-07-17 19:15:29 b.s.d.nimbus [ERROR] Error when processing event
java.lang.RuntimeException: java.io.InvalidClassException: 
backtype.storm.daemon.common.SupervisorInfo; local class incompatible: stream 
classdesc serialVersionUID = 7648414326720210054, local class serialVersionUID 
= 7463898661547835557
at backtype.storm.utils.Utils.deserialize(Utils.java:93) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.cluster$maybe_deserialize.invoke(cluster.clj:200) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at 
backtype.storm.cluster$mk_storm_cluster_state$reify__2284.supervisor_info(cluster.clj:299)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 ~[na:na]
at java.lang.reflect.Method.invoke(Method.java:597) ~[na:na]
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
~[clojure-1.5.1.jar:na]
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) 
~[clojure-1.5.1.jar:na]
at 
backtype.storm.daemon.nimbus$all_supervisor_info$fn__4715.invoke(nimbus.clj:277)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.core$map$fn__4207.invoke(core.clj:2487) 
~[clojure-1.5.1.jar:na]
at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.5.1.jar:na]
at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.5.1.jar:na]
at clojure.lang.RT.seq(RT.java:484) ~[clojure-1.5.1.jar:na]
at clojure.core$seq.invoke(core.clj:133) ~[clojure-1.5.1.jar:na]
at clojure.core$apply.invoke(core.clj:617) ~[clojure-1.5.1.jar:na]
at clojure.core$mapcat.doInvoke(core.clj:2514) ~[clojure-1.5.1.jar:na]
at clojure.lang.RestFn.invoke(RestFn.java:423) ~[clojure-1.5.1.jar:na]
at 
backtype.storm.daemon.nimbus$all_supervisor_info.invoke(nimbus.clj:275) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at 
backtype.storm.daemon.nimbus$all_scheduling_slots.invoke(nimbus.clj:288) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at 
backtype.storm.daemon.nimbus$compute_new_topology__GT_executor__GT_node_PLUS_port.invoke(nimbus.clj:580)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.daemon.nimbus$mk_assignments.doInvoke(nimbus.clj:660) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.lang.RestFn.invoke(RestFn.java:410) ~[clojure-1.5.1.jar:na]
at 
backtype.storm.daemon.nimbus$fn__5210$exec_fn__1396__auto5211$fn__5216$fn__5217.invoke(nimbus.clj:905)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at 
backtype.storm.daemon.nimbus$fn__5210$exec_fn__1396__auto5211$fn__5216.invoke(nimbus.clj:904)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at 
backtype.storm.timer$schedule_recurring$this__1134.invoke(timer.clj:99) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.timer$mk_timer$fn__1117$fn__1118.invoke(timer.clj:50) 
~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at backtype.storm.timer$mk_timer$fn__1117.invoke(timer.clj:42) 
[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
at java.lang.Thread.run(Thread.java:619) [na:na]
Caused by: java.io.InvalidClassException: 
backtype.storm.daemon.common.SupervisorInfo; local class incompatible: stream 
classdesc serialVersionUID = 7648414326720210054, local class serialVersionUID 
= 7463898661547835557
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:562) 
~[na:na]
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1583) ~[na:na]
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1496) 
~[na:na]
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1732) 
~[na:na]
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) 
~[na:na]
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) 
~[na:na]
at backtype.storm.utils.Utils.deserialize(Utils.java:89) 

How to find when Supervisor fails.

2014-07-23 Thread Sai SaiGraph
I have figured how to track the spouts and bolts and noticed the supervisor
has stopped when i submitted a topology and noticed nothing was happening
due to it.
Is there any automatic way of knowing when the daemons stop such as
supervisor or nimbus or ZK.

Thanks
Sai


use normal storm bolt in trident topology

2014-07-23 Thread A.M.
Is it recommended to use regular storm bolts in a trident topology? or is it at 
all possible? Thanks for your advice in advance.

-Costco

Modifying tipples, is the below use case a fit for STORM

2014-07-23 Thread Anil Karamchandani

 
 Hi,
 
 I am a newbie to Storm and wanted to try STORM in my project since it 
 involves streams of messages and big data 
 
  I have the following use case.
 
 I have 2 files, One file has 20 million user information.
 and the other file has user id and the scores associated to that user. 
 
 I need to perform the following actions using STORM.
 
 For each each user record I need to get the associated scores for that user 
 and append it to the user record.
 
 The scores data is around 110 million 
 
 User record something like 
 
 EmailAdd   ID   
 a...@gmail.com  1  
 
 Score data
 
 180 
 190
 110
 120
 
 
 2180 
 2920
 2110
 2210
 
 
 result 
 
 ani...@gmail.com 1  80 90 10 20
 
 Can you please help how could I do this association using storm.
 
 thanks !
 
 anil
 


Re: use normal storm bolt in trident topology

2014-07-23 Thread P. Taylor Goetz
You cannot use regular storm bolts in a trident topology. Regular spouts can be 
used with trident, but not bolts.

- Taylor

On Jul 23, 2014, at 1:16 PM, A.M. costco.s...@yahoo.com wrote:

 Is it recommended to use regular storm bolts in a trident topology? or is it 
 at all possible? Thanks for your advice in advance.
 
 -Costco
 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: use normal storm bolt in trident topology

2014-07-23 Thread A.M.
Hi Taylor, thanks for confirming. 



-Costco
On Wednesday, July 23, 2014 10:24 AM, P. Taylor Goetz ptgo...@gmail.com wrote:
 


You cannot use regular storm bolts in a trident topology. Regular spouts can be 
used with trident, but not bolts.

- Taylor


On Jul 23, 2014, at 1:16 PM, A.M. costco.s...@yahoo.com wrote:

Is it recommended to use regular storm bolts in a trident topology? or is it at 
all possible? Thanks for your advice in advance.


-Costco




Re: Netty transport errors.

2014-07-23 Thread Tomas Mazukna
No, its on RHE on real hardware...
Although we went through a network upgrade couple day ago and all sort of
things started failing


On Wed, Jul 23, 2014 at 1:43 PM, Andrew Montalenti and...@parsely.com
wrote:

 Tomas,

 You don't happen to be running Ubuntu 14.04 on Xen kernel, do you? Eg on
 Amazon EC2.

 I discovered an issue where running Storm across many workers on that OS
 led to me hitting an annoying network driver bug that would cause timeouts
 and topology freezes like you are seeing. Check dmesg for odd messages from
 your network stack. Just a guess.

 (copied from my reply to another similar thread)

 -AM
 On Jul 23, 2014 10:07 AM, Tomas Mazukna tomas.mazu...@gmail.com wrote:

 I am really puzzled why processing stopped in the topology.
 Looks like the acking threads all stopped communicating. Only hint I saw
 was this netty exception:
 Any hints how to prevent this from happening again?

 2014-07-23 08:56:03 b.s.m.n.Client [INFO] Closing Netty Client
 Netty-Client-ndhhadappp3.tsh.mis.mckesson.com/10.48.132.224:9703

 2014-07-23 08:56:03 b.s.m.n.Client [INFO] Waiting for pending batchs to
 be sent with
 Netty-Client-ndhhadappp3.tsh.mis.mckesson.com/10.48.132.224:9703...,
 timeout: 60ms, pendings: 0

 2014-07-23 08:56:03 b.s.m.n.Client [INFO] Closing Netty Client
 Netty-Client-ndhhadappp3.tsh.mis.mckesson.com/10.48.132.224:9700

 2014-07-23 08:56:03 b.s.m.n.Client [INFO] Waiting for pending batchs to
 be sent with
 Netty-Client-ndhhadappp3.tsh.mis.mckesson.com/10.48.132.224:9700...,
 timeout: 60ms, pendings: 0

 2014-07-23 08:56:03 b.s.util [ERROR] Async loop died!

 java.lang.RuntimeException: java.lang.RuntimeException: Client is being
 closed, and does not take requests any more

 at
 backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.disruptor$consume_loop_STAR_$fn__758.invoke(disruptor.clj:94)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_60]

 Caused by: java.lang.RuntimeException: Client is being closed, and does
 not take requests any more

 at backtype.storm.messaging.netty.Client.send(Client.java:194)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__5927$fn__5928.invoke(worker.clj:322)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__5927.invoke(worker.clj:320)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 at
 backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
 ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]

 ... 6 common frames omitted

 2014-07-23 08:56:03 b.s.util [INFO] Halting process: (Async loop died!)


 Configuration:

 worker.childopts: -Xmx2048m -Xss256k -XX:MaxPermSize=256m
 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
 -XX:NewSize=128m -XX:CMSInitiatingOccupancyFraction=70
 -XX:-CMSConcurrentMTEnabled -Djava.net.preferIPv4Stack=true

 supervisor.childopts: -Xmx256m -Djava.net.preferIPv4Stack=true

 nimbus.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true

 ui.childopts: -Xmx768m -Djava.net.preferIPv4Stack=true

 nimbus.thrift.threads: 256


 storm.messaging.transport: backtype.storm.messaging.netty.Context

 storm.messaging.netty.server_worker_threads: 1

 storm.messaging.netty.client_worker_threads: 1

 storm.messaging.netty.buffer_size: 5242880

 storm.messaging.netty.max_retries: 100

 storm.messaging.netty.max_wait_ms: 1000

 storm.messaging.netty.min_wait_ms: 100


 Thanks,
 --
 Tomas Mazukna
 678-557-3834




-- 
Tomas Mazukna
678-557-3834


intra-topology SSL transport

2014-07-23 Thread Isaac Councill
Hi,

I've been working with storm on mesos but I need to make sure all workers
are messaging over SSL since streams may contain sensitive information for
almost all of my use cases.

stunnel seems like a viable option but I dislike having complex port
forwarding arrangements and would prefer code to config in this case.

As an exercise to see how much work it would be, I forked storm and
modified the storm-netty package to use SSL with the existing nio. Not so
bad, and lein tests pass.

Still wrapping my head around the storm codebase. Would using my modified
storm-netty Context as storm.messaging.transport be enough to ensure
streams are encrypted, or would I need to also attack the thrift transport
plugin?

Also, is anyone else interested in locking storm down with SSL?


Re: intra-topology SSL transport

2014-07-23 Thread Derek Dagit

In the security branch of storm, worker-worker communication are encrypted 
(blowfish) with a shared secret.

STORM-348 will add authentication to worker-worker.

For thrift (nimbus  drpc), the security branch has SASL/kerberos 
authentication, and you should be able to configure encryption via SASL as well.  
We have not tried enabling encryption with SASL.
--
Derek

On 7/23/14, 14:05, Isaac Councill wrote:

Hi,

I've been working with storm on mesos but I need to make sure all workers
are messaging over SSL since streams may contain sensitive information for
almost all of my use cases.

stunnel seems like a viable option but I dislike having complex port
forwarding arrangements and would prefer code to config in this case.

As an exercise to see how much work it would be, I forked storm and
modified the storm-netty package to use SSL with the existing nio. Not so
bad, and lein tests pass.

Still wrapping my head around the storm codebase. Would using my modified
storm-netty Context as storm.messaging.transport be enough to ensure
streams are encrypted, or would I need to also attack the thrift transport
plugin?

Also, is anyone else interested in locking storm down with SSL?



Re: intra-topology SSL transport

2014-07-23 Thread Isaac Councill
great, thanks!


On Wed, Jul 23, 2014 at 3:25 PM, Derek Dagit der...@yahoo-inc.com wrote:

 In the security branch of storm, worker-worker communication are encrypted
 (blowfish) with a shared secret.

 STORM-348 will add authentication to worker-worker.

 For thrift (nimbus  drpc), the security branch has SASL/kerberos
 authentication, and you should be able to configure encryption via SASL as
 well.  We have not tried enabling encryption with SASL.
 --
 Derek


 On 7/23/14, 14:05, Isaac Councill wrote:

 Hi,

 I've been working with storm on mesos but I need to make sure all workers
 are messaging over SSL since streams may contain sensitive information for
 almost all of my use cases.

 stunnel seems like a viable option but I dislike having complex port
 forwarding arrangements and would prefer code to config in this case.

 As an exercise to see how much work it would be, I forked storm and
 modified the storm-netty package to use SSL with the existing nio. Not so
 bad, and lein tests pass.

 Still wrapping my head around the storm codebase. Would using my modified
 storm-netty Context as storm.messaging.transport be enough to ensure
 streams are encrypted, or would I need to also attack the thrift transport
 plugin?

 Also, is anyone else interested in locking storm down with SSL?




Parallelism for KafkaSpout

2014-07-23 Thread Kashyap Mhaisekar
Hi,
Is the no. of executors for KafkaSpout dependent on the partitions for the
topic?
For E.g.,
Say kafka TopicA has 5 partitions.
If I have a KafkaSpout that has the parallelism hint set to 10, will all
the executors get the data? Or am i constrained by the no. of partitions
declared for the topic?

Regards,
Kashyap


Re: Parallelism for KafkaSpout

2014-07-23 Thread Robert Lee
From my understanding, the executors without a partition to read remain
idle. Its recommended for efficiency to set the number of executors to the
number of partitions.


On Wed, Jul 23, 2014 at 5:11 PM, Kashyap Mhaisekar kashya...@gmail.com
wrote:

 Hi,
 Is the no. of executors for KafkaSpout dependent on the partitions for the
 topic?
 For E.g.,
 Say kafka TopicA has 5 partitions.
 If I have a KafkaSpout that has the parallelism hint set to 10, will all
 the executors get the data? Or am i constrained by the no. of partitions
 declared for the topic?

 Regards,
 Kashyap



Re: Parallelism for KafkaSpout

2014-07-23 Thread Nathan Leung
In your example, five spouts would get data and the other five would not.
On Jul 23, 2014 5:11 PM, Kashyap Mhaisekar kashya...@gmail.com wrote:

 Hi,
 Is the no. of executors for KafkaSpout dependent on the partitions for the
 topic?
 For E.g.,
 Say kafka TopicA has 5 partitions.
 If I have a KafkaSpout that has the parallelism hint set to 10, will all
 the executors get the data? Or am i constrained by the no. of partitions
 declared for the topic?

 Regards,
 Kashyap



Re: Parallelism for KafkaSpout

2014-07-23 Thread Raphael Hsieh
how do you tell how many partitions are used in a specific bolt on the
topology ?


On Wed, Jul 23, 2014 at 2:15 PM, Nathan Leung ncle...@gmail.com wrote:

 In your example, five spouts would get data and the other five would not.
 On Jul 23, 2014 5:11 PM, Kashyap Mhaisekar kashya...@gmail.com wrote:

 Hi,
 Is the no. of executors for KafkaSpout dependent on the partitions for
 the topic?
 For E.g.,
 Say kafka TopicA has 5 partitions.
 If I have a KafkaSpout that has the parallelism hint set to 10, will all
 the executors get the data? Or am i constrained by the no. of partitions
 declared for the topic?

 Regards,
 Kashyap




-- 
Raphael Hsieh


Re: NotAliveException via Storm UI

2014-07-23 Thread Vincent Russell
I solved my own issue I think.  There is a space in my topology name.  So I
believe the storm UI is not escaping the link.


On Wed, Jul 23, 2014 at 6:49 PM, Vincent Russell vincent.russ...@gmail.com
wrote:

 I'm using Storm 0.9.0.1 and I'm getting a NotAliveException via the Storm
 UI:


 NotAliveException(msg:Ingest-33-1406154838)
   at 
 backtype.storm.generated.Nimbus$getTopologyInfo_result.read(Nimbus.java:11330)
   at org.apache.thrift7.TServiceClient.receiveBase(TServiceClient.java:78)
   at 
 backtype.storm.generated.Nimbus$Client.recv_getTopologyInfo(Nimbus.java:474)
   at 
 backtype.storm.generated.Nimbus$Client.getTopologyInfo(Nimbus.java:461)
   at backtype.storm.ui.core$topology_page.invoke(core.clj:481)
   at backtype.storm.ui.core$fn__7877.invoke(core.clj:745)
   at compojure.core$make_route$fn__3855.invoke(core.clj:93)
   at compojure.core$if_route$fn__3843.invoke(core.clj:39)
   at compojure.core$if_method$fn__3836.invoke(core.clj:24)
   at compojure.core$routing$fn__3861.invoke(core.clj:106)
   at clojure.core$some.invoke(core.clj:2390)
   at compojure.core$routing.doInvoke(core.clj:106)
   at clojure.lang.RestFn.applyTo(RestFn.java:139)
   at clojure.core$apply.invoke(core.clj:603)
   at compojure.core$routes$fn__3865.invoke(core.clj:111)
   at ring.middleware.reload$wrap_reload$fn__7170.invoke(reload.clj:14)
   at backtype.storm.ui.core$catch_errors$fn__7912.invoke(core.clj:798)
   at 
 ring.middleware.keyword_params$wrap_keyword_params$fn__4391.invoke(keyword_params.clj:27)
   at 
 ring.middleware.nested_params$wrap_nested_params$fn__4428.invoke(nested_params.clj:65)
   at ring.middleware.params$wrap_params$fn__4365.invoke(params.clj:55)
   at 
 ring.middleware.multipart_params$wrap_multipart_params$fn__4454.invoke(multipart_params.clj:103)
   at ring.middleware.flash$wrap_flash$fn__4729.invoke(flash.clj:14)
   at ring.middleware.session$wrap_session$fn__4720.invoke(session.clj:43)
   at ring.middleware.cookies$wrap_cookies$fn__4657.invoke(cookies.clj:160)
   at ring.adapter.jetty$proxy_handler$fn__4204.invoke(jetty.clj:16)
   at 
 ring.adapter.jetty.proxy$org.mortbay.jetty.handler.AbstractHandler$0.handle(Unknown
  Source)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


 but if I get the status from the command line it looks fine:


 # storm list
 Running: java -client -Dstorm.options=
 -Dstorm.home=/opt/storm/storm-0.9.0.1
 -Djava.library.path=/usr/lib64:/usr/lib -Dstorm.conf.file= -cp
 

storm reliablity

2014-07-23 Thread Chen Wang
Guys,
i am a bit confused about the fault tolerance feature of storm. I have read
https://storm.incubator.apache.org/documentation/Guaranteeing-message-processing.html

My question is, if a bolt failed with some runtime exception, does it means
that the tuple is failed and the same tuple sent to this bolt will be
replayed again? Or it means that the message has been processed although it
failed. It already met the at least one processing, and will not be
replayed.
Or it depends on how and when I call .ack and .fail?

Also about replaying message if time out, does it mean that if a downstream
bolt takes more than the specified time out, the message will be replayed?
Or its again depending on when I call ack and fail..

Thanks for clarification.
Chen


Re: storm reliablity

2014-07-23 Thread Naresh Kosgi
All reply's occur from the spout.  Think of what you emit from the spout as
your root tuple and all other tuples generated from that tuple as child
tuples. If any downstream tuples fail you can code to reemit the same tuple
again from your spout and reply the entire set.  There is no way I know of
to reply just the bolt tuples.

When all downstream processing is complete the ack method in the spout is
called by the storm framework and fail is called when one of the tuples
downstream fails/timesout.  Storm provides you a framework to setup reply
but it is you as a developer that needs to code for it.

Hope this helps


On Wed, Jul 23, 2014 at 9:13 PM, Chen Wang chen.apache.s...@gmail.com
wrote:

 Guys,
 i am a bit confused about the fault tolerance feature of storm. I have read

 https://storm.incubator.apache.org/documentation/Guaranteeing-message-processing.html

 My question is, if a bolt failed with some runtime exception, does it
 means that the tuple is failed and the same tuple sent to this bolt will be
 replayed again? Or it means that the message has been processed although it
 failed. It already met the at least one processing, and will not be
 replayed.
 Or it depends on how and when I call .ack and .fail?

 Also about replaying message if time out, does it mean that if a
 downstream bolt takes more than the specified time out, the message will be
 replayed? Or its again depending on when I call ack and fail..

 Thanks for clarification.
 Chen



Re: storm reliablity

2014-07-23 Thread Chen Wang
So if my downstream bolt is doing some time consuming task after receiving
a signal(tuple) from the spout. e.g:

 public void execute(Tuple tuple) {
String sentence = tuple.getString(0);
while(){

// time consuming job

}
_collector.ack(tuple);
}

if the while loop takes longer than the specified time(30s), then the
storm will consider the signal failed and the .fail() will be called
in the spout. Is that correct? Whether the tuple will be replayed or
not entirely depends on my implementation in fail()?

Another case, if I move _collector.ack(tuple); before the while loop,
then the storm will consider the tuple at least succeeded already on
this bolt? Even in while loop there is exception thrown, the fail()
will not be called on spout. Is that right?

Thanks,

Chen





On Wed, Jul 23, 2014 at 6:22 PM, Naresh Kosgi nareshko...@gmail.com wrote:

 All reply's occur from the spout.  Think of what you emit from the spout
 as your root tuple and all other tuples generated from that tuple as child
 tuples. If any downstream tuples fail you can code to reemit the same tuple
 again from your spout and reply the entire set.  There is no way I know of
 to reply just the bolt tuples.

 When all downstream processing is complete the ack method in the spout is
 called by the storm framework and fail is called when one of the tuples
 downstream fails/timesout.  Storm provides you a framework to setup reply
 but it is you as a developer that needs to code for it.

 Hope this helps


 On Wed, Jul 23, 2014 at 9:13 PM, Chen Wang chen.apache.s...@gmail.com
 wrote:

 Guys,
 i am a bit confused about the fault tolerance feature of storm. I have
 read

 https://storm.incubator.apache.org/documentation/Guaranteeing-message-processing.html

 My question is, if a bolt failed with some runtime exception, does it
 means that the tuple is failed and the same tuple sent to this bolt will be
 replayed again? Or it means that the message has been processed although it
 failed. It already met the at least one processing, and will not be
 replayed.
 Or it depends on how and when I call .ack and .fail?

 Also about replaying message if time out, does it mean that if a
 downstream bolt takes more than the specified time out, the message will be
 replayed? Or its again depending on when I call ack and fail..

 Thanks for clarification.
 Chen





Re: storm reliablity

2014-07-23 Thread Naresh Kosgi
yes that is correct.

Two quick points:
1.  If you are not going to use the fail method for some kind of reply then
you probably should not anchor your tuples as storm runs a little bit
faster if reply framework is not used.
2. The timeout setting is configurable so you can set it to be more then 30
seconds if your process takes more time.


On Wed, Jul 23, 2014 at 10:46 PM, Chen Wang chen.apache.s...@gmail.com
wrote:

 So if my downstream bolt is doing some time consuming task after receiving
 a signal(tuple) from the spout. e.g:

  public void execute(Tuple tuple) {
 String sentence = tuple.getString(0);
 while(){

 // time consuming job

 }
 _collector.ack(tuple);
 }

 if the while loop takes longer than the specified time(30s), then the storm 
 will consider the signal failed and the .fail() will be called in the spout. 
 Is that correct? Whether the tuple will be replayed or not entirely depends 
 on my implementation in fail()?

 Another case, if I move _collector.ack(tuple); before the while loop, then 
 the storm will consider the tuple at least succeeded already on this bolt? 
 Even in while loop there is exception thrown, the fail() will not be called 
 on spout. Is that right?

 Thanks,

 Chen





 On Wed, Jul 23, 2014 at 6:22 PM, Naresh Kosgi nareshko...@gmail.com
 wrote:

 All reply's occur from the spout.  Think of what you emit from the spout
 as your root tuple and all other tuples generated from that tuple as child
 tuples. If any downstream tuples fail you can code to reemit the same tuple
 again from your spout and reply the entire set.  There is no way I know of
 to reply just the bolt tuples.

 When all downstream processing is complete the ack method in the spout is
 called by the storm framework and fail is called when one of the tuples
 downstream fails/timesout.  Storm provides you a framework to setup reply
 but it is you as a developer that needs to code for it.

 Hope this helps


 On Wed, Jul 23, 2014 at 9:13 PM, Chen Wang chen.apache.s...@gmail.com
 wrote:

 Guys,
 i am a bit confused about the fault tolerance feature of storm. I have
 read

 https://storm.incubator.apache.org/documentation/Guaranteeing-message-processing.html

 My question is, if a bolt failed with some runtime exception, does it
 means that the tuple is failed and the same tuple sent to this bolt will be
 replayed again? Or it means that the message has been processed although it
 failed. It already met the at least one processing, and will not be
 replayed.
 Or it depends on how and when I call .ack and .fail?

 Also about replaying message if time out, does it mean that if a
 downstream bolt takes more than the specified time out, the message will be
 replayed? Or its again depending on when I call ack and fail..

 Thanks for clarification.
 Chen