Process/Thread ids for bolts and spouts

2016-01-19 Thread Milind Vaidya
Is there any way to know process/thread ids of kafka spout and underlying bolts in a topology on linux command line ? As an extension of other thread about failure scenarios, I want to kill manually these individual workers/executers/tasks if possible to simulate corresponding failure scenarios an

Re: Data loss scenarios

2016-01-19 Thread Milind Vaidya
Yes. In a sunny day scenario there is no data loss. But we are trying to list some cases where there will be a data loss, or at least we want to consider different scenarios in which one or more components fail and see how the kafka-storm set up reacts to that and if there is any data loss. We had

Re: How to redeploy the topology

2016-01-19 Thread Matthias J. Sax
That is not possible. You first need to "kill" the running topology, and then resubmit with the new code... -Matthias On 01/19/2016 05:49 PM, Noppanit Charassinvichai wrote: > Right now I'm using storm jar command to deploy the topology to the > storm cluster. I have setup Jenkins to deploy the

Re: STDIO [INFO] java.io.FileNotFoundException

2016-01-19 Thread Kashyap Mhaisekar
If project worked fine in local mode, then it means that it is referring to a path that is available in local but not on cluster. Would suggest going through the code and figuring out where the file is meant to be at and make changes on cluster accordingly. On Tue, Jan 19, 2016 at 11:05 AM, resea

Re: How to redeploy the topology

2016-01-19 Thread Noppanit Charassinvichai
Doh! I totally miss this. http://storm.apache.org/documentation/Command-line-client.html On Tue, 19 Jan 2016 at 13:22 Noppanit Charassinvichai wrote: > How can I tell storm to stop the current topology from command line? > Ultimately, I want to automate the process. > > On Tue, 19 Jan 2016 at 13

Re: How to redeploy the topology

2016-01-19 Thread Noppanit Charassinvichai
How can I tell storm to stop the current topology from command line? Ultimately, I want to automate the process. On Tue, 19 Jan 2016 at 13:00 Stephen Powis wrote: > You would need to stop or kill the running topology first. There is an > argument to the stop command that tells storm how long (i

Re: How to redeploy the topology

2016-01-19 Thread Stephen Powis
You would need to stop or kill the running topology first. There is an argument to the stop command that tells storm how long (in secs) to wait before killing the topology. My understanding of how this works is when you issue the stop command, the topology simply disables the spouts in the topolo

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread Noppanit Charassinvichai
Ahhh got it!. Thanks a lot. On Tue, 19 Jan 2016 at 12:23 John Yost wrote: > Close. :) The tuples that failed to process in the failed/failing bolt > will be replayed, and the failed bolt executor will be restarted within a > new worker process. > > --John > > On Tue, Jan 19, 2016 at 12:18 PM, No

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread John Yost
Close. :) The tuples that failed to process in the failed/failing bolt will be replayed, and the failed bolt executor will be restarted within a new worker process. --John On Tue, Jan 19, 2016 at 12:18 PM, Noppanit Charassinvichai < noppani...@gmail.com> wrote: > Thanks for your reply. So that m

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread Noppanit Charassinvichai
Thanks for your reply. So that means if I don't do any fieldsGrouping. The bolt that fails will be replayed? Because my topology only uses shuffleGrouping. Thanks very much. On Tue, 19 Jan 2016 at 11:00 John Yost wrote: > Storm would replay the tuple and there is no guarantee which bolt it g

The best way to unit test a Bolt with dependency?

2016-01-19 Thread Noppanit Charassinvichai
I'm trying to unit test my Bolt which has dependency to RabbitMQ. This is what I put in my prepare method. @Override public void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) { this.outputCollector = outputCollector; this.gson = new Gso

Re: STDIO [INFO] java.io.FileNotFoundException

2016-01-19 Thread researcher cs
i checked it before , this file contains like 2too 4for 4getforget abtabout it's supposed code using the content of this file , but i checked jar file after mvn it and didn't find this file project worked fine in Local Mode but not working in Distributed so i checked again Logs f

How to redeploy the topology

2016-01-19 Thread Noppanit Charassinvichai
Right now I'm using storm jar command to deploy the topology to the storm cluster. I have setup Jenkins to deploy the code. However, if I want to redeploy again how can I deploy to not interrupt the current streaming because I would get the error saying the topology name already exists? Thanks

Re: STDIO [INFO] java.io.FileNotFoundException

2016-01-19 Thread Kashyap Mhaisekar
You should probably check the code to see what is file.txt. Storm has nothing to do with file.txt and whether it is [ERROR] or [INFO] depends on how the logger has been logging. So again, check the code On Tue, Jan 19, 2016 at 1:26 AM, researcher cs wrote: > i mean from last question is that err

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread John Yost
Storm would replay the tuple and there is no guarantee which bolt it goes to unless you are using fieldsGrouping. --John On Tue, Jan 19, 2016 at 10:56 AM, Noppanit Charassinvichai < noppani...@gmail.com> wrote: > Also, if I have one spout and two bolts. If both two bolts call ack() and > if one

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread Noppanit Charassinvichai
Also, if I have one spout and two bolts. If both two bolts call ack() and if one of them fail. Would the tuple will be replayed only for just that bolt? Or both of the bolts? On Tue, 19 Jan 2016 at 10:42 John Yost wrote: > Yes, acking a tuple confirms to Storm that a tuple was processed within a

Re: Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread John Yost
Yes, acking a tuple confirms to Storm that a tuple was processed within a Spout or Bolt, and is used to guarantee at least once processing for all tuples processed by your topology. --John On Tue, Jan 19, 2016 at 9:49 AM, Noppanit Charassinvichai < noppani...@gmail.com> wrote: > I'm new to Storm

Is calling ack() in Bolt for only guarantee?

2016-01-19 Thread Noppanit Charassinvichai
I'm new to Storm. And I've seen some of the examples that do not call `ack()` in Bolt. From the documentation and my understanding, is calling ack() is just for guaranteeing that the msg will be processed at least once? Thanks,

Re: How to retrieve the offsets of messages storm-kafka

2016-01-19 Thread Abhishek Agarwal
Not in 0.10.0 but it looks like it is possible in master. https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/MessageMetadataScheme.java On Tue, Jan 19, 2016 at 4:08 PM, Florian Hussonnois wrote: > Hi all, > > I would to know if there is way to get the

How to retrieve the offsets of messages storm-kafka

2016-01-19 Thread Florian Hussonnois
Hi all, I would to know if there is way to get the offset/partition of each message using the KafkaSpout ? Thank in advance -- Florian HUSSONNOIS