Dell - Internal Use - Confidential
Nathan,

Can you explain in a little more detail what you mean by “When you have more 
tasks than executors, the spout thread does the same logic, it just does it for 
more tasks during its main loop.”  I thought the spout thread emits tuples 
based on the max spout pending and how quickly the downstream bolts are 
processing the incoming tuples.

+1 for setting number of tasks of a bolt to a higher number so that you can 
rebalance later on based on the need.

The other time I would consider having more than 1 task per executor thread  is 
when the task is IO intensive and you are waiting on the response coming back 
most of the time instead of being CPU intensive.


From: Nathan Leung [mailto:ncle...@gmail.com]
Sent: Thursday, May 14, 2015 8:05 AM
To: yang...@bupt.edu.cn
Cc: user
Subject: Re: Re: How much is the overhead of time to deploy a system on Storm ?


I would expect that it depends on how many executors you have. In storm, an 
executor corresponds to an OS thread while a task is more of a logical unit of 
work. The only situation where I would personally use more tasks than executors 
is if I wanted to over provision the tasks so that I can rebalance to use more 
executors in the future (you cannot change number of tasks in rebalance).

When you have more tasks than executors, the spout thread does the same logic, 
it just does it for more tasks during its main loop. I'm not sure why that 
would increase your per thread throughput.
On May 13, 2015 10:13 PM, "yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>" 
<yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>> wrote:
hi,Nathan
actually I tried many ways to make my program fit the Storm
1: a 'while(true)' in the nextTuple()
2: execute n times in one nextTuple.
I don't need to batch messages because what I really care is the speed it 
processes(emit phase is not the bottleneck).
I want to mention this: I only created one single spout task in one machine 
node.
and I read some papers about storm evaluation, they did some parrallelism to 
some extent. So I tried to add some parrallelism(10tasks per executor per 
node),and I got a pretty good result(the same throughout with the java program).
I wonder if this is the design pattern we should pick in storm?

________________________________
yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>

From: Nathan Leung<mailto:ncle...@gmail.com>
Date: 2015-05-12 20:57
To: user<mailto:user@storm.apache.org>
Subject: Re: How much is the overhead of time to deploy a system on Storm ?

I'm not very surprised. See for example published single machine benchmarks 
(iirc 1.6 million tuples / s is the official figure from Nathan Marz though 
that figure is a little old). This is spout to bolt and matches my observations 
for trivial cases. With some processing logic and only one spout I can see how 
it's lower.

You can reduce the overhead by batching your work differently, eg by doing more 
work in each call to nextTuple.
On May 12, 2015 4:56 AM, "Matthias J. Sax" 
<mj...@informatik.hu-berlin.de<mailto:mj...@informatik.hu-berlin.de>> wrote:
Can you share your code?

Do you process a single tuple each time nextTuple() is called? If a
spout does not emit anything, Storm applies a waiting-penalty to avoid
busy waiting. That might slow down your code.

You can configure the waiting strategy:
https://storm.apache.org/2012/09/06/storm081-released.html

-Matthias


On 05/12/2015 09:31 AM, Daniel Compton wrote:
> I'm also interested on the answers to this question, but to add to the
> discussion, take a look at
> http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html.
> I suspect Storm is still introducing coordination overhead even running
> on a single machine.
> On Tue, 12 May 2015 at 1:39 pm yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>
> <mailto:yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>> 
> <yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>
> <mailto:yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>>> wrote:
>
>     __
>     Hi and thanks .
>
>     I'm working on a parrallel algorithm, which is to count massive
>     items in data streams. The previous researches on the parallelism of
>     this algorithm were focusing on muti-core CPU, however, I want to
>     take advantage of Storm.
>
>     Processing latency is extremly important for this algorithm, and I
>     did some evaluation of the perfomance.
>
>     Firstly,  I implemented the algorithm in java(one thread, with no
>     parallelism) and I get the performance : it could process 3 million
>     items per second.
>
>     Secondly,  I wrapped this implement of the algorithm into Storm(just
>     one Spout to process) and I get the perfomance: it could process
>     only 0.75 million items per second. I changes a little bit of my
>     impletment to adapt Storm structure, but in the end the perfomance
>     is still not good....
>
>     ps. I didn't take the network overhead into consideration because I
>     just run the program in the single Spout node so that there is no
>     emit or transfer.(so I don't care how storm emits messages between
>     nodes for now ) The program on Spout is actually doing the same
>     thing as the former one.(I just copy the program into the
>     NextTuple() method with some necessary changes)
>
>     1. The degration(1/4 of the speed) is inevitable?
>     2. What incurred the degration?
>     3. How can I reduce the degration?
>
>     Thank you all.
>
>     ------------------------------------------------------------------------
>     yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn> 
> <mailto:yang...@bupt.edu.cn<mailto:yang...@bupt.edu.cn>>
>

Reply via email to