Actually that figure is from a Nathan Marz tweet, but he also cites the
million mark here: http://nathanmarz.com/blog/storms-1st-birthday.html

When I saw this type of throughout it was with a canned example that I
created solely for testing throughput.  Also it was run on pretty beefy
hardware so ymmv.
On May 13, 2015 9:24 AM, "Jeffery Maass" <maas...@gmail.com> wrote:

> Nathan:
>
> Where can I find this?
> "See for example published single machine benchmarks"
>
> Thank you for your time!
>
> +++++++++++++++++++++
> Jeff Maass <maas...@gmail.com>
> linkedin.com/in/jeffmaass
> stackoverflow.com/users/373418/maassql
> +++++++++++++++++++++
>
>
> On Tue, May 12, 2015 at 7:57 AM, Nathan Leung <ncle...@gmail.com> wrote:
>
>> I'm not very surprised. See for example published single machine
>> benchmarks (iirc 1.6 million tuples / s is the official figure from Nathan
>> Marz though that figure is a little old). This is spout to bolt and matches
>> my observations for trivial cases. With some processing logic and only one
>> spout I can see how it's lower.
>>
>> You can reduce the overhead by batching your work differently, eg by
>> doing more work in each call to nextTuple.
>> On May 12, 2015 4:56 AM, "Matthias J. Sax" <mj...@informatik.hu-berlin.de>
>> wrote:
>>
>>> Can you share your code?
>>>
>>> Do you process a single tuple each time nextTuple() is called? If a
>>> spout does not emit anything, Storm applies a waiting-penalty to avoid
>>> busy waiting. That might slow down your code.
>>>
>>> You can configure the waiting strategy:
>>> https://storm.apache.org/2012/09/06/storm081-released.html
>>>
>>> -Matthias
>>>
>>>
>>> On 05/12/2015 09:31 AM, Daniel Compton wrote:
>>> > I'm also interested on the answers to this question, but to add to the
>>> > discussion, take a look at
>>> >
>>> http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html
>>> .
>>> > I suspect Storm is still introducing coordination overhead even running
>>> > on a single machine.
>>> > On Tue, 12 May 2015 at 1:39 pm yang...@bupt.edu.cn
>>> > <mailto:yang...@bupt.edu.cn> <yang...@bupt.edu.cn
>>> > <mailto:yang...@bupt.edu.cn>> wrote:
>>> >
>>> >     __
>>> >     Hi and thanks .
>>> >
>>> >     I'm working on a parrallel algorithm, which is to count massive
>>> >     items in data streams. The previous researches on the parallelism
>>> of
>>> >     this algorithm were focusing on muti-core CPU, however, I want to
>>> >     take advantage of Storm.
>>> >
>>> >     Processing latency is extremly important for this algorithm, and I
>>> >     did some evaluation of the perfomance.
>>> >
>>> >     Firstly,  I implemented the algorithm in java(one thread, with no
>>> >     parallelism) and I get the performance : it could process 3 million
>>> >     items per second.
>>> >
>>> >     Secondly,  I wrapped this implement of the algorithm into
>>> Storm(just
>>> >     one Spout to process) and I get the perfomance: it could process
>>> >     only 0.75 million items per second. I changes a little bit of my
>>> >     impletment to adapt Storm structure, but in the end the perfomance
>>> >     is still not good....
>>> >
>>> >     ps. I didn't take the network overhead into consideration because I
>>> >     just run the program in the single Spout node so that there is no
>>> >     emit or transfer.(so I don't care how storm emits messages between
>>> >     nodes for now ) The program on Spout is actually doing the same
>>> >     thing as the former one.(I just copy the program into the
>>> >     NextTuple() method with some necessary changes)
>>> >
>>> >     1. The degration(1/4 of the speed) is inevitable?
>>> >     2. What incurred the degration?
>>> >     3. How can I reduce the degration?
>>> >
>>> >     Thank you all.
>>> >
>>> >
>>>  ------------------------------------------------------------------------
>>> >     yang...@bupt.edu.cn <mailto:yang...@bupt.edu.cn>
>>> >
>>>
>>>
>

Reply via email to