It surely did! Thanks for such a precint answer!
Thanks,
Baek
> On Jun 8, 2015, at 12:43 AM, Vineet Mishra wrote:
>
> Any Storm Streaming job runs in its own space and doesn't interact with other
> topology. Your tuple distribution will be across the topology within the
> number of workers on
For your case, if messages have the same field value, they will be send to
only one executor in whole topology.
Best regards,
Dmytro Dragan
On Jun 8, 2015 08:31, "Seungtack Baek"
wrote:
> Thanks a lot for such a timely response.
>
> So, even if each bolt tasks resides in different worker (differ
Any Storm Streaming job runs in its own space and doesn't interact with
other topology. Your tuple distribution will be across the topology within
the number of workers on the number of bolts defined, so for instance if
you have shuffle grouping enabled and specific data of your interest
0 1 - K
For having the unique tuple access across the Bolts use shuffle group
(otherwise for some specific use case refer to my last mail links), it will
distribute the data uniformly across all the bolts without heavily loading
any of the bolt, it basically works on the hashing principle, assign the
tuple
@Vineet,
Thanks a lot for "another" timely response!
Actually I have read that section but it wasn't still clear (to me, and I
guess to me only) whether field grouping was concerning the whole cluster
(or topology) or for the same worker only.. Maybe I am not too familiar
with the "zoo".
Thanks
Hi Seung,
You can better refer to the section Stream Groupings in the following link
attached below
https://storm.apache.org/documentation/Concepts.html
It will get you better understanding of the tuple distribution in Storm,
for clear understanding here is the pictorial representation of the sa
Thanks a lot for such a timely response.
So, even if each bolt tasks resides in different worker (different server
in our use-case), the messages go to all 32 tasks, right?
Also, this leads me into another question. (I think the answer is yes).
Given field grouping guarantees that messages with s
Hi, Seungtack!
Distribution of messages will be depends only from grouping (in case of
"shuffe grouping", Tuples are randomly distributed across the all bolt's
tasks in a way such that each bolt is guaranteed to get an equal number of
tuples.
Best regards,
Dmytro Dragan
On Jun 8, 2015 07:12, "Seu
Hi,
I have read from the documentation that if you have more spout tasks than
kafka partition, the excessive tasks will remain idle for entire lifecycle
of the topology.
Now, Let's consider 4 spout tasks, 32 bolt tasks (of one class) in 4
workers (in 4 nodes) and 2 partitions in kafka. Then 2 tas
You should emit with a message id, which will prevent too many messages
from being in flight simultaneously, which will alleviate your out of
memory conditions.
On Jun 7, 2015 5:05 AM, "Michail Toutoudakis" wrote:
> What is the best spout implementation for reading input data from file? I
> have
Somewhere in your code you are starting way too many threads (more than
thousands). I don't see that in your code you posted, so it must be in one
of the classes you haven't posted.
Are you using multithreading anywhere? Are you instantiating services that
spawn threads (like network clients)? If
I am trying to read some data from text file and process them. I am currently
using scanner. In the beginning everything works fine for the first 1
values and then it looks like no other input lines are sent to the bolt that
implements the algorithm. Finally after a few minutes of run i get
What is the best spout implementation for reading input data from file? I have
implemented a spout for reading input data from file using a scanner which
seems to perform better than buffered file reader.
However i still loose some values, not many this time about 1%, but the problem
is that af
13 matches
Mail list logo