True, but that's what I wanted to confirm by mentioning spout S1 and S2.
Will S1 and S2 use their own n mod hash functions or is it a common
function decided by Storm? (If anyone could offer a pointer on where I
could find this in the Storm source code, I could try finding it myself too)

On Thu, Aug 11, 2016 at 2:36 PM, Gireesh Ramji <gireeshra...@yahoo.com>
wrote:

> It does not matter who hashes it as long as they all use the same hash
> function it will go to the same bolt
>
>
> ------------------------------
> *From:* Navin Ipe <navin....@searchlighthealth.com>
> *To:* user@storm.apache.org
> *Sent:* Thursday, August 11, 2016 4:56 PM
> *Subject:* Re: How long until fields grouping gets overwhelmed with data?
>
> If the hash is dynamically computed and is stateless, then that brings up
> one more question.
>
> Let's say there are two spout classes S1 and S2. I create 10 tasks of S1
> and 10 tasks of S2.
> There are 10 tasks of a bolt B.
>
> S1 and S2 are fieldsGrouped with B.
>
> I receive data x in S1 and another data x in S2.
>
> If S1's emit of x goes to task1 of B, then will S2's emit of x also go to
> task1 of B?
>
> *Basically the question is: *Is the hash value decided by the Spout or by
> Storm? Because if it is decided by the spout, then S1's emit of x can go to
> task 1 but S2's emit of x might go to some other task of the bolt, and that
> won't serve the purpose of someone who wants all x'es to go to one bolt.
>
>
>
>
> On Wed, Aug 10, 2016 at 8:58 PM, Navin Ipe <navin.ipe@searchlighthealth.
> com> wrote:
>
> Oh that's good to know. I assume it works like this: 
> https://en.wikipedia.org/wiki/
> Hash_function#Hashing_ uniformly_distributed_data
> <https://en.wikipedia.org/wiki/Hash_function#Hashing_uniformly_distributed_data>
>
> On Wed, Aug 10, 2016 at 6:23 PM, Nathan Leung <ncle...@gmail.com> wrote:
>
> It's based on a modulo of a hash of the field. The fields grouping is
> stateless.
>
> On Aug 10, 2016 8:18 AM, "Navin Ipe" <navin.ipe@searchlighthealth.c om
> <navin....@searchlighthealth.com>> wrote:
>
> Hi,
>
> For spouts to be able to continuously send a fields grouped tuple to the
> same bolt, it would have to store a key value map something like this,
> right?
>
> field1023 ---> Bolt1
> field1343 ---> Bolt3
> field1629 ---> Bolt5
> field1726 ---> Bolt1
> field1481 ---> Bolt3
>
> So if my topology runs for a very long time and the spout generates many
> unique field values, won't this key value map run out of memory eventually?
>
> OR is there a failsafe or a map limit that Storm has to handle this
> without crashing?
>
> If memory problems could happen, what would be an alternative way to solve
> this problem where many unique fields could get generated over time?
>
> --
> Regards,
> Navin
>
>
>
>
> --
> Regards,
> Navin
>
>
>
>
> --
> Regards,
> Navin
>
>
>


-- 
Regards,
Navin

Reply via email to