If I'm reading this correctly, I think you're not getting the result you
want - having all tuples with a given key processed in the same bolt2
instance.

If you want to have all messages of a given key to be processed in the same
Bolt2, you need to do fields grouping from bolt1 to bolt2. By doing fields
grouping in the spout-bolt1 hop and shuffle/local in the bolt1-bolt2 hop,
you're ensuring that bolt1 instances always see the same key, but is there
any guarantee that the bolt2 you want is the nearest/only local bolt
available to any given instance of bolt1?

Regards,
Javier
On Oct 5, 2015 7:33 AM, "John Yost" <soozandjohny...@gmail.com> wrote:

> Hi Everyone,
>
> I am currently prototyping FieldsGrouping at the KafkaSpout vs Bolt level.
> I am curious as to whether anyone else has tried this and, if so, how well
> this worked.
>
> The reason I am attempting to do FieldsGrouping in the KafkaSpout is that
> I moved from fieldsGrouping to localOrShuffleGrouping between Bolt 1 and
> Bolt 2 in my topology due to a 4 to 1 fan in from Bolt 1 to Bolt 2 (for
> example, 200 Bolt 1 executors and 50 Bolt 2 executors) which was
> dramatically slowing throughput. It is still highly preferable to do
> fieldsGrouping one way or another so that I am getting all values for a
> given key to the same Bolt 2 executor, which is the impetus for attempting
> to do fieldsGrouping in the KafkaSpout.
>
> If anyone has any thoughts on this approach, I'd very much like to get
> your thoughts.
>
> Thanks
>
> --John
>

Reply via email to