If I'm reading this correctly, I think you're not getting the result you want - having all tuples with a given key processed in the same bolt2 instance.
If you want to have all messages of a given key to be processed in the same Bolt2, you need to do fields grouping from bolt1 to bolt2. By doing fields grouping in the spout-bolt1 hop and shuffle/local in the bolt1-bolt2 hop, you're ensuring that bolt1 instances always see the same key, but is there any guarantee that the bolt2 you want is the nearest/only local bolt available to any given instance of bolt1? Regards, Javier On Oct 5, 2015 7:33 AM, "John Yost" <soozandjohny...@gmail.com> wrote: > Hi Everyone, > > I am currently prototyping FieldsGrouping at the KafkaSpout vs Bolt level. > I am curious as to whether anyone else has tried this and, if so, how well > this worked. > > The reason I am attempting to do FieldsGrouping in the KafkaSpout is that > I moved from fieldsGrouping to localOrShuffleGrouping between Bolt 1 and > Bolt 2 in my topology due to a 4 to 1 fan in from Bolt 1 to Bolt 2 (for > example, 200 Bolt 1 executors and 50 Bolt 2 executors) which was > dramatically slowing throughput. It is still highly preferable to do > fieldsGrouping one way or another so that I am getting all values for a > given key to the same Bolt 2 executor, which is the impetus for attempting > to do fieldsGrouping in the KafkaSpout. > > If anyone has any thoughts on this approach, I'd very much like to get > your thoughts. > > Thanks > > --John >