Hi Everyone, I am attempting to get fieldsGrouping to scale with no success. When I use localOrShuffleGrouping, I get throughput 10x faster than fieldsGrouping. In addition, with fieldsGrouping, the topology eventually slows down dramatically and tuples start to fail. Finally, when I use fieldsGrouping, ~99% of the time is spent in lmax.disruptor.ProcessingSequenceBuffer for the Bolt that has proven to be the bottleneck, whereas with localOrShuffleGrouping, 55-60%
I have tried changing the number of netty server threads, netty client threads, the send and receive buffer sizes, and nothing is working. Anyone have any thoughts on this? Thanks --John
