To parallelize some code, I considered having this topology. The single [Spout] or [Bolt] represent multiple Spouts or Bolts.
*[Spout]--emit--->[Bolt A]--emit--->[Bolt B]* If any of the bolts in Bolt A emit a Tuple of value 1, and it gets processed by a certain bolt in Bolt B, then it is imperative that if any of the bolts in Bolt A again emits the value 1, it should compulsorily be processed by the same bolt in Bolt B. I assume fields grouping can handle this. To have many spouts work in parallel, my initial thoughts were to have: * Integer numberOfSpouts = 10; String partialSpoutName = "mongoSpout"; String partialBoltName = "mongoBolt"; for(Integer i = 0; i < numberOfSpouts; ++i) { String spoutName = partialSpoutName + i.toString(); String boltName = partialBoltName + i.toString(); builder.setSpout(spoutName, new MongoSpout()); builder.setBolt(boltName, new GenBolt()).shuffleGrouping(spoutName); }* But I realized it was probably not the right way, because in case I wanted all of Bolt A's tuples to go to a single Bolt B, then I'd have to include cases like this: * switch (numberOfSpouts) { case 3: builder.setBolt("sqlWriterBolt", new SqlWriterBolt(appConfig),3) .shuffleGrouping(partialBoltName+"2") .shuffleGrouping(partialBoltName+"1") .shuffleGrouping(partialBoltName+"0"); break; case 2: builder.setBolt("sqlWriterBolt", new SqlWriterBolt(appConfig),2) .shuffleGrouping(partialBoltName+"1") .shuffleGrouping(partialBoltName+"0"); break; case 1: builder.setBolt("sqlWriterBolt", new SqlWriterBolt(appConfig),1) .shuffleGrouping(partialBoltName+"0"); break; default: System.out.println("No case here"); }//switch* *So three questions:* *1. *The way I'm using the for loop above is wrong isn't it? If I use a single builder.setSpout and set the numTasks value, then Storm would create those many Spout instances automatically? *2.* When you specify something like this for fields grouping: public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("A","B","C","D")); } What does it mean? Does it mean that 4 types of tuples might be emitted? When they are received as builder.setBolt("*sqlWriterBolt*", new Bolt_AB(), 3).fieldsGrouping("*mongoBolt*", new Fields("A", "B")); Does it mean that my the first 2 tuples will go to the Bolt_AB and the next 2 tuples may go to any other bolt that receives tuples from mongoBolt, and then the next two tuples will go to Bolt_AB again? Is that how it works? *3. *If I don't specify any new Fields("A", "B"), how does Storm know the grouping? How does it decide? If I have a tuple like this: class SomeTuple extends IRichBolt { private Integer someID; public Integer getSomeID() {return someID;} } and if Bolt A sends SomeTuple to Bolt B, with SomeID value (assume a SomeID value is 9) and the next time Bolt A generates a Tuple with someID = 9 (and it may generate many tuples with someID=9), how do I ensure that Storm sees the SomeID value and decides to send it to the Bolt B instance that processes all someID's which have a value of 9? -- Regards, Navin