Hi ,
I am processing CDC messages using storm. My topology has a 2 bolts . First
one is a bolt to write data to S3 and second one is a bolt to write to
Database.
i am using anchored tuples. Now i am facing with the issue of handling
duplicate writes.
When message is successful in S3 bolt and failure happens in DB bolt,
tuple replay is happening. During replay again S3 bolt call is invoked and
data is written to S3 again.
Any way i can have the tuple replayed only for the failed bolt.
Topology
-----------
topologyBuilder.setSpout("mdpSpout", new SQSMessageReaderSpout(queueUrl),
SPOUT_PARALLELISM);
topologyBuilder.setBolt("mdpS3Bolt", new S3WriteBolt(),
BOLT_PARALLELISM).shuffleGrouping("mdpSpout");
topologyBuilder.setBolt("dbBolt", new DbBolt(), BOLT_PARALLELISM
).shuffleGrouping("mdpS3Bolt");
Regards
Pradeep S