Trying to do "exactly once" in distributed systems is not the easiest or safest path Francisco. Making the downstream workflow idempotent is usually a lot easier and prolly a much robust solution.
On Wed, Jun 8, 2016 at 3:02 PM, Francisco Lopes <[email protected]> wrote: > Hello, > > I'm new to Storm/Trident and I'm using it to read messages from Kafka and > send them to an external API exactly once. My topology is as simple as: > > IPartitionedTridentSpout kafkaSpout = getKafkaSpout(); > TridentTopology topology = new TridentTopology(); > > topology > .newStream("kafka", kafkaSpout) > .each(new Fields("str"), new Processor(), new Fields()); > > I'm not sure how I should implement state to guarantee a message is not > processed twice. > > Can anyone please enlighten me? > > All the examples I found show states as counts or sums and that's not what > I really need. I'm inclined to use a Redis instance to store state ( > https://github.com/kstyrc/trident-redis), but I don't know on what I'd > actually persistentAggregate. > > Thank you. > > Regards, > Francisco >
