The spout emit batches of 100 ids to process, some steps are faster to be executed in batches, like fetching data from the database which is done with an aggregator that emit all the same rows with additional values
we need Trident because we need joins, merge, aggregator, etc but each batches are independant..., as my colleage said, with a maxSpoutSpending > 1 in our context, it's acceptable that the second batch can finish before the first one, but currently, it waits that the first batch is completed, which made our processing slower. is it possible to keep the Trident and its features, but allowing unordering batch processing Is it a problem of kind of Spout, or because we use a StateUpdater at the end? we tried to remove the StateUpdater and use an aggregator but it does not help is it clearer? ________________________________ De : Pascal Arnal <[email protected]> Envoyé : 21 novembre 2014 12:15 À : [email protected] Objet : RE: Trident topology This post of one colleague is about the same thing. https://mail-archives.apache.org/mod_mbox/storm-user/201401.mbox/%3c2730f9f8f8a44d16858c346886978...@by2pr08mb144.namprd08.prod.outlook.com%3E ________________________________ De : Brunner, Bill <[email protected]> Envoyé : 21 novembre 2014 12:04 À : [email protected] Objet : RE: Trident topology Still not very clear From: Pascal Arnal [mailto:[email protected]] Sent: Friday, November 21, 2014 9:33 AM To: [email protected] Subject: RE: Trident topology any help? ________________________________ De : Pascal Arnal <[email protected]<mailto:[email protected]>> Envoyé : 20 novembre 2014 14:01 À : [email protected]<mailto:[email protected]> Objet : RE: Trident topology If i run one topology with max spout pending of 3, actual execution of stateupdater is batch 1 then batch 2 then batch 3, and one new batch 4 is generated after commit of batch 1, batch 5 after batch 2 .... If batch 2 finish its execution before batch 1, it should wait that batch 1 is commited. I don't want that it waits and i want the sequence in stateupdater batch 2 then batch 1 then batch 3 ... and one new batch 4 after batch 2, batch 5 after batch 1 .... is-it more clear, and is-it possible ? Thanks ________________________________ De : P. Taylor Goetz <[email protected]<mailto:[email protected]>> Envoyé : 20 novembre 2014 12:53 À : [email protected]<mailto:[email protected]> Objet : Re: Trident topology Hi Pascal, I'm not sure I understand what you are asking. Could you elaborate? -Taylor On Nov 20, 2014, at 10:52 AM, Pascal Arnal <[email protected]<mailto:[email protected]>> wrote: nobody for response ? Should I create one issue / feature in Jira ? ________________________________ De : Pascal Arnal <[email protected]<mailto:[email protected]>> Envoyé : 19 novembre 2014 10:58 À : [email protected]<mailto:[email protected]> Objet : Trident topology Hi, I try to build one topology with trident for some functions, filters and aggregators. I don't care about transaction and I would like that my batchs are unordered. I use IBatchSpout for the Spout and BaseStateUpdater for the updater with TridentState. Is-it possible to build one topology with my required ? May be with another state updater, or simply by using aggregator ? Thanks ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.
