No as per the code only individual messages are replayed. On Sep 13, 2016 6:09 PM, "fanxi...@travelsky.com" <fanxi...@travelsky.com> wrote:
> Hi: > > I'd like to make clear on something about Kafka-spout referring to ack. > > For example, kafka-spout fetches offset 5000-6000 from Kafka server, but > one tuple whose offset is 5101 is failed by a bolt, the whole batch of > 5000-6000 will be remain in kafka-spout until the 5101 tuple will be acked. > If the 5101 tuple can not be acked for a long time, the batch 5000-6000 > will remain for a long time, and the kafka-spout will stop to fetch data > from kafka in these time. > > Am I right? > > > ------------------------------ > Josh > > > > *From:* Tech Id <tech.login....@gmail.com> > *Date:* 2016-09-14 06:26 > *To:* user <user@storm.apache.org> > *Subject:* Re: How will storm replay the tuple tree? > I agree with this statement about code/architecture but in case of some > system outages, like one of the end-points (Solr, Couchbase, Elastic-Search > etc.) being down temporarily, a very large number of other fully-functional > and healthy systems will receive a large number of duplicate replays > (especially in heavy throughput topologies). > > If you can elaborate a little more on the performance cost of tracking > tuples or point to a document reflecting the same, that will be of great > help. > > Best, > T.I. > > On Tue, Sep 13, 2016 at 12:26 PM, Hart, James W. <jwh...@seic.com> wrote: > >> Failures should be very infrequent, if they are not then rethink the code >> and architecture. The performance cost of tracking tuples in the way that >> would be required to replay at the failure is large, basically that method >> would slow everything way down for very infrequent failures. >> >> >> >> *From:* S G [mailto:sg.online.em...@gmail.com] >> *Sent:* Tuesday, September 13, 2016 3:17 PM >> *To:* user@storm.apache.org >> *Subject:* Re: How will storm replay the tuple tree? >> >> >> >> Hi, >> >> >> >> I am a little curious to know why we begin at the spout level for case 1. >> >> If we replay at the failing bolt's parent level (BoltA in this case), >> then it should be more performant due to a decrease in duplicate processing >> (as compared to whole tuple tree replays). >> >> >> >> If BoltA crashes due to some reason while replaying, only then the Spout >> should receive this as a failure and whole tuple tree should be replayed. >> >> >> >> This saving in duplicate processing will be more visible with several >> layers of bolts. >> >> >> >> I am sure there is a good reason to replay the whole tuple-tree, and want >> to know the same. >> >> >> >> Thanks >> >> SG >> >> >> >> On Tue, Sep 13, 2016 at 10:22 AM, P. Taylor Goetz <ptgo...@gmail.com> >> wrote: >> >> Hi Cheney, >> >> >> >> Replays happen at the spout level. So if there is a failure at any point >> in the tuple tree (the tuple tree being the anchored emits, unanchored >> emits don’t count), the original spout tuple will be replayed. So the >> replayed tuple will traverse the topology again, including unanchored >> points. >> >> >> >> If an unanchored tuple fails downstream, it will not trigger a replay. >> >> >> >> Hope this helps. >> >> >> >> -Taylor >> >> >> >> >> >> On Sep 13, 2016, at 4:42 AM, Cheney Chen <tbcql1...@gmail.com> wrote: >> >> >> >> Hi there, >> >> >> >> We're using storm 1.0.1, and I'm checking through http://storm.apache.or >> g/releases/1.0.1/Guaranteeing-message-processing.html >> >> >> >> Got questions for below two scenarios. >> >> Assume topology: S (spout) --> BoltA --> BoltB >> >> 1. S: anchored emit, BoltA: anchored emit >> >> Suppose BoltB processing failed w/ ack, what will the replay be, will it >> execute both BoltA and BoltB or only failed BoltB processing? >> >> >> >> 2. S: anchored emit, BoltA: unanchored emit >> >> Suppose BoltB processing failed w/ ack, replay will not happen, correct? >> >> >> >> -- >> >> Regards, >> Qili Chen (Cheney) >> >> E-mail: tbcql1...@gmail.com >> MP: (+1) 4086217503 >> >> >> >> >> > >