It seems to me that this can be solved by allowing a user to attach some 
arbitrary data to a call to fail(), which is passed to the spout.
So there would be an override for fail in IOutputCollector which takes both the 
Tuple input and also some object to give to the spout. The spout's fail method 
would now accept an object as a second argument.

The spout can then decide what to do about the failure based on the content of 
the object.

This makes it generic, possibly useful for other things like reporting, etc. I 
only looked at the relevant code briefly, but it looks like it would also be 
relatively simple to implement. -- Kyle 

    On Tuesday, September 27, 2016 12:06 PM, Tech Id <tech.login....@gmail.com> 
wrote:
 

 Any more thoughts on this?
Seems like a useful feature for all the spouts/bolts.

On Wed, Sep 21, 2016 at 9:09 AM, S G <sg.online.em...@gmail.com> wrote:

> Thank you Aaron.
>
> We use Kafka and JMS spouts and several bolts - Elastic-Search, Solr,
> Cassandra, Couchbase and HDFS in different scenarios and need to have the
> dead letter functionality for almost all these scenarios.
> Locally we have this functionality almost ready for writing dead-letters to
> Solr or Kafka.
> I will try to contribute the same to Storm as a PR and we can then look
> into adding the failing tuple as well. I agree adding the failing tuple
> would be somewhat more complicated.
>
>
> On Tue, Sep 20, 2016 at 4:34 PM, Aaron Niskodé-Dossett <doss...@gmail.com>
> wrote:
>
> > I like the idea, especially if it can be implemented as generically as
> > possible. Ideally we could "dead letter" both the original tuple and the
> > tuple that itself failed. Intervening transformations could have changed
> > the original tuple. I realize that's adds a lot of complexity to your
> idea
> > and may not be feasible.
> > On Tue, Sep 20, 2016 at 1:15 AM S G <sg.online.em...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I want to gather some thoughts on a suggestion to provide a dead-letter
> > > functionality common to all spouts/bolts.
> > >
> > > Currently, if any spout / bolt reports a failure, it is retried by the
> > > spout.
> > > For a single bolt-failure in a large ADG, this retry logic can cause
> > > several perfectly successful component to replay and yet the Tuple
> could
> > > fail exactly at the same bolt on retry.
> > >
> > > This is fine usually (if the failure was temporary, say due to a
> network
> > > glitch) but sometimes, the message is bad enough such that it should
> not
> > be
> > > retried but at the same time important enough that its failure should
> not
> > > be ignored.
> > >
> > > Example: ElasticSearch-bolt receiving bytes from Kafka-Spout.
> > >
> > > Most of the times, it is able to deserialize the bytes correctly but
> > > sometimes a badly formatted message fails to deserialize. For such
> cases,
> > > neither Kafka-Spout should retry nor ES-bolt should report a success.
> It
> > > should however be reported to the user somehow that a badly serialized
> > > message entered the stream.
> > >
> > > For cases like temporary network glitch, the Tuple should be retried.
> > >
> > > So the proposal is to implement a dead-letter topic as:
> > >
> > > 1) Add a new method *failWithoutRetry(Tuple, Exception)* in the
> > collector.
> > > Bolts will begin using it once its available for use.
> > >
> > > 2) Provide the ability to *configure a dead-letter data-store in the
> > > spout* for
> > > failed messages reported by #1 above.
> > >
> > >
> > > The configurable data-store should support kafka, solr and redis to
> > > begin-with (Plus the option to implement one's own and dropping a jar
> > file
> > > in the classpath).
> > >
> > > Such a feature should benefit all the spouts as:
> > >
> > > 1) Topologies will not block replaying the same doomed-to-fail tuples.
> > > 2) Users can set alerts on dead-letters and find out easily actual
> > problems
> > > in their topologies rather than analyze all failed tuples only to find
> > that
> > > they failed because of a temporary network glitch.
> > > 3) Since the entire Tuple is put into the dead-letter, all the data is
> > > available for retrying after fixing the topology code.
> > >
> > > Please share your thoughts if you think it can benefit storm in a
> generic
> > > way.
> > >
> > > Thx,
> > > SG
> > >
> >
>

   

Reply via email to