Hey Tom, I'm not aware of any patterns for this problem, sorry. Intuitively, I would send dead letters to a separate Kafka topic.
Best, Robert On Wed, Jul 22, 2020 at 7:22 PM Tom Fennelly <tfenne...@cloudbees.com> wrote: > Thanks Chen. > > I'm thinking about errors that occur while processing a record/message > that shouldn't be retried until after some "action" has been taken Vs > flooding the system with pointless retries e.g. > > - A side output step might involve an API call to an external system > and that system is down atm so there's no point retrying until further > notice. For this we want to be able to send something to a DLQ. > - We have some bad code that is resulting in an uncaught exception in > very specific cases. We want these to go to a DLQ and only be retried after > the appropriate fix has been made. > > The possible scenarios for this are numerous so I think my main question > would be ... are there established general Flink patterns or best practices > that can be applied for this, or is it something we'd need to hand-role on > a case by case basis with a side output type solution such as in your > example? We can do that but I just wanted to make sure I wasn't missing > anything before heading down that road. > > Regards, > > Tom. > > > On Wed, Jul 22, 2020 at 5:46 PM Chen Qin <qinnc...@gmail.com> wrote: > >> Could you more specific on what “failed message” means here? >> >> In general side output can do something like were >> >> >> >> def process(ele) { >> >> try{ >> >> biz >> >> } catch { >> >> Sideout( ele + exception context) >> >> } >> >> } >> >> >> >> process(func).sideoutput(tag).addSink(kafkasink) >> >> >> >> Thanks, >> >> Chen >> >> >> >> >> >> >> >> *From: *Eleanore Jin <eleanore....@gmail.com> >> *Sent: *Wednesday, July 22, 2020 9:25 AM >> *To: *Tom Fennelly <tfenne...@cloudbees.com> >> *Cc: *user <user@flink.apache.org> >> *Subject: *Re: Recommended pattern for implementing a DLQ with >> Flink+Kafka >> >> >> >> +1 we have a similar use case for message schema validation. >> >> >> >> Eleanore >> >> >> >> On Wed, Jul 22, 2020 at 4:12 AM Tom Fennelly <tfenne...@cloudbees.com> >> wrote: >> >> Hi. >> >> >> >> I've been searching blogs etc trying to see if there are >> established patterns/mechanisms for reprocessing of failed messages via >> something like a DLQ. I've read about using checkpointing and restarting >> tasks (not what we want because we want to keep processing forward) and >> then also how some use side outputs to filter "bad" data to a DLQ style >> topic. Kafka has dead letter topic configs too but it seems that can't >> really be used from inside Flink (from what I can see). >> >> >> >> We're running Flink+Kafka. I'm new to Flink+Kafka so not sure if there >> just isn't a defined pattern for it, or if I'm just not asking the right >> questions in my searches. I searched the archives here and don't see >> anything either, which obviously makes me think that I'm not thinking about >> this in the "Flink way" :-| >> >> >> >> Regards, >> >> >> >> Tom. >> >> >> >