The scheduler should never fail hard. The schedule logic that tries to insert the new task instance should only insert a new one if it doesn't exist already and isolate that check+insert inside a database transaction.
Max On Fri, Nov 2, 2018 at 5:38 AM Abhishek Sinha <abhis...@infoworks.io> wrote: > Brian, > > We use the trigger dag CLI command to trigger it manually. > > Even when you have custom operators, the duplicate key error should not > happen right? Shouldn't the combination of task id, dag id and execution > date be unique? > > > On 30 October 2018 at 10:23:27 PM, Abhishek Sinha (abhis...@infoworks.io) > wrote: > > Max, > > The schedule interval is 1 day. > > > > Sent from my iPhone > > > On 30-Oct-2018, at 9:29 PM, Maxime Beauchemin < > maximebeauche...@gmail.com> > wrote: > > > > Also what's your schedule interval? I'm just trying to confirm that this > > isn't a "run every minute, or anytime someone blinks" kind of DAG. > > > > Max > > > > On Tue, Oct 30, 2018 at 5:48 AM Brian Greene < > > br...@heisenbergwoodworking.com> wrote: > > > >> How do you trigger it externally? > >> > >> We have several custom operators that trigger other jobs and we had to > be > >> really careful or we’d get duplicate keys for the dag run and it would > fail > >> to kick off. > >> > >> One scheduler, but we saw it repeatedly and have it noted as a thing to > >> watch out for. > >> > >> Brian > >> > >> Sent from a device with less than stellar autocorrect > >> > >>> On Oct 29, 2018, at 2:03 PM, Abhishek Sinha <abhis...@infoworks.io> > >> wrote: > >>> > >>> Attaching the scheduler crash logs as well. > >>> > >>> https://pastebin.com/B2WEJKRB > >>> > >>> > >>> > >>> > >>> Regards, > >>> > >>> Abhishek Sinha | m: +919035191078 | e: abhis...@infoworks.io > >>> > >>> > >>> On Tue, Oct 30, 2018 at 12:19 AM Abhishek Sinha <abhis...@infoworks.io > > > >>> wrote: > >>> > >>>> Max, > >>>> > >>>> We always trigger the DAG externally. I am not sure if there is still > >> any > >>>> backfill involved. > >>>> > >>>> Is there a way where I can find out in logs, if more than one instance > >> of > >>>> scheduler is running? > >>>> > >>>> > >>>> On 29 October 2018 at 10:43:19 PM, Maxime Beauchemin ( > >>>> maximebeauche...@gmail.com) wrote: > >>>> > >>>> The stacktrace seems to be pointing in that direction. Id check that > >>>> first. It seems like it **could** be a race condition with a backfill > as > >>>> well, unclear. > >>>> > >>>> It's still a bug though, and the scheduler should make sure to handle > >> this > >>>> and not raise/crash. > >>>> > >>>> On Mon, Oct 29, 2018, 10:05 AM Abhishek Sinha <abhis...@infoworks.io> > >>>> wrote: > >>>> > >>>>> Max, > >>>>> > >>>>> I do not think there was more than one instance of scheduler running. > >>>>> Since the scheduler crashed and it has been restarted, I cannot > >> confirm it > >>>>> now. Is there any log that can provide this information? > >>>>> > >>>>> Could there be a different cause apart from multiple scheduler > >> instances > >>>>> running? > >>>>> > >>>>> > >>>>> On 29 October 2018 at 9:30:56 PM, Maxime Beauchemin ( > >>>>> maximebeauche...@gmail.com) wrote: > >>>>> > >>>>> Abhishek, are you running more than one scheduler instance at once? > >>>>> > >>>>> Max > >>>>> > >>>>> On Mon, Oct 29, 2018 at 8:17 AM Abhishek Sinha < > abhis...@infoworks.io> > > >>>>> wrote: > >>>>> > >>>>>> The issue is happening more frequently now. Can someone please look > >> into > >>>>>> this? > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 24 September 2018 at 12:42:49 PM, Abhishek Sinha ( > >>>>> abhis...@infoworks.io > >>>>>> ) > >>>>>> wrote: > >>>>>> > >>>>>> Can someone please help in looking into this issue? It is critical > >> since > >>>>>> this has come up in one of our production environment. Also, this > >> issue > >>>>> has > >>>>>> appeared only once till now. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Abhishek > >>>>>> > >>>>>> On 20-Sep-2018, at 10:18 PM, Abhishek Sinha <abhis...@infoworks.io> > >>>>> wrote: > >>>>>> > >>>>>> Any update on this? > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Abhishek > >>>>>> > >>>>>> On 18-Sep-2018, at 12:48 AM, Abhishek Sinha <abhis...@infoworks.io> > >>>>> wrote: > >>>>>> > >>>>>> Pastebin: https://pastebin.com/K6BMTb5K > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Regards, > >>>>>> > >>>>>> Abhishek > >>>>>> > >>>>>> On 18-Sep-2018, at 12:31 AM, Stefan Seelmann < > m...@stefan-seelmann.de > >>> > >>>>>> wrote: > >>>>>> > >>>>>> On 9/17/18 8:19 PM, Abhishek Sinha wrote: > >>>>>> > >>>>>> Any update on this? > >>>>>> > >>>>>> Please find the scheduler error log attached. > >>>>>> > >>>>>> Can you share the full python stack trace? > >>>>>> > >>>>>> > >>>>>> Seems the mailing list doesn't allow attachments. Either post the > >>>>>> stacktrace inline, or post it somewhere at pastebin or so. > >>>>>> > >>>>> > >>>>> > >> >