Yes, exactly. Sensors are ultimately just a few methods on top of a standard operator: https://airflow.apache.org/_modules/airflow/operators/sensors.html
The BaseSensorOperator doesn't modify how retries work. You definitely want a retry in the case of the worker running the sensor dying. But even if you have a temporary DNS outage, or drop an SSH connection - that might merit needing a retry too, depending on how the operator was implemented (whether it performs any retrying itself before causing a task failure). On Tue, Jun 5, 2018 at 8:12 PM, Pedro Machado <pe...@205datalab.com> wrote: > Hi James, > I've noticed that some dags fail if the services are restarted while a > sensor is waiting. Originally I didn't think retries would be relevant for > a time sensor but it sounds like if the worker crashes, the only way for > the sensor to rerun is if the retry count hasn't been met. Is this one of > the points you are making? > Thanks. > > On Tue, Jun 5, 2018 at 9:41 AM James Meickle <jmeic...@quantopian.com> > wrote: > > > We have to use a lot of time sensors like this, for reports that > shouldn't > > be filed to a third party before a certain time of day. Since these > sensors > > are themselves tasks, they can fail to be scheduled or can fail, like if > > the underlying worker instance dies. I would recommend double checking > your > > concurrency settings (esp. since you will have multiple days worth of > DAGs > > concurrently running) and your retry settings. > > > > On Tue, Jun 5, 2018 at 10:34 AM, Pedro Machado <pe...@205datalab.com> > > wrote: > > > > > Thanks, Max! > > > > > > On Mon, Jun 4, 2018 at 12:47 PM Maxime Beauchemin < > > > maximebeauche...@gmail.com> wrote: > > > > > > > The common standard is to have the execution_date aligned with the > > > > partition date in the database (say 2018-08-08) and contain data from > > > > 2018-08-08T00:00:000 > > > > to 2018-08-09T23:59:999. > > > > > > > > The partition date and execution_date match and correspond to the > left > > > > bound of the time interval processed. > > > > > > > > Then you'd use some sensors to make sure this cannot run until the > > > desired > > > > time or conditions are met. > > > > > > > > Max > > > > > > > > On Mon, Jun 4, 2018 at 5:46 AM Pedro Machado <pe...@205datalab.com> > > > wrote: > > > > > > > > > Hi. What is the recommended way to deal with data latency? For > > > example, I > > > > > have a feed that is not considered final until 72 hours have passed > > > after > > > > > the end of the daily period. > > > > > > > > > > For example, Monday's data would be ready by Thursday at 23:59. > > > > > > > > > > Should I pull data based on the execution date minus a 72 hour > offset > > > or > > > > > use the execution date and somehow delay the data pull for 72 > hours? > > > > > > > > > > The latter would be more intuitive (data pull date = execution > date) > > > but > > > > I > > > > > am not sure if it's a good pattern. > > > > > > > > > > Thanks, > > > > > > > > > > Pedro > > > > > > > > > > > > > > >