Thanks, Max! On Mon, Jun 4, 2018 at 12:47 PM Maxime Beauchemin < maximebeauche...@gmail.com> wrote:
> The common standard is to have the execution_date aligned with the > partition date in the database (say 2018-08-08) and contain data from > 2018-08-08T00:00:000 > to 2018-08-09T23:59:999. > > The partition date and execution_date match and correspond to the left > bound of the time interval processed. > > Then you'd use some sensors to make sure this cannot run until the desired > time or conditions are met. > > Max > > On Mon, Jun 4, 2018 at 5:46 AM Pedro Machado <pe...@205datalab.com> wrote: > > > Hi. What is the recommended way to deal with data latency? For example, I > > have a feed that is not considered final until 72 hours have passed after > > the end of the daily period. > > > > For example, Monday's data would be ready by Thursday at 23:59. > > > > Should I pull data based on the execution date minus a 72 hour offset or > > use the execution date and somehow delay the data pull for 72 hours? > > > > The latter would be more intuitive (data pull date = execution date) but > I > > am not sure if it's a good pattern. > > > > Thanks, > > > > Pedro > > >