In context of AIP-71 - slightly directing your attetion there for discussion
purposes I think it would be nice to do a
dag = dag_load(ObjectStoragePath("dagfs://mydag?version=1))
Having dag versioning as an fs implementation would open up additional
interesting avenues for DAG manipulation.
BTW there is a data contract implementation that is gaining some traction:
https://github.com/datacontract/datacontract-cli
Bolke
Sent from my iPhone
> On 28 May 2024, at 16:16, Constance Martineau
> <[email protected]> wrote:
>
> Agreed. When Jed and team wrote the AIP, we intentionally limited the
> scope to DAGs since the AIPs were already really large, but the intention
> is to extend the concept to datasets.
>
> Funny that you bring up point #2. A few of us met last week to talk about
> DAG Versioning, and that use-case came up. Not only should you be allowed
> to declare the state of each version, you should also be able to pick a
> version for normally scheduled runs that is not necessarily the most recent
> (for example the most recent version tagged as prod), while also running
> other versions adhoc, such as the draft version that may have just been
> deployed. Like Kaxil said, this will be covered by AIP-66.
>
>> On Tue, May 28, 2024 at 5:52 AM Kaxil Naik <[email protected]> wrote:
>>
>> Yes to both the below questions @Elad Kalif <[email protected]>. The
>> upcoming Data-Awareness AIPs the first one and the 2nd should be covered by
>> AIP-66 once it is out of draft.
>>
>> 1. Should datasets be also versioned?
>>> 2. Should we support executing more than 1 DAG version at a given time?
>>
>>
>>> On Tue, 28 May 2024 at 10:07, Elad Kalif <[email protected]> wrote:
>>>
>>> I have a general question about (maybe somehow related to the DAG Bundle
>>> concept introduced in the AIPs)
>>> The way I see it DAGs are tightly coupled with Datasets. Tasks take
>>> dependency on dataset or/and produce a dataset.
>>> We are focused on the versions of the code (DAG) but to make this play
>>> nicely we should consider also applying versions to datasets.
>>> Granted not every change to DAG code means change in dataset version but
>> we
>>> should consider if we want to leave datasets versionless.
>>>
>>> I previously worked with some data products that allow versioning of
>> tables
>>> and it was really nice! It enabled the concept of Data Contract (treating
>>> tables much like you treat API) and it made things much easier.
>>> I sometimes even had two versions of the same workflow running one for
>> the
>>> new version and one for the deprecated version thus allowing my customers
>>> the flexibility to migrate between the table versions before the
>> deprecated
>>> version is discontinued.
>>>
>>> I am raising two main questions here:
>>> 1. Should datasets be also versioned?
>>> 2. Should we support executing more than 1 DAG version at a given time?
>>> (allow user to declare Draft/Production/Deprecated/Deleted) state for
>> each
>>> version.
>>>
>>> On Wed, Mar 6, 2024 at 1:58 AM Jed Cunningham <[email protected]>
>>> wrote:
>>>
>>>> Hello everyone!
>>>>
>>>> I'm excited to start a discussion around DAG Versioning in Airflow.
>> It's
>>>> been the most requested feature in the last 3 community surveys!
>>>>
>>>> AIP-63: DAG Versioning
>>>> <
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-63%3A+DAG+Versioning
>>>>>
>>>>
>>>> As this topic quickly becomes rather large, I've made AIP-63 an
>> umbrella
>>>> AIP and split the specifics into separate AIPs:
>>>>
>>>> AIP-64: Keep TaskInstance try history
>>>> <
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-64%3A+Keep+TaskInstance+try+history
>>>>>
>>>> AIP-65: Improve DAG history in UI
>>>> <
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-65%3A+Improve+DAG+history+in+UI
>>>>>
>>>> [WIP] AIP-66: Execution of specific DAG code versions
>>>> <
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-66%3A+Execution+of+specific+DAG+versions
>>>>>
>>>>
>>>> AIP-64 and AIP-65 are ready to be discussed in depth, while AIP-66 is
>>> there
>>>> to provide an intentionally high level vision of what we may want to
>>> tackle
>>>> before Airflow's "DAG versioning" story is complete.
>>>>
>>>> Thanks,
>>>> Jed
>>>>
>>>
>>