Sorry about being late to this. Love this change. Will look deep into this
in the next couple of days.

Thank you,
Kevin Y

On Sat, Feb 23, 2019 at 10:54 PM Tao Feng <[email protected]> wrote:

> I manage to test out the latest version of the branch which I think all my
> previous concerns don't exist anymore. Great job! The latest screenshot for
> the new graph view could be found in the pr comment.
>
> On Sat, Feb 23, 2019 at 3:30 PM Tao Feng <[email protected]> wrote:
>
> > Thanks for bringing this up.  I look at the pr before and here recaps
> some
> > of the previous concerns:
> > 1. There will have a new *giant *button next to "go" button in graph
> > view(it could be found in the screenshot in the pr) based on my last read
> > of the pr. It even exists after DAGRun is created. I don't think it is
> > ideal. Have you removed the button or not? I understand we need to read
> the
> > DAG File when run doesn't exist. But IMO we need a better UI than this.
> If
> > the latest UI removed the button, how user found out this link(
> > http://host/graph?dag_id=my_dag&read_from_file=True) ?
> > 2. Previous discussion mentioned that it requires user to run `airflow
> > resetdb` if there are historical DAGRun. Is it still the case or not?
> > 3. During Andrew's testing, he found that the DAG shape keeps changing
> > during the backfill. Have we fixed the issue?
> > 4. By reading the pr, I recall the graph_id is always increment by 1
> based
> > on last graph_id. The last graph_id matches the latest execution date
> > DAGRun. What happens to the case when the latest execution date doesn't
> > change while user backfill multiple days? Will all these backfill DAGRuns
> > get the same graph id in this case? Is it intentded?
> >
> > Let me if the latest version made this change. And I would suggest the
> > community to test out the pr in your setup(with historical DAGRuns) if
> > possible as it is potentially a big(breaking) change.  We would like to
> > make sure nothing unexpected happens if we are going with this direction.
> >
> > Thanks,
> > -Tao
> >
> > On Sat, Feb 23, 2019 at 11:41 AM Bas Harenslak <
> > [email protected]> wrote:
> >
> >> Let's discuss AIP-12 here:
> >>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-12+Persist+DAG+into+DB
> .
> >> It involves persisting the entire DAG into the metastore. For full
> details,
> >> please read the AIP.
> >>
> >> A PR was made to create “versioned graphs” given by option #3 in the
> AIP:
> >> https://github.com/apache/airflow/pull/4396. This led to a long
> >> discussion but has been quiet for the last few days. It would be sad to
> see
> >> the effort put in by @ffinfo not leading to anything. To recap the
> summary
> >> at the end of the PR, the current status is:
> >>
> >> Internal changes:
> >>
> >>
> >>   *   This PR persists task dependencies in a new table `dag_edge`.
> >>   *   The term "graph" is introduced in the code, this contains the
> >> structure of a DAG, so the "edges" (dependencies) and "nodes" (tasks).
> >>   *   A DagRun is bound to one `graph_id`.
> >>   *   Currently in Airflow only the latest version of a DAG is displayed
> >> in the UI (both graph & tree view). This means if you delete a task, you
> >> cannot see runs of that task in the past anymore.
> >>   *   In the graph view you can now see different graph versions,
> because
> >> we store both tasks and edges.
> >>   *   For the record: in the tree view you still only the latest version
> >> because it is not possible to combine all history into a single view.
> >>
> >> Changes from a user perspective:
> >>
> >>
> >>   *   Nothing in the tree view.
> >>   *   In the graph view, you can now view different "graphs" if you
> >> change the structure of your DAG. Note the graph view shows DAG runs. If
> >> you change your DAG without running it, it does not show in the graph
> view.
> >>   *   When you have no DAG runs, there is no graph to show. So, as
> >> @ffinfo described above he then reads the graph from the DAG file
> instead.
> >> You can see this behaviour in the graph view url:
> >>      *   if DagRuns exist: http://host/graph?dag_id=my_dag
> >>      *   if no DagRuns exist:
> >> http://host/graph?dag_id=my_dag&read_from_file=True
> >>   *   In the screenshots in
> >> https://github.com/apache/airflow/pull/4396#issuecomment-465217731, you
> >> see this case. Since this is more of an internal thing how Airflow
> works,
> >> and not really informative for the user, @ffinfo removed the message in
> his
> >> last commit.
> >>
> >> Judging by the PR comments, everybody likes the idea of persisting more
> >> of the DAG in the DB. All issues mentioned were addressed. It would be
> >> great to see this work merged in Airflow, so please discuss anything
> about
> >> the PR/AIP here.
> >>
> >> Cheers,
> >> Bas
> >>
> >
>

Reply via email to