+1 to the idea of standardizing the format of the system test results output

On Tue, Jun 25, 2024 at 10:40 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> So we have almost everyone on board!
>
> Now we also need the Teradata team to add whatever JSON/XML we come up with
> :). In case people have not noticed, among our dashboards [1] we also have
> Teradata dashboard [2]
>
> [1]
>
> https://airflow.apache.org/ecosystem/#airflow-provider-system-test-dashboards
> [2] https://teradata.github.io/airflow/
>
> Anyone would like to take a lead on it? I am personally fine with either
> approach - Junit xml (or json version of it), or custom json is fine for
> me.
>
> J.
>
>
> On Tue, Jun 25, 2024 at 10:24 AM Pankaj Koti
> <pankaj.k...@astronomer.io.invalid> wrote:
>
> > For context, the Astronomer LLM providers dashboard operates as follows:
> >
> > 1. Fetch the latest source code for providers and system tests/example
> DAGs
> > from the Airflow repository, deploy them to an Airflow instance, and
> > execute the
> > DAGs.
> > 2. Use the Airflow API to retrieve the DAG run statuses and produce a
> JSON
> > output of these statuses.
> > 3. The dashboard, hosted on GitHub Pages, consumes the JSON data
> > generated in step 2.
> >
> > We are willing to adopt and adhere to a JSON or XML specification and a
> > model HTML view if one is established.
> >
> > Best regards,
> >
> > *Pankaj Koti*
> > Senior Software Engineer (Airflow OSS Engineering team)
> > Location: Pune, Maharashtra, India
> > Timezone: Indian Standard Time (IST)
> >
> >
> > On Mon, Jun 24, 2024 at 11:40 PM Ferruzzi, Dennis
> > <ferru...@amazon.com.invalid> wrote:
> >
> > > >  The information in our database is similar to the structure of the
> AWS
> > > providers json file
> > > >  https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> +
> > a
> > > field for logs.
> > > >  We also have an extra field that specifies the commit-id against
> which
> > > the CI was run,
> > > >  which I believe is helpful in case users want to know whether their
> PR
> > > was merged before
> > > >  or after a failure.
> > >
> > > The commit ID is a handy addition for sure, I may look into adding that
> > to
> > > the AWS dashboard.  I haven't had a chance to look into junit-xml yet,
> > but
> > > I think what we could do is agree on a minimum structure and allow for
> > > extras.   For example, logs are great, but if Google provides them and
> > AWS
> > > doesn't, that shouldn't break anything for the user trying to fetch
> logs.
> > > But the test name, timestamp, and success/fail state are definitely
> among
> > > the required minimum fields.
> > >
> > > > we could consider enforcing the presence of *some* dashboard that
> shows
> > > results of regular system tests executions for any new provider.
> > >
> > > The issue there is that smaller providers come and go, and are often
> > added
> > > by community members, not even necessarily with the provider's
> knowledge.
> > >  We can't force them to provide any support.  If Random Contributor
> adds
> > > support for a new provider, neither the contributor nor the provider
> can
> > be
> > > required to provide hosting for a dashboard and infrastructure to run
> the
> > > tests.  So (for the foreseeable future) the dashboards need to be an
> > opt-in
> > > project by/for the providers.   Maybe some day the project might be
> able
> > to
> > > provide hosting for the smaller dashboards or something, but I think
> the
> > > infrastructure to run the tests will always be optional and at the
> > expense
> > > (and effort) of some other interested party (almost certainly the
> > provider
> > > themselves, but who knows... ).
> > >
> > >
> > >  - ferruzzi
> > >
> > >
> > > ________________________________
> > > From: Michał Modras <michalmod...@google.com.INVALID>
> > > Sent: Monday, June 24, 2024 5:20 AM
> > > To: dev@airflow.apache.org
> > > Subject: RE: [EXT] System Test Dashboards - Phase Two??
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > Hi,
> > >
> > > +1 to this idea. I think standardizing the format of the presented test
> > run
> > > results makes sense. I also agree that we don't necessarily need to
> > enforce
> > > it in any hard way. However, given that we have dashboards of these
> three
> > > major providers, we could consider enforcing the presence of *some*
> > > dashboard
> > > that shows results of regular system tests executions for any new
> > provider.
> > > WDYT?
> > >
> > > Best,
> > > Michal
> > >
> > > On Sun, Jun 23, 2024 at 10:09 PM Freddy Demiane
> > > <fdemi...@google.com.invalid>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > Thank you for the comments! Indeed, +1 to the idea, I believe this
> > would
> > > be
> > > > a good step to increase the quality of providers. From our (Google)
> > side,
> > > > the dashboard's CI outputs the results to a database, which are then
> > used
> > > > to generate an HTML page. Yet, generating and publishing a JSON or a
> > > JUnit
> > > > XML style file would be a simple task for us. The information in our
> > > > database is similar to the structure of the AWS providers json file
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.json +
> > > > a field for logs. We also have an extra field that specifies the
> > > commit-id
> > > > against which the CI was run, which I believe is helpful in case
> users
> > > want
> > > > to know whether their PR was merged before or after a failure.
> > > > If we want to go with the junit-xml style format (I checked this
> > > reference
> > > > <
> > > >
> > >
> >
> https://www.ibm.com/docs/en/developer-for-zos/16.0?topic=formats-junit-xml-format
> > > > >),
> > > > one thing I could think of is to make each "Dashboard CI run"
> generate
> > an
> > > > xml file where each test is represented by a testcase, which as Jarek
> > > > mentioned, could be used in some way in the canary builds.
> > > > Let me know what you think.
> > > >
> > > > Best,
> > > > Freddy
> > > >
> > > >
> > > > On Fri, Jun 21, 2024 at 11:12 AM Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > >
> > > > > This is a fantastic idea! I love it !
> > > > >
> > > > > It also has some very far reaching possible spin-offs in the
> future -
> > > > > literally few days ago, when I discussed some of the future
> security
> > > > > related work that we might want to do, there was a concept of
> having
> > a
> > > > sort
> > > > > of CI of all CIs where we (and by we I mean wider Python ecosystem)
> > > could
> > > > > gather a status of pre-release versions of dependencies before they
> > hit
> > > > > release stage, and some kind of interchange between those CI
> systems
> > > that
> > > > > will be machine-parseable is pretty much prerequisite for that. So
> we
> > > > could
> > > > > generally try it out and sort out some issues, see how it works in
> > our
> > > > > small "airflow" world, but in the future we might be able to use
> > > similar
> > > > > mechanisms to get alerts for a number of our dependencies - and
> even
> > > > > further than that, we could make such approach much more
> wide-spread
> > (I
> > > > am
> > > > > discussing it with people from Python Software Foundation/Packaging
> > > team
> > > > /
> > > > > Python security, so there is a chance this might actually
> materialize
> > > in
> > > > a
> > > > > long term). This would be the first step.
> > > > >
> > > > > I think the first step for it could be rather simple and we do not
> > have
> > > > to
> > > > > invent our own standard - we could easily start with junit-xml
> style
> > > > output
> > > > > produced by each dashboard and available under some URL that we
> could
> > > > pull
> > > > > in our canary builds and have a step in our canary builds that
> could
> > > > > aggregate multiple xmlunit files coming from various dashboards,
> > > display
> > > > > them as the output, and fail the job in case some tests are failing
> > > (with
> > > > > maybe some thresholds). Pytest and a number of tools natively
> > supports
> > > > the
> > > > > junit-xml format, it's pretty established as machine-readable test
> > > > results,
> > > > > and I think it has all we need to start with
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.pytest.org/en/latest/how-to/usage.html#creating-junitxml-format-files
> > > > > .
> > > > > There is a lot of tooling around this format - including easy ways
> > how
> > > we
> > > > > could possibly integrate it with Github Actions output (think links
> > to
> > > > the
> > > > > tests that failed directly in GitHub UI), showing logs of failed
> > tests
> > > > etc.
> > > > > etc.
> > > > >
> > > > > If we can get the Astronomer, Amazon and Google team on board with
> > it,
> > > we
> > > > > could likely implement a simple version quickly and iterate over
> it -
> > > > later
> > > > > we could think about possibly evolving that into a more extensible
> > > > > approach.
> > > > >
> > > > > J.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 20, 2024 at 11:27 PM Ferruzzi, Dennis
> > > > > <ferru...@amazon.com.invalid> wrote:
> > > > >
> > > > > > Congrats to the Google team for getting their dashboard live, it
> > > looks
> > > > > > great!  I've been thinking of something for a while and thought
> I'd
> > > > > mention
> > > > > > it here.  I'm wearing a few different hats here so I'll try to
> > > clarify
> > > > > > context on my plural pronouns the best I can.
> > > > > >
> > > > > > Now that we [Providers] have a couple of big dashboards up, I'm
> > > curious
> > > > > if
> > > > > > we [Airflow dev community] might collaborate on a community
> > "optional
> > > > > > guideline" for a json (or yaml or whatever) format output on the
> > > > > dashboards
> > > > > > for any providers interested in participating.  I'm not
> interested
> > in
> > > > (or
> > > > > > trying to) impose any kind of hard-line policy or standard here,
> > but
> > > I
> > > > > > wonder if we [owners of the existing dashboards] might set some
> > > > > non-binding
> > > > > > precedent for future providers to join.  If others don't follow
> > suit,
> > > > > then
> > > > > > they wouldn't benefit from whatever uses folks come up with for
> the
> > > > data,
> > > > > > but I personally don't think we [Airflow] can or should try to
> > impose
> > > > > this
> > > > > > on providers.
> > > > > >
> > > > > > To my knowledge there are three provider-owned system test
> > dashboards
> > > > > > currently live, and I look forward to seeing more in time:
> > > > > >
> > > > > > Astronomer (found this LLM-specific one, not sure if there is
> > another
> > > > > > one): https://astronomer.github.io/llm-dags-dashboard/
> > > > > > AWS:
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.html
> > > > > > and
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> > > > > > Google:
> > > > > >
> > > https://storage.googleapis.com/providers-dashboard-html/dashboard.html
> > > > > >
> > > > > > Each was developed independently, and the path/name of the Google
> > one
> > > > may
> > > > > > hint that there is already an alternative to the html view that
> I'm
> > > > just
> > > > > > not familiar with, so maybe we [the three providers] could
> > > collaborate
> > > > on
> > > > > > some precedent that others could follow?  We [AWS] already have
> > ours
> > > > > > exporting in json so discussion might start there and see where
> it
> > > > goes?
> > > > > > Either way... Even if we [Airflow] don't do anything with the
> > json, I
> > > > > bet a
> > > > > > user could find interesting things to build if we give them the
> > > tools.
> > > > > >  Maybe aggregating a dashboard which monitors (and alerts?) the
> > > status
> > > > of
> > > > > > the system tests which cover the operators their workflow depends
> > on,
> > > > > > maybe?  Who knows what someone may come up with once they have
> the
> > > > tools
> > > > > to
> > > > > > mix and match the data from various providers.
> > > > > >
> > > > > > Is there any interest in the idea of a "standard json schema" for
> > > these
> > > > > > and any future system test dashboards?
> > > > > >
> > > > > >
> > > > > >  - ferruzzi
> > > > > >
> > > > >
> > > >
> > >
> >
>


-- 
Eugene

Reply via email to