Re: System Test Dashboards - Phase Two??

Ferruzzi, Dennis Tue, 25 Jun 2024 15:46:40 -0700

I was unaware of the Teradata dashboard!  Outstanding to see.

I can take point on design discussion and documentation for this, but in the 
end it'll be up to each provider to update their own infra, so there is only so 
much I can do.


I didn't really think this would catch on so enthusiastically.   One other 
thing I was thinking about but dropped from the initial idea would be adding an 
optional field to the provider.yaml with a dashboard url.  Currently it is up 
to the provider to manually add a link to the list on the ecosystem page.  If 
we make it part of the yaml file, new providers might see it when looking for a 
template and jump onboard.  It would also make the dashboards more 
programmatically-discoverable, maybe even something that can be used in 
generating a docs page and skip the manual step of adding it to the ecosystem 
page if someone wants to do that at some point.  Given the way this discussion 
caught on, maybe it should be two fields: 'dashboard-html' and 'dashboard-json' 
(or -xml or whatever we decide is the vended format).


As at least half of the existing dashboards already export some form of json, 
I'd propose we stick to that unless someone has a compelling reason to convert 
to XML?  I looked into junit-xml and I like the way they break down their 
schema, so maybe json-ify that with some tweaks?

Proposed formatting:

{
  "testsuite": {
    "provider": string,    [REQUIRED]
    "tests": int,          [REQUIRED]  // Could drop this as it's just 
len(testcases) but by that same logic it's easy enough to add it...
    "errors": int,         [REQUIRED]
    "failures": int,       [REQUIRED]
    "skipped": int,        [REQUIRED]
    "timestamp": string,   [REQUIRED]  // Standardize on UTC?
    "duration": float,     [OPTIONAL]  // Seconds?
    "properties": {},      [OPTIONAL]  // Let's make all "properties" blocks 
free-form and optional; a provider may add whatever extra values in this block 
that they want.
    "testcases": [
      {
        "name": string,        [OPTIONAL]
        "file": string,        [REQUIRED]
        "duration": float,     [OPTIONAL] // Seconds?
        "result": {
            "state": "SUCCESS" | "SKIPPED" | "FAILURE",   [REQUIRED]
            "message": string, [OPTIONAL]
            "type": string,    [OPTIONAL]  // Exception type in case of a 
failure.
        },
        "properties": {},      [OPTIONAL]  // Let's make all "properties" 
blocks free-form and optional; a provider may add whatever extra values in this 
block that they want.
      },
    ]
  }
}

Sample:


{
  "testsuite": {
    "provider": "AWS",
    "tests": 3,
    "errors": 0,
    "failures": 1,
    "skipped": 1,
    "timestamp": "2020-01-26T13:45:02",
    "duration": 139.89,
    "properties": {
      "commit": "ef7bebf",
      "executor": "celery",
    },
    "testcases": [
      // Example successful test
      {
        "name": "example_appflow",
        "file": "tests/system/providers/amazon/aws/example_appflow.py",
        "duration": 45.87,
        "result": {
            "state": "SUCCESS"
        },
        "properties": {
          "source": 
"https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_appflow.py";,
          "operators": [
            "AppflowRunOperator",
            "S3CreateBucketOperator",
            "S3CreateObjectOperator",
            "S3DeleteBucketOperator"
          ]
        }
      },
      // Example of a test case that was skipped.
      {
        "name": "example_athena",
        "file": "tests/system/providers/amazon/aws/example_athena.py",
        "duration": 0.01,
        "result": {
            "state": "SKIPPED",
            "message": "Message explaining why."
        },
        "properties": {
          "source": 
"https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_athena.py";,
          "operators": [
            "AthenaOperator",
            "S3CreateBucketOperator",
            "S3CreateObjectOperator",
            "S3DeleteBucketOperator"
          ]
        }
      },
      // Example of a test case that failed.
      {
        "name": "example_batch",
        "file": "tests/system/providers/amazon/aws/example_batch.py",
        "duration": 94.01,
        "result": {
            "state": "FAILURE",
            "message": "Some failure message, maybe a link to logs or a stack 
trace?",
            "type": "AssertionError",
        },
        "properties": {
          "source": 
"https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_batch.py";,
          "operators": [
            "BatchCreateComputeEnvironmentOperator",
            "BatchComputeEnvironmentSensor",
            "BatchJobQueueSensor",
            "BatchOperator",
            "BatchSensor",
          ]
        }
      },
    ]
  }
}


 - ferruzzi


________________________________
From: Eugen Kosteev <[email protected]>
Sent: Tuesday, June 25, 2024 6:04 AM
To: [email protected]
Subject: RE: [EXT] System Test Dashboards - Phase Two??

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1 to the idea of standardizing the format of the system test results output

On Tue, Jun 25, 2024 at 10:40 AM Jarek Potiuk <[email protected]> wrote:

> So we have almost everyone on board!
>
> Now we also need the Teradata team to add whatever JSON/XML we come up with
> :). In case people have not noticed, among our dashboards [1] we also have
> Teradata dashboard [2]
>
> [1]
>
> https://airflow.apache.org/ecosystem/#airflow-provider-system-test-dashboards
> [2] https://teradata.github.io/airflow/
>
> Anyone would like to take a lead on it? I am personally fine with either
> approach - Junit xml (or json version of it), or custom json is fine for
> me.
>
> J.
>
>
> On Tue, Jun 25, 2024 at 10:24 AM Pankaj Koti
> <[email protected]> wrote:
>
> > For context, the Astronomer LLM providers dashboard operates as follows:
> >
> > 1. Fetch the latest source code for providers and system tests/example
> DAGs
> > from the Airflow repository, deploy them to an Airflow instance, and
> > execute the
> > DAGs.
> > 2. Use the Airflow API to retrieve the DAG run statuses and produce a
> JSON
> > output of these statuses.
> > 3. The dashboard, hosted on GitHub Pages, consumes the JSON data
> > generated in step 2.
> >
> > We are willing to adopt and adhere to a JSON or XML specification and a
> > model HTML view if one is established.
> >
> > Best regards,
> >
> > *Pankaj Koti*
> > Senior Software Engineer (Airflow OSS Engineering team)
> > Location: Pune, Maharashtra, India
> > Timezone: Indian Standard Time (IST)
> >
> >
> > On Mon, Jun 24, 2024 at 11:40 PM Ferruzzi, Dennis
> > <[email protected]> wrote:
> >
> > > >  The information in our database is similar to the structure of the
> AWS
> > > providers json file
> > > >  https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> +
> > a
> > > field for logs.
> > > >  We also have an extra field that specifies the commit-id against
> which
> > > the CI was run,
> > > >  which I believe is helpful in case users want to know whether their
> PR
> > > was merged before
> > > >  or after a failure.
> > >
> > > The commit ID is a handy addition for sure, I may look into adding that
> > to
> > > the AWS dashboard.  I haven't had a chance to look into junit-xml yet,
> > but
> > > I think what we could do is agree on a minimum structure and allow for
> > > extras.   For example, logs are great, but if Google provides them and
> > AWS
> > > doesn't, that shouldn't break anything for the user trying to fetch
> logs.
> > > But the test name, timestamp, and success/fail state are definitely
> among
> > > the required minimum fields.
> > >
> > > > we could consider enforcing the presence of *some* dashboard that
> shows
> > > results of regular system tests executions for any new provider.
> > >
> > > The issue there is that smaller providers come and go, and are often
> > added
> > > by community members, not even necessarily with the provider's
> knowledge.
> > >  We can't force them to provide any support.  If Random Contributor
> adds
> > > support for a new provider, neither the contributor nor the provider
> can
> > be
> > > required to provide hosting for a dashboard and infrastructure to run
> the
> > > tests.  So (for the foreseeable future) the dashboards need to be an
> > opt-in
> > > project by/for the providers.   Maybe some day the project might be
> able
> > to
> > > provide hosting for the smaller dashboards or something, but I think
> the
> > > infrastructure to run the tests will always be optional and at the
> > expense
> > > (and effort) of some other interested party (almost certainly the
> > provider
> > > themselves, but who knows... ).
> > >
> > >
> > >  - ferruzzi
> > >
> > >
> > > ________________________________
> > > From: Michał Modras <[email protected]>
> > > Sent: Monday, June 24, 2024 5:20 AM
> > > To: [email protected]
> > > Subject: RE: [EXT] System Test Dashboards - Phase Two??
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > Hi,
> > >
> > > +1 to this idea. I think standardizing the format of the presented test
> > run
> > > results makes sense. I also agree that we don't necessarily need to
> > enforce
> > > it in any hard way. However, given that we have dashboards of these
> three
> > > major providers, we could consider enforcing the presence of *some*
> > > dashboard
> > > that shows results of regular system tests executions for any new
> > provider.
> > > WDYT?
> > >
> > > Best,
> > > Michal
> > >
> > > On Sun, Jun 23, 2024 at 10:09 PM Freddy Demiane
> > > <[email protected]>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > Thank you for the comments! Indeed, +1 to the idea, I believe this
> > would
> > > be
> > > > a good step to increase the quality of providers. From our (Google)
> > side,
> > > > the dashboard's CI outputs the results to a database, which are then
> > used
> > > > to generate an HTML page. Yet, generating and publishing a JSON or a
> > > JUnit
> > > > XML style file would be a simple task for us. The information in our
> > > > database is similar to the structure of the AWS providers json file
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.json +
> > > > a field for logs. We also have an extra field that specifies the
> > > commit-id
> > > > against which the CI was run, which I believe is helpful in case
> users
> > > want
> > > > to know whether their PR was merged before or after a failure.
> > > > If we want to go with the junit-xml style format (I checked this
> > > reference
> > > > <
> > > >
> > >
> >
> https://www.ibm.com/docs/en/developer-for-zos/16.0?topic=formats-junit-xml-format
> > > > >),
> > > > one thing I could think of is to make each "Dashboard CI run"
> generate
> > an
> > > > xml file where each test is represented by a testcase, which as Jarek
> > > > mentioned, could be used in some way in the canary builds.
> > > > Let me know what you think.
> > > >
> > > > Best,
> > > > Freddy
> > > >
> > > >
> > > > On Fri, Jun 21, 2024 at 11:12 AM Jarek Potiuk <[email protected]>
> > wrote:
> > > >
> > > > > This is a fantastic idea! I love it !
> > > > >
> > > > > It also has some very far reaching possible spin-offs in the
> future -
> > > > > literally few days ago, when I discussed some of the future
> security
> > > > > related work that we might want to do, there was a concept of
> having
> > a
> > > > sort
> > > > > of CI of all CIs where we (and by we I mean wider Python ecosystem)
> > > could
> > > > > gather a status of pre-release versions of dependencies before they
> > hit
> > > > > release stage, and some kind of interchange between those CI
> systems
> > > that
> > > > > will be machine-parseable is pretty much prerequisite for that. So
> we
> > > > could
> > > > > generally try it out and sort out some issues, see how it works in
> > our
> > > > > small "airflow" world, but in the future we might be able to use
> > > similar
> > > > > mechanisms to get alerts for a number of our dependencies - and
> even
> > > > > further than that, we could make such approach much more
> wide-spread
> > (I
> > > > am
> > > > > discussing it with people from Python Software Foundation/Packaging
> > > team
> > > > /
> > > > > Python security, so there is a chance this might actually
> materialize
> > > in
> > > > a
> > > > > long term). This would be the first step.
> > > > >
> > > > > I think the first step for it could be rather simple and we do not
> > have
> > > > to
> > > > > invent our own standard - we could easily start with junit-xml
> style
> > > > output
> > > > > produced by each dashboard and available under some URL that we
> could
> > > > pull
> > > > > in our canary builds and have a step in our canary builds that
> could
> > > > > aggregate multiple xmlunit files coming from various dashboards,
> > > display
> > > > > them as the output, and fail the job in case some tests are failing
> > > (with
> > > > > maybe some thresholds). Pytest and a number of tools natively
> > supports
> > > > the
> > > > > junit-xml format, it's pretty established as machine-readable test
> > > > results,
> > > > > and I think it has all we need to start with
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.pytest.org/en/latest/how-to/usage.html#creating-junitxml-format-files
> > > > > .
> > > > > There is a lot of tooling around this format - including easy ways
> > how
> > > we
> > > > > could possibly integrate it with Github Actions output (think links
> > to
> > > > the
> > > > > tests that failed directly in GitHub UI), showing logs of failed
> > tests
> > > > etc.
> > > > > etc.
> > > > >
> > > > > If we can get the Astronomer, Amazon and Google team on board with
> > it,
> > > we
> > > > > could likely implement a simple version quickly and iterate over
> it -
> > > > later
> > > > > we could think about possibly evolving that into a more extensible
> > > > > approach.
> > > > >
> > > > > J.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 20, 2024 at 11:27 PM Ferruzzi, Dennis
> > > > > <[email protected]> wrote:
> > > > >
> > > > > > Congrats to the Google team for getting their dashboard live, it
> > > looks
> > > > > > great!  I've been thinking of something for a while and thought
> I'd
> > > > > mention
> > > > > > it here.  I'm wearing a few different hats here so I'll try to
> > > clarify
> > > > > > context on my plural pronouns the best I can.
> > > > > >
> > > > > > Now that we [Providers] have a couple of big dashboards up, I'm
> > > curious
> > > > > if
> > > > > > we [Airflow dev community] might collaborate on a community
> > "optional
> > > > > > guideline" for a json (or yaml or whatever) format output on the
> > > > > dashboards
> > > > > > for any providers interested in participating.  I'm not
> interested
> > in
> > > > (or
> > > > > > trying to) impose any kind of hard-line policy or standard here,
> > but
> > > I
> > > > > > wonder if we [owners of the existing dashboards] might set some
> > > > > non-binding
> > > > > > precedent for future providers to join.  If others don't follow
> > suit,
> > > > > then
> > > > > > they wouldn't benefit from whatever uses folks come up with for
> the
> > > > data,
> > > > > > but I personally don't think we [Airflow] can or should try to
> > impose
> > > > > this
> > > > > > on providers.
> > > > > >
> > > > > > To my knowledge there are three provider-owned system test
> > dashboards
> > > > > > currently live, and I look forward to seeing more in time:
> > > > > >
> > > > > > Astronomer (found this LLM-specific one, not sure if there is
> > another
> > > > > > one): https://astronomer.github.io/llm-dags-dashboard/
> > > > > > AWS:
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.html
> > > > > > and
> > > > https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> > > > > > Google:
> > > > > >
> > > https://storage.googleapis.com/providers-dashboard-html/dashboard.html
> > > > > >
> > > > > > Each was developed independently, and the path/name of the Google
> > one
> > > > may
> > > > > > hint that there is already an alternative to the html view that
> I'm
> > > > just
> > > > > > not familiar with, so maybe we [the three providers] could
> > > collaborate
> > > > on
> > > > > > some precedent that others could follow?  We [AWS] already have
> > ours
> > > > > > exporting in json so discussion might start there and see where
> it
> > > > goes?
> > > > > > Either way... Even if we [Airflow] don't do anything with the
> > json, I
> > > > > bet a
> > > > > > user could find interesting things to build if we give them the
> > > tools.
> > > > > >  Maybe aggregating a dashboard which monitors (and alerts?) the
> > > status
> > > > of
> > > > > > the system tests which cover the operators their workflow depends
> > on,
> > > > > > maybe?  Who knows what someone may come up with once they have
> the
> > > > tools
> > > > > to
> > > > > > mix and match the data from various providers.
> > > > > >
> > > > > > Is there any interest in the idea of a "standard json schema" for
> > > these
> > > > > > and any future system test dashboards?
> > > > > >
> > > > > >
> > > > > >  - ferruzzi
> > > > > >
> > > > >
> > > >
> > >
> >
>


--
Eugene

Re: System Test Dashboards - Phase Two??

Reply via email to