Re: [openstack-dev] [neutron][grafana][infra] how to read grafana

Ihar Hrachyshka Mon, 08 Aug 2016 07:49:13 -0700

Matthew Treinish <[email protected]> wrote:

On Mon, Aug 08, 2016 at 02:40:31PM +0200, Ihar Hrachyshka wrote:

Hi,


I was looking at grafana today, and spotted another weirdness.

See the periodic jobs dashboard:

http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=4&fullscreen

Currently it shows for me 100% failure rate for py34/oslo-master job,
starting from ~Aug 3. But when I go to openstack-health, I don’t see those
runs at all:

http://status.openstack.org/openstack-health/#/job/periodic-neutron-py34-with-neutron-lib-master

(^ The last run is July 31.)

But then when I drill down into files, I can see more recent runs, like:

http://logs.openstack.org/periodic/periodic-neutron-py34-with-neutron-lib-master/?C=M;O=A
http://logs.openstack.org/periodic/periodic-neutron-py34-with-neutron-lib-master/faa24e0/testr_results.html.gz

The last link points to a run from yesterday. And as you can see it is
passing.

That run isn't actually from yesterday, it's from July 30th. The directory shows a recent date, but the last modified dates for the individual files is older:


http://logs.openstack.org/periodic/periodic-neutron-py34-with-neutron-lib-master/faa24e0/

The openstack-health data goes up until the job started failing, this is likely because the failures occur early enough in the test run that no subunit output

is generated for the run.

So, what’s wrong with the grafana dashboard? And why doesn’t
openstack-health show the latest runs?

On the openstack-health side it looks like you're running into an issue with using subunit2sql as the primary data source there. If you look at an example

output from what's not in openstack-health, like:

http://logs.openstack.org/periodic/periodic-neutron-py34-with-neutron-lib-master/37cd5eb/console.html.gz

Nice! I guess you just picked one of those that is not present on Health dashboard? Or you did something more elaborate to come up with the link?

You'll see that the failure is occuring before any subunit output is generated. (during the discovery phase of testr) If there is no subunit file in the log output for the run, then there is nothing to populate the subunit2sql DB with.
The grafana/graphite data doesn't share this limitation because it gets
populated directly by zuul.
This is a known limitation with openstack-health right, and the plan to solve it is to add a zuul sql data store that we can query like subunit2sql for job level information, and then use subunit2sql for more fine grained details. The work on that currently depends on: https://review.openstack.org/#/c/223333/ which adds the datastore to zuul. Once that lands we can work on the openstack-health side
consume that data in conjunction with subunit2sql.

-Matt Treinish

Just want to say a huge thank you for the reply. It both pointed me to the immediate problem to solve as well as gave wider perspective on the mechanics that I should be aware of. It’s great to work in a community of individuals that so often go an extra mile for their fellow.


Ihar

signature.asc
Description: Message signed with OpenPGP using GPGMail

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][grafana][infra] how to read grafana

Reply via email to