Re: [openstack-dev] Auditing Openstack

Jacob Bushman Wed, 31 Jul 2013 08:41:31 -0700

That is an excellent point I do think that catching this in thefunctional testing at the gate would be a great idea.


On 07/30/2013 08:55 PM, Sean Dague wrote:

I would definitely encourage you to think about how we could apply attool like this in the OpenStack gate itself as you go through theprocess of openning it up. If we could catch those kinds ofcorruptions before the commits land we move the cost of finding thoseproblem way down.
It obviously won't be able to do the scale you guys are doing, but I'dbet a large number of these corruptions are findable in the gate.
On 07/30/2013 10:48 PM, Jacob Bushman wrote:
I haven't opened it because currently it is too tied to our proprietary
platform.  I have actually submitted a talk for the summit and planned
on having an open version ready for this.

It is good to hear that I am not the only one out there dealing with
these sorts of issues and trying to find solutions.

On 07/30/2013 05:37 PM, Joshua Harlow wrote:
I would love that tool, is it opened??

I've thought about such a tool myself actually. Something that keeps
enough info on the compute node to be able to analyze the actualstate of
the cluster and find discrepancies for what the varying openstack db's
believe is the 'state' of the clusters.
Seems like a great analysis tool. What corrective actions does it do(if
any?), aka, DB says X instances, really Y, then?? (delete them??)

On 7/30/13 11:59 AM, "Jacob Bushman" <[email protected]> wrote:
In our deployment we have a custom solution for the orchestration of
Openstack through the API that connects with billing and otherexternal
systems on the back end.

We have found that most of the corruption is introduced by messaging
issues in Openstack. There are a myriad of edge cases where thestatusin the database can become out of sync with what is actuallyrunning on
a compute node for instance.
The basic concept of the auditing tools is to compare theinformation in
the database with the actual state of the compute node and identify
discrepancies.

This is accomplished by parsing the instance XML, external ids of the
tap device and gathering relevant data from the compute node. Then
passing this through an API to our orchestration system and using a
combination of Openstack API calls and DB queries to audit the compute
nodes and make sure the database and the compute nodes are in sync.

On 07/30/2013 11:17 AM, Joshua Harlow wrote:
Do u have a writeup of the corruption issues you have seen.

I would most definitely appreciate said tools.

Any little overview of what they do/are??

On 7/30/13 9:44 AM, "Jacob Bushman" <[email protected]> wrote:
I have been working with various corruption issues within openstack.
Issues like failed or partial provisions, quantum port / ipcorruptionand database corruption. There are several edge cases that Ihave run
into where the existing periodic task to clean up corruption were
inadequate for our use case.
We really needed a more unified way to query through the entirestack.
To handle this on the scale that I am working with I have developed
out
of band auditing tools.
I feel something like this belongs in Openstack and would beuseful to
the community.  I am wondering what other tools are available and if
this is something that is of interest.

~ Jacob

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Auditing Openstack

Reply via email to