Re: [controller-dev] [integration-dev] Clustering acceptance tests

Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) Tue, 07 Feb 2017 06:05:59 -0800

Thanks Luis.

> but other tests are not, I will have to investigate this.

Keep us informed.

> 3) & 4) is probably controller cluster limitation.

Both jobs occasionally pass,
and I have opened a Bug [0] for exceptions in karaf log.
To me, it looks like an error in OpenflowPlugin
(as opposed to Controller) code.

> writing very fast (REST or internal app) on a shard follower DS, and reading 
> on the other follower.

We plan to expand controller-csit-3node-rest-clust-cars-perf-only-carbon,
not sure yet whether this scenario will be included.

Vratko.

[0] https://bugs.opendaylight.org/show_bug.cgi?id=7750

From: Luis Gomez [mailto:ece...@gmail.com]
Sent: 7 February, 2017 08:35
To: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
<vrpo...@cisco.com>
Cc: integration-...@lists.opendaylight.org; 
controller-dev@lists.opendaylight.org; openflowplugin-dev 
<openflowplugin-...@lists.opendaylight.org>
Subject: Re: [integration-dev] Clustering acceptance tests

Here is what I know from OpenFlow plugin (cc-ing ofplugin devs):

* Does your project have a test plan mentioning specific cluster scenarios?

Not written test plan but we are running a bunch of cluster tests.

* Do you have any of such scenarios implemented as Robot suites?

1) 
https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/
 ->  Cluster HA test (DPN connect to all nodes), it used to pass except for 1 
test (member isolation with iptables), now I see this test is stable but other 
tests are not, I will have to investigate this.

2) 
https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/
 -> Cluster non HA test (DPN connect to 1 node), failing because this old bug: 
https://bugs.opendaylight.org/show_bug.cgi?id=6459.

3) 
https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-bulkomatic-clustering-perf-daily-only-boron/
 -> Max flows/sec using bulk-o-matic DS on cluster setup. Not fully working 
because some cluster backend limitation 
https://bugs.opendaylight.org/show_bug.cgi?id=6755

4) 
https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-restconf-clustering-perf-daily-only-boron/
 -> Max flows/sec using NB REST on cluster setup, this never worked very good 
because previous bug.

* Do the robot suites have failures, suspected to be caused by clustering
  (as opposed to application logic, or mistakes in Robot code)?

So far I think issue in 2) is OpenFlow cluster implementation and issue in 3) & 
4) is probably controller cluster limitation.

* Are there open Bugs corresponding to the clustering failures?

Yes, except for 1) that will require some analysis on the unstable tests.

* Are you planning to implement more Robot 3node suites until Carbon release?

I will probably replace 1 of the performance suites (no point to run 2 if they 
do not work) by a cluster switch scalability test.

* Are there scenarios you would like Controller team to cover using mock apps?

I think issue in 3) & 4) could be reproduced in controller project by just 
writing very fast (REST or internal app) on a shard follower DS, and reading on 
the other follower.

On Feb 6, 2017, at 5:31 AM, Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at 
Cisco) <vrpo...@cisco.com<mailto:vrpo...@cisco.com>> wrote:

Hello Test Contacts.

In Controller project, our highest priority
for Carbon release is to make sure ODL clustering
is usable and stable.

We are in the phase of formulating explicit acceptance criteria,
so we can create execution plan for turning them into Robot suites.

Of course, clustering is not very useful just by itself,
it is used as a tool applications can use to reach their goals.
So real acceptance criteria for clustering should also
take into account whether ODL applications can work in cluster.

Many projects are already running their 3node CSIT tests,
but on one hand, some important scenarios might be not covered yet,
and some suites might be too unstable to serve as acceptance tests.

Controller team is small and busy, so we are asking for help.
Here is a set of quick questions for test contacts:
* Does your project have a test plan mentioning specific cluster scenarios?
* Do you have any of such scenarios implemented as Robot suites?
* Do the robot suites have failures, suspected to be caused by clustering
  (as opposed to application logic, or mistakes in Robot code)?
* Are there open Bugs corresponding to the clustering failures?
* Are you planning to implement more Robot 3node suites until Carbon release?
* Are there scenarios you would like Controller team to cover using mock apps?

Vratko (as a Controller test contact).
_______________________________________________
integration-dev mailing list
integration-...@lists.opendaylight.org<mailto:integration-...@lists.opendaylight.org>
https://lists.opendaylight.org/mailman/listinfo/integration-dev

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Re: [controller-dev] [integration-dev] Clustering acceptance tests

Reply via email to