Hi Sam, Robert, On the observations which were made as early as September 2017 - https://lists.opendaylight.org/pipermail/netvirt-dev/2017-September/005518.html (thanks to Jamo for testing this out) Enabling tell based protocol had 22% failure of CSIT at releng level. More details on the last sandbox and releng runs below
Having said that, since this is a 3 month old result and multiple changes would have gone into netvirt + genius itself, it would be prudential to test the same with the latest Oxygen build (at least it would reduce the possibility of misinterpreting netvirt + genius related issues as MD-SAL related issues). We will do one more sandbox run here at Ericsson with latest ODL Master and re-publish the results with and without tell-based protocol enabled by mid of next week. We will also try to run one round of bulk-flow provisioning with OFPlugin's bulk-o-matic test driver to see the scale behavior of tell-based protocol too. Actually two runs were performed one on releng and another in sandbox between last week of August and mid of September 2017 against Nitrogen : Releng run : ========== https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-3node-openstack-ocata-gate-stateful-nitrogen/7/log.html.gz Sandbox run : =========== https://logs.opendaylight.org/sandbox/jenkins091/netvirt-csit-3node-openstack-ocata-jamo-upstream-stateful-nitrogen/1/odl1_karaf.log.gz Jamo's observations from sandbox run : results are not good. Looks like things pass from a black box perspective in our first l2 connectivity suite, but then lots of failures after that. I also notice that our non-failing keyword to write to the karaf log using ssh to the karaf shell is failing, even in the above passing suite. Also, it's worth noting that in order to enable tell-based protocol I'm just stealing a controller robot suite to do the work and running it first. It makes the config change and reboots all the controllers. In one karaf log (I only looked at one) I saw a bunch of WARN messages about "Unknown history .... ignoring..." example: FrontendClientMetadataBuilder | 215 - org.opendaylight.controller.sal-distributed-datastore - 1.7.0.SNAPSHOT | member 1-shard-topology-operational: Unknown history for aborted transaction member-1-datastore-operational-fe-4-txn-7810-1, ignoring I also saw an ERROR about failure to serialize something or other: 2017-08-29 04:25:12,719 | ERROR | -dispatcher-3279 | EndpointWriter | 41 - com.typesafe.akka.slf4j - 2.4.18 | Failed to serialize remote message [class akka.actor.Status$Failure] | using serializer [class akka.serialization.JavaSerializer]. Transient association error (association remains live) akka.remote.MessageSerializer$SerializationException: Failed to serialize remote message [class akka.actor.Status$Failure] using serializer [class akka.serialization.JavaSerializer]. Observations: =========== -----Original Message----- From: Robert Varga [mailto:[email protected]] Sent: Friday, January 12, 2018 2:11 AM To: Sam Hague Cc: Michael Vorburger; Muthukumaran K; Tom Pantelis; controller-dev; [email protected]; Kency Kurian Subject: Re: [controller-dev] Should application code persist do retries on TransactionCommitFailedException caused by AskTimeoutException or could CDS be configured to retry more? Regards Muthu On 11/01/18 21:26, Sam Hague wrote: > Robert, > > when you mention odlparent/yangtools integrated - what does that mean? I meant the yangtools-2.0.0 stuff needs to be merged up -- which obviously was delayed way longer than anticipated. > do we think that will happen for oxygen? I would love to have it in, but it does have potential to cause breakage -- hence I am afraid we are out of runway. > There are a number of clustering bugs open that all have > AskTimeoutException listed in the traces. I think the idea is the tell > based change will help and then we can dig deeper if the bugs still exist. Yup. > Muthu, > > how did your testing with tell for netvirt tests go? Were we safe > switching to it? *This* is the most critical question that needs to be answered. If netvirt and BGP greenlight it, I think we can make the switch ... Regards, Robert _______________________________________________ controller-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/controller-dev
