Re: [controller-dev] [mdsal-dev] [integration-dev] 3node cluster regression in Carbon - since Jan 5th

Peretz, Ravit Tue, 10 Jan 2017 06:43:23 -0800

Hi guys,

Unfortunately, it seems like this may not be the only issue.
3node cluster jobs still fail after the patch was merged.

This was reported earlier by Vladimir Lavor:
akka.remote.artery.OutboundHandshake$HandshakeTimeoutException: Handshake with 
[akka://[email protected]:2550] did not complete within 
20000 ms

https://bugs.opendaylight.org/show_bug.cgi?id=7493

please note that the commit Robert referred to  was only entered on: 
distribution-karaf-0.6.0-20170105.235121-2883.zip
https://logs.opendaylight.org/releng/jenkins092/controller-distribution-carbon/155/console.log.gz

The error we’ve seen was few hours/distro earlier.. 
0.6.0-20170105.222635-2880.zip

Can controller folks please take another look?

Thanks,
Ravit.

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Mainzer, Gal
Sent: יום ג 10 ינואר 2017 06:34
To: Robert Varga <[email protected]>
Cc: [email protected]; [email protected]; Luis 
Gomez <[email protected]>; Jamo Luhrsen <[email protected]>; 
[email protected]; [email protected]; 
[email protected]
Subject: Re: [mdsal-dev] [integration-dev] [controller-dev] 3node cluster 
regression in Carbon - since Jan 5th

Maybe not as a gate job but more of a periodic that runs every 4-6 hours.

At this stage, those jobs are stable enough (and if not we are really close to 
that point) for a single failure to state that there is a regression. All we 
need to agree is that if that cloud suite is failing - all relevant project 
should stop merging (even as a process and not by a gerrit mechanic lock) until 
we are back from regression.

We can add additional job that with a single click, will collect all commits 
from all relevant projects that are suspected - as Jamo said, ~15 are 
dependent. This will reduce our analysis time by even maybe reverting suspected 
commits just to come back from the regression and release the "lock".

Without proper dashboard I'm not really expecting all projects to monitor this, 
but at first stage we can monitor that job (like we do today) and send critical 
mail on certain failures.

Sent from my iPhone

On 10 Jan 2017, at 1:31, Robert Varga <[email protected]> wrote:

> On 01/09/2017 10:37 PM, Jamo Luhrsen wrote:
> so you mean to have this "cloud suite" run as a gating job on gerrit 
> patches for all projects that our "ODL for openstack" needs, I think. 
> That would be nice, but we would need to convince a lot of projects to do it. 
> Looks like at least 12 projects are dependencies for netvirt:
> 
> controller,dlux,genius,infrautils,mdsal,netconf,neutron,odlparent,open
> flowplugin,ovsdb,sfc,yangtools

Judging from how long it takes for -autorelease and -distcheck to stabilize for 
each release, I would hate to see such a job gate offset-0 patches.

In this particular set of projects, there is a history of breakage happening on 
OFP/OVSDB and OVSDB/SFC (I think) boundaries.

Just my .02,
Robert

_______________________________________________
mdsal-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/mdsal-dev
_______________________________________________
controller-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Re: [controller-dev] [mdsal-dev] [integration-dev] 3node cluster regression in Carbon - since Jan 5th

Reply via email to