On 2/16/18 11:33 AM, Tom Pantelis wrote:
> 
> 
> On Fri, Feb 16, 2018 at 2:26 PM, Jamo Luhrsen <jluhr...@gmail.com 
> <mailto:jluhr...@gmail.com>> wrote:
> 
>     I'm analyzing CSIT failures for our Carbon SR3 candidate.
> 
>     Something nasty went wrong in a netvirt CSIT job in the middle of
>     the robot tests. Seems like all functionality is probably broken
>     after that.
> 
>     in the karaf.log [0] I see a message about some akka circuit breaker
>     Timed out, then a bunch of RuntimeExceptions: Transaction
>     aborted due to shutdown.
> 
> 
> yeah that means akka persistence failed, ie it timed out waiting for data to 
> be written to the disk. That kills the
> shard actor with no recovery.  This can happen if there's slow disk 
> access/contention in the env - seen this happen
> before with internal CSIT env before the disk issue was resolved.

Thanks. I'll report to the infra guys that we are still likely seeing
some high disk IO latency. There was another job with similar issues.

JamO

>     Any ideas what's happening here?
> 
>     Thanks,
>     JamO
> 
>     
> [0]https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-pike-upstream-stateful-snat-conntrack-carbon/200/odl_1/odl1_karaf.log.gz
>     
> <https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-pike-upstream-stateful-snat-conntrack-carbon/200/odl_1/odl1_karaf.log.gz>
>     _______________________________________________
>     controller-dev mailing list
>     controller-dev@lists.opendaylight.org 
> <mailto:controller-dev@lists.opendaylight.org>
>     https://lists.opendaylight.org/mailman/listinfo/controller-dev
>     <https://lists.opendaylight.org/mailman/listinfo/controller-dev>
> 
> 
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to