Hello, Mathew.

I'm seeing same issues with the gate.
I also tried to found out why gate job is failing. First ran into issue related 
to cinder installation failure in devstack. But then I found same problem as 
you described. The best option is to increase job time range. 
Thanks for such research. I hope gate will be fixed in the easiest way and for 
the shortest period of time.

Best regards
Denis Makogon.
Sent from an iPad

> 16 февр. 2014, в 00:46, "Lowery, Mathew" <mlow...@ebay.com> написал(а):
> 
> Hi all,
> 
> Issue #1: Jobs that need more than one hour
> 
> Of the last 30 Trove-Gate builds (spanning three days), 7 have failed due to 
> a Jenkins job-level timeout (not a proboscis timeout). These jobs had no 
> failed tests when the timeout occurred.
> 
> Not having access to the job config to see what the job looks like, I used 
> the console output to guess what was going on. It appears that a Jenkins 
> plugin named boot-hpcloud-vm is booting a VM and running the commands given, 
> including redstack int-tests. From the console output, it states that it was 
> supplied with an ssh_shell_timeout="7200". This is passed down to another 
> library called net-ssh-simple. net-ssh-simple has two timeouts: an idle 
> timeout and an operation timeout.
> 
> In the latest boot-hpcloud-vm, ssh_shell_timeout is passed down to 
> net-ssh-simple for both the idle timeout and the operation timeout. But in 
> older versions of boot-hp-cloud-vm, ssh_shell_timeout is passed down to 
> net-ssh-simple for only the idle timeout, leaving a default operation timeout 
> of 3600. This is why I believe these jobs are failing after exactly one hour.
> 
> FYI: Here are the jobs that failed due to the Jenkins job-level timeout (and 
> had no test failures when the timeout occurred) along with their associated 
> patch sets:
> https://rdjenkins.dyndns.org/job/Trove-Gate/2532/console 
> (http://review.openstack.org/73786)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2530/console 
> (http://review.openstack.org/73736)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2517/console 
> (http://review.openstack.org/63789)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2514/console 
> (https://review.openstack.org/50944)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2513/console 
> (https://review.openstack.org/50944)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2504/console 
> (https://review.openstack.org/73147)
> https://rdjenkins.dyndns.org/job/Trove-Gate/2503/console 
> (https://review.openstack.org/73147)
> 
> Suggested action items:
> If it is acceptable to have jobs that run over one hour, then install the 
> latest boot-hpcloud-vm plugin for Jenkins which will increase the make the 
> operation timeout match the idle timeout.
> 
> Issue #2: The running time of all jobs is 1 hr 1 min
> 
> While the Jenkins job-level timeout will end the job after one hour, it also 
> appears to keep every job running for a minimum of one hour.  To be more 
> precise, the timeout (or minimum running time) occurs on the part of the 
> Jenkins job that runs commands on the VM; the VM provision (which takes about 
> one minute) is excluded from this timeout which is why the running time of 
> all jobs is around 1 hr 1 min. A sampling of console logs showing the time 
> the int-tests completed and when the timeout kicks in:
> 
> https://rdjenkins.dyndns.org/job/Trove-Gate/2531/console (00:01:03 wasted)
> 04:51:12 COMMAND_0: echo refs/changes/36/73736/2
> ...
> 05:50:10     335.41     proboscis.case.MethodTest (test_instance_created)
> 05:50:10     194.05     proboscis.case.MethodTest 
> (test_instance_returns_to_active_after_resize)
> 05:51:13 **************************************
> 05:51:13 ****** STDERR-BEGIN ******
> 
> https://rdjenkins.dyndns.org/job/Trove-Gate/2521/console (00:06:44 wasted)
> 21:11:44 COMMAND_0: echo refs/changes/89/63789/13
> ...
> 22:05:00     195.11     proboscis.case.MethodTest 
> (test_instance_returns_to_active_after_resize)
> 22:05:00     186.89     proboscis.case.MethodTest (test_resize_down)
> 22:11:44 **************************************
> 22:11:44 ****** STDERR-BEGIN ******
> 
> https://rdjenkins.dyndns.org/job/Trove-Gate/2518/consoleFull (00:06:01 wasted)
> 17:46:59 COMMAND_0: echo refs/changes/02/64302/20
> ...
> 18:40:57     210.03     proboscis.case.MethodTest 
> (test_instance_returns_to_active_after_resize)
> 18:40:57     187.89     proboscis.case.MethodTest (test_resize_down)
> 18:46:58 **************************************
> 18:46:58 ****** STDERR-BEGIN ******
> 
> Suggested action items:
> Given that the minimum running time is one hour, I assume the problem is in 
> the net-ssh-simple library. Needs more investigation.
> 
> Issue #3: Jenkins console log line timestamps different between full and 
> truncated views
> 
> I assume this is due to JENKINS-17779.
> 
> Suggested action items:
> Upgrade the timestamper plugin.
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to