+dev list I personally don't mind letting the regression suite run overnight. The important thing is that we do not push changes that have not passed the full automated test suite. In the interest of efficiency, we shouldn't even be reviewing most PRs until after they have passed the automated tests.
Deron, are you seeing a backlog of not-yet-started builds queueing up on the PR build server? If the queue is getting long, we can add additional machines to the Jenkins cluster. Fred From: Deron Eriksson/San Francisco/IBM To: Niketan Pansare/Almaden/IBM@IBMUS Cc: Berthold Reinwald/Almaden/IBM@IBMUS, Frederick R Reiss/Almaden/IBM@IBMUS Date: 12/08/2016 11:06 AM Subject: Re: test suite running slowly after disable cache/sparse commit? Hi Niketan, Perhaps Berthold or Fred could add a little guidance here in terms of what is acceptable? Having the test suite go from 2:21 to 3:41 (one pull request yesterday took 4:11 to complete - https://sparktc.ibmcloud.com/jenkins/job/SystemML-PullRequestBuilder/909/) is very serious to me. Even if the test suite runs at 3:00, this is a serious slowdown. It slows down our ability to validate pull requests and other code on jenkins. Deron ----- Original message ----- From: Niketan Pansare/Almaden/IBM To: Deron Eriksson/San Francisco/IBM@ibmus Cc: Berthold Reinwald/Almaden/IBM@ibmus, Frederick R Reiss/Almaden/IBM@ibmus Subject: Re: test suite running slowly after disable cache/sparse commit? Date: Thu, Dec 8, 2016 8:55 AM Hi Deron, The commit replicated application tests for disable sparse and disable caching. So, the test time should increase. We should increase the duration or reduce the number of application tests we want to test with caching and sparse disabled. Thanks Niketan On Dec 8, 2016, at 7:47 AM, Deron Eriksson <de...@us.ibm.com> wrote: Hi Niketan, I noticed the daily test yesterday timed out, probably because of a long-running test. Looking at the commits from the day before ( https://github.com/apache/incubator-systemml/commits/master), I noticed that [SYSTEMML-769] [SYSTEMML-1140] Removed -disable-caching and -disable-… ( https://github.com/apache/incubator-systemml/commit/caaaec90b61e529e50021d89f9f108230fa307a8 ) updated some of the tests. So I ran the tests on the previous commit ( https://sparktc.ibmcloud.com/jenkins/job/SystemML-OnDemand/227/) and the tests ran in 2hr 21min. I ran the tests on the 'disable caching...' commit ( https://sparktc.ibmcloud.com/jenkins/job/SystemML-OnDemand/228/) and the tests ran in 3hr 41min. One thing that is confusing to me is that the nightly test just completed successfully ( https://sparktc.ibmcloud.com/jenkins/job/SystemML-DailyTest/674/) in 2hr 57min and did not time out like yesterday afternoon. So it is always possible it could be a server issue. Could you look into this and see if that commit introduced an issue with the tests? Thanks! Deron