Jake, everything is dockerized on both Jenkins and Travis. The current Jenkins failure (D test hang) is caused by different environment on Jenkins.
On Fri, Apr 15, 2016 at 12:45 AM John Sirois <jsir...@apache.org> wrote: > On Thu, Apr 14, 2016 at 9:41 AM, John Sirois <jsir...@apache.org> wrote: > >> >> >> On Thu, Apr 14, 2016 at 9:34 AM, Aki Sukegawa <ns...@apache.org> wrote: >> >>> Quoting from my previous mail. >>> >>> > Other than Travis, make check is hanging for almost every build of >>> Jenkins. >>> > The log is not that clear but I think it's D test. >>> > AFAIK the test was running fine a few weeks ago and nobody touched it >>> since then. >>> > It might be due to insufficient resource on Jenkins. >>> >>> I suspect default task limit introduced in a recent version of docker is >>> not lifted on ASF jenkins. >>> >>> I'm not sure if it's worth maintaining sub-set of builds on another CI >>> that has relatively unstable basis that cannot even be touched by >>> committers. >>> Less resource is fine because it can detect failures on such platforms >>> like last time John enabled it. >>> But it's apparently changing. >>> >> >> Aha - that would be an interesting cause to the D hangs. >> >> I'm not clear on what you meant by the rest, but I assume you're >> addressing the confusing fact that thrift maintains 2 sets of broken CI >> jobs (fwict) for pull requests, TravisCI and Apache Jenkins. >> >> It seems to me 4 steps are needed to provide baseline sanity for >> contributing to the project: >> 1. Halt accepting and changes immediately. >> 2. Pick Travis or Jenkins, kill the other. >> 3. Get the winner from 2 green. >> 4. Resume accepting patches that are green in CI and only green in CI. >> > > Towards step 2, I think I now have the git issue solved for Jenkins after > enabling `git clean -fdx && git reset --hard HEAD` (or the equivalent in > the Jenkins git plugin) and modifying the `docker run`s to use --user `uid > -u`:`uid -g` so that the container modifications to the Jenkins WORKSPACE > volume mount are done as the Jenkins user instead of as root. > https://builds.apache.org/job/Thrift-precommit/417/ is spinning with > these fixes and hopefully goes clean to a hang in the D tests. > > >> >>> On Thu, Apr 14, 2016 at 11:45 PM John Sirois <jsir...@apache.org> wrote: >>> >>>> On Thu, Apr 14, 2016 at 8:29 AM, John Sirois <jsir...@apache.org> >>>> wrote: >>>> >>>> > >>>> > >>>> > On Thu, Apr 14, 2016 at 8:14 AM, Jim King <jim.k...@simplivity.com> >>>> wrote: >>>> > >>>> >> I got one build through (which failed in "d" tests) and now it's >>>> stuck in >>>> >> the same state, see: >>>> >> https://builds.apache.org/job/Thrift-precommit/411/ >>>> >> >>>> >> FATAL: Could not checkout master with start point origin/master >>>> >> hudson.plugins.git.GitException: Could not checkout master with start >>>> >> point origin/master >>>> >> at >>>> >> >>>> org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:1962) >>>> >> at >>>> >> >>>> org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.checkoutBranch(AbstractGitAPIImpl.java:82) >>>> >> at >>>> >> >>>> org.jenkinsci.plugins.gitclient.CliGitAPIImpl.checkoutBranch(CliGitAPIImpl.java:62) >>>> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>> >> at >>>> >> >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> >> at >>>> >> >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> >> at java.lang.reflect.Method.invoke(Method.java:606) >>>> >> at >>>> >> >>>> hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:608) >>>> >> at >>>> >> >>>> hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:583) >>>> >> at >>>> >> >>>> hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:542) >>>> >> at hudson.remoting.UserRequest.perform(UserRequest.java:120) >>>> >> at hudson.remoting.UserRequest.perform(UserRequest.java:48) >>>> >> at hudson.remoting.Request$2.run(Request.java:326) >>>> >> at >>>> >> >>>> hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68) >>>> >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>>> >> at >>>> >> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> >> at >>>> >> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> >> at java.lang.Thread.run(Thread.java:745) >>>> >> at ......remote call to H10(Native Method) >>>> >> at >>>> >> hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416) >>>> >> at >>>> hudson.remoting.UserResponse.retrieve(UserRequest.java:220) >>>> >> at hudson.remoting.Channel.call(Channel.java:781) >>>> >> at >>>> >> >>>> hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:250) >>>> >> at com.sun.proxy.$Proxy115.checkoutBranch(Unknown Source) >>>> >> at >>>> >> >>>> org.jenkinsci.plugins.gitclient.RemoteGitImpl.checkoutBranch(RemoteGitImpl.java:327) >>>> >> at >>>> >> >>>> com.cloudbees.jenkins.plugins.git.vmerge.BuildChooserImpl.getCandidateRevisions(BuildChooserImpl.java:78) >>>> >> at >>>> >> hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:951) >>>> >> at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1054) >>>> >> at hudson.scm.SCM.checkout(SCM.java:485) >>>> >> at >>>> >> hudson.model.AbstractProject.checkout(AbstractProject.java:1276) >>>> >> at >>>> >> >>>> hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607) >>>> >> at >>>> >> jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86) >>>> >> at >>>> >> >>>> hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529) >>>> >> at hudson.model.Run.execute(Run.java:1738) >>>> >> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) >>>> >> at >>>> >> hudson.model.ResourceController.execute(ResourceController.java:98) >>>> >> at hudson.model.Executor.run(Executor.java:410) >>>> >> Caused by: hudson.plugins.git.GitException: Command "git checkout -b >>>> >> master origin/master" returned status code 1: >>>> >> stdout: lib/lua/TCompactProtocol.lua: needs merge >>>> >> >>>> >> stderr: error: you need to resolve your current index first >>>> >> >>>> >> It looks like the build environment is not forced clean at the >>>> beginning >>>> >> of each build. >>>> >> >>>> > >>>> > Ack - looking now. >>>> > >>>> > It is odd that the git portion of these builds went sideways since the >>>> > Jenkins Job Config History auditing plugin shows the last change >>>> (before my >>>> > tweak last night) was 2016-02-16_02-09-39. I expect jenkins or its >>>> plugins >>>> > were updated by infra causing the previously working job config to >>>> not work >>>> > any longer. >>>> > >>>> >>>> OK - that analysis was wrong, clearly there has been a change in the >>>> build >>>> itself that modifies source code and this causes the issue. >>>> I've enabled <hudson.plugins.git.extensions.impl.CleanBeforeCheckout/> >>>> with >>>> the following description: >>>> >>>> Clean up the workspace before every checkout by deleting all untracked >>>> files and directories, including those which are specified in >>>> .gitignore. >>>> It also resets all *tracked* files to their versioned state. This >>>> ensures >>>> >>>> that the workspace is in the same state as if you cloned and checked >>>> out in >>>> a brand-new empty directory, and ensures that your build is not >>>> affected by >>>> the files generated by the previous build. >>>> >>>> That sounds like ~ `git clean -fdx && git reset --hard HEAD` to me, >>>> which >>>> should do it. That should insulate CI from bad tests that modify >>>> checked in >>>> repo state, but those tests shouldn't exist either. >>>> >>>> COMMITTERS: >>>> I'd like to reiterate to any committers out there that red CI must be a >>>> hard bright line that is not crossed when accepting patches; otherwise >>>> well >>>> be right back here after getting this thing green again. Here we is >>>> you - >>>> I won't be interested in helping out a third time if this relapses. >>>> >>>> >>>> > >>>> >> - Jim >>>> >> >>>> >> -----Original Message----- >>>> >> From: Jim King >>>> >> Sent: Wednesday, April 13, 2016 10:34 PM >>>> >> To: dev@thrift.apache.org; 'jsir...@apache.org' <jsir...@apache.org> >>>> >> Subject: RE: Build Failures >>>> >> >>>> >> The builds were failing claiming that a file was in the middle of >>>> being >>>> >> merged and they were all failing early. >>>> >> I think the build environment itself is compromised and there's >>>> nothing I >>>> >> can do about that. >>>> >> >>>> >> -----Original Message----- >>>> >> From: John Sirois [mailto:jsir...@apache.org] >>>> >> Sent: Wednesday, April 13, 2016 9:58 PM >>>> >> To: dev@thrift.apache.org >>>> >> Subject: Re: Build Failures >>>> >> >>>> >> On Wed, Apr 13, 2016 at 7:54 PM, John Sirois <jsir...@apache.org> >>>> wrote: >>>> >> >>>> >> > >>>> >> > >>>> >> > On Wed, Apr 13, 2016 at 7:51 PM, Jim King <jim.k...@simplivity.com >>>> > >>>> >> wrote: >>>> >> > >>>> >> >> I’m still looking for answers on pull request build failures. >>>> >> >> >>>> >> >> I have 2 or 3 PRs open right now and they’ve failed in the apache >>>> >> >> precommit builds for strange reasons. >>>> >> >> >>>> >> >> The apache internal builds seem to be failing. >>>> >> >> >>>> >> > >>>> >> > I think the answer is the breaks need a fixer; hopefully you can >>>> find >>>> >> > time to help fix. >>>> >> > >>>> >> > I say this because I started down a series of patches to the java >>>> >> > codegen/lib a while back and found a similar state - though on the >>>> >> > pull request builder (apache jenkins). I stopped my java stuff and >>>> >> > fixed that CI with the help of Aki and Jake reviewing and providing >>>> >> > guidance. I am not a thrift comitter. >>>> >> > >>>> >> >>>> >> I will say that its discouraging that that CI is now solid red too: >>>> >> https://builds.apache.org/job/Thrift-precommit/ >>>> >> Part of the answer IMO is for committers to hold a hard line on >>>> accepting >>>> >> any patch, or pushing their own, w/o full green CIs. >>>> >> >>>> >> >>>> >> > >>>> >> > >>>> >> >> >>>> >> >> >>>> >> >> [image: Description: Description: simplivity-lg-xsmall] >>>> >> >> >>>> >> >> James E. King, III >>>> >> >> >>>> >> >> Architect >>>> >> >> >>>> >> >> 8 Technology Drive, 2nd Floor >>>> >> >> Westborough, MA 01581-1756 >>>> >> >> >>>> >> >> Ph: 855-SVT-INFO >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> ------------------------------ >>>> >> >> PRIVACY STATEMENT: >>>> >> >> This message is a PRIVATE communication. This message and all >>>> >> >> attachments are a private communication sent by SimpliVity and are >>>> >> >> considered to be confidential or protected by privilege. If you >>>> are >>>> >> >> not the intended recipient, you are hereby notified that any >>>> >> >> disclosure, copying, distribution or use of the information >>>> contained >>>> >> >> in or attached to this message is strictly prohibited. Please >>>> notify >>>> >> >> the sender of the delivery error by replying to this message, and >>>> then >>>> >> delete it from your system. >>>> >> >> For more information please visit http://www.simplivity.com >>>> >> >> ------------------------------ >>>> >> >> >>>> >> > >>>> >> > >>>> >> >>>> >> >>>> --------------------------------------------------------------------------------------- >>>> >> PRIVACY STATEMENT: >>>> >> This message is a PRIVATE communication. This message and all >>>> >> attachments are a private communication sent by SimpliVity and are >>>> >> considered to be confidential or protected by privilege. If you are >>>> not the >>>> >> intended recipient, you are hereby notified that any disclosure, >>>> copying, >>>> >> distribution or use of the information contained in or attached to >>>> this >>>> >> message is strictly prohibited. Please notify the sender of the >>>> delivery >>>> >> error by replying to this message, and then delete it from your >>>> system. >>>> >> >>>> >> >>>> --------------------------------------------------------------------------------------- >>>> >> >>>> > >>>> > >>>> >>> >>