Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Niels de Vos
On Wed, Jan 13, 2016 at 10:35:42AM +0100, Xavier Hernandez wrote:
> The same has happened to slave34.cloud.gluster.org. I've disabled it to
> allow regressions to be run on other slaves.
> 
> There are two files owned by root inside
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered:
> 
> -rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
> drwxr-xr-x  3 rootroot  4096 Jan  7 18:31 slave34.cloud.gluster.org:

Thanks!

I've looked into this a little more now, and might have identified the
problem.

This one failed with an unrelated error:

  https://build.gluster.org/job/rackspace-regression-2GB-triggered/17413/console

  ...
  Building remotely on slave34.cloud.gluster.org (rackspace_regression_2gb) in 
workspace /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
   > git rev-parse --is-inside-work-tree # timeout=10
  Fetching changes from the remote Git repository
   > git config remote.origin.url git://review.gluster.org/glusterfs.git # 
timeout=10
  Fetching upstream changes from git://review.gluster.org/glusterfs.git
  ...

The next run on slave34 failed because of the weird directory:

  https://build.gluster.org/job/rackspace-regression-2GB-triggered/17440/console
  
  ...
  Building remotely on slave34.cloud.gluster.org (rackspace_regression_2gb) in 
workspace /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
  Wiping out workspace first.
  java.io.IOException: remote file operation failed: 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered at 
hudson.remoting.Channel@62ecdacb:slave34.cloud.gluster.org: 
  ...

Note the "Wiping out workspace first." line. This comes from an option
in the regression job. This seems to be a recently added "Additional
Behaviour" in the Jenkins job. Did anyone add this on purpose, or was
that automatically done with a Jenkins update or something?

Niels

> 
> Xavi
> 
> On 12/01/16 12:06, Niels de Vos wrote:
> >Hi,
> >
> >I've disabled slave32.cloud.gluster.org because it failed multiple
> >regression tests with a weird error. After disabling slave32 and
> >retriggering the failed run, the same job executed fine on a different
> >slave.
> >
> >The affected directory is owned by root, so the jenkins user is not
> >allowed to wipe it. Does anyone know how this could happen? The dirname
> >is rather awkward too...
> >
> >   
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d
> >
> >I think we can just remove that dir and the slave can be enabled again.
> >Leaving the status as is for further investigation.
> >
> >Thanks,
> >Niels
> >
> >
> >Full error:
> >
> > Wiping out workspace first.
> > java.io.IOException: remote file operation failed: 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered at 
> > hudson.remoting.Channel@7bc1e07d:slave32.cloud.gluster.org: 
> > java.nio.file.AccessDeniedException: 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d
> > at hudson.FilePath.act(FilePath.java:986)
> > at hudson.FilePath.act(FilePath.java:968)
> > at hudson.FilePath.deleteContents(FilePath.java:1183)
> > at 
> > hudson.plugins.git.extensions.impl.WipeWorkspace.beforeCheckout(WipeWorkspace.java:28)
> > at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1040)
> > at hudson.scm.SCM.checkout(SCM.java:485)
> > at 
> > hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
> > at 
> > hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
> > at 
> > jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
> > at 
> > hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
> > at hudson.model.Run.execute(Run.java:1738)
> > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
> > at 
> > hudson.model.ResourceController.execute(ResourceController.java:98)
> > at hudson.model.Executor.run(Executor.java:410)
> > Caused by: java.nio.file.AccessDeniedException: 
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d
> > at 
> > sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
> > at 
> > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> > at 
> > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> > at 
> > sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)
> > at 
> > sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
> > at java.nio.file.Files.delete(Files.java:1079)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at 
> > 

Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Xavier Hernandez
The same has happened to slave34.cloud.gluster.org. I've disabled it to 
allow regressions to be run on other slaves.


There are two files owned by root inside 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered:


-rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
drwxr-xr-x  3 rootroot  4096 Jan  7 18:31 slave34.cloud.gluster.org:

Xavi

On 12/01/16 12:06, Niels de Vos wrote:

Hi,

I've disabled slave32.cloud.gluster.org because it failed multiple
regression tests with a weird error. After disabling slave32 and
retriggering the failed run, the same job executed fine on a different
slave.

The affected directory is owned by root, so the jenkins user is not
allowed to wipe it. Does anyone know how this could happen? The dirname
is rather awkward too...

   
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d

I think we can just remove that dir and the slave can be enabled again.
Leaving the status as is for further investigation.

Thanks,
Niels


Full error:

 Wiping out workspace first.
 java.io.IOException: remote file operation failed: 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered at 
hudson.remoting.Channel@7bc1e07d:slave32.cloud.gluster.org: 
java.nio.file.AccessDeniedException: 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d
at hudson.FilePath.act(FilePath.java:986)
at hudson.FilePath.act(FilePath.java:968)
at hudson.FilePath.deleteContents(FilePath.java:1183)
at 
hudson.plugins.git.extensions.impl.WipeWorkspace.beforeCheckout(WipeWorkspace.java:28)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1040)
at hudson.scm.SCM.checkout(SCM.java:485)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
at hudson.model.Run.execute(Run.java:1738)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:410)
 Caused by: java.nio.file.AccessDeniedException: 
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)
at 
sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
at java.nio.file.Files.delete(Files.java:1079)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at hudson.Util.deleteFile(Util.java:255)
at hudson.FilePath.deleteRecursive(FilePath.java:1203)
at hudson.FilePath.deleteContentsRecursive(FilePath.java:1212)
at hudson.FilePath.deleteRecursive(FilePath.java:1194)
at hudson.FilePath.deleteContentsRecursive(FilePath.java:1212)
at hudson.FilePath.access$1100(FilePath.java:190)
at hudson.FilePath$15.invoke(FilePath.java:1186)
at hudson.FilePath$15.invoke(FilePath.java:1183)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2719)
at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
at ..remote call to slave32.cloud.gluster.org(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
at hudson.remoting.Channel.call(Channel.java:781)
at hudson.FilePath.act(FilePath.java:979)
... 13 more
 Finished: FAILURE



___
Gluster-infra mailing list
Gluster-infra@gluster.org

Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Xavier Hernandez

slave23 has the same problem.

I've deleted both entries on this slave to see if it works. If it does, 
the same can be done on the other slaves unless someone wants to 
investigate it.


The 'file_lock' seems to come from test 
tests/basic/tier/locked_file_migration.t. The other directory I don't 
know how it appears.


Xavi

On 13/01/16 10:35, Xavier Hernandez wrote:

The same has happened to slave34.cloud.gluster.org. I've disabled it to
allow regressions to be run on other slaves.

There are two files owned by root inside
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered:

-rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
drwxr-xr-x  3 rootroot  4096 Jan  7 18:31
slave34.cloud.gluster.org:

Xavi

On 12/01/16 12:06, Niels de Vos wrote:

Hi,

I've disabled slave32.cloud.gluster.org because it failed multiple
regression tests with a weird error. After disabling slave32 and
retriggering the failed run, the same job executed fine on a different
slave.

The affected directory is owned by root, so the jenkins user is not
allowed to wipe it. Does anyone know how this could happen? The dirname
is rather awkward too...


/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d


I think we can just remove that dir and the slave can be enabled again.
Leaving the status as is for further investigation.

Thanks,
Niels


Full error:

 Wiping out workspace first.
 java.io.IOException: remote file operation failed:
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
hudson.remoting.Channel@7bc1e07d:slave32.cloud.gluster.org:
java.nio.file.AccessDeniedException:
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d

 at hudson.FilePath.act(FilePath.java:986)
 at hudson.FilePath.act(FilePath.java:968)
 at hudson.FilePath.deleteContents(FilePath.java:1183)
 at
hudson.plugins.git.extensions.impl.WipeWorkspace.beforeCheckout(WipeWorkspace.java:28)

 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1040)
 at hudson.scm.SCM.checkout(SCM.java:485)
 at
hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
 at
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)

 at
jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
 at
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)

 at hudson.model.Run.execute(Run.java:1738)
 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
 at
hudson.model.ResourceController.execute(ResourceController.java:98)
 at hudson.model.Executor.run(Executor.java:410)
 Caused by: java.nio.file.AccessDeniedException:
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:/d

 at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
 at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
 at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
 at
sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)

 at
sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)

 at java.nio.file.Files.delete(Files.java:1079)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)
 at hudson.Util.deleteFile(Util.java:255)
 at hudson.FilePath.deleteRecursive(FilePath.java:1203)
 at hudson.FilePath.deleteContentsRecursive(FilePath.java:1212)
 at hudson.FilePath.deleteRecursive(FilePath.java:1194)
 at hudson.FilePath.deleteContentsRecursive(FilePath.java:1212)
 at hudson.FilePath.access$1100(FilePath.java:190)
 at hudson.FilePath$15.invoke(FilePath.java:1186)
 at hudson.FilePath$15.invoke(FilePath.java:1183)
 at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2719)
 at hudson.remoting.UserRequest.perform(UserRequest.java:120)
 at hudson.remoting.UserRequest.perform(UserRequest.java:48)
 at hudson.remoting.Request$2.run(Request.java:326)
 at
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)

 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:745)
 at ..remote call to slave32.cloud.gluster.org(Native Method)
 at

Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Niels de Vos
On Wed, Jan 13, 2016 at 04:50:51PM +0530, Raghavendra Talur wrote:
> On Wed, Jan 13, 2016 at 3:49 PM, Niels de Vos  wrote:
> 
> > On Wed, Jan 13, 2016 at 10:35:42AM +0100, Xavier Hernandez wrote:
> > > The same has happened to slave34.cloud.gluster.org. I've disabled it to
> > > allow regressions to be run on other slaves.
> > >
> > > There are two files owned by root inside
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered:
> > >
> > > -rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
> > > drwxr-xr-x  3 rootroot  4096 Jan  7 18:31
> > slave34.cloud.gluster.org:
> >
> > Thanks!
> >
> > I've looked into this a little more now, and might have identified the
> > problem.
> >
> > This one failed with an unrelated error:
> >
> >
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17413/console
> >
> >   ...
> >   Building remotely on slave34.cloud.gluster.org
> > (rackspace_regression_2gb) in workspace
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
> >> git rev-parse --is-inside-work-tree # timeout=10
> >   Fetching changes from the remote Git repository
> >> git config remote.origin.url git://review.gluster.org/glusterfs.git
> > # timeout=10
> >   Fetching upstream changes from git://review.gluster.org/glusterfs.git
> >   ...
> >
> > The next run on slave34 failed because of the weird directory:
> >
> >
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17440/console
> >
> >   ...
> >   Building remotely on slave34.cloud.gluster.org
> > (rackspace_regression_2gb) in workspace
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
> >   Wiping out workspace first.
> >   java.io.IOException: remote file operation failed:
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
> > hudson.remoting.Channel@62ecdacb:slave34.cloud.gluster.org:
> >   ...
> >
> > Note the "Wiping out workspace first." line. This comes from an option
> > in the regression job. This seems to be a recently added "Additional
> > Behaviour" in the Jenkins job. Did anyone add this on purpose, or was
> > that automatically done with a Jenkins update or something?
> >
> 
> The three additional behaviour added in regression configuration were added
> by me and I simply copied
> whatever was there in smoke configuration page. We can try removing this
> configuration line but the tests weren't getting started without this(we
> got it running after a restart so this might be false requirement).

Ok, at least we know where it came from :) If we could put the jenkins
job xml files in a git repo, we can then follow the changes that are
being made. Is this something you can get done? We can just export the
xml and commit the changes to the repo wheneven a change is needed, most
importantly we have a history we can track.

> Technically, it is not a harmful configuration and wiki pages recommend it.
> It says such errors occur only if files were created and left open/locked
> by some tests or were created with different permissions. We still need to
> identify what tests are responsible for this.

Yes, I think it is a good thing to have. Unfortunately some of the tests
seem to create files as root in the directory. We should correct those
tests to use only files under the paths that are used for bricks or
installation. (Xavi found one of those tests in an other email in this
thread.)

Thanks,
Niels

> 
> 
> >
> > Niels
> >
> > >
> > > Xavi
> > >
> > > On 12/01/16 12:06, Niels de Vos wrote:
> > > >Hi,
> > > >
> > > >I've disabled slave32.cloud.gluster.org because it failed multiple
> > > >regression tests with a weird error. After disabling slave32 and
> > > >retriggering the failed run, the same job executed fine on a different
> > > >slave.
> > > >
> > > >The affected directory is owned by root, so the jenkins user is not
> > > >allowed to wipe it. Does anyone know how this could happen? The dirname
> > > >is rather awkward too...
> > > >
> > > >
> >  
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:
> > /d
> > > >
> > > >I think we can just remove that dir and the slave can be enabled again.
> > > >Leaving the status as is for further investigation.
> > > >
> > > >Thanks,
> > > >Niels
> > > >
> > > >
> > > >Full error:
> > > >
> > > > Wiping out workspace first.
> > > > java.io.IOException: remote file operation failed:
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
> > hudson.remoting.Channel@7bc1e07d:slave32.cloud.gluster.org:
> > java.nio.file.AccessDeniedException:
> > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:
> > /d
> > > > at hudson.FilePath.act(FilePath.java:986)
> > > > at hudson.FilePath.act(FilePath.java:968)
> > > > at hudson.FilePath.deleteContents(FilePath.java:1183)
> > > > at
> > 

Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Michael Scherer
Le mercredi 13 janvier 2016 à 12:42 +0100, Niels de Vos a écrit :
> On Wed, Jan 13, 2016 at 04:50:51PM +0530, Raghavendra Talur wrote:
> > On Wed, Jan 13, 2016 at 3:49 PM, Niels de Vos  wrote:
> > 
> > > On Wed, Jan 13, 2016 at 10:35:42AM +0100, Xavier Hernandez wrote:
> > > > The same has happened to slave34.cloud.gluster.org. I've disabled it to
> > > > allow regressions to be run on other slaves.
> > > >
> > > > There are two files owned by root inside
> > > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered:
> > > >
> > > > -rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
> > > > drwxr-xr-x  3 rootroot  4096 Jan  7 18:31
> > > slave34.cloud.gluster.org:
> > >
> > > Thanks!
> > >
> > > I've looked into this a little more now, and might have identified the
> > > problem.
> > >
> > > This one failed with an unrelated error:
> > >
> > >
> > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17413/console
> > >
> > >   ...
> > >   Building remotely on slave34.cloud.gluster.org
> > > (rackspace_regression_2gb) in workspace
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
> > >> git rev-parse --is-inside-work-tree # timeout=10
> > >   Fetching changes from the remote Git repository
> > >> git config remote.origin.url git://review.gluster.org/glusterfs.git
> > > # timeout=10
> > >   Fetching upstream changes from git://review.gluster.org/glusterfs.git
> > >   ...
> > >
> > > The next run on slave34 failed because of the weird directory:
> > >
> > >
> > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/17440/console
> > >
> > >   ...
> > >   Building remotely on slave34.cloud.gluster.org
> > > (rackspace_regression_2gb) in workspace
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered
> > >   Wiping out workspace first.
> > >   java.io.IOException: remote file operation failed:
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
> > > hudson.remoting.Channel@62ecdacb:slave34.cloud.gluster.org:
> > >   ...
> > >
> > > Note the "Wiping out workspace first." line. This comes from an option
> > > in the regression job. This seems to be a recently added "Additional
> > > Behaviour" in the Jenkins job. Did anyone add this on purpose, or was
> > > that automatically done with a Jenkins update or something?
> > >
> > 
> > The three additional behaviour added in regression configuration were added
> > by me and I simply copied
> > whatever was there in smoke configuration page. We can try removing this
> > configuration line but the tests weren't getting started without this(we
> > got it running after a restart so this might be false requirement).
> 
> Ok, at least we know where it came from :) If we could put the jenkins
> job xml files in a git repo, we can then follow the changes that are
> being made. Is this something you can get done?
>  We can just export the
> xml and commit the changes to the repo wheneven a change is needed, most
> importantly we have a history we can track.

There is jenkins job builder for that 
Requires EL 7 however, and someone to know how it work

> > Technically, it is not a harmful configuration and wiki pages recommend it.
> > It says such errors occur only if files were created and left open/locked
> > by some tests or were created with different permissions. We still need to
> > identify what tests are responsible for this.
> 
> Yes, I think it is a good thing to have. Unfortunately some of the tests
> seem to create files as root in the directory. We should correct those
> tests to use only files under the paths that are used for bricks or
> installation. (Xavi found one of those tests in an other email in this
> thread.)
> 
> Thanks,
> Niels
> 
> > 
> > 
> > >
> > > Niels
> > >
> > > >
> > > > Xavi
> > > >
> > > > On 12/01/16 12:06, Niels de Vos wrote:
> > > > >Hi,
> > > > >
> > > > >I've disabled slave32.cloud.gluster.org because it failed multiple
> > > > >regression tests with a weird error. After disabling slave32 and
> > > > >retriggering the failed run, the same job executed fine on a different
> > > > >slave.
> > > > >
> > > > >The affected directory is owned by root, so the jenkins user is not
> > > > >allowed to wipe it. Does anyone know how this could happen? The dirname
> > > > >is rather awkward too...
> > > > >
> > > > >
> > >  
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/slave32.cloud.gluster.org:
> > > /d
> > > > >
> > > > >I think we can just remove that dir and the slave can be enabled again.
> > > > >Leaving the status as is for further investigation.
> > > > >
> > > > >Thanks,
> > > > >Niels
> > > > >
> > > > >
> > > > >Full error:
> > > > >
> > > > > Wiping out workspace first.
> > > > > java.io.IOException: remote file operation failed:
> > > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
> > > hudson.remoting.Channel@7bc1e07d:slave32.cloud.gluster.org:
> 

Re: [Gluster-infra] Jenkins slave32 seems broken?

2016-01-13 Thread Vijay Bellur

On 01/13/2016 06:20 AM, Raghavendra Talur wrote:



On Wed, Jan 13, 2016 at 3:49 PM, Niels de Vos > wrote:

On Wed, Jan 13, 2016 at 10:35:42AM +0100, Xavier Hernandez wrote:
> The same has happened toslave34.cloud.gluster.org 
. I've
disabled it to
> allow regressions to be run on other slaves.
>
> There are two files owned by root inside
> /home/jenkins/root/workspace/rackspace-regression-2GB-triggered:
>
> -rwxr-xr-x  1 rootroot 10124 Jan  7 17:54 file_lock
> drwxr-xr-x  3 rootroot  4096 Jan  7 18:31slave34.cloud.gluster.org 
:

Thanks!

I've looked into this a little more now, and might have identified the
problem.

This one failed with an unrelated error:


https://build.gluster.org/job/rackspace-regression-2GB-triggered/17413/console

   ...
   Building remotely on slave34.cloud.gluster.org
 (rackspace_regression_2gb) in
workspace
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered
> git rev-parse --is-inside-work-tree # timeout=10
   Fetching changes from the remote Git repository
> git config remote.origin.url
git://review.gluster.org/glusterfs.git
 # timeout=10
   Fetching upstream changes from
git://review.gluster.org/glusterfs.git

   ...

The next run on slave34 failed because of the weird directory:


https://build.gluster.org/job/rackspace-regression-2GB-triggered/17440/console

   ...
   Building remotely on slave34.cloud.gluster.org
 (rackspace_regression_2gb) in
workspace
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered
   Wiping out workspace first.
   java.io.IOException: remote file operation failed:
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered at
hudson.remoting.Channel@62ecdacb:slave34.cloud.gluster.org
:
   ...

Note the "Wiping out workspace first." line. This comes from an option
in the regression job. This seems to be a recently added "Additional
Behaviour" in the Jenkins job. Did anyone add this on purpose, or was
that automatically done with a Jenkins update or something?


The three additional behaviour added in regression configuration were
added by me and I simply copied
whatever was there in smoke configuration page. We can try removing this
configuration line but the tests weren't getting started without this(we
got it running after a restart so this might be false requirement).

Technically, it is not a harmful configuration and wiki pages recommend
it. It says such errors occur only if files were created and left
open/locked by some tests or were created with different permissions. We
still need to identify what tests are responsible for this.



I have removed this option as regression tests were continuously 
failing. Please feel free to re-enable as deemed necessary.


Thanks,
Vijay

___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-infra