Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-14 Thread Niels de Vos
On Tue, Jul 07, 2015 at 06:04:44PM +0200, Niels de Vos wrote:
 On Tue, Jul 07, 2015 at 07:13:53PM +0530, Kaushal M wrote:
  I've taken this slave and one other offline and am rebooting it.
 
 Reminder that you do not need to take teh system offline for rebooting.
 I normally follow these steps to get hung systems back functional:
 
 1. verify stuck job, NFS unmount related?
 2. open http://build.gluster.org/view/Infra/job/reboot-vm/build
 3. login on Jenkins
 4. start the reboot-vm job for the stuck system
 5. wait until the job finished
 6. click the abort [x] link on the stuck job
 7. retrigger the job after aborting has been done (reload page)
 
 These hangs do not seem to happen on tests from the master branch
 anymore, only on release-3.7. I think this is a confirmation that the
 reference counting for auth-cache structures in gluster/nfs is a working
 solution.
 
 We should backport these changes:
 
 - nfs: add a gf_lock_t for the auth_cache-cache_dict
   http://review.gluster.org/11021
 
 - core: add gf_ref_t for common refcounting structures
   http://review.gluster.org/11022
   (already done through http://review.gluster.org/11421)
 
 - nfs: refcount each auth_cache_entry and related data_t
   http://review.gluster.org/11023
 
 - refcount: correct the documentation
   http://review.gluster.org/11328
 
 
 I'll try to send backports later this week (maybe Thursday?), unless
 someone else beats me to it. Please reply to this thread if you file a
 bug for this and send some backports.

The above backports have been posted. These should prevent the
Gluster/NFS crashes in the regression tests, and therefor prevent the
hanging of NetBSD on unmounting NFS (when the NFS-server died).

Please check these patches, and merge them when ready:

   
http://review.gluster.org/#/q/status:open+project:glusterfs+branch:release-3.7+topic:bug-1242515

Thanks,
Niels

 
 Thanks,
 Niels
 
  
  On Tue, Jul 7, 2015 at 6:44 PM, Kotresh Hiremath Ravishankar
  khire...@redhat.com wrote:
   Hi Emmanuel,
  
   We are seeing these issues again on nbslave7h.cloud.gluster.org
   http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7974/console
  
   Thanks and Regards,
   Kotresh H R
  
   - Original Message -
   From: Emmanuel Dreyfus m...@netbsd.org
   To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster 
   Devel gluster-devel@gluster.org
   Sent: Sunday, July 5, 2015 12:52:23 AM
   Subject: Re: [Gluster-devel] NetBSD regression tests not Initializing...
  
   Kotresh Hiremath Ravishankar khire...@redhat.com wrote:
  
Any help is appreciated.
  
   nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
   retiggered your change, but it went on another machine.
  
   --
   Emmanuel Dreyfus
   http://hcpnet.free.fr/pubz
   m...@netbsd.org
  
   ___
   Gluster-devel mailing list
   Gluster-devel@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-devel
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-devel



 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-10 Thread Vijaikumar M

NetBSD tests arefailing again:

http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/8123/console

Triggered by Gerrit:http://review.gluster.org/11616  in silent mode.
Building remotely onnbslave74.cloud.gluster.org  
http://build.gluster.org/computer/nbslave74.cloud.gluster.org  
(netbsd7_regression) in workspace 
/home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered
  git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
  git config remote.origin.urlhttp://review.gluster.org/glusterfs.git  # 
timeout=10
Fetching upstream changes fromhttp://review.gluster.org/glusterfs.git
  git --version # timeout=10
  git -c core.askpass=true fetch --tags 
--progresshttp://review.gluster.org/glusterfs.git  refs/changes/16/11616/1
ERROR: Error fetching remote repo 'origin'
ERROR  http://stacktrace.jenkins-ci.org/search?query=ERROR: Error fetching 
remote repo 'origin'
Finished  http://stacktrace.jenkins-ci.org/search?query=Finished: FAILURE

Thanks,
Vijay




On Tuesday 07 July 2015 07:13 PM, Kaushal M wrote:

I've taken this slave and one other offline and am rebooting it.

On Tue, Jul 7, 2015 at 6:44 PM, Kotresh Hiremath Ravishankar
khire...@redhat.com wrote:

Hi Emmanuel,

We are seeing these issues again on nbslave7h.cloud.gluster.org
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7974/console

Thanks and Regards,
Kotresh H R

- Original Message -

From: Emmanuel Dreyfus m...@netbsd.org
To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
gluster-devel@gluster.org
Sent: Sunday, July 5, 2015 12:52:23 AM
Subject: Re: [Gluster-devel] NetBSD regression tests not Initializing...

Kotresh Hiremath Ravishankar khire...@redhat.com wrote:


Any help is appreciated.

nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
retiggered your change, but it went on another machine.

--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-10 Thread Emmanuel Dreyfus
Vijaikumar M vmall...@redhat.com wrote:

 NetBSD tests arefailing again:
(...)
 ERROR: Error fetching remote repo 'origin'

Please reboot it. 

I amstill working on the infamous NFS unmount kernel bug, I hope the
NetBSD slaves will behave better with the fix.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-07 Thread Kaushal M
I've taken this slave and one other offline and am rebooting it.

On Tue, Jul 7, 2015 at 6:44 PM, Kotresh Hiremath Ravishankar
khire...@redhat.com wrote:
 Hi Emmanuel,

 We are seeing these issues again on nbslave7h.cloud.gluster.org
 http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/7974/console

 Thanks and Regards,
 Kotresh H R

 - Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Sunday, July 5, 2015 12:52:23 AM
 Subject: Re: [Gluster-devel] NetBSD regression tests not Initializing...

 Kotresh Hiremath Ravishankar khire...@redhat.com wrote:

  Any help is appreciated.

 nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
 retiggered your change, but it went on another machine.

 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-05 Thread Kotresh Hiremath Ravishankar
Thanks Emmanuel.

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Sunday, July 5, 2015 12:52:23 AM
 Subject: Re: [Gluster-devel] NetBSD regression tests not Initializing...
 
 Kotresh Hiremath Ravishankar khire...@redhat.com wrote:
 
  Any help is appreciated.
 
 nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
 retiggered your change, but it went on another machine.
 
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests not Initializing...

2015-07-04 Thread Emmanuel Dreyfus
Kotresh Hiremath Ravishankar khire...@redhat.com wrote:

 Any help is appreciated.

nbslave72 was sick indeed: it refused SSH connexions. I rebooted it and
retiggered your change, but it went on another machine.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests not Initializing...

2015-07-03 Thread Kotresh Hiremath Ravishankar
Hi

NetBSD regressions are not initializing because of following error consistently 
with multiple re-triggers.
I see the same error for quite a few patches.

http://review.gluster.org/#/c/11443/
Building remotely on nbslave72.cloud.gluster.org (netbsd7_regression) in 
workspace /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered
  git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
  git config remote.origin.url http://review.gluster.org/glusterfs.git # 
  timeout=10
Fetching upstream changes from http://review.gluster.org/glusterfs.git
  git --version # timeout=10
  git -c core.askpass=true fetch --tags --progress 
  http://review.gluster.org/glusterfs.git refs/changes/43/11443/9
ERROR: Error fetching remote repo 'origin'
ERROR: Error fetching remote repo 'origin'
Finished: FAILURE

Any help is appreciated.

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-18 Thread Vijay Bellur

On Tuesday 16 June 2015 02:19 AM, Emmanuel Dreyfus wrote:

Rajesh Joseph rjos...@redhat.com wrote:


Correct me if I am wrong, but I think interruptible is good with hard
mount. Which is good in real deployment scenario. Since we are talking
about test scripts, I thought soft mount along with timeout period can be
a good option to prevent hangs.


soft mount means an I/O operation can timeout and return failure
interruptible mount means you can kill a process undergoing I/O, which
is useful for cleanup routine.

Both are like belt with sustenders, but given how likely we are to hang,
it does not hurts.



We again hit this problem [1]. Can we use soft mount with some retries 
and timeouts so that we don't need manual intervention to recover a hung VM?


Thanks,
Vijay

[1] 
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/6971/console 


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-18 Thread Emmanuel Dreyfus
Vijay Bellur vbel...@redhat.com wrote:

 We again hit this problem [1]. Can we use soft mount with some retries
 and timeouts so that we don't need manual intervention to recover a hung VM?

Sure, but while there, I advise soft and interruptible mount (On NetBSD,
either mount -o soft,intr or mount -i -s) 

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-18 Thread Emmanuel Dreyfus
Emmanuel Dreyfus m...@netbsd.org wrote:

  We again hit this problem [1]. Can we use soft mount with some retries and
  timeouts so that we don't need manual intervention to recover a hung VM?
 
 Um, looking at the current test scripts, we already do it. 

A side note: It seems the hung case is always with dd(1). I have beven
caught tests using quota.c undergoing the same failure.

The only tests that do NFS mount + dd(1) are:

tests/basic/ec/nfs.t
tests/basic/mount-nfs-auth.t
tests/bugs/glusterfs/bug-872923.t
tests/bugs/quota/bug-1153964.t

Perhaps it is time to add options to quota.c and use it everywhere? It
would be interesting to understand what makes dd(1) hang while quota.c
is fine, though.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-18 Thread Emmanuel Dreyfus
Emmanuel Dreyfus m...@netbsd.org wrote:

 This means the dd process getting stuck in tstile because glusterfsd
 died is probably a NetBSD kernel bug. I have to investigate. 

I think I found the culprit, but fixing this will need some discussions
on NetBSD lists:

dd waits on a vnode lock owned by the ioflush kernel thread, which is
responsible of periodical fsync.

ioflush is stuck on the following backtrace:
cv_wait
genfs_do_putpages
genfs_putpages
VOP_PUTPAGES
nfs_flush
nfs_fsync
VOP_FSYNC
nfs_sync
sync_fsync

The cv_wait() call in genfs_do_putpages():
/* Wait for output to complete. */
if (!wasclean  !async  vp-v_numoutput != 0) {
while (vp-v_numoutput != 0)
cv_wait(vp-v_cv, slock);
}

cv_wait() is uninterruptible, timeout-less wait which is obviously wrong
there. cv_timedwait_sig() would be better, but that means pulling NFS
mount options from a lower layer. Not obvious on the architecture front.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-16 Thread Emmanuel Dreyfus
Rajesh Joseph rjos...@redhat.com wrote:

 Correct me if I am wrong, but I think interruptible is good with hard
 mount. Which is good in real deployment scenario. Since we are talking
 about test scripts, I thought soft mount along with timeout period can be
 a good option to prevent hangs.

soft mount means an I/O operation can timeout and return failure
interruptible mount means you can kill a process undergoing I/O, which
is useful for cleanup routine.

Both are like belt with sustenders, but given how likely we are to hang,
it does not hurts.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Pranith Kumar Karampuri

Emmanuel,
   I am not sure of the feasibility but just wanted to ask you. Do 
you think there is a possibility to error out operations on the mount 
when mount crashes instead of hanging? That would prevent a lot of 
manual intervention even in future.


Pranith.
On 06/15/2015 01:35 PM, Niels de Vos wrote:

Hi,

sometimes the NetBSD regression tests hang with messages like this:

 [12:29:07] ./tests/basic/mgmt_v3-locks.t
 ... ok79867 ms
 No volumes present
 mount_nfs: can't access /patchy: Permission denied
 mount_nfs: can't access /patchy: Permission denied
 mount_nfs: can't access /patchy: Permission denied

Most (if not all) of these hangs are caused by a crashing Gluster/NFS
process. Once the Gluster/NFS server is not reachable anymore,
unmounting fails.

The only way to recover is to reboot the VM and retrigger the test. For
rebooting, the http://build.gluster.org/job/reboot-vm job can be used,
and retriggering works by clicking the retrigger link in the left menu
once the test has been marked as failed/aborted.

When logging in on the NetBSD system that hangs, you can verify with
these steps:

1. check if there is a /glusterfsd.core file
2. run gdb on the core:

 # cd /build/install
 # gdb --core=/glusterfsd.core sbin/glusterfs
 ...
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0xb9b94f0b in auth_cache_lookup (cache=0xb9aa2310, fh=0xb9044bf8,
 host_addr=0xb900e400 104.130.205.187, timestamp=0xbf7fd900,
 can_write=0xbf7fd8fc)
 at
 
/home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/nfs/server/src/auth-cache.c:164
 164 *can_write = lookup_res-item-opts-rw;

3. verify the lookup_res structure:

 (gdb) p *lookup_res
 $1 = {timestamp = 1434284981, item = 0xb901e3b0}
 (gdb) p *lookup_res-item
 $2 = {name = 0xff00 error: Cannot access memory at address
 0xff00, opts = 0x}


A fix for this has been sent, it is currently waiting for an update to
the prosed reference counting:

   - http://review.gluster.org/11022
 core: add gf_ref_t for common refcounting structures
   - http://review.gluster.org/11023
 nfs: refcount each auth_cache_entry and related data_t

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Kaushal M
The hang we observe is not something specific to Gluster. I've
observed this kind of hangs when a filesystem which is in use goes
offline.
For example I've accidently shutdown machines which were being used
for mounting nfs, which lead to the client systems hanging completely
and required a hard reboot.

If there are ways to avoid these kinds hangs when they eventually
occur, I'm all ears.

On Mon, Jun 15, 2015 at 4:38 PM, Pranith Kumar Karampuri
pkara...@redhat.com wrote:
 Emmanuel,
I am not sure of the feasibility but just wanted to ask you. Do you
 think there is a possibility to error out operations on the mount when mount
 crashes instead of hanging? That would prevent a lot of manual intervention
 even in future.

 Pranith.

 On 06/15/2015 01:35 PM, Niels de Vos wrote:

 Hi,

 sometimes the NetBSD regression tests hang with messages like this:

  [12:29:07] ./tests/basic/mgmt_v3-locks.t
  ... ok79867 ms
  No volumes present
  mount_nfs: can't access /patchy: Permission denied
  mount_nfs: can't access /patchy: Permission denied
  mount_nfs: can't access /patchy: Permission denied

 Most (if not all) of these hangs are caused by a crashing Gluster/NFS
 process. Once the Gluster/NFS server is not reachable anymore,
 unmounting fails.

 The only way to recover is to reboot the VM and retrigger the test. For
 rebooting, the http://build.gluster.org/job/reboot-vm job can be used,
 and retriggering works by clicking the retrigger link in the left menu
 once the test has been marked as failed/aborted.

 When logging in on the NetBSD system that hangs, you can verify with
 these steps:

 1. check if there is a /glusterfsd.core file
 2. run gdb on the core:

  # cd /build/install
  # gdb --core=/glusterfsd.core sbin/glusterfs
  ...
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0xb9b94f0b in auth_cache_lookup (cache=0xb9aa2310, fh=0xb9044bf8,
  host_addr=0xb900e400 104.130.205.187, timestamp=0xbf7fd900,
  can_write=0xbf7fd8fc)
  at

 /home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/nfs/server/src/auth-cache.c:164
  164 *can_write = lookup_res-item-opts-rw;

 3. verify the lookup_res structure:

  (gdb) p *lookup_res
  $1 = {timestamp = 1434284981, item = 0xb901e3b0}
  (gdb) p *lookup_res-item
  $2 = {name = 0xff00 error: Cannot access memory at address
  0xff00, opts = 0x}


 A fix for this has been sent, it is currently waiting for an update to
 the prosed reference counting:

- http://review.gluster.org/11022
  core: add gf_ref_t for common refcounting structures
- http://review.gluster.org/11023
  nfs: refcount each auth_cache_entry and related data_t

 Thanks,
 Niels
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel


 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Emmanuel Dreyfus
On Mon, Jun 15, 2015 at 04:38:54PM +0530, Pranith Kumar Karampuri wrote:
 Emmanuel,
I am not sure of the feasibility but just wanted to ask you. Do you
 think there is a possibility to error out operations on the mount when mount
 crashes instead of hanging? That would prevent a lot of manual intervention
 even in future.

Your message is a bit contradictory: there are bits quoted about NFS mount, 
which is native, and bits about glusterfs mount. What information are
you looking for?

If we talk about hanging mount, this is probably NFS client awaiting
for a NFS server that will never return. I alsready wrote how this can be 
cleaned up by umount -f -R and the limitation of that approahc.

If we talk about crashing mount then this is more likely to be a
native mount, for which you have information in the logs, don't you?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Rajesh Joseph



On Monday 15 June 2015 05:21 PM, Kaushal M wrote:

The hang we observe is not something specific to Gluster. I've
observed this kind of hangs when a filesystem which is in use goes
offline.
For example I've accidently shutdown machines which were being used
for mounting nfs, which lead to the client systems hanging completely
and required a hard reboot.

If there are ways to avoid these kinds hangs when they eventually
occur, I'm all ears.


For these test cases can't we use the nfs soft mount option to prevent 
the hang?




On Mon, Jun 15, 2015 at 4:38 PM, Pranith Kumar Karampuri
pkara...@redhat.com wrote:

Emmanuel,
I am not sure of the feasibility but just wanted to ask you. Do you
think there is a possibility to error out operations on the mount when mount
crashes instead of hanging? That would prevent a lot of manual intervention
even in future.

Pranith.

On 06/15/2015 01:35 PM, Niels de Vos wrote:

Hi,

sometimes the NetBSD regression tests hang with messages like this:

  [12:29:07] ./tests/basic/mgmt_v3-locks.t
  ... ok79867 ms
  No volumes present
  mount_nfs: can't access /patchy: Permission denied
  mount_nfs: can't access /patchy: Permission denied
  mount_nfs: can't access /patchy: Permission denied

Most (if not all) of these hangs are caused by a crashing Gluster/NFS
process. Once the Gluster/NFS server is not reachable anymore,
unmounting fails.

The only way to recover is to reboot the VM and retrigger the test. For
rebooting, the http://build.gluster.org/job/reboot-vm job can be used,
and retriggering works by clicking the retrigger link in the left menu
once the test has been marked as failed/aborted.

When logging in on the NetBSD system that hangs, you can verify with
these steps:

1. check if there is a /glusterfsd.core file
2. run gdb on the core:

  # cd /build/install
  # gdb --core=/glusterfsd.core sbin/glusterfs
  ...
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0xb9b94f0b in auth_cache_lookup (cache=0xb9aa2310, fh=0xb9044bf8,
  host_addr=0xb900e400 104.130.205.187, timestamp=0xbf7fd900,
  can_write=0xbf7fd8fc)
  at

/home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/nfs/server/src/auth-cache.c:164
  164 *can_write = lookup_res-item-opts-rw;

3. verify the lookup_res structure:

  (gdb) p *lookup_res
  $1 = {timestamp = 1434284981, item = 0xb901e3b0}
  (gdb) p *lookup_res-item
  $2 = {name = 0xff00 error: Cannot access memory at address
  0xff00, opts = 0x}


A fix for this has been sent, it is currently waiting for an update to
the prosed reference counting:

- http://review.gluster.org/11022
  core: add gf_ref_t for common refcounting structures
- http://review.gluster.org/11023
  nfs: refcount each auth_cache_entry and related data_t

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Emmanuel Dreyfus
On Mon, Jun 15, 2015 at 06:28:26PM +0530, Rajesh Joseph wrote:
 For these test cases can't we use the nfs soft mount option to prevent the
 hang?

soft mount will not be enough. I think you also need interruptible.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Rajesh Joseph



On Monday 15 June 2015 06:34 PM, Emmanuel Dreyfus wrote:

On Mon, Jun 15, 2015 at 06:28:26PM +0530, Rajesh Joseph wrote:

For these test cases can't we use the nfs soft mount option to prevent the
hang?

soft mount will not be enough. I think you also need interruptible.


Correct me if I am wrong, but I think interruptible is good with hard 
mount. Which is good
in real deployment scenario. Since we are talking about test scripts, I 
thought soft mount

along with timeout period can be a good option to prevent hangs.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests hanging after ./tests/basic/mgmt_v3-locks.t

2015-06-15 Thread Niels de Vos
Hi,

sometimes the NetBSD regression tests hang with messages like this:

[12:29:07] ./tests/basic/mgmt_v3-locks.t
... ok79867 ms
No volumes present
mount_nfs: can't access /patchy: Permission denied
mount_nfs: can't access /patchy: Permission denied
mount_nfs: can't access /patchy: Permission denied

Most (if not all) of these hangs are caused by a crashing Gluster/NFS
process. Once the Gluster/NFS server is not reachable anymore,
unmounting fails.

The only way to recover is to reboot the VM and retrigger the test. For
rebooting, the http://build.gluster.org/job/reboot-vm job can be used,
and retriggering works by clicking the retrigger link in the left menu
once the test has been marked as failed/aborted.

When logging in on the NetBSD system that hangs, you can verify with
these steps:

1. check if there is a /glusterfsd.core file
2. run gdb on the core:

# cd /build/install
# gdb --core=/glusterfsd.core sbin/glusterfs
...
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xb9b94f0b in auth_cache_lookup (cache=0xb9aa2310, fh=0xb9044bf8,
host_addr=0xb900e400 104.130.205.187, timestamp=0xbf7fd900,
can_write=0xbf7fd8fc)
at

/home/jenkins/root/workspace/rackspace-netbsd7-regression-triggered/xlators/nfs/server/src/auth-cache.c:164
164 *can_write = lookup_res-item-opts-rw;

3. verify the lookup_res structure:

(gdb) p *lookup_res
$1 = {timestamp = 1434284981, item = 0xb901e3b0}
(gdb) p *lookup_res-item
$2 = {name = 0xff00 error: Cannot access memory at address
0xff00, opts = 0x}


A fix for this has been sent, it is currently waiting for an update to
the prosed reference counting:

  - http://review.gluster.org/11022
core: add gf_ref_t for common refcounting structures
  - http://review.gluster.org/11023
nfs: refcount each auth_cache_entry and related data_t

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-09 Thread Emmanuel Dreyfus
Vijay Bellur vbel...@redhat.com wrote:

 More the merrier :-).

Hi

On master, I still have this one pending to fix glustershd:
http://review.gluster.com/9071

Same fix on release-3.6
http://review.gluster.com/9084

While I am there, fixes done in master but pending for release-3.6:
http://review.gluster.com/9215 (easy buffer overrun fix)
http://review.gluster.com/9214 (reviewed +2)
-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-03 Thread Emmanuel Dreyfus
On Mon, Dec 01, 2014 at 05:49:54AM +0100, Emmanuel Dreyfus wrote:
 Here is the latest list of NetBSD fixes for regression tests:

Hi

This is a friendly reminder that I sill have the following pending:

 http://review.gluster.com/9071
 http://review.gluster.com/9075
 http://review.gluster.com/9074
 http://review.gluster.com/9216  [2]
 http://review.gluster.com/9217
 http://review.gluster.com/9219
 http://review.gluster.com/9220
 
 [2] Here I fix the symptom rather than the cause. Hints are welcome to
 help fixing the cause, but perhaps the symptom fix could be merged as an
 interim solution so that glustershd stops crashing during the test.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Xavier Hernandez

On 12/01/2014 05:49 AM, Emmanuel Dreyfus wrote:

Vijay Bellur vbel...@redhat.com wrote:


And as the fix crop, I have a few others to share :-)

More the merrier :-).


Here is the latest list of NetBSD fixes for regression tests:
http://review.gluster.com/8982
http://review.gluster.com/9071
http://review.gluster.com/9075
http://review.gluster.com/9074
http://review.gluster.com/9212  [1]
http://review.gluster.com/9216  [2]
http://review.gluster.com/9217
http://review.gluster.com/9219
http://review.gluster.com/9220

[1] Krishnan Parthasarathi will probably want to improve the commit
message before merging.

[2] Here I fix the symptom rather than the cause. Hints are welcome to
help fixing the cause, but perhaps the symptom fix could be merged as an
interim solution so that glustershd stops crashing during the test.

The regression.sh script on nbslave71 and nbslave72 still disable two
test that always fail
./tests/basic/afr/entry-self-heal.t  - I am working on it
./tests/basic/ec/quota.t - Xavier Hernandez and Raghavendra Gowdappa
may have a word about it.


A temporal solution if you need to implement this very soon is to add a 
sleep of a few seconds between the 'dd' and 'rm' commands in the quota.t 
script. This prevents the crash on DHT and allows the test to pass.


I can do that if needed.

Xavi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Emmanuel Dreyfus
On Mon, Dec 01, 2014 at 10:02:32AM +0100, Xavier Hernandez wrote:
 A temporal solution if you need to implement this very soon is to add a
 sleep of a few seconds between the 'dd' and 'rm' commands in the quota.t
 script. This prevents the crash on DHT and allows the test to pass.

Given that NetBSD regressions appear almost as fast as they are
fixed, I am in a hurry to have the triggered regression tests 
operationnal, so that new regressions are not introduced.

Hence yes, I am in favor of temporary fixes so that tests can be run.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Xavier Hernandez

This patch should solve the crash and let the test finish successfully.

http://review.gluster.org/9222

Xavi

On 12/01/2014 10:26 AM, Emmanuel Dreyfus wrote:

On Mon, Dec 01, 2014 at 10:02:32AM +0100, Xavier Hernandez wrote:

A temporal solution if you need to implement this very soon is to add a
sleep of a few seconds between the 'dd' and 'rm' commands in the quota.t
script. This prevents the crash on DHT and allows the test to pass.


Given that NetBSD regressions appear almost as fast as they are
fixed, I am in a hurry to have the triggered regression tests
operationnal, so that new regressions are not introduced.

Hence yes, I am in favor of temporary fixes so that tests can be run.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Raghavendra Gowdappa


- Original Message -
 From: Xavier Hernandez xhernan...@datalab.es
 To: Emmanuel Dreyfus m...@netbsd.org, Vijay Bellur 
 vbel...@redhat.com, Justin Clift jus...@gluster.org,
 Pranith Kumar Karampuri pkara...@redhat.com, Krishnan Parthasarathi 
 kpart...@redhat.com, Raghavendra
 Gowdappa rgowd...@redhat.com
 Cc: gluster-devel@gluster.org
 Sent: Monday, December 1, 2014 2:32:32 PM
 Subject: Re: [Gluster-devel] NetBSD regression tests: reviews required
 
 On 12/01/2014 05:49 AM, Emmanuel Dreyfus wrote:
  Vijay Bellur vbel...@redhat.com wrote:
 
  And as the fix crop, I have a few others to share :-)
  More the merrier :-).
 
  Here is the latest list of NetBSD fixes for regression tests:
  http://review.gluster.com/8982
  http://review.gluster.com/9071
  http://review.gluster.com/9075
  http://review.gluster.com/9074
  http://review.gluster.com/9212  [1]
  http://review.gluster.com/9216  [2]
  http://review.gluster.com/9217
  http://review.gluster.com/9219
  http://review.gluster.com/9220
 
  [1] Krishnan Parthasarathi will probably want to improve the commit
  message before merging.
 
  [2] Here I fix the symptom rather than the cause. Hints are welcome to
  help fixing the cause, but perhaps the symptom fix could be merged as an
  interim solution so that glustershd stops crashing during the test.
 
  The regression.sh script on nbslave71 and nbslave72 still disable two
  test that always fail
  ./tests/basic/afr/entry-self-heal.t  - I am working on it
  ./tests/basic/ec/quota.t - Xavier Hernandez and Raghavendra Gowdappa
  may have a word about it.
 
 A temporal solution if you need to implement this very soon is to add a
 sleep of a few seconds between the 'dd' and 'rm' commands in the quota.t
 script. This prevents the crash on DHT and allows the test to pass.
 
 I can do that if needed.

Go ahead :).

 
 Xavi
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Raghavendra G
On Mon, Dec 1, 2014 at 2:32 PM, Xavier Hernandez xhernan...@datalab.es
wrote:

 On 12/01/2014 05:49 AM, Emmanuel Dreyfus wrote:

 Vijay Bellur vbel...@redhat.com wrote:

  And as the fix crop, I have a few others to share :-)

 More the merrier :-).


 Here is the latest list of NetBSD fixes for regression tests:
 http://review.gluster.com/8982
 http://review.gluster.com/9071
 http://review.gluster.com/9075
 http://review.gluster.com/9074
 http://review.gluster.com/9212  [1]
 http://review.gluster.com/9216  [2]
 http://review.gluster.com/9217
 http://review.gluster.com/9219
 http://review.gluster.com/9220

 [1] Krishnan Parthasarathi will probably want to improve the commit
 message before merging.

 [2] Here I fix the symptom rather than the cause. Hints are welcome to
 help fixing the cause, but perhaps the symptom fix could be merged as an
 interim solution so that glustershd stops crashing during the test.

 The regression.sh script on nbslave71 and nbslave72 still disable two
 test that always fail
 ./tests/basic/afr/entry-self-heal.t  - I am working on it
 ./tests/basic/ec/quota.t - Xavier Hernandez and Raghavendra Gowdappa
 may have a word about it.


 A temporal solution if you need to implement this very soon is to add a
 sleep of a few seconds between the 'dd' and 'rm' commands in the quota.t
 script. This prevents the crash on DHT


Do you've the core/back-trace? I am curious to know what caused the crash.


 and allows the test to pass.

 I can do that if needed.

 Xavi

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel




-- 
Raghavendra G
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-12-01 Thread Krutika Dhananjay
Emmanuel, 

I think Raghavendra is referring to the crash in tests/basic/ec/quota.t here 
(as opposed to the one in tests/basic/afr/self-heald.t for which I posted the 
explanation). 
-Krutika 

- Original Message -

 From: Emmanuel Dreyfus m...@netbsd.org
 To: Raghavendra G raghaven...@gluster.com, Xavier Hernandez
 xhernan...@datalab.es
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Monday, December 1, 2014 6:04:57 PM
 Subject: Re: [Gluster-devel] NetBSD regression tests: reviews required

 Raghavendra G raghaven...@gluster.com wrote:

  Do you've the core/back-trace? I am curious to know what caused the crash.

 Krutika Dhananjay just posted the full explanation in
 1374342506.4067656.1417436203250.javamail.zim...@redhat.com
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-11-30 Thread Emmanuel Dreyfus
Vijay Bellur vbel...@redhat.com wrote:

  And as the fix crop, I have a few others to share :-)
 More the merrier :-).

Here is the latest list of NetBSD fixes for regression tests:
http://review.gluster.com/8982
http://review.gluster.com/9071
http://review.gluster.com/9075
http://review.gluster.com/9074
http://review.gluster.com/9212  [1]
http://review.gluster.com/9216  [2]
http://review.gluster.com/9217
http://review.gluster.com/9219
http://review.gluster.com/9220

[1] Krishnan Parthasarathi will probably want to improve the commit
message before merging.

[2] Here I fix the symptom rather than the cause. Hints are welcome to
help fixing the cause, but perhaps the symptom fix could be merged as an
interim solution so that glustershd stops crashing during the test.

The regression.sh script on nbslave71 and nbslave72 still disable two
test that always fail
./tests/basic/afr/entry-self-heal.t  - I am working on it
./tests/basic/ec/quota.t - Xavier Hernandez and Raghavendra Gowdappa
may have a word about it. 


-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-11-28 Thread Justin Clift
On Sat, 22 Nov 2014 16:55:07 +0100
m...@netbsd.org (Emmanuel Dreyfus) wrote:
snip
 Some news on triggered NetBSD regression tests: we still have a few
 test that always fail in basic. I tweaked the regression test
 launching script to skip them, so that we can come some useful
 results until they are fixed. But before enabling votes, we still
 need to merge two change sets to fix spurious failures:
 http://review.gluster.com/9071
 http://review.gluster.com/9075
 
snip 
 And while I am there, it would be nice if someone could review this
 one: http://review.gluster.com/9137

Hi all,

Does anyone have time to look over some (or all) of these three for
correctness?

We're trying to get the NetBSD side of things running 100%, and
waiting on these is blocking us. ;)

Regards and best wishes,

Justin Clift

-- 
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-11-28 Thread Vijay Bellur

On 11/28/2014 09:25 PM, Justin Clift wrote:

On Sat, 22 Nov 2014 16:55:07 +0100
m...@netbsd.org (Emmanuel Dreyfus) wrote:
snip

Some news on triggered NetBSD regression tests: we still have a few
test that always fail in basic. I tweaked the regression test
launching script to skip them, so that we can come some useful
results until they are fixed. But before enabling votes, we still
need to merge two change sets to fix spurious failures:
http://review.gluster.com/9071
http://review.gluster.com/9075



Have dropped a note to Pranith and Raghavendra Bhat to review these. We 
should get this in soon.



snip

And while I am there, it would be nice if someone could review this
one: http://review.gluster.com/9137




Have Merged this.


Hi all,

Does anyone have time to look over some (or all) of these three for
correctness?

We're trying to get the NetBSD side of things running 100%, and
waiting on these is blocking us. ;)



Sorry about this delay. All patches should be in the repo in a bit.

Regards,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-11-28 Thread Emmanuel Dreyfus
On Fri, Nov 28, 2014 at 03:55:22PM +, Justin Clift wrote:
 We're trying to get the NetBSD side of things running 100%, and
 waiting on these is blocking us. ;)

And as the fix crop, I have a few others to share :-)

Currently I test with:
http://review.gluster.org/8074
http://review.gluster.org/8982
http://review.gluster.org/9075
http://review.gluster.org/9137
http://review.gluster.org/9171
http://review.gluster.org/9204
http://review.gluster.org/9212 (maybe not ready yet)


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests: reviews required

2014-11-28 Thread Vijay Bellur

On 11/28/2014 09:44 PM, Emmanuel Dreyfus wrote:

On Fri, Nov 28, 2014 at 03:55:22PM +, Justin Clift wrote:

We're trying to get the NetBSD side of things running 100%, and
waiting on these is blocking us. ;)


And as the fix crop, I have a few others to share :-)



More the merrier :-).


Currently I test with:
http://review.gluster.org/8074


s/8074/9074/ ?

Pranith - If this happens to be 9074, how would we want to address this? 
Should we wait for glfsheal to start working in release-3.6  mainline?



http://review.gluster.org/8982


Have triggered a regression run for this. Will merge upon successful 
completion of regression.



http://review.gluster.org/9075


Under review (along with 9071).


http://review.gluster.org/9137
http://review.gluster.org/9171
http://review.gluster.org/9204


All three patches are now in merged state.


http://review.gluster.org/9212 (maybe not ready yet)



Should be in once ready.

Cheers,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests: reviews required

2014-11-22 Thread Emmanuel Dreyfus
Hello

Some news on triggered NetBSD regression tests: we still have a few test
that always fail in basic. I tweaked the regression test launching
script to skip them, so that we can come some useful results until they
are fixed. But before enabling votes, we still need to merge two change
sets to fix spurious failures:
http://review.gluster.com/9071
http://review.gluster.com/9075

The NetBSD tirggered regression history can be seen here: starting at
build #79,  we have 2 success and two spurious failures:
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/

The skipped tests for now are:

- tests/basic/afr/self-heald.t
Anuradha Talur ata...@redhat.com is working on it

- tests/basic/ec/ec.t
- tests/basic/ec/self-heal.t
Xavier Hernandez xhernan...@datalab.es submitted this fix that needs
to me merged:
http://review.gluster.org/9151

- tests/basic/ec/quota.t
Still being investigated by Xavier.


And while I am there, it would be nice if someone could review this one:
http://review.gluster.com/9137

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests

2014-11-18 Thread Emmanuel Dreyfus
Hi

We now have NetBSD regression tests triggered on each commit. The
tests are restricted to tests/basic

Unfortunately we have 3 tests that almost reliabily fail. Help is welcome
to fix them (nbslave70.cloud.gluster.org is available for testing):
./tests/basic/afr/self-heald.t   (Wstat: 0 Tests: 83 Failed: 1)
  Failed test:  29
./tests/basic/ec/quota.t (Wstat: 0 Tests: 22 Failed: 3)
  Failed tests:  19-21
./tests/basic/ec/self-heal.t (Wstat: 0 Tests: 257 Failed: 1)
  Failed test:  246

There are also a few spurious failures. Merging that changes would help:
http://review.gluster.org/8982
http://review.gluster.org/9071
http://review.gluster.org/9075

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests

2014-11-18 Thread Xavier Hernandez

Hi,

On 11/18/2014 09:44 AM, Emmanuel Dreyfus wrote:

Hi

We now have NetBSD regression tests triggered on each commit. The
tests are restricted to tests/basic

Unfortunately we have 3 tests that almost reliabily fail. Help is welcome
to fix them (nbslave70.cloud.gluster.org is available for testing):
./tests/basic/afr/self-heald.t   (Wstat: 0 Tests: 83 Failed: 1)
   Failed test:  29
./tests/basic/ec/quota.t (Wstat: 0 Tests: 22 Failed: 3)
   Failed tests:  19-21
./tests/basic/ec/self-heal.t (Wstat: 0 Tests: 257 Failed: 1)
   Failed test:  246


This failure is solved by patches http://review.gluster.org/9117/ 
(already merged) and http://review.gluster.org/9133/ (I've just reviewed 
it).


I'm still working on quota.t failure.

Xavi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests

2014-11-18 Thread Emmanuel Dreyfus
On Tue, Nov 18, 2014 at 10:30:46AM +0100, Xavier Hernandez wrote:
 ./tests/basic/ec/quota.t (Wstat: 0 Tests: 22 Failed: 3)
Failed tests:  19-21
 ./tests/basic/ec/self-heal.t (Wstat: 0 Tests: 257 Failed: 1)
Failed test:  246
 
 This failure is solved by patches http://review.gluster.org/9117/ (already
 merged) and http://review.gluster.org/9133/ (I've just reviewed it).

Unfortunately http://review.gluster.org/9133 does not always fix test
246 of ./tests/basic/ec/self-heal.t. Running ls 2 here shows that
the result is not conistent: we do not have the same objects listed
on each try


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests

2014-11-18 Thread Xavier Hernandez

On 11/18/2014 02:05 PM, Emmanuel Dreyfus wrote:

On Tue, Nov 18, 2014 at 10:30:46AM +0100, Xavier Hernandez wrote:

./tests/basic/ec/quota.t (Wstat: 0 Tests: 22 Failed: 3)
   Failed tests:  19-21
./tests/basic/ec/self-heal.t (Wstat: 0 Tests: 257 Failed: 1)
   Failed test:  246


This failure is solved by patches http://review.gluster.org/9117/ (already
merged) and http://review.gluster.org/9133/ (I've just reviewed it).


Unfortunately http://review.gluster.org/9133 does not always fix test
246 of ./tests/basic/ec/self-heal.t. Running ls 2 here shows that
the result is not conistent: we do not have the same objects listed
on each try


Have you applied 9117 ? When I was doing tests, it was not applied.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests

2014-11-18 Thread Emmanuel Dreyfus
On Tue, Nov 18, 2014 at 02:16:24PM +0100, Xavier Hernandez wrote:
 Have you applied 9117 ? When I was doing tests, it was not applied.

Yes, I am testing at mine to avoid breaking your experiments on quota.t
I will resync everything to master to see if it improves.

--  
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NetBSD regression tests

2014-11-18 Thread Emmanuel Dreyfus
On Tue, Nov 18, 2014 at 02:16:24PM +0100, Xavier Hernandez wrote:
 Have you applied 9117 ? When I was doing tests, it was not applied.

Yes, it still fails with latest master (and changes 8982 9071 9075 9097
9133 and 9137

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] NetBSD regression tests: patches to merge

2014-10-31 Thread Emmanuel Dreyfus
Hi

I need this one to be merged so that I can setup pre-commit basic
regression tests on NetBSD for master:
http://review.gluster.org/8936

While there, my life would be simplier if that big one could 
be merged: http://review.gluster.org/9009


-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel