Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week
On 02/21/2017 12:42 PM, Atin Mukherjee wrote: On Tue, Feb 21, 2017 at 9:47 PM, Shyam> wrote: 3) ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t - *Milind/Hari*, request you take a look at this - This seems to have about 8 failures in the last week on master and release-3.10 - The failure seems to stem from tier.rc:function rebalance_run_time (line 133)? - Logs follow, http://lists.gluster.org/pipermail/gluster-devel/2017-February/052137.html Hari did mention that he has identified the issue and will be sending a patch soon. Awesome! I missed reading/processing that earlier mail, thanks. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week
Update from week of: (2017-02-13 to 2017-02-21) This week we have 3 problems from fstat to report as follows, 1) ./tests/features/lock_revocation.t - *Pranith*, request you take a look at this - This seems to be hanging on CentOS runs causing *aborted* test runs - Some of these test runs are, - https://build.gluster.org/job/centos6-regression/3256/console - https://build.gluster.org/job/centos6-regression/3196/console - https://build.gluster.org/job/centos6-regression/3196/console 2) tests/basic/quota-anon-fd-nfs.t - This had one spurious failure in 3.10 - I think it is because of not checking if NFS mount is available (which is anyway a good check to have in the test to avoid spurious failures) - I have filed and posted a fix for the same, - Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425515 - Possible Fix: https://review.gluster.org/16701 3) ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t - *Milind/Hari*, request you take a look at this - This seems to have about 8 failures in the last week on master and release-3.10 - The failure seems to stem from tier.rc:function rebalance_run_time (line 133)? - Logs follow, 02:36:38 [10:36:38] Running tests in file ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t 02:36:45 No volumes present 02:37:36 Tiering Migration Functionality: patchy: failed: Tier daemon is not running on volume patchy 02:37:36 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 + * 60 + : syntax error: operand expected (error token is "* 3600 + * 60 + ") 02:37:36 ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: line 23: [: : integer expression expected 02:37:41 Tiering Migration Functionality: patchy: failed: Tier daemon is not running on volume patchy 02:37:41 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 + * 60 + : syntax error: operand expected (error token is "* 3600 + * 60 + ") 02:37:41 ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: line 23: [: : integer expression expected 02:37:41 ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: line 23: [: -: integer expression expected 02:37:41 ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t .. ... 02:37:41 ok 14, LINENUM:69 02:37:41 not ok 15 Got "1" instead of "0", LINENUM:70 02:37:41 FAILED COMMAND: 0 tier_daemon_check 02:37:41 not ok 16 Got "1" instead of "0", LINENUM:72 02:37:41 FAILED COMMAND: 0 non_zero_check 02:37:41 not ok 17 Got "1" instead of "0", LINENUM:75 02:37:41 FAILED COMMAND: 0 non_zero_check 02:37:41 not ok 18 Got "1" instead of "0", LINENUM:77 02:37:41 FAILED COMMAND: 0 non_zero_check - 02:37:41 Failed 4/18 subtests Shyam On 02/15/2017 09:25 AM, Shyam wrote: Update from week of: (2017-02-06 to 2017-02-13) No major failures to report this week, things look fine from a regression suite failure stats perspective. Do we have any updates on the older cores? Specifically, - https://build.gluster.org/job/centos6-regression/3046/consoleText (./tests/basic/tier/tier.t -- tier rebalance) - https://build.gluster.org/job/centos6-regression/2963/consoleFull (./tests/basic/volume-snapshot.t -- glusterd) Shyam On 02/06/2017 02:21 PM, Shyam wrote: Update from week of: (2017-01-30 to 2017-02-06) Failure stats and actions: 1) ./tests/basic/tier/tier.t Core dump needs attention https://build.gluster.org/job/centos6-regression/3046/consoleText Looks like the tier rebalance process has crashed (see below for the stack details) 2) ./tests/basic/ec/ec-background-heals.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 3) ./tests/bitrot/bug-1373520.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 Thanks, Shyam On 01/30/2017 03:00 PM, Shyam wrote: Hi, The following is a list of spurious(?) regression failures in the 3.10 branch last week (from fstat.gluster.org). Request component owner or other devs to take a look at the failures, and weed out real issues. Regression failures 3.10: Summary: 1) https://build.gluster.org/job/centos6-regression/2960/consoleFull ./tests/basic/ec/ec-background-heals.t 2) https://build.gluster.org/job/centos6-regression/2963/consoleFull ./tests/basic/volume-snapshot.t 3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull ./tests/basic/afr/self-heald.t 4) https://build.gluster.org/job/centos6-regression/2954/consoleFull ./tests/basic/tier/legacy-many.t 5) https://build.gluster.org/job/centos6-regression/2858/consoleFull ./tests/bugs/bitrot/bug-1245981.t 6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull ./tests/basic/afr/self-heal.t 7)
Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week
Update from week of: (2017-02-06 to 2017-02-13) No major failures to report this week, things look fine from a regression suite failure stats perspective. Do we have any updates on the older cores? Specifically, - https://build.gluster.org/job/centos6-regression/3046/consoleText (./tests/basic/tier/tier.t -- tier rebalance) - https://build.gluster.org/job/centos6-regression/2963/consoleFull (./tests/basic/volume-snapshot.t -- glusterd) Shyam On 02/06/2017 02:21 PM, Shyam wrote: Update from week of: (2017-01-30 to 2017-02-06) Failure stats and actions: 1) ./tests/basic/tier/tier.t Core dump needs attention https://build.gluster.org/job/centos6-regression/3046/consoleText Looks like the tier rebalance process has crashed (see below for the stack details) 2) ./tests/basic/ec/ec-background-heals.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 3) ./tests/bitrot/bug-1373520.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 Thanks, Shyam On 01/30/2017 03:00 PM, Shyam wrote: Hi, The following is a list of spurious(?) regression failures in the 3.10 branch last week (from fstat.gluster.org). Request component owner or other devs to take a look at the failures, and weed out real issues. Regression failures 3.10: Summary: 1) https://build.gluster.org/job/centos6-regression/2960/consoleFull ./tests/basic/ec/ec-background-heals.t 2) https://build.gluster.org/job/centos6-regression/2963/consoleFull ./tests/basic/volume-snapshot.t 3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull ./tests/basic/afr/self-heald.t 4) https://build.gluster.org/job/centos6-regression/2954/consoleFull ./tests/basic/tier/legacy-many.t 5) https://build.gluster.org/job/centos6-regression/2858/consoleFull ./tests/bugs/bitrot/bug-1245981.t 6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull ./tests/basic/afr/self-heal.t 7) https://build.gluster.org/job/netbsd7-regression/2624/consoleFull ./tests/encryption/crypt.t Thanks, Shyam Core details from https://build.gluster.org/job/centos6-regression/3046/consoleText Core was generated by `/build/install/sbin/glusterfs -s localhost --volfile-id tierd/patchy -p /var/li'. Program terminated with signal 11, Segmentation fault. #0 0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6 Thread 1 (Thread 0x7ffb5a169700 (LWP 467)): #0 0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6 No symbol table info available. #1 0x7ffb56b7789f in dht_filter_loc_subvol_key (this=0x7ffb50015930, loc=0x7ffb2c002de4, new_loc=0x7ffb2c413f80, subvol=0x7ffb2c413fc0) at /home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-helper.c:307 new_name = 0x0 new_path = 0x0 trav = 0x0 key = '\000' ret = 0 #2 0x7ffb56bb2ce4 in dht_lookup (frame=0x7ffb4c00623c, this=0x7ffb50015930, loc=0x7ffb2c002de4, xattr_req=0x7ffb4c00949c) at /home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-common.c:2494 subvol = 0x0 hashed_subvol = 0x0 local = 0x7ffb4c00636c conf = 0x7ffb5003f380 ret = -1 op_errno = -1 layout = 0x0 i = 0 call_cnt = 0 new_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' , pargfid = '\000' } __FUNCTION__ = "dht_lookup" #3 0x7ffb63ff6f5c in syncop_lookup (subvol=0x7ffb50015930, loc=0x7ffb2c002de4, iatt=0x7ffb2c415af0, parent=0x0, xdata_in=0x7ffb4c00949c, xdata_out=0x7ffb2c415a50) at /home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:1223 _new = 0x7ffb4c00623c old_THIS = 0x7ffb50019490 tmp_cbk = 0x7ffb63ff69b3 task = 0x7ffb2c009790 frame = 0x7ffb2c001b3c args = {op_ret = 0, op_errno = 0, iatt1 = {ia_ino = 0, ia_gfid = '\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, iatt2 = {ia_ino = 0, ia_gfid = '\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0,
Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week
Update from week of: (2017-01-30 to 2017-02-06) Failure stats and actions: 1) ./tests/basic/tier/tier.t Core dump needs attention https://build.gluster.org/job/centos6-regression/3046/consoleText Looks like the tier rebalance process has crashed (see below for the stack details) 2) ./tests/basic/ec/ec-background-heals.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 3) ./tests/bitrot/bug-1373520.t Marked as bad in master, not in release-3.10. May cause unwanted failures in 3.10 and as a result marked this as bad in 3.10 as well. Commit: https://review.gluster.org/16549 Thanks, Shyam On 01/30/2017 03:00 PM, Shyam wrote: Hi, The following is a list of spurious(?) regression failures in the 3.10 branch last week (from fstat.gluster.org). Request component owner or other devs to take a look at the failures, and weed out real issues. Regression failures 3.10: Summary: 1) https://build.gluster.org/job/centos6-regression/2960/consoleFull ./tests/basic/ec/ec-background-heals.t 2) https://build.gluster.org/job/centos6-regression/2963/consoleFull ./tests/basic/volume-snapshot.t 3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull ./tests/basic/afr/self-heald.t 4) https://build.gluster.org/job/centos6-regression/2954/consoleFull ./tests/basic/tier/legacy-many.t 5) https://build.gluster.org/job/centos6-regression/2858/consoleFull ./tests/bugs/bitrot/bug-1245981.t 6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull ./tests/basic/afr/self-heal.t 7) https://build.gluster.org/job/netbsd7-regression/2624/consoleFull ./tests/encryption/crypt.t Thanks, Shyam Core details from https://build.gluster.org/job/centos6-regression/3046/consoleText Core was generated by `/build/install/sbin/glusterfs -s localhost --volfile-id tierd/patchy -p /var/li'. Program terminated with signal 11, Segmentation fault. #0 0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6 Thread 1 (Thread 0x7ffb5a169700 (LWP 467)): #0 0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6 No symbol table info available. #1 0x7ffb56b7789f in dht_filter_loc_subvol_key (this=0x7ffb50015930, loc=0x7ffb2c002de4, new_loc=0x7ffb2c413f80, subvol=0x7ffb2c413fc0) at /home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-helper.c:307 new_name = 0x0 new_path = 0x0 trav = 0x0 key = '\000' ret = 0 #2 0x7ffb56bb2ce4 in dht_lookup (frame=0x7ffb4c00623c, this=0x7ffb50015930, loc=0x7ffb2c002de4, xattr_req=0x7ffb4c00949c) at /home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-common.c:2494 subvol = 0x0 hashed_subvol = 0x0 local = 0x7ffb4c00636c conf = 0x7ffb5003f380 ret = -1 op_errno = -1 layout = 0x0 i = 0 call_cnt = 0 new_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' , pargfid = '\000' } __FUNCTION__ = "dht_lookup" #3 0x7ffb63ff6f5c in syncop_lookup (subvol=0x7ffb50015930, loc=0x7ffb2c002de4, iatt=0x7ffb2c415af0, parent=0x0, xdata_in=0x7ffb4c00949c, xdata_out=0x7ffb2c415a50) at /home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:1223 _new = 0x7ffb4c00623c old_THIS = 0x7ffb50019490 tmp_cbk = 0x7ffb63ff69b3 task = 0x7ffb2c009790 frame = 0x7ffb2c001b3c args = {op_ret = 0, op_errno = 0, iatt1 = {ia_ino = 0, ia_gfid = '\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, iatt2 = {ia_ino = 0, ia_gfid = '\000' times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, xattr = 0x0, statvfs_buf = {f_bsize = 0, f_frsize = 0, f_blocks = 0, f_bfree = 0, f_bavail = 0, f_files = 0, f_ffree = 0, f_favail = 0, f_fsid = 0, f_flag = 0, f_namemax = 0, __f_spare = {0, 0, 0, 0, 0, 0}}, vector = 0x0, count = 0, iobref = 0x0, buffer = 0x0, xdata = 0x0, flock = {l_type = 0, l_whence = 0, l_start = 0, l_len = 0,