Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week

2017-02-21 Thread Shyam

On 02/21/2017 12:42 PM, Atin Mukherjee wrote:



On Tue, Feb 21, 2017 at 9:47 PM, Shyam > wrote:
3)
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
- *Milind/Hari*, request you take a look at this
- This seems to have about 8 failures in the last week on master and
release-3.10
- The failure seems to stem from tier.rc:function rebalance_run_time
(line 133)?
- Logs follow,



http://lists.gluster.org/pipermail/gluster-devel/2017-February/052137.html

Hari did mention that he has identified the issue and will be sending a
patch soon.



Awesome! I missed reading/processing that earlier mail, thanks.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week

2017-02-21 Thread Shyam

Update from week of: (2017-02-13 to 2017-02-21)

This week we have 3 problems from fstat to report as follows,

1) ./tests/features/lock_revocation.t
- *Pranith*, request you take a look at this
- This seems to be hanging on CentOS runs causing *aborted* test runs
- Some of these test runs are,
  - https://build.gluster.org/job/centos6-regression/3256/console
  - https://build.gluster.org/job/centos6-regression/3196/console
  - https://build.gluster.org/job/centos6-regression/3196/console

2) tests/basic/quota-anon-fd-nfs.t
- This had one spurious failure in 3.10
- I think it is because of not checking if NFS mount is available (which 
is anyway a good check to have in the test to avoid spurious failures)

- I have filed and posted a fix for the same,
  - Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425515
  - Possible Fix: https://review.gluster.org/16701

3) 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t

- *Milind/Hari*, request you take a look at this
- This seems to have about 8 failures in the last week on master and 
release-3.10
- The failure seems to stem from tier.rc:function rebalance_run_time 
(line 133)?

- Logs follow,


  02:36:38 [10:36:38] Running tests in file 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t

  02:36:45 No volumes present
  02:37:36 Tiering Migration Functionality: patchy: failed: Tier daemon 
is not running on volume patchy
  02:37:36 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 +  * 
60 + : syntax error: operand expected (error token is "* 3600 +  * 60 + ")
  02:37:36 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: 
line 23: [: : integer expression expected
  02:37:41 Tiering Migration Functionality: patchy: failed: Tier daemon 
is not running on volume patchy
  02:37:41 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 +  * 
60 + : syntax error: operand expected (error token is "* 3600 +  * 60 + ")
  02:37:41 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: 
line 23: [: : integer expression expected
  02:37:41 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t: 
line 23: [: -: integer expression expected
  02:37:41 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t 
..

  ...
  02:37:41 ok 14, LINENUM:69
  02:37:41 not ok 15 Got "1" instead of "0", LINENUM:70
  02:37:41 FAILED COMMAND: 0 tier_daemon_check
  02:37:41 not ok 16 Got "1" instead of "0", LINENUM:72
  02:37:41 FAILED COMMAND: 0 non_zero_check
  02:37:41 not ok 17 Got "1" instead of "0", LINENUM:75
  02:37:41 FAILED COMMAND: 0 non_zero_check
  02:37:41 not ok 18 Got "1" instead of "0", LINENUM:77
  02:37:41 FAILED COMMAND: 0 non_zero_check -
  02:37:41 Failed 4/18 subtests


Shyam

On 02/15/2017 09:25 AM, Shyam wrote:

Update from week of: (2017-02-06 to 2017-02-13)

No major failures to report this week, things look fine from a
regression suite failure stats perspective.

Do we have any updates on the older cores? Specifically,
  - https://build.gluster.org/job/centos6-regression/3046/consoleText
(./tests/basic/tier/tier.t -- tier rebalance)
  - https://build.gluster.org/job/centos6-regression/2963/consoleFull
(./tests/basic/volume-snapshot.t -- glusterd)

Shyam

On 02/06/2017 02:21 PM, Shyam wrote:

Update from week of: (2017-01-30 to 2017-02-06)

Failure stats and actions:

1) ./tests/basic/tier/tier.t
Core dump needs attention
https://build.gluster.org/job/centos6-regression/3046/consoleText

Looks like the tier rebalance process has crashed (see below for the
stack details)

2) ./tests/basic/ec/ec-background-heals.t
Marked as bad in master, not in release-3.10. May cause unwanted
failures in 3.10 and as a result marked this as bad in 3.10 as well.

Commit: https://review.gluster.org/16549

3) ./tests/bitrot/bug-1373520.t
Marked as bad in master, not in release-3.10. May cause unwanted
failures in 3.10 and as a result marked this as bad in 3.10 as well.

Commit: https://review.gluster.org/16549

Thanks,
Shyam

On 01/30/2017 03:00 PM, Shyam wrote:

Hi,

The following is a list of spurious(?) regression failures in the 3.10
branch last week (from fstat.gluster.org).

Request component owner or other devs to take a look at the failures,
and weed out real issues.

Regression failures 3.10:

Summary:
1) https://build.gluster.org/job/centos6-regression/2960/consoleFull
  ./tests/basic/ec/ec-background-heals.t

2) https://build.gluster.org/job/centos6-regression/2963/consoleFull
  
  ./tests/basic/volume-snapshot.t

3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull
  ./tests/basic/afr/self-heald.t

4) https://build.gluster.org/job/centos6-regression/2954/consoleFull
  ./tests/basic/tier/legacy-many.t

5) https://build.gluster.org/job/centos6-regression/2858/consoleFull
  ./tests/bugs/bitrot/bug-1245981.t

6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull
  ./tests/basic/afr/self-heal.t

7) 

Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week

2017-02-15 Thread Shyam

Update from week of: (2017-02-06 to 2017-02-13)

No major failures to report this week, things look fine from a 
regression suite failure stats perspective.


Do we have any updates on the older cores? Specifically,
  - https://build.gluster.org/job/centos6-regression/3046/consoleText 
(./tests/basic/tier/tier.t -- tier rebalance)
  - https://build.gluster.org/job/centos6-regression/2963/consoleFull 
(./tests/basic/volume-snapshot.t -- glusterd)


Shyam

On 02/06/2017 02:21 PM, Shyam wrote:

Update from week of: (2017-01-30 to 2017-02-06)

Failure stats and actions:

1) ./tests/basic/tier/tier.t
Core dump needs attention
https://build.gluster.org/job/centos6-regression/3046/consoleText

Looks like the tier rebalance process has crashed (see below for the
stack details)

2) ./tests/basic/ec/ec-background-heals.t
Marked as bad in master, not in release-3.10. May cause unwanted
failures in 3.10 and as a result marked this as bad in 3.10 as well.

Commit: https://review.gluster.org/16549

3) ./tests/bitrot/bug-1373520.t
Marked as bad in master, not in release-3.10. May cause unwanted
failures in 3.10 and as a result marked this as bad in 3.10 as well.

Commit: https://review.gluster.org/16549

Thanks,
Shyam

On 01/30/2017 03:00 PM, Shyam wrote:

Hi,

The following is a list of spurious(?) regression failures in the 3.10
branch last week (from fstat.gluster.org).

Request component owner or other devs to take a look at the failures,
and weed out real issues.

Regression failures 3.10:

Summary:
1) https://build.gluster.org/job/centos6-regression/2960/consoleFull
  ./tests/basic/ec/ec-background-heals.t

2) https://build.gluster.org/job/centos6-regression/2963/consoleFull
  
  ./tests/basic/volume-snapshot.t

3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull
  ./tests/basic/afr/self-heald.t

4) https://build.gluster.org/job/centos6-regression/2954/consoleFull
  ./tests/basic/tier/legacy-many.t

5) https://build.gluster.org/job/centos6-regression/2858/consoleFull
  ./tests/bugs/bitrot/bug-1245981.t

6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull
  ./tests/basic/afr/self-heal.t

7) https://build.gluster.org/job/netbsd7-regression/2624/consoleFull
  ./tests/encryption/crypt.t

Thanks,
Shyam


Core details from
https://build.gluster.org/job/centos6-regression/3046/consoleText

Core was generated by `/build/install/sbin/glusterfs -s localhost
--volfile-id tierd/patchy -p /var/li'.
Program terminated with signal 11, Segmentation fault.
#0  0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6

Thread 1 (Thread 0x7ffb5a169700 (LWP 467)):
#0  0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6
No symbol table info available.
#1  0x7ffb56b7789f in dht_filter_loc_subvol_key
(this=0x7ffb50015930, loc=0x7ffb2c002de4, new_loc=0x7ffb2c413f80,
subvol=0x7ffb2c413fc0) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-helper.c:307

new_name = 0x0
new_path = 0x0
trav = 0x0
key = '\000' 
ret = 0
#2  0x7ffb56bb2ce4 in dht_lookup (frame=0x7ffb4c00623c,
this=0x7ffb50015930, loc=0x7ffb2c002de4, xattr_req=0x7ffb4c00949c) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-common.c:2494

subvol = 0x0
hashed_subvol = 0x0
local = 0x7ffb4c00636c
conf = 0x7ffb5003f380
ret = -1
op_errno = -1
layout = 0x0
i = 0
call_cnt = 0
new_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0,
gfid = '\000' , pargfid = '\000' }
__FUNCTION__ = "dht_lookup"
#3  0x7ffb63ff6f5c in syncop_lookup (subvol=0x7ffb50015930,
loc=0x7ffb2c002de4, iatt=0x7ffb2c415af0, parent=0x0,
xdata_in=0x7ffb4c00949c, xdata_out=0x7ffb2c415a50) at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:1223

_new = 0x7ffb4c00623c
old_THIS = 0x7ffb50019490
tmp_cbk = 0x7ffb63ff69b3 
task = 0x7ffb2c009790
frame = 0x7ffb2c001b3c
args = {op_ret = 0, op_errno = 0, iatt1 = {ia_ino = 0, ia_gfid =
'\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot =
{suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0
'\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000',
write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0
'\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev
= 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0,
ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0,
ia_ctime_nsec = 0}, iatt2 = {ia_ino = 0, ia_gfid = '\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid
= 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0
'\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000',
exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0
'\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size =
0, ia_blksize = 0, 

Re: [Gluster-devel] [Gluster-Maintainers] Release 3.10 spurious(?) regression failures in the past week

2017-02-06 Thread Shyam

Update from week of: (2017-01-30 to 2017-02-06)

Failure stats and actions:

1) ./tests/basic/tier/tier.t
Core dump needs attention 
https://build.gluster.org/job/centos6-regression/3046/consoleText


Looks like the tier rebalance process has crashed (see below for the 
stack details)


2) ./tests/basic/ec/ec-background-heals.t
Marked as bad in master, not in release-3.10. May cause unwanted 
failures in 3.10 and as a result marked this as bad in 3.10 as well.


Commit: https://review.gluster.org/16549

3) ./tests/bitrot/bug-1373520.t
Marked as bad in master, not in release-3.10. May cause unwanted 
failures in 3.10 and as a result marked this as bad in 3.10 as well.


Commit: https://review.gluster.org/16549

Thanks,
Shyam

On 01/30/2017 03:00 PM, Shyam wrote:

Hi,

The following is a list of spurious(?) regression failures in the 3.10
branch last week (from fstat.gluster.org).

Request component owner or other devs to take a look at the failures,
and weed out real issues.

Regression failures 3.10:

Summary:
1) https://build.gluster.org/job/centos6-regression/2960/consoleFull
  ./tests/basic/ec/ec-background-heals.t

2) https://build.gluster.org/job/centos6-regression/2963/consoleFull
  
  ./tests/basic/volume-snapshot.t

3) https://build.gluster.org/job/netbsd7-regression/2694/consoleFull
  ./tests/basic/afr/self-heald.t

4) https://build.gluster.org/job/centos6-regression/2954/consoleFull
  ./tests/basic/tier/legacy-many.t

5) https://build.gluster.org/job/centos6-regression/2858/consoleFull
  ./tests/bugs/bitrot/bug-1245981.t

6) https://build.gluster.org/job/netbsd7-regression/2637/consoleFull
  ./tests/basic/afr/self-heal.t

7) https://build.gluster.org/job/netbsd7-regression/2624/consoleFull
  ./tests/encryption/crypt.t

Thanks,
Shyam


Core details from 
https://build.gluster.org/job/centos6-regression/3046/consoleText


Core was generated by `/build/install/sbin/glusterfs -s localhost 
--volfile-id tierd/patchy -p /var/li'.

Program terminated with signal 11, Segmentation fault.
#0  0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6

Thread 1 (Thread 0x7ffb5a169700 (LWP 467)):
#0  0x7ffb62c2c4c4 in __strchr_sse42 () from /lib64/libc.so.6
No symbol table info available.
#1  0x7ffb56b7789f in dht_filter_loc_subvol_key 
(this=0x7ffb50015930, loc=0x7ffb2c002de4, new_loc=0x7ffb2c413f80, 
subvol=0x7ffb2c413fc0) at 
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-helper.c:307

new_name = 0x0
new_path = 0x0
trav = 0x0
key = '\000' 
ret = 0
#2  0x7ffb56bb2ce4 in dht_lookup (frame=0x7ffb4c00623c, 
this=0x7ffb50015930, loc=0x7ffb2c002de4, xattr_req=0x7ffb4c00949c) at 
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-common.c:2494

subvol = 0x0
hashed_subvol = 0x0
local = 0x7ffb4c00636c
conf = 0x7ffb5003f380
ret = -1
op_errno = -1
layout = 0x0
i = 0
call_cnt = 0
new_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, 
gfid = '\000' , pargfid = '\000' }

__FUNCTION__ = "dht_lookup"
#3  0x7ffb63ff6f5c in syncop_lookup (subvol=0x7ffb50015930, 
loc=0x7ffb2c002de4, iatt=0x7ffb2c415af0, parent=0x0, 
xdata_in=0x7ffb4c00949c, xdata_out=0x7ffb2c415a50) at 
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:1223

_new = 0x7ffb4c00623c
old_THIS = 0x7ffb50019490
tmp_cbk = 0x7ffb63ff69b3 
task = 0x7ffb2c009790
frame = 0x7ffb2c001b3c
args = {op_ret = 0, op_errno = 0, iatt1 = {ia_ino = 0, ia_gfid 
= '\000' , ia_dev = 0, ia_type = IA_INVAL, ia_prot = 
{suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 
'\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', 
write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 
'\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev 
= 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, 
ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, 
ia_ctime_nsec = 0}, iatt2 = {ia_ino = 0, ia_gfid = '\000' times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid 
= 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 
'\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', 
exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 
'\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 
0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, 
ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, xattr 
= 0x0, statvfs_buf = {f_bsize = 0, f_frsize = 0, f_blocks = 0, f_bfree = 
0, f_bavail = 0, f_files = 0, f_ffree = 0, f_favail = 0, f_fsid = 0, 
f_flag = 0, f_namemax = 0, __f_spare = {0, 0, 0, 0, 0, 0}}, vector = 
0x0, count = 0, iobref = 0x0, buffer = 0x0, xdata = 0x0, flock = {l_type 
= 0, l_whence = 0, l_start = 0, l_len = 0,