Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t

2018-08-07 Thread Mohit Agrawal
I have posted a patch https://review.gluster.org/#/c/20657/ and start
brick-mux regression to validate the patch.

Thanks
Mohit Agrawal

On Wed, Aug 8, 2018 at 7:22 AM, Atin Mukherjee  wrote:

> +Mohit
>
> Requesting Mohit for help.
>
> On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan 
> wrote:
>
>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>>
>> Ashish/Atin, the above test failed in run:
>> https://build.gluster.org/job/regression-on-demand-
>> multiplex/172/consoleFull
>>
>> The above run is based on patchset 4 of
>> https://review.gluster.org/#/c/20637/4
>>
>> The logs look as below, and as Ashish is unable to reproduce this, and
>> all failures are on line 78 with a heal outstanding of 105, looks like
>> this run may provide some possibilities on narrowing it down.
>>
>> The problem seems to be glustershd not connecting to one of the bricks
>> that is restarted, and hence failing to heal that brick. This also looks
>> like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t
>>
>> ==
>> Test times from: cat ./glusterd.log | grep TEST
>> [2018-08-06 20:56:28.177386]:++
>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script
>> --wignore volume heal patchy full ++
>> [2018-08-06 20:56:28.767209]:++
>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count
>> patchy ++
>> [2018-08-06 20:57:48.957136]:++
>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o
>> 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o
>> ++
>> ==
>> Repeated connection failure to client-3 in glustershd.log:
>> [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig]
>> 0-patchy-client-3: changing port to 49152 (from 0)
>> [2018-08-06 20:56:30.222738] W [MSGID: 114043]
>> [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed
>> to set the volume [Resource temporarily unavailable]
>> [2018-08-06 20:56:30.222788] W [MSGID: 114007]
>> [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed
>> to get 'process-uuid' from reply dict [Invalid argument]
>> [2018-08-06 20:56:30.222813] E [MSGID: 114044]
>> [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3:
>> SETVOLUME on remote-host failed: cleanup flag is set for xlator.  Try
>> again later [Resource tempor
>> arily unavailable]
>> [2018-08-06 20:56:30.222845] I [MSGID: 114051]
>> [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3:
>> sending CHILD_CONNECTING event
>> [2018-08-06 20:56:30.222919] I [MSGID: 114018]
>> [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from
>> patchy-client-3. Client process will keep trying to connect to glusterd
>> until brick's port is
>>  available
>> ==
>> Repeated connection messages close to above retries in
>> d-backends-patchy0.log:
>> [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update]
>> 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1"
>> [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login:
>> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
>> The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict:
>> key 'trusted.ec.version' is would not be sent on wire in future [Invalid
>> argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and
>>  [2018-08-06 20:56:37.933084]
>> [2018-08-06 20:56:38.530067] I [MSGID: 115029]
>> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
>> client from
>> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-
>> PID:10506-HOST:builder104.clo
>> ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev)
>> [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update]
>> 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1"
>> [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login:
>> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
>> [2018-08-06 20:56:38.540555] I [MSGID: 115029]
>> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
>> client from
>> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-
>> PID:10506-HOST:builder104.clo
>> ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev)
>> [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update]
>> 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1"
>> [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login:
>> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
>> [2018-08-06 20:56:38.552494] I [MSGID: 115029]
>> [server-handshake.c:786:server_setvolume] 0-patchy-server: 

Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t

2018-08-07 Thread Atin Mukherjee
+Mohit

Requesting Mohit for help.

On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan  wrote:

> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>
> Ashish/Atin, the above test failed in run:
>
> https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull
>
> The above run is based on patchset 4 of
> https://review.gluster.org/#/c/20637/4
>
> The logs look as below, and as Ashish is unable to reproduce this, and
> all failures are on line 78 with a heal outstanding of 105, looks like
> this run may provide some possibilities on narrowing it down.
>
> The problem seems to be glustershd not connecting to one of the bricks
> that is restarted, and hence failing to heal that brick. This also looks
> like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t
>
> ==
> Test times from: cat ./glusterd.log | grep TEST
> [2018-08-06 20:56:28.177386]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script
> --wignore volume heal patchy full ++
> [2018-08-06 20:56:28.767209]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count
> patchy ++
> [2018-08-06 20:57:48.957136]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o
> 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o
> ++
> ==
> Repeated connection failure to client-3 in glustershd.log:
> [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig]
> 0-patchy-client-3: changing port to 49152 (from 0)
> [2018-08-06 20:56:30.222738] W [MSGID: 114043]
> [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed
> to set the volume [Resource temporarily unavailable]
> [2018-08-06 20:56:30.222788] W [MSGID: 114007]
> [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed
> to get 'process-uuid' from reply dict [Invalid argument]
> [2018-08-06 20:56:30.222813] E [MSGID: 114044]
> [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3:
> SETVOLUME on remote-host failed: cleanup flag is set for xlator.  Try
> again later [Resource tempor
> arily unavailable]
> [2018-08-06 20:56:30.222845] I [MSGID: 114051]
> [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3:
> sending CHILD_CONNECTING event
> [2018-08-06 20:56:30.222919] I [MSGID: 114018]
> [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from
> patchy-client-3. Client process will keep trying to connect to glusterd
> until brick's port is
>  available
> ==
> Repeated connection messages close to above retries in
> d-backends-patchy0.log:
> [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict:
> key 'trusted.ec.version' is would not be sent on wire in future [Invalid
> argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and
>  [2018-08-06 20:56:37.933084]
> [2018-08-06 20:56:38.530067] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.540555] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.552494] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy4: allowed = "*", received 

Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t

2018-08-07 Thread Shyam Ranganathan
On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> 5) Current test failures
> We still have the following tests failing and some without any RCA or
> attention, (If something is incorrect, write back).
> 
> ./tests/bugs/ec/bug-1236065.t (Ashish)

Ashish/Atin, the above test failed in run:
https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull

The above run is based on patchset 4 of
https://review.gluster.org/#/c/20637/4

The logs look as below, and as Ashish is unable to reproduce this, and
all failures are on line 78 with a heal outstanding of 105, looks like
this run may provide some possibilities on narrowing it down.

The problem seems to be glustershd not connecting to one of the bricks
that is restarted, and hence failing to heal that brick. This also looks
like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t

==
Test times from: cat ./glusterd.log | grep TEST
[2018-08-06 20:56:28.177386]:++
G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script
--wignore volume heal patchy full ++
[2018-08-06 20:56:28.767209]:++
G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count
patchy ++
[2018-08-06 20:57:48.957136]:++
G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o
13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o
++
==
Repeated connection failure to client-3 in glustershd.log:
[2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig]
0-patchy-client-3: changing port to 49152 (from 0)
[2018-08-06 20:56:30.222738] W [MSGID: 114043]
[client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed
to set the volume [Resource temporarily unavailable]
[2018-08-06 20:56:30.222788] W [MSGID: 114007]
[client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed
to get 'process-uuid' from reply dict [Invalid argument]
[2018-08-06 20:56:30.222813] E [MSGID: 114044]
[client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3:
SETVOLUME on remote-host failed: cleanup flag is set for xlator.  Try
again later [Resource tempor
arily unavailable]
[2018-08-06 20:56:30.222845] I [MSGID: 114051]
[client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3:
sending CHILD_CONNECTING event
[2018-08-06 20:56:30.222919] I [MSGID: 114018]
[client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from
patchy-client-3. Client process will keep trying to connect to glusterd
until brick's port is
 available
==
Repeated connection messages close to above retries in
d-backends-patchy0.log:
[2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update]
0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1"
[2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login:
allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict:
key 'trusted.ec.version' is would not be sent on wire in future [Invalid
argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and
 [2018-08-06 20:56:37.933084]
[2018-08-06 20:56:38.530067] I [MSGID: 115029]
[server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
client from
CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev)
[2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update]
0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1"
[2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login:
allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
[2018-08-06 20:56:38.540555] I [MSGID: 115029]
[server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
client from
CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev)
[2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update]
0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1"
[2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login:
allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
[2018-08-06 20:56:38.552494] I [MSGID: 115029]
[server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
client from
CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev)
[2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update]
0-/d/backends/patchy4: allowed = "*", received addr = "127.0.0.1"
[2018-08-06 20:56:38.571701] I [login.c:111:gf_auth] 0-auth/login:
allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
[2018-08-06 20:56:38.571723] I [MSGID: 115029]
[server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
client from

Re: [Gluster-devel] Test: ./tests/bugs/distribute/bug-1042725.t

2018-08-07 Thread Shyam Ranganathan
On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> 6) Tests that are addressed or are not occurring anymore are,
> 
> ./tests/bugs/distribute/bug-1042725.t

The above test fails, I think due to cleanup not completing in the
previous test failure.

The failed runs are:
https://build.gluster.org/job/line-coverage/405/consoleFull
https://build.gluster.org/job/line-coverage/415/consoleFull

The logs are similar, where test 1042725.t fails to start glusterd and
the previous test ./tests/bugs/core/multiplex-limit-issue-151.t has
timed out.

I am thinking we need to increase the cleanup time as well on time out
tests from 5 seconds to 10 seconds to prevent these, thoughts?

This timer:
https://github.com/gluster/glusterfs/blob/master/run-tests.sh#L16

Logs look as follows:
16:24:48

16:24:48 [16:24:51] Running tests in file
./tests/bugs/core/multiplex-limit-issue-151.t
16:28:08 ./tests/bugs/core/multiplex-limit-issue-151.t timed out after
200 seconds
16:28:08 ./tests/bugs/core/multiplex-limit-issue-151.t: bad status 124
16:28:08
16:28:08*
16:28:08*   REGRESSION FAILED   *
16:28:08* Retrying failed tests in case *
16:28:08* we got some spurious failures *
16:28:08*
16:28:08
16:31:28 ./tests/bugs/core/multiplex-limit-issue-151.t timed out after
200 seconds
16:31:28 End of test ./tests/bugs/core/multiplex-limit-issue-151.t
16:31:28

16:31:28
16:31:28
16:31:28

16:31:28 [16:31:31] Running tests in file
./tests/bugs/distribute/bug-1042725.t
16:32:35 ./tests/bugs/distribute/bug-1042725.t ..
16:32:35 1..16
16:32:35 Terminated
16:32:35 not ok 1 , LINENUM:9
16:32:35 FAILED COMMAND: glusterd
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Master branch lock down status

2018-08-07 Thread Shyam Ranganathan
Deserves a new beginning, threads on the other mail have gone deep enough.

NOTE: (5) below needs your attention, rest is just process and data on
how to find failures.

1) We are running the tests using the patch [2].

2) Run details are extracted into a separate sheet in [3] named "Run
Failures" use a search to find a failing test and the corresponding run
that it failed in.

3) Patches that are fixing issues can be found here [1], if you think
you have a patch out there, that is not in this list, shout out.

4) If you own up a test case failure, update the spreadsheet [3] with
your name against the test, and also update other details as needed (as
comments, as edit rights to the sheet are restricted).

5) Current test failures
We still have the following tests failing and some without any RCA or
attention, (If something is incorrect, write back).

./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
attention)
./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
(Atin)
./tests/bugs/ec/bug-1236065.t (Ashish)
./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
./tests/basic/ec/ec-1468261.t (needs attention)
./tests/basic/afr/add-brick-self-heal.t (needs attention)
./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
./tests/bugs/glusterd/validating-server-quorum.t (Atin)
./tests/bugs/replicate/bug-1363721.t (Ravi)

Here are some newer failures, but mostly one-off failures except cores
in ec-5-2.t. All of the following need attention as these are new.

./tests/00-geo-rep/00-georep-verify-setup.t
./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
./tests/basic/stats-dump.t
./tests/bugs/bug-1110262.t
./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
./tests/basic/ec/ec-data-heal.t
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
./tests/basic/ec/ec-5-2.t

6) Tests that are addressed or are not occurring anymore are,

./tests/bugs/glusterd/rebalance-operations-in-single-node.t
./tests/bugs/index/bug-1559004-EMLINK-handling.t
./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
./tests/bitrot/bug-1373520.t
./tests/bugs/distribute/bug-1117851.t
./tests/bugs/glusterd/quorum-validation.t
./tests/bugs/distribute/bug-1042725.t
./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
./tests/bugs/quota/bug-1293601.t
./tests/bugs/bug-1368312.t
./tests/bugs/distribute/bug-1122443.t
./tests/bugs/core/bug-1432542-mpx-restart-crash.t

Shyam (and Atin)

On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> Health on master as of the last nightly run [4] is still the same.
> 
> Potential patches that rectify the situation (as in [1]) are bunched in
> a patch [2] that Atin and myself have put through several regressions
> (mux, normal and line coverage) and these have also not passed.
> 
> Till we rectify the situation we are locking down master branch commit
> rights to the following people, Amar, Atin, Shyam, Vijay.
> 
> The intention is to stabilize master and not add more patches that my
> destabilize it.
> 
> Test cases that are tracked as failures and need action are present here
> [3].
> 
> @Nigel, request you to apply the commit rights change as you see this
> mail and let the list know regarding the same as well.
> 
> Thanks,
> Shyam
> 
> [1] Patches that address regression failures:
> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> 
> [2] Bunched up patch against which regressions were run:
> https://review.gluster.org/#/c/20637
> 
> [3] Failing tests list:
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> 
> [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Yaniv Kaul
On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan  wrote:

> On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> >
> > https://review.gluster.org/#/c/20603/ has been merged.
> > As far as I can see, it has nothing to do with stabilization and should
> > be reverted.
>
> Posted this on the gerrit review as well:
>
> 
> 4.1 does not have nightly tests, those run on master only.
>

That should change of course. We cannot strive for stability otherwise,
AFAIK.


> Stability of master does not (will not), in the near term guarantee
> stability of release branches, unless patches that impact code already
> on release branches, get fixes on master and are back ported.
>
> Release branches get fixes back ported (as is normal), this fix and its
> merge should not impact current master stability in any way, and neither
> stability of 4.1 branch.
> 
>
> The current hold is on master, not on release branches. I agree that
> merging further code changes on release branches (for example geo-rep
> issues that are backported (see [1]), as there are tests that fail
> regularly on master), may further destabilize the release branch. This
> patch is not one of those.
>

Two issues I have with the merge:
1. It just makes comparing master branch to release branch harder. For
example, to understand if there's a test that fails on master but succeeds
on release branch, or vice versa.
2. It means we are not focused on stabilizing master branch.
Y.


> Merging patches on release branches are allowed by release owners only,
> and usual practice is keeping the backlog low (merging weekly) in these
> cases as per the dashboard [1].
>
> Allowing for the above 2 reasons this patch was found,
> - Not on master
> - Not stabilizing or destabilizing the release branch
> and hence was merged.
>
> If maintainers disagree I can revert the same.
>
> Shyam
>
> [1] Release 4.1 dashboard:
>
> https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Shyam Ranganathan
On 08/07/2018 02:58 PM, Yaniv Kaul wrote:
> The intention is to stabilize master and not add more patches that my
> destabilize it.
> 
> 
> https://review.gluster.org/#/c/20603/ has been merged.
> As far as I can see, it has nothing to do with stabilization and should
> be reverted.

Posted this on the gerrit review as well:


4.1 does not have nightly tests, those run on master only.

Stability of master does not (will not), in the near term guarantee
stability of release branches, unless patches that impact code already
on release branches, get fixes on master and are back ported.

Release branches get fixes back ported (as is normal), this fix and its
merge should not impact current master stability in any way, and neither
stability of 4.1 branch.


The current hold is on master, not on release branches. I agree that
merging further code changes on release branches (for example geo-rep
issues that are backported (see [1]), as there are tests that fail
regularly on master), may further destabilize the release branch. This
patch is not one of those.

Merging patches on release branches are allowed by release owners only,
and usual practice is keeping the backlog low (merging weekly) in these
cases as per the dashboard [1].

Allowing for the above 2 reasons this patch was found,
- Not on master
- Not stabilizing or destabilizing the release branch
and hence was merged.

If maintainers disagree I can revert the same.

Shyam

[1] Release 4.1 dashboard:
https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-07 Thread Yaniv Kaul
On Mon, Aug 6, 2018 at 1:24 AM, Shyam Ranganathan 
wrote:

> On 07/31/2018 07:16 AM, Shyam Ranganathan wrote:
> > On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
> >> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
> >>> 1) master branch health checks (weekly, till branching)
> >>>   - Expect every Monday a status update on various tests runs
> >> See https://build.gluster.org/job/nightly-master/ for a report on
> >> various nightly and periodic jobs on master.
> > Thinking aloud, we may have to stop merges to master to get these test
> > failures addressed at the earliest and to continue maintaining them
> > GREEN for the health of the branch.
> >
> > I would give the above a week, before we lockdown the branch to fix the
> > failures.
> >
> > Let's try and get line-coverage and nightly regression tests addressed
> > this week (leaving mux-regression open), and if addressed not lock the
> > branch down.
> >
>
> Health on master as of the last nightly run [4] is still the same.
>
> Potential patches that rectify the situation (as in [1]) are bunched in
> a patch [2] that Atin and myself have put through several regressions
> (mux, normal and line coverage) and these have also not passed.
>
> Till we rectify the situation we are locking down master branch commit
> rights to the following people, Amar, Atin, Shyam, Vijay.
>
> The intention is to stabilize master and not add more patches that my
> destabilize it.
>

https://review.gluster.org/#/c/20603/ has been merged.
As far as I can see, it has nothing to do with stabilization and should be
reverted.
Y.


>
> Test cases that are tracked as failures and need action are present here
> [3].
>
> @Nigel, request you to apply the commit rights change as you see this
> mail and let the list know regarding the same as well.
>
> Thanks,
> Shyam
>
> [1] Patches that address regression failures:
> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
>
> [2] Bunched up patch against which regressions were run:
> https://review.gluster.org/#/c/20637
>
> [3] Failing tests list:
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_
> -crKALHSaSjZMQ/edit?usp=sharing
>
> [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Coverity covscan for 2018-08-07-9e03c5fc (master branch)

2018-08-07 Thread staticanalysis


GlusterFS Coverity covscan results for the master branch are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-08-07-9e03c5fc/

Coverity covscan results for other active branches are also available at
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Coverity cleanup drive

2018-08-07 Thread Sunny Kumar
Hello folks,

We are planning to reduce coverity errors to improve upstream stability.

To avoid duplicate efforts and concerns like patch review, we suggest
following rules :-

1. Visit https://scan.coverity.com/projects/gluster-glusterfs and
request to "Add me to project" to see all reported coverity errors.

2. Pick any error and assign to yourself in Triage -> Owner section,
it will let others know you are owner of that error and avoid
duplicate efforts.
In Triage-> Ext. Reference you can post patch link.
For example see -
https://scan6.coverity.com/reports.htm#v42401/p10714/fileInstanceId=84384726=25600457=727233=25600457-1

3. You should pick coverity per component/file basis and fix all the
coverity errors reported for that component/file.
If some file has few defects it is advisable to combine errors from
same component. This will help the reviewers to quickly review the
patch.
For example see -
https://review.gluster.org/#/c/20600/.

4. Please use BUG : 789278 to send all coverity related patch and do
not forget to mention coverity link in commit message.
For example see-
https://review.gluster.org/#/c/20600/.

5. After the patch is merged please make entry of coverity-ID in spreadsheet.
https://docs.google.com/spreadsheets/d/1qZNallBF30T2w_qi0wRxzqDn0KbcYP5zshiLrYu9XyQ/edit?usp=sharing.

Yes ! You can win some swags by participating in this effort if you
meet these criteria:-

a. Triage the bugs properly.
b. Updating the sheets.
c. More than 17 coverity fixes.
d. Most importantly all the submitted patches should get merged by
25th Aug 2018.

Please feel free to let us know if you have any questions,
* Sunny - sunku...@redhat.com
* Karthik - ksubr...@redhat.com
* Bhumika -  bgo...@redhat.com

- Sunny
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Fwd: Gerrit downtime on Aug 8, 2016

2018-08-07 Thread Nigel Babu
Reminder, this upgrade is tomorrow.

-- Forwarded message -
From: Nigel Babu 
Date: Fri, Jul 27, 2018 at 5:28 PM
Subject: Gerrit downtime on Aug 8, 2016
To: gluster-devel 
Cc: gluster-infra , <
automated-test...@gluster.org>


Hello,

It's been a while since we upgraded Gerrit. We plan to do a full upgrade
and move to 2.15.3. Among other changes, this brings in the new PolyGerrit
interface which brings significant frontend changes. You can take a look at
how this would look on the staging site[1].

## Outage Window
0330 EDT to 0730 EDT
0730 UTC to 1130 UTC
1300 IST to 1700 IST

The actual time needed for the upgrade is about than hour, but we want to
keep a larger window open to rollback in the event of any problems during
the upgrade.

-- 
nigelb


-- 
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel