Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t
I have posted a patch https://review.gluster.org/#/c/20657/ and start brick-mux regression to validate the patch. Thanks Mohit Agrawal On Wed, Aug 8, 2018 at 7:22 AM, Atin Mukherjee wrote: > +Mohit > > Requesting Mohit for help. > > On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan > wrote: > >> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote: >> > 5) Current test failures >> > We still have the following tests failing and some without any RCA or >> > attention, (If something is incorrect, write back). >> > >> > ./tests/bugs/ec/bug-1236065.t (Ashish) >> >> Ashish/Atin, the above test failed in run: >> https://build.gluster.org/job/regression-on-demand- >> multiplex/172/consoleFull >> >> The above run is based on patchset 4 of >> https://review.gluster.org/#/c/20637/4 >> >> The logs look as below, and as Ashish is unable to reproduce this, and >> all failures are on line 78 with a heal outstanding of 105, looks like >> this run may provide some possibilities on narrowing it down. >> >> The problem seems to be glustershd not connecting to one of the bricks >> that is restarted, and hence failing to heal that brick. This also looks >> like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t >> >> == >> Test times from: cat ./glusterd.log | grep TEST >> [2018-08-06 20:56:28.177386]:++ >> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script >> --wignore volume heal patchy full ++ >> [2018-08-06 20:56:28.767209]:++ >> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count >> patchy ++ >> [2018-08-06 20:57:48.957136]:++ >> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o >> 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o >> ++ >> == >> Repeated connection failure to client-3 in glustershd.log: >> [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig] >> 0-patchy-client-3: changing port to 49152 (from 0) >> [2018-08-06 20:56:30.222738] W [MSGID: 114043] >> [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed >> to set the volume [Resource temporarily unavailable] >> [2018-08-06 20:56:30.222788] W [MSGID: 114007] >> [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed >> to get 'process-uuid' from reply dict [Invalid argument] >> [2018-08-06 20:56:30.222813] E [MSGID: 114044] >> [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3: >> SETVOLUME on remote-host failed: cleanup flag is set for xlator. Try >> again later [Resource tempor >> arily unavailable] >> [2018-08-06 20:56:30.222845] I [MSGID: 114051] >> [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3: >> sending CHILD_CONNECTING event >> [2018-08-06 20:56:30.222919] I [MSGID: 114018] >> [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from >> patchy-client-3. Client process will keep trying to connect to glusterd >> until brick's port is >> available >> == >> Repeated connection messages close to above retries in >> d-backends-patchy0.log: >> [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update] >> 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1" >> [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login: >> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 >> The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict: >> key 'trusted.ec.version' is would not be sent on wire in future [Invalid >> argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and >> [2018-08-06 20:56:37.933084] >> [2018-08-06 20:56:38.530067] I [MSGID: 115029] >> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted >> client from >> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0- >> PID:10506-HOST:builder104.clo >> ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev) >> [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update] >> 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1" >> [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login: >> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 >> [2018-08-06 20:56:38.540555] I [MSGID: 115029] >> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted >> client from >> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0- >> PID:10506-HOST:builder104.clo >> ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev) >> [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update] >> 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1" >> [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login: >> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 >> [2018-08-06 20:56:38.552494] I [MSGID: 115029] >> [server-handshake.c:786:server_setvolume] 0-patchy-server:
Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t
+Mohit Requesting Mohit for help. On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan wrote: > On 08/07/2018 07:37 PM, Shyam Ranganathan wrote: > > 5) Current test failures > > We still have the following tests failing and some without any RCA or > > attention, (If something is incorrect, write back). > > > > ./tests/bugs/ec/bug-1236065.t (Ashish) > > Ashish/Atin, the above test failed in run: > > https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull > > The above run is based on patchset 4 of > https://review.gluster.org/#/c/20637/4 > > The logs look as below, and as Ashish is unable to reproduce this, and > all failures are on line 78 with a heal outstanding of 105, looks like > this run may provide some possibilities on narrowing it down. > > The problem seems to be glustershd not connecting to one of the bricks > that is restarted, and hence failing to heal that brick. This also looks > like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t > > == > Test times from: cat ./glusterd.log | grep TEST > [2018-08-06 20:56:28.177386]:++ > G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script > --wignore volume heal patchy full ++ > [2018-08-06 20:56:28.767209]:++ > G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count > patchy ++ > [2018-08-06 20:57:48.957136]:++ > G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o > 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o > ++ > == > Repeated connection failure to client-3 in glustershd.log: > [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig] > 0-patchy-client-3: changing port to 49152 (from 0) > [2018-08-06 20:56:30.222738] W [MSGID: 114043] > [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed > to set the volume [Resource temporarily unavailable] > [2018-08-06 20:56:30.222788] W [MSGID: 114007] > [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed > to get 'process-uuid' from reply dict [Invalid argument] > [2018-08-06 20:56:30.222813] E [MSGID: 114044] > [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3: > SETVOLUME on remote-host failed: cleanup flag is set for xlator. Try > again later [Resource tempor > arily unavailable] > [2018-08-06 20:56:30.222845] I [MSGID: 114051] > [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3: > sending CHILD_CONNECTING event > [2018-08-06 20:56:30.222919] I [MSGID: 114018] > [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from > patchy-client-3. Client process will keep trying to connect to glusterd > until brick's port is > available > == > Repeated connection messages close to above retries in > d-backends-patchy0.log: > [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update] > 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1" > [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login: > allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 > The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict: > key 'trusted.ec.version' is would not be sent on wire in future [Invalid > argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and > [2018-08-06 20:56:37.933084] > [2018-08-06 20:56:38.530067] I [MSGID: 115029] > [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted > client from > > CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo > ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev) > [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update] > 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1" > [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login: > allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 > [2018-08-06 20:56:38.540555] I [MSGID: 115029] > [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted > client from > > CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo > ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev) > [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update] > 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1" > [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login: > allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 > [2018-08-06 20:56:38.552494] I [MSGID: 115029] > [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted > client from > > CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo > ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev) > [2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update] > 0-/d/backends/patchy4: allowed = "*", received
Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t
On 08/07/2018 07:37 PM, Shyam Ranganathan wrote: > 5) Current test failures > We still have the following tests failing and some without any RCA or > attention, (If something is incorrect, write back). > > ./tests/bugs/ec/bug-1236065.t (Ashish) Ashish/Atin, the above test failed in run: https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull The above run is based on patchset 4 of https://review.gluster.org/#/c/20637/4 The logs look as below, and as Ashish is unable to reproduce this, and all failures are on line 78 with a heal outstanding of 105, looks like this run may provide some possibilities on narrowing it down. The problem seems to be glustershd not connecting to one of the bricks that is restarted, and hence failing to heal that brick. This also looks like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t == Test times from: cat ./glusterd.log | grep TEST [2018-08-06 20:56:28.177386]:++ G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script --wignore volume heal patchy full ++ [2018-08-06 20:56:28.767209]:++ G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count patchy ++ [2018-08-06 20:57:48.957136]:++ G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o ++ == Repeated connection failure to client-3 in glustershd.log: [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig] 0-patchy-client-3: changing port to 49152 (from 0) [2018-08-06 20:56:30.222738] W [MSGID: 114043] [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed to set the volume [Resource temporarily unavailable] [2018-08-06 20:56:30.222788] W [MSGID: 114007] [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed to get 'process-uuid' from reply dict [Invalid argument] [2018-08-06 20:56:30.222813] E [MSGID: 114044] [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3: SETVOLUME on remote-host failed: cleanup flag is set for xlator. Try again later [Resource tempor arily unavailable] [2018-08-06 20:56:30.222845] I [MSGID: 114051] [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3: sending CHILD_CONNECTING event [2018-08-06 20:56:30.222919] I [MSGID: 114018] [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from patchy-client-3. Client process will keep trying to connect to glusterd until brick's port is available == Repeated connection messages close to above retries in d-backends-patchy0.log: [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update] 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1" [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login: allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict: key 'trusted.ec.version' is would not be sent on wire in future [Invalid argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and [2018-08-06 20:56:37.933084] [2018-08-06 20:56:38.530067] I [MSGID: 115029] [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted client from CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev) [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update] 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1" [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login: allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 [2018-08-06 20:56:38.540555] I [MSGID: 115029] [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted client from CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev) [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update] 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1" [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login: allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 [2018-08-06 20:56:38.552494] I [MSGID: 115029] [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted client from CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev) [2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update] 0-/d/backends/patchy4: allowed = "*", received addr = "127.0.0.1" [2018-08-06 20:56:38.571701] I [login.c:111:gf_auth] 0-auth/login: allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05 [2018-08-06 20:56:38.571723] I [MSGID: 115029] [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted client from
Re: [Gluster-devel] Test: ./tests/bugs/distribute/bug-1042725.t
On 08/07/2018 07:37 PM, Shyam Ranganathan wrote: > 6) Tests that are addressed or are not occurring anymore are, > > ./tests/bugs/distribute/bug-1042725.t The above test fails, I think due to cleanup not completing in the previous test failure. The failed runs are: https://build.gluster.org/job/line-coverage/405/consoleFull https://build.gluster.org/job/line-coverage/415/consoleFull The logs are similar, where test 1042725.t fails to start glusterd and the previous test ./tests/bugs/core/multiplex-limit-issue-151.t has timed out. I am thinking we need to increase the cleanup time as well on time out tests from 5 seconds to 10 seconds to prevent these, thoughts? This timer: https://github.com/gluster/glusterfs/blob/master/run-tests.sh#L16 Logs look as follows: 16:24:48 16:24:48 [16:24:51] Running tests in file ./tests/bugs/core/multiplex-limit-issue-151.t 16:28:08 ./tests/bugs/core/multiplex-limit-issue-151.t timed out after 200 seconds 16:28:08 ./tests/bugs/core/multiplex-limit-issue-151.t: bad status 124 16:28:08 16:28:08* 16:28:08* REGRESSION FAILED * 16:28:08* Retrying failed tests in case * 16:28:08* we got some spurious failures * 16:28:08* 16:28:08 16:31:28 ./tests/bugs/core/multiplex-limit-issue-151.t timed out after 200 seconds 16:31:28 End of test ./tests/bugs/core/multiplex-limit-issue-151.t 16:31:28 16:31:28 16:31:28 16:31:28 16:31:28 [16:31:31] Running tests in file ./tests/bugs/distribute/bug-1042725.t 16:32:35 ./tests/bugs/distribute/bug-1042725.t .. 16:32:35 1..16 16:32:35 Terminated 16:32:35 not ok 1 , LINENUM:9 16:32:35 FAILED COMMAND: glusterd ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Master branch lock down status
Deserves a new beginning, threads on the other mail have gone deep enough. NOTE: (5) below needs your attention, rest is just process and data on how to find failures. 1) We are running the tests using the patch [2]. 2) Run details are extracted into a separate sheet in [3] named "Run Failures" use a search to find a failing test and the corresponding run that it failed in. 3) Patches that are fixing issues can be found here [1], if you think you have a patch out there, that is not in this list, shout out. 4) If you own up a test case failure, update the spreadsheet [3] with your name against the test, and also update other details as needed (as comments, as edit rights to the sheet are restricted). 5) Current test failures We still have the following tests failing and some without any RCA or attention, (If something is incorrect, write back). ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs attention) ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh) ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t (Atin) ./tests/bugs/ec/bug-1236065.t (Ashish) ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh) ./tests/basic/ec/ec-1468261.t (needs attention) ./tests/basic/afr/add-brick-self-heal.t (needs attention) ./tests/basic/afr/granular-esh/replace-brick.t (needs attention) ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention) ./tests/bugs/glusterd/validating-server-quorum.t (Atin) ./tests/bugs/replicate/bug-1363721.t (Ravi) Here are some newer failures, but mostly one-off failures except cores in ec-5-2.t. All of the following need attention as these are new. ./tests/00-geo-rep/00-georep-verify-setup.t ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t ./tests/basic/stats-dump.t ./tests/bugs/bug-1110262.t ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t ./tests/basic/ec/ec-data-heal.t ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t ./tests/basic/ec/ec-5-2.t 6) Tests that are addressed or are not occurring anymore are, ./tests/bugs/glusterd/rebalance-operations-in-single-node.t ./tests/bugs/index/bug-1559004-EMLINK-handling.t ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t ./tests/bitrot/bug-1373520.t ./tests/bugs/distribute/bug-1117851.t ./tests/bugs/glusterd/quorum-validation.t ./tests/bugs/distribute/bug-1042725.t ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t ./tests/bugs/quota/bug-1293601.t ./tests/bugs/bug-1368312.t ./tests/bugs/distribute/bug-1122443.t ./tests/bugs/core/bug-1432542-mpx-restart-crash.t Shyam (and Atin) On 08/05/2018 06:24 PM, Shyam Ranganathan wrote: > Health on master as of the last nightly run [4] is still the same. > > Potential patches that rectify the situation (as in [1]) are bunched in > a patch [2] that Atin and myself have put through several regressions > (mux, normal and line coverage) and these have also not passed. > > Till we rectify the situation we are locking down master branch commit > rights to the following people, Amar, Atin, Shyam, Vijay. > > The intention is to stabilize master and not add more patches that my > destabilize it. > > Test cases that are tracked as failures and need action are present here > [3]. > > @Nigel, request you to apply the commit rights change as you see this > mail and let the list know regarding the same as well. > > Thanks, > Shyam > > [1] Patches that address regression failures: > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com > > [2] Bunched up patch against which regressions were run: > https://review.gluster.org/#/c/20637 > > [3] Failing tests list: > https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing > > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/ ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)
On Tue, Aug 7, 2018, 10:46 PM Shyam Ranganathan wrote: > On 08/07/2018 02:58 PM, Yaniv Kaul wrote: > > The intention is to stabilize master and not add more patches that my > > destabilize it. > > > > > > https://review.gluster.org/#/c/20603/ has been merged. > > As far as I can see, it has nothing to do with stabilization and should > > be reverted. > > Posted this on the gerrit review as well: > > > 4.1 does not have nightly tests, those run on master only. > That should change of course. We cannot strive for stability otherwise, AFAIK. > Stability of master does not (will not), in the near term guarantee > stability of release branches, unless patches that impact code already > on release branches, get fixes on master and are back ported. > > Release branches get fixes back ported (as is normal), this fix and its > merge should not impact current master stability in any way, and neither > stability of 4.1 branch. > > > The current hold is on master, not on release branches. I agree that > merging further code changes on release branches (for example geo-rep > issues that are backported (see [1]), as there are tests that fail > regularly on master), may further destabilize the release branch. This > patch is not one of those. > Two issues I have with the merge: 1. It just makes comparing master branch to release branch harder. For example, to understand if there's a test that fails on master but succeeds on release branch, or vice versa. 2. It means we are not focused on stabilizing master branch. Y. > Merging patches on release branches are allowed by release owners only, > and usual practice is keeping the backlog low (merging weekly) in these > cases as per the dashboard [1]. > > Allowing for the above 2 reasons this patch was found, > - Not on master > - Not stabilizing or destabilizing the release branch > and hence was merged. > > If maintainers disagree I can revert the same. > > Shyam > > [1] Release 4.1 dashboard: > > https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard > ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)
On 08/07/2018 02:58 PM, Yaniv Kaul wrote: > The intention is to stabilize master and not add more patches that my > destabilize it. > > > https://review.gluster.org/#/c/20603/ has been merged. > As far as I can see, it has nothing to do with stabilization and should > be reverted. Posted this on the gerrit review as well: 4.1 does not have nightly tests, those run on master only. Stability of master does not (will not), in the near term guarantee stability of release branches, unless patches that impact code already on release branches, get fixes on master and are back ported. Release branches get fixes back ported (as is normal), this fix and its merge should not impact current master stability in any way, and neither stability of 4.1 branch. The current hold is on master, not on release branches. I agree that merging further code changes on release branches (for example geo-rep issues that are backported (see [1]), as there are tests that fail regularly on master), may further destabilize the release branch. This patch is not one of those. Merging patches on release branches are allowed by release owners only, and usual practice is keeping the backlog low (merging weekly) in these cases as per the dashboard [1]. Allowing for the above 2 reasons this patch was found, - Not on master - Not stabilizing or destabilizing the release branch and hence was merged. If maintainers disagree I can revert the same. Shyam [1] Release 4.1 dashboard: https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:4-1-dashboard ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Release 5: Master branch health report (Week of 30th July)
On Mon, Aug 6, 2018 at 1:24 AM, Shyam Ranganathan wrote: > On 07/31/2018 07:16 AM, Shyam Ranganathan wrote: > > On 07/30/2018 03:21 PM, Shyam Ranganathan wrote: > >> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote: > >>> 1) master branch health checks (weekly, till branching) > >>> - Expect every Monday a status update on various tests runs > >> See https://build.gluster.org/job/nightly-master/ for a report on > >> various nightly and periodic jobs on master. > > Thinking aloud, we may have to stop merges to master to get these test > > failures addressed at the earliest and to continue maintaining them > > GREEN for the health of the branch. > > > > I would give the above a week, before we lockdown the branch to fix the > > failures. > > > > Let's try and get line-coverage and nightly regression tests addressed > > this week (leaving mux-regression open), and if addressed not lock the > > branch down. > > > > Health on master as of the last nightly run [4] is still the same. > > Potential patches that rectify the situation (as in [1]) are bunched in > a patch [2] that Atin and myself have put through several regressions > (mux, normal and line coverage) and these have also not passed. > > Till we rectify the situation we are locking down master branch commit > rights to the following people, Amar, Atin, Shyam, Vijay. > > The intention is to stabilize master and not add more patches that my > destabilize it. > https://review.gluster.org/#/c/20603/ has been merged. As far as I can see, it has nothing to do with stabilization and should be reverted. Y. > > Test cases that are tracked as failures and need action are present here > [3]. > > @Nigel, request you to apply the commit rights change as you see this > mail and let the list know regarding the same as well. > > Thanks, > Shyam > > [1] Patches that address regression failures: > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com > > [2] Bunched up patch against which regressions were run: > https://review.gluster.org/#/c/20637 > > [3] Failing tests list: > https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_ > -crKALHSaSjZMQ/edit?usp=sharing > > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/ > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coverity covscan for 2018-08-07-9e03c5fc (master branch)
GlusterFS Coverity covscan results for the master branch are available from http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-08-07-9e03c5fc/ Coverity covscan results for other active branches are also available at http://download.gluster.org/pub/gluster/glusterfs/static-analysis/ ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coverity cleanup drive
Hello folks, We are planning to reduce coverity errors to improve upstream stability. To avoid duplicate efforts and concerns like patch review, we suggest following rules :- 1. Visit https://scan.coverity.com/projects/gluster-glusterfs and request to "Add me to project" to see all reported coverity errors. 2. Pick any error and assign to yourself in Triage -> Owner section, it will let others know you are owner of that error and avoid duplicate efforts. In Triage-> Ext. Reference you can post patch link. For example see - https://scan6.coverity.com/reports.htm#v42401/p10714/fileInstanceId=84384726=25600457=727233=25600457-1 3. You should pick coverity per component/file basis and fix all the coverity errors reported for that component/file. If some file has few defects it is advisable to combine errors from same component. This will help the reviewers to quickly review the patch. For example see - https://review.gluster.org/#/c/20600/. 4. Please use BUG : 789278 to send all coverity related patch and do not forget to mention coverity link in commit message. For example see- https://review.gluster.org/#/c/20600/. 5. After the patch is merged please make entry of coverity-ID in spreadsheet. https://docs.google.com/spreadsheets/d/1qZNallBF30T2w_qi0wRxzqDn0KbcYP5zshiLrYu9XyQ/edit?usp=sharing. Yes ! You can win some swags by participating in this effort if you meet these criteria:- a. Triage the bugs properly. b. Updating the sheets. c. More than 17 coverity fixes. d. Most importantly all the submitted patches should get merged by 25th Aug 2018. Please feel free to let us know if you have any questions, * Sunny - sunku...@redhat.com * Karthik - ksubr...@redhat.com * Bhumika - bgo...@redhat.com - Sunny ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Fwd: Gerrit downtime on Aug 8, 2016
Reminder, this upgrade is tomorrow. -- Forwarded message - From: Nigel Babu Date: Fri, Jul 27, 2018 at 5:28 PM Subject: Gerrit downtime on Aug 8, 2016 To: gluster-devel Cc: gluster-infra , < automated-test...@gluster.org> Hello, It's been a while since we upgraded Gerrit. We plan to do a full upgrade and move to 2.15.3. Among other changes, this brings in the new PolyGerrit interface which brings significant frontend changes. You can take a look at how this would look on the staging site[1]. ## Outage Window 0330 EDT to 0730 EDT 0730 UTC to 1130 UTC 1300 IST to 1700 IST The actual time needed for the upgrade is about than hour, but we want to keep a larger window open to rollback in the event of any problems during the upgrade. -- nigelb -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel