Re: [Gluster-devel] [release-4.0] FAILED ./tests/bugs/ec/bug-1236065.t

2018-03-28 Thread Milind Changire
This is now failing with brick-mux *on*:
https://build.gluster.org/job/centos7-regression/535/consoleFull

Patch: https://review.gluster.org/19786


On Tue, Mar 20, 2018 at 11:57 PM, Raghavendra Gowdappa 
wrote:

> Patch at https://review.gluster.org/19746
>
> On Tue, Mar 20, 2018 at 8:42 PM, Milind Changire 
> wrote:
>
>> Jenkins Job: https://build.gluster.org/job/centos7-regression/405/console
>> Full
>>
>> --
>> Milind
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] replace-brick commit force fails in multi node cluster

2018-03-28 Thread Karthik Subrahmanya
Hey Atin,

This is happening because of bringing down the glusterd on the third node
before doing the replcae brick.
In replace brick we do a temporary mount to mark pending xattr on the
source bricks saying that the brick being replaced is sink.
But in this case, since one of the source brick's glusterd is down, when
the mount tries to get the port at which the brick is listening,
it fails to get that leading to failure of setting the
"trusted.replace_brick" attribute.
For replica 3 volume to say any fop as success it needs at least quorum
number of success. Hence the replace brick fails.

On the QE setup the replace brick would have succeeded only because of some
race between glusterd going down and replace brick happening.
Otherwise there is no chance for replace brick to succeed.

Regards,
Karthik

On Tue, Mar 27, 2018 at 7:25 PM, Atin Mukherjee  wrote:

> While writing a test for the patch fix of BZ https://bugzilla.redhat.com/
> show_bug.cgi?id=1560957 I just can't make my test case to pass where a
> replace brick commit force always fails on a multi node cluster and that's
> on the latest mainline code.
>
>
> *The fix is a one liner:*
> atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ gd HEAD~1
> diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c
> b/xlators/mgmt/glusterd/src/glusterd-utils.c
> index af30756c9..24d813fbd 100644
> --- a/xlators/mgmt/glusterd/src/glusterd-utils.c
> +++ b/xlators/mgmt/glusterd/src/glusterd-utils.c
> @@ -5995,6 +5995,7 @@ glusterd_brick_start (glusterd_volinfo_t *volinfo,
>   * TBD: re-use RPC connection across bricks
>   */
>  if (is_brick_mx_enabled ()) {
> +brickinfo->port_registered = _gf_true;
>  ret = glusterd_get_sock_from_brick_pid
> (pid, socketpath,
>
> sizeof(socketpath));
>  if (ret) {
>
>
>
>
> *The test does the following:*
>
> #!/bin/bash
>
>
>
> . $(dirname $0)/../../include.rc
>
> . $(dirname $0)/../../cluster.rc
>
> . $(dirname $0)/../../volume.rc
>
>
>
>
>
> cleanup;
>
>
>
> TEST launch_cluster 3;
>
>
>
> TEST $CLI_1 peer probe $H2;
>
> EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
>
>
>
> TEST $CLI_1 peer probe $H3;
>
> EXPECT_WITHIN $PROBE_TIMEOUT 2 peer_count
>
>
>
> TEST $CLI_1 volume set all cluster.brick-multiplex
> on
>
>
> TEST $CLI_1 volume create $V0 replica 3 $H1:$B1/${V0}1 $H2:$B2/${V0}1
> $H3:$B3/${V0}1
>
>
> TEST $CLI_1 volume start $V0
>
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H1
> $B1/${V0}1
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H2
> $B2/${V0}1
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3
> $B3/${V0}1
>
>
>
>
> #bug-1560957 - replace brick followed by an add-brick in a brick mux
> setup
> #brings down one brick instance
>
>
>
> kill_glusterd 3
>
> EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
>
> TEST $CLI_1 volume replace-brick $V0 $H1:$B1/${V0}1 $H1:$B1/${V0}1_new
> commit force
>
>
> *this is where the test always fails saying "volume replace-brick: failed:
> Commit failed on localhost. Please check log file for details.*
>
>
> TEST $glusterd_3
>
> EXPECT_WITHIN $PROBE_TIMEOUT 2 peer_count
>
>
>
> TEST $CLI_1 volume add-brick $V0 replica 3 $H1:$$B1/${V0}3 $H2:$B1/${V0}3
> $H3:$B1/${V0}3 commit force
>
>
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3
> $H3:$B1/${V0}1
> cleanup;
>
> glusterd log from 1st node
> [2018-03-27 13:11:58.630845] E [MSGID: 106053] [glusterd-utils.c:13889:
> glusterd_handle_replicate_brick_ops] 0-management: Failed to set extended
> attribute trusted.replace-brick : Transport endpoint is not connected
> [Transport endpoint is not connected]
>
> Request some help/attention from AFR folks.
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Coverity covscan for 2018-03-28-caa76bf8 (master branch)

2018-03-28 Thread staticanalysis
GlusterFS Coverity covscan results are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-03-28-caa76bf8
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel