Re: [Gluster-users] Question about trusted.afr

2017-03-01 Thread Karthik Subrahmanya
Hey Tamal,

Sorry for the delay. See my comments inline.

On Tue, Feb 28, 2017 at 10:29 AM, Tamal Saha  wrote:

> Hi,
> I am running a GlusterFS cluster in Kubernetes. This has a single 1x2
> volume. I am dealing with a split-brain sitution. During debugging I
> noticed that, the files in the backend bricks does not have the proper
> trusted.afr xattr. Given this volume has 2 bricks, I should see
> files with the  following 2 afr, I am guessing:
>
> trusted.afr.vol-client-0
> trusted.afr.vol-client-1
>
> But I see files with xattrs below:
>
> trusted.afr.vol-client-2
> trusted.afr.vol-client-3
> trusted.afr.vol-client-5
>
> As I run the bricks as pods in Kubernetes, I have from time to time added
> and removed bricks fron this volume. Basically the pod IPs changed when the
> pods get restarted. My questions are:
>
> 1. Am I seeing wrong trusted.afr?
>
AFAIK, the numbers 2,3,and 5 you are seeing in trusted.afr is because you
had added bricks to your volume. This is because of the change [1].
I suspect that you are getting 3 brick IDs in the changelog, since it was
1x3 volume after adding the brick to the volume.
You can check for the brick IDs/name of the trusted.afr attributes in the
vol file ".tcp-fuse.vol" and grep for "afr-pending-xattr"

> 2. When I remove-brick and add a new brick and then run heal full, does it
> reapply the trusted.afrs properly?
>
What do you mean by reapplying trusted.afr? Are you referring to changing
the values of attributes or changing the brick IDs?

[1]
https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.6/Persistent%20AFR%20Changelog%20xattributes.md

>
> Thanks,
> -Tamal
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 2 Quorum and arbiter

2017-04-19 Thread Karthik Subrahmanya
Hi,
Comments inline.

On Tue, Apr 18, 2017 at 1:11 AM, Mahdi Adnan 
wrote:

> Hi,
>
>
> We have a replica 2 volume and we have issue with setting proper quorum.
>
> The volumes used as datastore for vmware/ovirt, the current settings for
> the quorum are:
>
>
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.server-quorum-ratio: 51%
>
>
> Losing the first node which hosting the first bricks will take the storage
> domain in ovirt offline but in the FUSE mount point works fine "read/write"
>
This is not possible. When the client quorum is set to auto and the replica
count is 2, the first node should be up for read/write to happen from
mount. May be you are missing something here.

> Losing the second node or any other node that hosts only the second bricks
> of the replication will not affect ovirt storage domain i.e 2nd or 4th
> ndoes.
>
Since the server-quorum-ratio is set to 51%, this is also not possible.
Can you share the volume info here?

> As i understand, losing the first brick in replica 2 volumes will render
> the volume to read only
>
Yes you are correct, loosing the first brick will make the volume
read-only.

> , but how FUSE mount works in read write ?
> Also, can we add an arbiter node to the current replica 2 volume without
> losing data ? if yes, does the re-balance bug "Bug 1440635" affect this
> process ?
>
Yes you can add an arbiter brick without losing data and the bug 1440635
will not affect that, since only the metadata need to be replicated on the
arbiter brick.

> And what happen if we set "cluster.quorum-type: none" and the first node
> goes offline ?
>
If you set the quorum-type to auto in a replica 2 volume, you will be able
to read/write even when only one brick is up.
For an arbiter volume, quorum-type is auto by default and it is the
recommended setting.

HTH,
Karthik

>
> Thank you.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GluserFS WORM hardlink

2017-07-07 Thread Karthik Subrahmanya
Hi,

If I did not misunderstood, you are saying that WORM is not allowing to
create hard links for the files.
Answering it based on that assumption.
If the volume level or file level WORM feature is enabled and the file is
in WORM/WORM-Retained state,
then those files should be immutable and hence we don't allow to create
hardlinks for those files.

If I did not clarify your doubt or misunderstood your question, could you
please elaborate?

Regards,
Karthik

On Fri, Jul 7, 2017 at 2:30 PM, 최두일  wrote:

> GlusterFS WORM hard links will not be created
>
> OS is CentOS7
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS WORM mode can't create hard link pliz ㅠ

2017-07-09 Thread Karthik Subrahmanya
On Mon, Jul 10, 2017 at 5:55 AM, 최두일  wrote:

> hard linksA read-only file system does not produce a hard link in GlusterFS
> WORM mode. Is it impossible?
>
It is not possible to create a hard link in WORM mode.
If a file is in WORM mode then even from hard links you can not change the
data,
since both point to the same location on the disk.
Once a file becomes immutable it is not recommended to change its contents
and metadata.

May I know the use case where you want the hard links to be allowed in the
WORM mode?

Regards,
Karthik

> OS is CentOS7
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Restore a node in a replicating Gluster setup after data loss

2017-06-04 Thread Karthik Subrahmanya
Hay Niklaus,

Sorry for the delay. The *reset-brick* should do the trick for you.
You can have a look at [1] for more details.

[1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/

HTH,
Karthik

On Thu, Jun 1, 2017 at 12:28 PM, Niklaus Hofer <
niklaus.ho...@stepping-stone.ch> wrote:

> Hi
>
> We have a Replica 2 + Arbiter Gluster setup with 3 nodes Server1, Server2
> and Server3 where Server3 is the Arbiter node. There are several Gluster
> volumes ontop of that setup. They all look a bit like this:
>
> gluster volume info gv-tier1-vm-01
>
> [...]
> Number of Bricks: 1 x (2 + 1) = 3
> [...]
> Bricks:
> Brick1: Server1:/var/data/lv-vm-01
> Brick2: Server2:/var/data/lv-vm-01
> Brick3: Server3:/var/data/lv-vm-01/brick (arbiter)
> [...]
> cluster.data-self-heal-algorithm: full
> [...]
>
> We took down Server2 because we needed to do maintenance on this server's
> storage. During maintenance work, we ended up having to completely rebuild
> the storage on Server2. This means that "/var/data/lv-vm-01" on Server2 is
> now empty. However, all the Gluster Metadata in "/var/lib/glusterd/" is
> still in tact. Gluster has not been started on Server2.
>
> Here is what our sample gluster volume currently looks like on the still
> active nodes:
>
> gluster volume status gv-tier1-vm-01
>
> Status of volume: gv-tier1-vm-01
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick Server1:/var/data/lv-vm-0149204 0  Y 22775
> Brick Server3:/var/data/lv-vm-01/brick  49161 0  Y 15334
> Self-heal Daemon on localhost   N/A   N/AY 19233
> Self-heal Daemon on Server3 N/A   N/AY 20839
>
>
> Now we would like to rebuild the data on Server2 from the still in tact
> data on Server1. That is to say, we hope to start up Gluster on Server2 in
> such a way that it will sync the data from Server1 back. If at all
> possible, the Gluster cluster should stay up during this process and access
> to the Gluster volumes should not be interrupted.
>
> What is the correct / recommended way of doing this?
>
> Greetings
> Niklaus Hofer
> --
> stepping stone GmbH
> Neufeldstrasse 9
> CH-3012 Bern
>
> Telefon: +41 31 332 53 63
> www.stepping-stone.ch
> niklaus.ho...@stepping-stone.ch
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Interesting split-brain...

2017-06-14 Thread Karthik Subrahmanya
Hi Ludwig,

There is no way to resolve gfid split-brains with type mismatch. You have
to do it manually by following the steps in [1].
In case of type mismatch it is recommended to resolve it manually. But for
only gfid mismatch in 3.11 we have a way to
resolve it by using the *favorite-child-policy*.
Since the file is not important, you can go with deleting that.

[1]
https://gluster.readthedocs.io/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain

HTH,
Karthik

On Thu, Jun 15, 2017 at 8:23 AM, Ludwig Gamache 
wrote:

> I am new to gluster but already like it. I did a maintenance last week
> where I shutdown both nodes (one after each others). I had many files that
> needed to be healed after that. Everything worked well, except for 1 file.
> It is in split-brain, with 2 different GFID. I read the documentation but
> it only covers the cases where the GFID is the same on both bricks. BTW, I
> am running Gluster 3.10.
>
> Here are some details...
>
> [root@NAS-01 .glusterfs]# gluster volume heal data01 info
>
> Brick 192.168.186.11:/mnt/DATA/data
>
> /abc/.zsh_history
>
> /abc - Is in split-brain
>
>
> Status: Connected
>
> Number of entries: 2
>
>
> Brick 192.168.186.12:/mnt/DATA/data
>
> /abc - Is in split-brain
>
>
> /abc/.zsh_history
>
> Status: Connected
>
> Number of entries: 2
>
> On brick 1:
>
> [root@NAS-01 abc]# ls -lart
>
> total 75
>
> drwxr-xr-x.  2 root  root  2 Jun  8 13:26 .zsh_history
>
> drwxr-xr-x.  3 12078 root  3 Jun 12 11:36 .
>
> drwxrwxrwt. 17 root  root 17 Jun 12 12:20 ..
>
> On brick 2:
>
> [root@DC-MTL-NAS-02 abc]# ls -lart
>
> total 66
>
> -rw-rw-r--.  2 12078 12078 1085 Jun 12 04:42 .zsh_history
>
> drwxr-xr-x.  2 12078 root 3 Jun 12 10:36 .
>
> drwxrwxrwt. 17 root  root17 Jun 12 11:20 ..
>
> Notice that on one brick, it is a file and on the other one it is a
> directory.
>
> On brick 1:
>
> [root@NAS-01 abc]# getfattr -d -m . -e hex /mnt/DATA/data/abc/.zsh_history
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: mnt/DATA/data/abc/.zsh_history
>
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
>
> trusted.afr.data01-client-0=0x
>
> trusted.afr.data01-client-1=0x0002
>
> trusted.gfid=0xdee43407139d41f091d13e106a51f262
>
> trusted.glusterfs.dht=0x0001
>
> On brick 2:
>
> root@NAS-02 abc]# getfattr -d -m . -e hex /mnt/DATA/data/abc/.zsh_history
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: mnt/DATA/data/abc/.zsh_history
>
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
>
> trusted.afr.data01-client-0=0x00170002
>
> trusted.afr.data01-client-1=0x
>
> trusted.bit-rot.version=0x060059397acd0005dadd
>
> trusted.gfid=0xa70ae9af887a4a37875f5c7c81ebc803
>
> Any recommendation on how to recover from that? BTW, the file is not
> important and I could easily get rid of it without impact. So, if this is
> an easy solution...
>
> Regards,
>
> --
> Ludwig Gamache
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-16 Thread Karthik Subrahmanya
Hi Matt,

The files might be in split brain. Could you please send the outputs of
these?
gluster volume info 
gluster volume heal  info
And also the getfattr output of the files which are in the heal info output
from all the bricks of that replica pair.
getfattr -d -e hex -m . 

Thanks &  Regards
Karthik

On 16-Oct-2017 8:16 PM, "Matt Waymack"  wrote:

Hi all,



I have a volume where the output of volume heal info shows several gfid
entries to be healed, but they’ve been there for weeks and have not
healed.  Any normal file that shows up on the heal info does get healed as
expected, but these gfid entries do not.  Is there any way to remove these
orphaned entries from the volume so they are no longer stuck in the heal
process?



Thank you!

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-17 Thread Karthik Subrahmanya
Hi Matt,

Run these commands on all the bricks of the replica pair to get the attrs
set on the backend.

On the bricks of first replica set:
getfattr -d -e hex -m . /.glusterfs/10/86/
108694db-c039-4b7c-bd3d-ad6a15d811a2

On the fourth replica set:
getfattr -d -e hex -m . /.glusterfs/
e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3

Also run the "gluster volume heal " once and send the shd log.
And the output of "gluster volume heal  info split-brain"

Regards,
Karthik

On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mwaym...@nsgdv.com> wrote:

> OK, so here’s my output of the volume info and the heal info. I have not
> yet tracked down physical location of these files, any tips to finding them
> would be appreciated, but I’m definitely just wanting them gone.  I forgot
> to mention earlier that the cluster is running 3.12 and was upgraded from
> 3.10; these files were likely stuck like this when it was on 3.10.
>
>
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume info gv0
>
>
>
> Volume Name: gv0
>
> Type: Distributed-Replicate
>
> Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
>
> Status: Started
>
> Snapshot Count: 0
>
> Number of Bricks: 4 x (2 + 1) = 12
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
>
> Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
>
> Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
>
> Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
>
> Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
>
> Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
>
> Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
>
> Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
>
> Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
>
> Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
>
> Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
>
> Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
>
> Options Reconfigured:
>
> nfs.disable: on
>
> transport.address-family: inet
>
>
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
>
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
>
> 
>
> 
>
> 
>
> 
>
> 
>
>
>
> 
>
>
>
> Status: Connected
>
> Number of entries: 118
>
>
>
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
>
> 
>
> 
>
> 
>
> 
>
> 
>
>
>
> 
>
>
>
> Status: Connected
>
> Number of entries: 118
>
>
>
> Brick tpc-arbiter1-100617:/exp/b1/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> Status: Connected
>
> Number of entries: 24
>
>
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> 
>
> Status: Connected
>
> Number of entries: 24
>
>
>
> Brick tpc-arbiter1-100617:/exp/b4/gv0
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Thank you for your help!
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Monday, October 16, 2017 10:27 AM
> *To:* Matt Waymack <mwaym...@nsgdv.com>
> *Cc:* gluster-users <Gluster-users@gluster.org>
> *Subject:* Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
>
>
>
> Hi Matt,
>
>
>
> The files might be in split brain. Could you please send the outputs of
> these?
>
> gluster volume info 
>
> gluster volume heal  info
>
> And also the getfattr output of the files which are in the heal info
> output from all the bricks of that replica pair.
>
> getfattr -d -e hex -m . 
>
>
>
> Thanks &  Regards
>
> Karthik
>
>
>
> On 16-Oct-2017 8:16 PM, "Matt Waymack" <mwaym...@nsgdv.com> wrote:
>
> Hi all,
>
>
>
> I have a volume where the output of volume heal info shows several gfid
> entries to be healed, but they’ve been there for weeks and have not
> healed.  Any normal file that shows up on the heal info does get healed as
> expected, but these gfid entries do not.  Is there any way to remove these
> orphaned entries from the volume so they are no longer stuck in the heal
> process?
>
>
>
> Thank you!
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Adding bricks to an existing installation.

2017-09-26 Thread Karthik Subrahmanya
Hey Ludwig,

Yes this configuration is fine. You can add them and do the rebalance after
that.

FYI: Replica 2 volumes are prone to split-brains. Replica 3 or arbiter will
highly reduce the possibility of ending up in split-brains.
If possible consider using one of those configurations. For more
information go through [1].

[1] https://gluster.readthedocs.io/en/latest/Administrator%
20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/

Thanks & Regards,
Karthik

On Tue, Sep 26, 2017 at 12:48 AM, Ludwig Gamache 
wrote:

> Sharding is not enabled.
>
> Ludwig
>
> On Mon, Sep 25, 2017 at 2:34 PM,  wrote:
>
>> Do you have sharding enabled ? If yes, don't do it.
>> If no I'll let someone who knows better answer you :)
>>
>> On Mon, Sep 25, 2017 at 02:27:13PM -0400, Ludwig Gamache wrote:
>> > All,
>> >
>> > We currently have a Gluster installation which is made of 2 servers.
>> Each
>> > server has 10 drives on ZFS. And I have a gluster mirror between these
>> 2.
>> >
>> > The current config looks like:
>> > SERVER A-BRICK 1 replicated to SERVER B-BRICK 1
>> >
>> > I now need to add more space and a third server. Before I do the
>> changes, I
>> > want to know if this is a supported config. By adding a third server, I
>> > simply want to distribute the load. I don't want to add extra
>> redundancy.
>> >
>> > In the end, I want to have the following done:
>> > Add a peer to the cluster
>> > Add 2 bricks to the cluster (one on server A and one on SERVER C) to the
>> > existing volume
>> > Add 2 bricks to the cluster (one on server B and one on SERVER C) to the
>> > existing volume
>> > After that, I need to rebalance all the data between the bricks...
>> >
>> > Is this config supported? Is there something I should be careful before
>> I
>> > do this? SHould I do a rebalancing before I add the 3 set of disks?
>> >
>> > Regards,
>> >
>> >
>> > Ludwig
>>
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Ludwig Gamache
> IT Director - Element AI
> 4200 St-Laurent, suite 1200
> 514-704-0564
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

2017-09-28 Thread Karthik Subrahmanya
On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> Hi,
>
> Thanks for reply!
>
> I’ve checked [1]. But the problem is that there is nothing shown in
> command “gluster volume heal  info”. So these split-entry
> files could only be detected when app try to visit them.
>
> I can find gfid mismatch for those in-split-brain entries from mount log,
> however, nothing show in shd log, the shd log does not know those
> split-brain entries. Because there is nothing in indices/xattrop directory.
>
I guess it was there before, and then it got cleared by one of the heal
process either client side or server side. I wanted to check that by
examining the logs.
Which version of gluster you are running by the way?

>
>
> The log is not available right now, when it reproduced, I will provide it
> to your, Thanks!
>
Ok.

>
>
> Best regards,
> *Cynthia **(周琳)*
>
> MBB SM HETRAN SW3 MATRIX
>
> Storage
> Mobile: +86 (0)18657188311
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Thursday, September 28, 2017 2:02 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>
> *Cc:* Gluster-users@gluster.org; gluster-de...@gluster.org
> *Subject:* Re: [Gluster-users] after hard reboot, split-brain happened,
> but nothing showed in gluster voluem heal info command !
>
>
>
> Hi,
>
> To resolve the gfid split-brain you can follow the steps at [1].
>
> Since we don't have the pending markers set on the files, it is not
> showing in the heal info.
> To debug this issue, need some more data from you. Could you provide these
> things?
>
> 1. volume info
>
> 2. mount log
>
> 3. brick logs
>
> 4. shd log
>
>
>
> May I also know which version of gluster you are running. From the info
> you have provided it looks like an old version.
>
> If it is, then it would be great if you can upgarde to one of the latest
> supported release.
>
>
> [1] http://docs.gluster.org/en/latest/Troubleshooting/split-
> brain/#fixing-directory-entry-split-brain
>
>
>
> Thanks & Regards,
>
> Karthik
>
> On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
>
>
> HI gluster experts,
>
>
>
> I meet a tough problem about “split-brain” issue. Sometimes, after hard
> reboot, we will find some files in split-brain, however its parent
> directory or anything could be shown in command “gluster volume heal
>  info”, also, no entry in .glusterfs/indices/xattrop
> directory, can you help to shed some lights on this issue? Thanks!
>
>
>
>
>
>
>
> Following is some info from our env,
>
>
>
> *Checking from sn-0 cliet, nothing is shown in-split-brain!*
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # gluster v heal services info
>
> Brick sn-0:/mnt/bricks/services/brick/
>
> Number of entries: 0
>
>
>
> Brick sn-1:/mnt/bricks/services/brick/
>
> Number of entries: 0
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # gluster v heal services info split-brain
>
> Gathering list of split brain entries on volume services has been
> successful
>
>
>
> Brick sn-0.local:/mnt/bricks/services/brick
>
> Number of entries: 0
>
>
>
> Brick sn-1.local:/mnt/bricks/services/brick
>
> Number of entries: 0
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # ls -l /mnt/services/netserv/ethip/
>
> ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error
>
> ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error
>
> total 3
>
> -rw-r--r-- 1 root root 144 Sep 26 20:35 as-0
>
> -rw-r--r-- 1 root root 144 Sep 26 20:35 as-1
>
> -rw-r--r-- 1 root root 145 Sep 26 20:35 as-2
>
> -rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0
>
> -? ? ??  ?? mn-1
>
> -rw-r--r-- 1 root root  73 Sep 26 20:35 sn-0
>
> -rw-r--r-- 1 root root  73 Sep 26 20:35 sn-1
>
> -? ? ??  ?? sn-2
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
>
>
> *Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is
> different*
>
>
>
> *[SN-0]*
>
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
>
> # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: mnt/bricks/services/brick/netserv/ethip
>
> trusted.gfid=0

Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

2017-09-28 Thread Karthik Subrahmanya
Hi,

To resolve the gfid split-brain you can follow the steps at [1].
Since we don't have the pending markers set on the files, it is not showing
in the heal info.
To debug this issue, need some more data from you. Could you provide these
things?
1. volume info
2. mount log
3. brick logs
4. shd log

May I also know which version of gluster you are running. From the info you
have provided it looks like an old version.
If it is, then it would be great if you can upgarde to one of the latest
supported release.

[1]
http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain

Thanks & Regards,
Karthik

On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

>
> HI gluster experts,
>
> I meet a tough problem about “split-brain” issue. Sometimes, after hard
> reboot, we will find some files in split-brain, however its parent
> directory or anything could be shown in command “gluster volume heal
>  info”, also, no entry in .glusterfs/indices/xattrop
> directory, can you help to shed some lights on this issue? Thanks!
>
>
>
> Following is some info from our env,
>
> *Checking from sn-0 cliet, nothing is shown in-split-brain!*
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> # gluster v heal services info
> Brick sn-0:/mnt/bricks/services/brick/
> Number of entries: 0
>
> Brick sn-1:/mnt/bricks/services/brick/
> Number of entries: 0
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> # gluster v heal services info split-brain
> Gathering list of split brain entries on volume services has been
> successful
>
> Brick sn-0.local:/mnt/bricks/services/brick
> Number of entries: 0
>
> Brick sn-1.local:/mnt/bricks/services/brick
> Number of entries: 0
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> # ls -l /mnt/services/netserv/ethip/
> ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error
> ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error
> total 3
> -rw-r--r-- 1 root root 144 Sep 26 20:35 as-0
> -rw-r--r-- 1 root root 144 Sep 26 20:35 as-1
> -rw-r--r-- 1 root root 145 Sep 26 20:35 as-2
> -rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0
> -? ? ??  ?? mn-1
> -rw-r--r-- 1 root root  73 Sep 26 20:35 sn-0
> -rw-r--r-- 1 root root  73 Sep 26 20:35 sn-1
> -? ? ??  ?? sn-2
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> *Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is
> different*
>
> *[SN-0]*
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
> # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/bricks/services/brick/netserv/ethip
> trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4
> trusted.glusterfs.dht=0x0001
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> # getfattr -m . -d -e hex mn-1
> # file: mn-1
> trusted.afr.dirty=0x
> trusted.afr.services-client-0=0x
> trusted.afr.services-client-1=0x
> trusted.gfid=0x53a33f437464475486f31c4e44d83afd
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
> # stat mn-1
>   File: mn-1
>   Size: 237  Blocks: 16 IO Block: 4096   regular file
> Device: fd51h/64849dInode: 2536Links: 2
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
> Access: 2017-09-26 20:30:25.67900 +0300
> Modify: 2017-09-26 20:30:24.60400 +0300
> Change: 2017-09-26 20:30:24.61000 +0300
> Birth: -
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
> # ls
> xattrop-63f8bbcb-7fa6-4fc8-b721-675a05de0ab3
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
>
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
> # ls
> 53a33f43-7464-4754-86f3-1c4e44d83afd
> [root@sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
> # stat 53a33f43-7464-4754-86f3-1c4e44d83afd
>   File: 53a33f43-7464-4754-86f3-1c4e44d83afd
>   Size: 237  Blocks: 16 IO Block: 4096   regular file
> Device: fd51h/64849dInode: 2536Links: 2
> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
> Access: 2017-09-26 20:30:25.67900 +0300
> Modify: 2017-09-26 20:30:24.60400 +0300
> Change: 2017-09-26 20:30:24.61000 +0300
> Birth: -
>
> #
> *[SN-1]*
>
> [root@sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]
> #  getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip
> getfattr: Removing leading '/' from absolute path names
> # file: mnt/bricks/services/brick/netserv/ethip
> trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4
> trusted.glusterfs.dht=0x0001
>
> [root@sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]
> *#*
> 

Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

2017-09-28 Thread Karthik Subrahmanya
On Thu, Sep 28, 2017 at 12:11 PM, Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

>
>
> The version I am using is glusterfs 3.6.9
>
This is a very old version which is EOL. If you can upgrade to any of the
supported version (3.10 or 3.12) would be great.
They have many new features, bug fixes & performance improvements. If you
can try to reproduce the issue on that would be
very helpful.

Regards,
Karthik

> Best regards,
> *Cynthia **(周琳)*
>
> MBB SM HETRAN SW3 MATRIX
>
> Storage
> Mobile: +86 (0)18657188311
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Thursday, September 28, 2017 2:37 PM
>
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>
> *Cc:* Gluster-users@gluster.org; gluster-de...@gluster.org
> *Subject:* Re: [Gluster-users] after hard reboot, split-brain happened,
> but nothing showed in gluster voluem heal info command !
>
>
>
>
>
>
>
> On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
> Hi,
>
> Thanks for reply!
>
> I’ve checked [1]. But the problem is that there is nothing shown in
> command “gluster volume heal  info”. So these split-entry
> files could only be detected when app try to visit them.
>
> I can find gfid mismatch for those in-split-brain entries from mount log,
> however, nothing show in shd log, the shd log does not know those
> split-brain entries. Because there is nothing in indices/xattrop directory.
>
> I guess it was there before, and then it got cleared by one of the heal
> process either client side or server side. I wanted to check that by
> examining the logs.
>
> Which version of gluster you are running by the way?
>
>
>
> The log is not available right now, when it reproduced, I will provide it
> to your, Thanks!
>
> Ok.
>
>
>
> Best regards,
> *Cynthia **(周琳)*
>
> MBB SM HETRAN SW3 MATRIX
>
> Storage
> Mobile: +86 (0)18657188311
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Thursday, September 28, 2017 2:02 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>
> *Cc:* Gluster-users@gluster.org; gluster-de...@gluster.org
> *Subject:* Re: [Gluster-users] after hard reboot, split-brain happened,
> but nothing showed in gluster voluem heal info command !
>
>
>
> Hi,
>
> To resolve the gfid split-brain you can follow the steps at [1].
>
> Since we don't have the pending markers set on the files, it is not
> showing in the heal info.
> To debug this issue, need some more data from you. Could you provide these
> things?
>
> 1. volume info
>
> 2. mount log
>
> 3. brick logs
>
> 4. shd log
>
>
>
> May I also know which version of gluster you are running. From the info
> you have provided it looks like an old version.
>
> If it is, then it would be great if you can upgarde to one of the latest
> supported release.
>
>
> [1] http://docs.gluster.org/en/latest/Troubleshooting/split-
> brain/#fixing-directory-entry-split-brain
>
>
>
> Thanks & Regards,
>
> Karthik
>
> On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
>
>
> HI gluster experts,
>
>
>
> I meet a tough problem about “split-brain” issue. Sometimes, after hard
> reboot, we will find some files in split-brain, however its parent
> directory or anything could be shown in command “gluster volume heal
>  info”, also, no entry in .glusterfs/indices/xattrop
> directory, can you help to shed some lights on this issue? Thanks!
>
>
>
>
>
>
>
> Following is some info from our env,
>
>
>
> *Checking from sn-0 cliet, nothing is shown in-split-brain!*
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # gluster v heal services info
>
> Brick sn-0:/mnt/bricks/services/brick/
>
> Number of entries: 0
>
>
>
> Brick sn-1:/mnt/bricks/services/brick/
>
> Number of entries: 0
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # gluster v heal services info split-brain
>
> Gathering list of split brain entries on volume services has been
> successful
>
>
>
> Brick sn-0.local:/mnt/bricks/services/brick
>
> Number of entries: 0
>
>
>
> Brick sn-1.local:/mnt/bricks/services/brick
>
> Number of entries: 0
>
>
>
> [root@sn-0:/mnt/bricks/services/brick/netserv/ethip]
>
> # ls -l /mnt/services/netserv/ethip/
>
> ls: cannot acces

Re: [Gluster-users] Peer isolation while healing

2017-10-09 Thread Karthik Subrahmanya
Hi,

There is no way to isolate the healing peer. Healing happens from the good
brick to the bad brick.
I guess your replica bricks are on a different peers. If you try to isolate
the healing peer, it will stop the healing process itself.

What is the error you are getting while writing? It would be helpful to
debug the issue, if you can provide us the output of the following commands:
gluster volume info 
gluster volume heal  info
And also provide the client & heal logs.

Thanks & Regards,
Karthik

On Mon, Oct 9, 2017 at 3:02 PM, ML  wrote:

> Hi everyone,
>
> I've been using gluster for a few month now, on a simple 2 peers
> replicated infrastructure, 22Tb each.
>
> One of the peers has been offline last week during 10 hours (raid resync
> after a disk crash), and while my gluster server was healing bricks, I
> could see some write errors on my gluster clients.
>
> I couldn't find a way to isolate my healing peer, in the documentation or
> anywhere else.
>
> Is there a way to avoid that ? Detach the peer while healing ? Some
> tunning on the client side maybe ?
>
> I'm using gluster 3.9 on debian 8.
>
> Thank you for your help.
>
> Quentin
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-18 Thread Karthik Subrahmanya
ck tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> -Matt
>
> From: Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> Sent: Tuesday, October 17, 2017 1:26 AM
> To: Matt Waymack <mwaym...@nsgdv.com>
> Cc: gluster-users <Gluster-users@gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do not
> heal
>
> Hi Matt,
>
> Run these commands on all the bricks of the replica pair to get the attrs
> set on the backend.
>
> On the bricks of first replica set:
> getfattr -d -e hex -m . /.glusterfs/10/86/
> 108694db-c039-4b7c-bd3d-ad6a15d811a2
> On the fourth replica set:
> getfattr -d -e hex -m . /.glusterfs/e0/c5/
> e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> Also run the "gluster volume heal " once and send the shd log.
> And the output of "gluster volume heal  info split-brain"
> Regards,
> Karthik
>
> On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaym...@nsgdv.com>
> wrote:
> OK, so here’s my output of the volume info and the heal info. I have not
> yet tracked down physical location of these files, any tips to finding them
> would be appreciated, but I’m definitely just wanting them gone.  I forgot
> to mention earlier that the cluster is running 3.12 and was upgraded from
> 3.10; these files were likely stuck like this when it was on 3.10.
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume info gv0
>
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x (2 + 1) = 12
> Transport-type: tcp
> Bricks:
> Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
> Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
> Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
> Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
> Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
> Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
> Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
> Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
> Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
> Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
> Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
> Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> 
> 
> 
> 
> 
>
> 
>
> Status: Connected
> Number of entries: 118
>
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> 
> 
> 
> 
> 
>
> 
>
> Status: Connected
> Number of entries: 118
>
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Status: Connected
> Number of entries: 24
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Status: Connected
> Nu

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-24 Thread Karthik Subrahmanya
Hi Jim,

Can you check whether the same hardlinks are present on both the bricks &
both of them have the link count 2?
If the link count is 2 then "find  -samefile
//"
should give you the file path.

Regards,
Karthik

On Tue, Oct 24, 2017 at 3:28 AM, Jim Kinney <jim.kin...@gmail.com> wrote:

> I'm not so lucky. ALL of mine show 2 links and none have the attr data
> that supplies the path to the original.
>
> I have the inode from stat. Looking now to dig out the path/filename from
> xfs_db on the specific inodes individually.
>
> Is the hash of the filename or /filename and if so relative to
> where? /, , ?
>
> On Mon, 2017-10-23 at 18:54 +, Matt Waymack wrote:
>
> In my case I was able to delete the hard links in the .glusterfs folders
> of the bricks and it seems to have done the trick, thanks!
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Monday, October 23, 2017 1:52 AM
> *To:* Jim Kinney <jim.kin...@gmail.com>; Matt Waymack <mwaym...@nsgdv.com>
> *Cc:* gluster-users <Gluster-users@gluster.org>
> *Subject:* Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
>
>
>
> Hi Jim & Matt,
>
> Can you also check for the link count in the stat output of those hardlink
> entries in the .glusterfs folder on the bricks.
> If the link count is 1 on all the bricks for those entries, then they are
> orphaned entries and you can delete those hardlinks.
>
> To be on the safer side have a backup before deleting any of the entries.
>
> Regards,
>
> Karthik
>
>
>
> On Fri, Oct 20, 2017 at 3:18 AM, Jim Kinney <jim.kin...@gmail.com> wrote:
>
> I've been following this particular thread as I have a similar issue
> (RAID6 array failed out with 3 dead drives at once while a 12 TB load was
> being copied into one mounted space - what a mess)
>
>
>
> I have >700K GFID entries that have no path data:
>
> Example:
>
> getfattr -d -e hex -m . .glusterfs/00/00/a5ef-
> 5af7-401b-84b5-ff2a51c10421
>
> # file: .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421
>
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
>
> trusted.bit-rot.version=0x020059b1b316000270e7
>
> trusted.gfid=0xa5ef5af7401b84b5ff2a51c10421
>
>
>
> [root@bmidata1 brick]# getfattr -d -n trusted.glusterfs.pathinfo -e hex
> -m . .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421
>
> .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421:
> trusted.glusterfs.pathinfo: No such attribute
>
>
>
> I had to totally rebuild the dead RAID array and did a copy from the live
> one before activating gluster on the rebuilt system. I accidentally copied
> over the .glusterfs folder from the working side
>
> (replica 2 only for now - adding arbiter node as soon as I can get this
> one cleaned up).
>
>
>
> I've run the methods from "http://docs.gluster.org/en/
> latest/Troubleshooting/gfid-to-path/" with no results using random GFIDs.
> A full systemic run using the script from method 3 crashes with "too many
> nested links" error (or something similar).
>
>
>
> When I run gluster volume heal volname info, I get 700K+ GFIDs. Oh.
> gluster 3.8.4 on Centos 7.3
>
>
>
> Should I just remove the contents of the .glusterfs folder on both and
> restart gluster and run a ls/stat on every file?
>
>
>
>
>
> When I run a heal, it no longer has a decreasing number of files to heal
> so that's an improvement over the last 2-3 weeks :-)
>
>
>
> On Tue, 2017-10-17 at 14:34 +, Matt Waymack wrote:
>
> Attached is the heal log for the volume as well as the shd log.
>
>
>
>
>
>
>
> Run these commands on all the bricks of the replica pair to get the attrs set 
> on the backend.
>
>
>
>
>
>
>
> [root@tpc-cent-glus1-081017 ~]# getfattr -d -e hex -m . 
> /exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: exp/b1/gv0/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
>
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>
> trusted.afr.dirty=0x
>
> trusted.afr.gv0-client-2=0x0001
>
> trusted.gfid=0x108694dbc0394b7cbd3dad6a15d811a2
>
> trusted.gfid2path.9a2f5ada22eb9c45=0x38633262623330322d323466332d346463622d393630322d3839356136396461363131662f435f564f4c2d623030312d693637342d63642d63772e6d6435
>
>
>
> [root@tpc-cent-glus2-081017 ~]# getfat

Re: [Gluster-users] How to make sure self-heal backlog is empty ?

2017-12-19 Thread Karthik Subrahmanya
Hi,

Can you provide the
- volume info
- shd log
- mount log
of the volumes which are showing pending entries, to debug the issue.

Thanks & Regards,
Karthik

On Wed, Dec 20, 2017 at 3:11 AM, Matt Waymack  wrote:

> Mine also has a list of files that seemingly never heal.  They are usually
> isolated on my arbiter bricks, but not always.  I would also like to find
> an answer for this behavior.
>
> -Original Message-
> From: gluster-users-boun...@gluster.org [mailto:gluster-users-bounces@
> gluster.org] On Behalf Of Hoggins!
> Sent: Tuesday, December 19, 2017 12:26 PM
> To: gluster-users 
> Subject: [Gluster-users] How to make sure self-heal backlog is empty ?
>
> Hello list,
>
> I'm not sure what to look for here, not sure if what I'm seeing is the
> actual "backlog" (that we need to make sure is empty while performing a
> rolling upgrade before going to the next node), how can I tell, while
> reading this, if it's okay to reboot / upgrade my next node in the pool ?
> Here is what I do for checking :
>
> for i in `gluster volume list`; do gluster volume heal $i info; done
>
> And here is what I get :
>
> Brick ngluster-1.network.hoggins.fr:/export/brick/clem
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-2.network.hoggins.fr:/export/brick/clem
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-3.network.hoggins.fr:/export/brick/clem
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-1.network.hoggins.fr:/export/brick/mailer
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-2.network.hoggins.fr:/export/brick/mailer
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-3.network.hoggins.fr:/export/brick/mailer
> 
> Status: Connected
> Number of entries: 1
>
> Brick ngluster-1.network.hoggins.fr:/export/brick/rom
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-2.network.hoggins.fr:/export/brick/rom
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-3.network.hoggins.fr:/export/brick/rom
> 
> Status: Connected
> Number of entries: 1
>
> Brick ngluster-1.network.hoggins.fr:/export/brick/thedude
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-2.network.hoggins.fr:/export/brick/thedude
> 
> Status: Connected
> Number of entries: 1
>
> Brick ngluster-3.network.hoggins.fr:/export/brick/thedude
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-1.network.hoggins.fr:/export/brick/web
> Status: Connected
> Number of entries: 0
>
> Brick ngluster-2.network.hoggins.fr:/export/brick/web
> 
> 
> 
> Status: Connected
> Number of entries: 3
>
> Brick ngluster-3.network.hoggins.fr:/export/brick/web
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Status: Connected
> Number of entries: 11
>
>
> Should I be worrying with this never ending ?
>
> Thank you,
>
> Hoggins!
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] not healing one file

2017-10-25 Thread Karthik Subrahmanya
Hey Richard,

Could you share the following informations please?
1. gluster volume info 
2. getfattr output of that file from all the bricks
getfattr -d -e hex -m . 
3. glustershd & glfsheal logs

Regards,
Karthik

On Thu, Oct 26, 2017 at 10:21 AM, Amar Tumballi  wrote:

> On a side note, try recently released health report tool, and see if it
> does diagnose any issues in setup. Currently you may have to run it in all
> the three machines.
>
>
>
> On 26-Oct-2017 6:50 AM, "Amar Tumballi"  wrote:
>
>> Thanks for this report. This week many of the developers are at Gluster
>> Summit in Prague, will be checking this and respond next week. Hope that's
>> fine.
>>
>> Thanks,
>> Amar
>>
>>
>> On 25-Oct-2017 3:07 PM, "Richard Neuboeck"  wrote:
>>
>>> Hi Gluster Gurus,
>>>
>>> I'm using a gluster volume as home for our users. The volume is
>>> replica 3, running on CentOS 7, gluster version 3.10
>>> (3.10.6-1.el7.x86_64). Clients are running Fedora 26 and also
>>> gluster 3.10 (3.10.6-3.fc26.x86_64).
>>>
>>> During the data backup I got an I/O error on one file. Manually
>>> checking for this file on a client confirms this:
>>>
>>> ls -l
>>> romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/ses
>>> sionstore-backups/
>>> ls: cannot access
>>> 'romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4':
>>> Input/output error
>>> total 2015
>>> -rw---. 1 romanoch tbi 998211 Sep 15 18:44 previous.js
>>> -rw---. 1 romanoch tbi  65222 Oct 17 17:57 previous.jsonlz4
>>> -rw---. 1 romanoch tbi 149161 Oct  1 13:46 recovery.bak
>>> -?? ? ???? recovery.baklz4
>>>
>>> Out of curiosity I checked all the bricks for this file. It's
>>> present there. Making a checksum shows that the file is different on
>>> one of the three replica servers.
>>>
>>> Querying healing information shows that the file should be healed:
>>> # gluster volume heal home info
>>> Brick sphere-six:/srv/gluster_home/brick
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>>
>>> Status: Connected
>>> Number of entries: 1
>>>
>>> Brick sphere-five:/srv/gluster_home/brick
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>>
>>> Status: Connected
>>> Number of entries: 1
>>>
>>> Brick sphere-four:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Manually triggering heal doesn't report an error but also does not
>>> heal the file.
>>> # gluster volume heal home
>>> Launching heal operation to perform index self heal on volume home
>>> has been successful
>>>
>>> Same with a full heal
>>> # gluster volume heal home full
>>> Launching heal operation to perform full self heal on volume home
>>> has been successful
>>>
>>> According to the split brain query that's not the problem:
>>> # gluster volume heal home info split-brain
>>> Brick sphere-six:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick sphere-five:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick sphere-four:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>>
>>> I have no idea why this situation arose in the first place and also
>>> no idea as how to solve this problem. I would highly appreciate any
>>> helpful feedback I can get.
>>>
>>> The only mention in the logs matching this file is a rename operation:
>>> /var/log/glusterfs/bricks/srv-gluster_home-brick.log:[2017-10-23
>>> 09:19:11.561661] I [MSGID: 115061]
>>> [server-rpc-fops.c:1022:server_rename_cbk] 0-home-server: 5266153:
>>> RENAME
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.jsonlz4
>>> (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.jsonlz4) ->
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>> (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.baklz4), client:
>>> romulus.tbi.univie.ac.at-11894-2017/10/18-07:06:07:206366-ho
>>> me-client-3-0-0,
>>> error-xlator: home-posix [No data available]
>>>
>>> I enabled directory quotas the same day this problem showed up but
>>> I'm not sure how quotas could have an effect like this (maybe unless
>>> the limit is reached but that's also not the case).
>>>
>>> Thanks again if anyone as an idea.
>>> Cheers
>>> Richard
>>> --
>>> /dev/null
>>>
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing 

Re: [Gluster-users] not healing one file

2017-10-26 Thread Karthik Subrahmanya
Hi Richard,

Thanks for the informations. As you said there is gfid mismatch for the
file.
On brick-1 & brick-2 the gfids are same & on brick-3 the gfid is different.
This is not considered as split-brain because we have two good copies here.
Gluster 3.10 does not have a method to resolve this situation other than the
manual intervention [1]. Basically what you need to do is remove the file
and
the gfid hardlink from brick-3 (considering brick-3 entry as bad). Then when
you do a lookup for the file from mount it will recreate the entry on the
other brick.

Form 3.12 we have methods to resolve this situation with the cli option [2]
and
with favorite-child-policy [3]. For the time being you can use [1] to
resolve this
and if you can consider upgrading to 3.12 that would give you options to
handle
these scenarios.

[1]
http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain
[2] https://review.gluster.org/#/c/17485/
[3] https://review.gluster.org/#/c/16878/

HTH,
Karthik

On Thu, Oct 26, 2017 at 12:40 PM, Richard Neuboeck <h...@tbi.univie.ac.at>
wrote:

> Hi Karthik,
>
> thanks for taking a look at this. I'm not working with gluster long
> enough to make heads or tails out of the logs. The logs are attached to
> this mail and here is the other information:
>
> # gluster volume info home
>
> Volume Name: home
> Type: Replicate
> Volume ID: fe6218ae-f46b-42b3-a467-5fc6a36ad48a
> Status: Started
> Snapshot Count: 1
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: sphere-six:/srv/gluster_home/brick
> Brick2: sphere-five:/srv/gluster_home/brick
> Brick3: sphere-four:/srv/gluster_home/brick
> Options Reconfigured:
> features.barrier: disable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: on
> performance.cache-samba-metadata: on
> performance.cache-invalidation: on
> performance.md-cache-timeout: 600
> network.inode-lru-limit: 9
> performance.cache-size: 1GB
> performance.client-io-threads: on
> cluster.lookup-optimize: on
> cluster.readdir-optimize: on
> features.quota: on
> features.inode-quota: on
> features.quota-deem-statfs: on
> cluster.server-quorum-ratio: 51%
>
>
> [root@sphere-four ~]# getfattr -d -e hex -m .
> /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> getfattr: Removing leading '/' from absolute path names
> # file:
> srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x020059df20a40006f989
> trusted.gfid=0xda1c94b1643544b18d5b6f4654f60bf5
> trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1=
> 0x9a01
> trusted.pgfid.48e9eea6-cda6-4e53-bb4a-72059debf4c2=0x0001
>
> [root@sphere-five ~]# getfattr -d -e hex -m .
> /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> getfattr: Removing leading '/' from absolute path names
> # file:
> srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x
> trusted.afr.home-client-4=0x00010001
> trusted.bit-rot.version=0x020059df1f310006ce63
> trusted.gfid=0xea8ecfd195fd4e48b994fd0a2da226f9
> trusted.glusterfs.quota.48e9eea6-cda6-4e53-bb4a-72059debf4c2.contri.1=
> 0x9a01
> trusted.pgfid.48e9eea6-cda6-4e53-bb4a-72059debf4c2=0x0001
>
> [root@sphere-six ~]# getfattr -d -e hex -m .
> /srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> getfattr: Removing leading '/' from absolute path names
> # file:
> srv/gluster_home/brick/romanoch/.mozilla/firefox/vzzqqxrm.default-
> 1396429081309/sessionstore-backups/recovery.baklz4
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x
> trusted.afr.home-client-4=0x00010001
> trusted.bit-rot.version=0x020059df11cd000548ec
> trusted.gfid=0xea8ecfd195fd4e48b994fd0a2da226f9
> trusted.glusterfs.quota.48e9eea6-cda6-4

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-25 Thread Karthik Subrahmanya
Hey Jim,

Please let me know what I understood is correct or not.
You have 14,734 GFIDs in .glusterfs which are only on the brick which was
up during the failure and it does not have any referenced file in that
brick.
The down brick has the file  inside the brick path, but not the GFID
hardlink in the .glusterfs folder.
Did I get it right?

Could you also give me these information as well?
1. What is the link count for those GFIDs in the up brick?
2. If the link count is 2 or more do you have a file path for those GFIDs
in the up brick? (use the find command)
3. Do you have the GFID hardlink or you have the file in the down brick?

Let me explain how the things work.
When a file gets created from the mount, the file will be created inside
the brick & a hardlink to that file will be created inside the .glusterfs
folder with its GFID.
So the link count for the file will be 2 (unless you create more hardlinks
manually). So after the failure happened I guess you should have both the
file & hardlink in the up brick,
and when you do the lookup on the mount for that file, it should create the
file & the hardlink on the brick which was down.

Regards,
Karthik

On Tue, Oct 24, 2017 at 10:29 PM, Jim Kinney <jim.kin...@gmail.com> wrote:

> I have 14,734 GFIDS that are different. All the different ones are only on
> the brick that was live during the outage and concurrent file copy-in. The
> brick that was down at that time has no GFIDs that are not also on the up
> brick.
>
> As the bricks are 10TB, the find is going to be a long running process.
> I'm running several finds at once with gnu parallel but it will still take
> some time. Can't bring the up machine offline as it's in use. At least I
> have 24 cores to work with.
>
> I've only tested with one GFID but the file it referenced _IS_ on the down
> machine even though it has no GFID in the .glusterfs structure.
>
> On Tue, 2017-10-24 at 12:35 +0530, Karthik Subrahmanya wrote:
>
> Hi Jim,
>
> Can you check whether the same hardlinks are present on both the bricks &
> both of them have the link count 2?
> If the link count is 2 then "find  -samefile
> // gfid>"
> should give you the file path.
>
> Regards,
> Karthik
>
> On Tue, Oct 24, 2017 at 3:28 AM, Jim Kinney <jim.kin...@gmail.com> wrote:
>
> I'm not so lucky. ALL of mine show 2 links and none have the attr data
> that supplies the path to the original.
>
> I have the inode from stat. Looking now to dig out the path/filename from
> xfs_db on the specific inodes individually.
>
> Is the hash of the filename or /filename and if so relative to
> where? /, , ?
>
> On Mon, 2017-10-23 at 18:54 +, Matt Waymack wrote:
>
> In my case I was able to delete the hard links in the .glusterfs folders
> of the bricks and it seems to have done the trick, thanks!
>
>
>
> *From:* Karthik Subrahmanya [mailto:ksubr...@redhat.com]
> *Sent:* Monday, October 23, 2017 1:52 AM
> *To:* Jim Kinney <jim.kin...@gmail.com>; Matt Waymack <mwaym...@nsgdv.com>
> *Cc:* gluster-users <Gluster-users@gluster.org>
> *Subject:* Re: [Gluster-users] gfid entries in volume heal info that do
> not heal
>
>
>
> Hi Jim & Matt,
>
> Can you also check for the link count in the stat output of those hardlink
> entries in the .glusterfs folder on the bricks.
> If the link count is 1 on all the bricks for those entries, then they are
> orphaned entries and you can delete those hardlinks.
>
> To be on the safer side have a backup before deleting any of the entries.
>
> Regards,
>
> Karthik
>
>
>
> On Fri, Oct 20, 2017 at 3:18 AM, Jim Kinney <jim.kin...@gmail.com> wrote:
>
> I've been following this particular thread as I have a similar issue
> (RAID6 array failed out with 3 dead drives at once while a 12 TB load was
> being copied into one mounted space - what a mess)
>
>
>
> I have >700K GFID entries that have no path data:
>
> Example:
>
> getfattr -d -e hex -m . .glusterfs/00/00/a5ef-5af7
> -401b-84b5-ff2a51c10421
>
> # file: .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421
>
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6
> c6162656c65645f743a733000
>
> trusted.bit-rot.version=0x020059b1b316000270e7
>
> trusted.gfid=0xa5ef5af7401b84b5ff2a51c10421
>
>
>
> [root@bmidata1 brick]# getfattr -d -n trusted.glusterfs.pathinfo -e hex
> -m . .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421
>
> .glusterfs/00/00/a5ef-5af7-401b-84b5-ff2a51c10421:
> trusted.glusterfs.pathinfo: No such attribute
>
>
>
> I had to totally rebuild the dead RAID array and did a copy from the live
> one before activating gluster on the rebuilt system. I accid

Re: [Gluster-users] Gluster replicate 3 arbiter 1 in split brain. gluster cli seems unaware

2017-12-20 Thread Karthik Subrahmanya
Hey,

Can you give us the volume info output for this volume?
Why are you not able to get the xattrs from arbiter brick? It is the same
way as you do it on data bricks.
The changelog xattrs are named trusted.afr.virt_images-client-{1,2,3} in
the getxattr outputs you have provided.
Did you do a remove-brick and add-brick any time? Otherwise it will be
trusted.afr.virt_images-client-{0,1,2} usually.

To overcome this scenario you can do what Ben Turner had suggested. Select
the source copy and change the xattrs manually.
I am suspecting that it has hit the arbiter becoming source for data heal
bug. But to confirm that we need the xattrs on the arbiter brick also.

Regards,
Karthik


On Thu, Dec 21, 2017 at 9:55 AM, Ben Turner  wrote:

> Here is the process for resolving split brain on replica 2:
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.1/html/
> Administration_Guide/Recovering_from_File_Split-brain.html
>
> It should be pretty much the same for replica 3, you change the xattrs
> with something like:
>
> # setfattr -n trusted.afr.vol-client-0 -v 0x0001
> /gfs/brick-b/a
>
> When I try to decide which copy to use I normally run things like:
>
> # stat //pat/to/file
>
> Check out the access and change times of the file on the back end bricks.
> I normally pick the copy with the latest access / change times.  I'll also
> check:
>
> # md5sum //pat/to/file
>
> Compare the hashes of the file on both bricks to see if the data actually
> differs.  If the data is the same it makes choosing the proper replica
> easier.
>
> Any idea how you got in this situation?  Did you have a loss of NW
> connectivity?  I see you are using server side quorum, maybe check the logs
> for any loss of quorum?  I wonder if there was a loos of quorum and there
> was some sort of race condition hit:
>
> http://docs.gluster.org/en/latest/Administrator%20Guide/
> arbiter-volumes-and-quorum/#server-quorum-and-some-pitfalls
>
> "Unlike in client-quorum where the volume becomes read-only when quorum is
> lost, loss of server-quorum in a particular node makes glusterd kill the
> brick processes on that node (for the participating volumes) making even
> reads impossible."
>
> I wonder if the killing of brick processes could have led to some sort of
> race condition where writes were serviced on one brick / the arbiter and
> not the other?
>
> If you can find a reproducer for this please open a BZ with it, I have
> been seeing something similar(I think) but I haven't been able to run the
> issue down yet.
>
> -b
>
> - Original Message -
> > From: "Henrik Juul Pedersen" 
> > To: gluster-users@gluster.org
> > Cc: "Henrik Juul Pedersen" 
> > Sent: Wednesday, December 20, 2017 1:26:37 PM
> > Subject: [Gluster-users] Gluster replicate 3 arbiter 1 in split brain.
>   gluster cli seems unaware
> >
> > Hi,
> >
> > I have the following volume:
> >
> > Volume Name: virt_images
> > Type: Replicate
> > Volume ID: 9f3c8273-4d9d-4af2-a4e7-4cb4a51e3594
> > Status: Started
> > Snapshot Count: 2
> > Number of Bricks: 1 x (2 + 1) = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: virt3:/data/virt_images/brick
> > Brick2: virt2:/data/virt_images/brick
> > Brick3: printserver:/data/virt_images/brick (arbiter)
> > Options Reconfigured:
> > features.quota-deem-statfs: on
> > features.inode-quota: on
> > features.quota: on
> > features.barrier: disable
> > features.scrub: Active
> > features.bitrot: on
> > nfs.rpc-auth-allow: on
> > server.allow-insecure: on
> > user.cifs: off
> > features.shard: off
> > cluster.shd-wait-qlength: 1
> > cluster.locking-scheme: granular
> > cluster.data-self-heal-algorithm: full
> > cluster.server-quorum-type: server
> > cluster.quorum-type: auto
> > cluster.eager-lock: enable
> > network.remote-dio: enable
> > performance.low-prio-threads: 32
> > performance.io-cache: off
> > performance.read-ahead: off
> > performance.quick-read: off
> > nfs.disable: on
> > transport.address-family: inet
> > server.outstanding-rpc-limit: 512
> >
> > After a server reboot (brick 1) a single file has become unavailable:
> > # touch fedora27.qcow2
> > touch: setting times of 'fedora27.qcow2': Input/output error
> >
> > Looking at the split brain status from the client side cli:
> > # getfattr -n replica.split-brain-status fedora27.qcow2
> > # file: fedora27.qcow2
> > replica.split-brain-status="The file is not under data or metadata
> > split-brain"
> >
> > However, in the client side log, a split brain is mentioned:
> > [2017-12-20 18:05:23.570762] E [MSGID: 108008]
> > [afr-transaction.c:2629:afr_write_txn_refresh_done]
> > 0-virt_images-replicate-0: Failing SETATTR on gfid
> > 7a36937d-52fc-4b55-a932-99e2328f02ba: split-brain observed.
> > [Input/output error]
> > [2017-12-20 18:05:23.576046] W [MSGID: 108027]
> > [afr-common.c:2733:afr_discover_done] 0-virt_images-replicate-0: no
> > read subvols for /fedora27.qcow2
> > [2017-12-20 

Re: [Gluster-users] How to set up a 4 way gluster file system

2018-04-27 Thread Karthik Subrahmanya
Hi,

With replica 2 volumes one can easily end up in split-brains if there are
frequent disconnects and high IOs going on.
If you use replica 3 or arbiter volumes, it will guard you by using the
quorum mechanism giving you both consistency and availability.
But in replica 2 volumes, quorum does not make sense since it needs both
the nodes up to guarantee consistency, which costs availability.

If you can consider having a replica 3 or arbiter volumes it would be
great. Otherwise you can anyway go ahead and continue with the replica 2
volume
by selecting  *y* for the warning message. It will create the replica 2
configuration as you wanted.

HTH,
Karthik

On Fri, Apr 27, 2018 at 10:56 AM, Thing  wrote:

> Hi,
>
> I have 4 servers each with 1TB of storage set as /dev/sdb1, I would like
> to set these up in a raid 10 which will? give me 2TB useable.  So Mirrored
> and concatenated?
>
> The command I am running is as per documents but I get a warning error,
> how do I get this to proceed please as the documents do not say.
>
> gluster volume create gv0 replica 2 glusterp1:/bricks/brick1/gv0
> glusterp2:/bricks/brick1/gv0 glusterp3:/bricks/brick1/gv0
> glusterp4:/bricks/brick1/gv0
> Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to
> avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/
> Split%20brain%20and%20ways%20to%20deal%20with%20it/.
> Do you still want to continue?
>  (y/n) n
>
> Usage:
> volume create  [stripe ] [replica  [arbiter
> ]] [disperse []] [disperse-data ] [redundancy ]
> [transport ] ?... [force]
>
> [root@glustep1 ~]#
>
> thanks
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] split brain? but where?

2018-05-22 Thread Karthik Subrahmanya
Hi,

Which version of gluster you are using?

You can find which file is that using the following command
find  -samefile //

Please provide the getfatr output of the file which is in split brain.
The steps to recover from split-brain can be found here,
http://gluster.readthedocs.io/en/latest/Troubleshooting/resolving-splitbrain/

HTH,
Karthik

On Tue, May 22, 2018 at 4:03 AM, Joe Julian  wrote:

> How do I find what  "eafb8799-4e7a-4264-9213-26997c5a4693"  is?
>
> https://docs.gluster.org/en/v3/Troubleshooting/gfid-to-path/
>
> On May 21, 2018 3:22:01 PM PDT, Thing  wrote:
> >Hi,
> >
> >I seem to have a split brain issue, but I cannot figure out where this
> >is
> >and what it is, can someone help me pls,  I cant find what to fix here.
> >
> >==
> >root@salt-001:~# salt gluster* cmd.run 'df -h'
> >glusterp2.graywitch.co.nz:
> >Filesystem   Size  Used
> >Avail Use% Mounted on
> >/dev/mapper/centos-root   19G  3.4G
> > 16G  19% /
> >devtmpfs 3.8G 0
> >3.8G   0% /dev
> >tmpfs3.8G   12K
> >3.8G   1% /dev/shm
> >tmpfs3.8G  9.1M
> >3.8G   1% /run
> >tmpfs3.8G 0
> >3.8G   0% /sys/fs/cgroup
> >/dev/mapper/centos-tmp   3.8G   33M
> >3.7G   1% /tmp
> >/dev/mapper/centos-var19G  213M
> > 19G   2% /var
> >/dev/mapper/centos-home   47G   38M
> > 47G   1% /home
> >/dev/mapper/centos-data1 112G   33M
> >112G   1% /data1
> >/dev/mapper/centos-var_lib   9.4G  178M
> >9.2G   2% /var/lib
> >/dev/mapper/vg--gluster--prod--1--2-gluster--prod--1--2  932G  264G
> >668G  29% /bricks/brick1
> >/dev/sda1950M  235M
> >715M  25% /boot
> >tmpfs771M   12K
> >771M   1% /run/user/42
> >glusterp2:gv0/glusterp2/images   932G  273G
> >659G  30% /var/lib/libvirt/images
> >glusterp2:gv0932G  273G
> >659G  30% /isos
> >tmpfs771M   48K
> >771M   1% /run/user/1000
> >tmpfs771M 0
> >771M   0% /run/user/0
> >glusterp1.graywitch.co.nz:
> >   Filesystem Size  Used Avail Use%
> >Mounted on
> > /dev/mapper/centos-root 20G  3.5G   17G  18% /
> >   devtmpfs   3.8G 0  3.8G   0%
> >/dev
> >   tmpfs  3.8G   12K  3.8G   1%
> >/dev/shm
> >   tmpfs  3.8G  9.0M  3.8G   1%
> >/run
> >   tmpfs  3.8G 0  3.8G   0%
> >/sys/fs/cgroup
> >   /dev/sda1  969M  206M  713M  23%
> >/boot
> >   /dev/mapper/centos-home 50G  4.3G   46G   9%
> >/home
> >   /dev/mapper/centos-tmp 3.9G   33M  3.9G   1%
> >/tmp
> >   /dev/mapper/centos-data1   120G   36M  120G   1%
> >/data1
> >   /dev/mapper/vg--gluster--prod1-gluster--prod1  932G  260G  673G  28%
> >/bricks/brick1
> >   /dev/mapper/centos-var  20G  413M   20G   3%
> >/var
> >   /dev/mapper/centos00-var_lib   9.4G  179M  9.2G   2%
> >/var/lib
> >   tmpfs  771M  8.0K  771M   1%
> >/run/user/42
> >   glusterp1:gv0  932G  273G  659G  30%
> >/isos
> >   glusterp1:gv0/glusterp1/images 932G  273G  659G  30%
> >/var/lib/libvirt/images
> >glusterp3.graywitch.co.nz:
> >Filesystem   Size  Used
> >Avail Use% Mounted on
> >/dev/mapper/centos-root   20G  3.5G
> > 17G  18% /
> >devtmpfs 3.8G 0
> >3.8G   0% /dev
> >tmpfs3.8G   12K
> >3.8G   1% /dev/shm
> >tmpfs3.8G  9.0M
> >3.8G   1% /run
> >tmpfs3.8G 0
> >3.8G   0% /sys/fs/cgroup
> >/dev/sda1969M  206M
> >713M  23% /boot
> >/dev/mapper/centos-var20G  206M
> > 20G   2% /var
> >/dev/mapper/centos-tmp   3.9G   33M
> >3.9G   1% /tmp
> >/dev/mapper/centos-home 

Re: [Gluster-users] gfid entries in volume heal info that do not heal

2017-10-23 Thread Karthik Subrahmanya
722e6d6435
>
> [root@tpc-cent-glus2-081017 ~]# getfattr -d -e hex -m . 
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: Removing leading '/' from absolute path names
> # file: exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x
> trusted.afr.gv0-client-11=0x0001
> trusted.gfid=0xe0c56bf78bfe46cabde1e46b92d33df3
> trusted.gfid2path.be3ba24c3ef95ff2=0x63323366353834652d353566652d343033382d393131622d3866373063656334616136662f435f564f4c2d623030332d69313331342d63642d636d2d63722e6d6435
>
> [root@tpc-arbiter1-100617 ~]# getfattr -d -e hex -m . 
> /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> getfattr: /exp/b4/gv0/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3: 
> No such file or directory
>
>
>
> And the output of "gluster volume heal  info split-brain"
>
>
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info split-brain
> Brick tpc-cent-glus1-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b1/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b2/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b3/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus1-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-cent-glus2-081017:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick tpc-arbiter1-100617:/exp/b4/gv0
> Status: Connected
> Number of entries in split-brain: 0
>
> -Matt
>
> From: Karthik Subrahmanya [mailto:ksubr...@redhat.com <ksubr...@redhat.com>]
> Sent: Tuesday, October 17, 2017 1:26 AM
> To: Matt Waymack <mwaym...@nsgdv.com>
> Cc: gluster-users <Gluster-users@gluster.org>
> Subject: Re: [Gluster-users] gfid entries in volume heal info that do not heal
>
> Hi Matt,
>
> Run these commands on all the bricks of the replica pair to get the attrs set 
> on the backend.
>
> On the bricks of first replica set:
> getfattr -d -e hex -m .  path>/.glusterfs/10/86/108694db-c039-4b7c-bd3d-ad6a15d811a2
> On the fourth replica set:
> getfattr -d -e hex -m .  path>/.glusterfs/e0/c5/e0c56bf7-8bfe-46ca-bde1-e46b92d33df3
> Also run the "gluster volume heal " once and send the shd log.
> And the output of "gluster volume heal  info split-brain"
> Regards,
> Karthik
>
> On Mon, Oct 16, 2017 at 9:51 PM, Matt Waymack <mailto:mwaym...@nsgdv.com 
> <mwaym...@nsgdv.com>> wrote:
> OK, so here’s my output of the volume info and the heal info. I have not yet 
> tracked down physical location of these files, any tips to finding them would 
> be appreciated, but I’m definitely just wanting them gone.  I forgot to 
> mention earlier that the cluster is running 3.12 and was upgraded from 3.10; 
> these files were likely stuck like this when it was on 3.10.
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume info gv0
>
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 8f07894d-e3ab-4a65-bda1-9d9dd46db007
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x (2 + 1) = 12
> Transport-type: tcp
> Bricks:
> Brick1: tpc-cent-glus1-081017:/exp/b1/gv0
> Brick2: tpc-cent-glus2-081017:/exp/b1/gv0
> Brick3: tpc-arbiter1-100617:/exp/b1/gv0 (arbiter)
> Brick4: tpc-cent-glus1-081017:/exp/b2/gv0
> Brick5: tpc-cent-glus2-081017:/exp/b2/gv0
> Brick6: tpc-arbiter1-100617:/exp/b2/gv0 (arbiter)
> Brick7: tpc-cent-glus1-081017:/exp/b3/gv0
> Brick8: tpc-cent-glus2-081017:/exp/b3/gv0
> Brick9: tpc-arbiter1-100617:/exp/b3/gv0 (arbiter)
> Brick10: tpc-cent-glus1-081017:/exp/b4/gv0
> Brick11: tpc-cent-glus2-081017:/exp/b4/gv0
> Brick12: tpc-arbiter1-100617:/exp/b4/gv0 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
>
> [root@tpc-cent-glus1-081017 ~]# gluster volume heal gv0 info
> Brick tpc-cent-glus1-081017:/exp/

Re: [Gluster-users] Gluster replicate 3 arbiter 1 in split brain. gluster cli seems unaware

2017-12-21 Thread Karthik Subrahmanya
Hi Henrik,

Thanks for providing the required outputs. See my replies inline.

On Thu, Dec 21, 2017 at 10:42 PM, Henrik Juul Pedersen <h...@liab.dk> wrote:

> Hi Karthik and Ben,
>
> I'll try and reply to you inline.
>
> On 21 December 2017 at 07:18, Karthik Subrahmanya <ksubr...@redhat.com>
> wrote:
> > Hey,
> >
> > Can you give us the volume info output for this volume?
>
> # gluster volume info virt_images
>
> Volume Name: virt_images
> Type: Replicate
> Volume ID: 9f3c8273-4d9d-4af2-a4e7-4cb4a51e3594
> Status: Started
> Snapshot Count: 2
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: virt3:/data/virt_images/brick
> Brick2: virt2:/data/virt_images/brick
> Brick3: printserver:/data/virt_images/brick (arbiter)
> Options Reconfigured:
> features.quota-deem-statfs: on
> features.inode-quota: on
> features.quota: on
> features.barrier: disable
> features.scrub: Active
> features.bitrot: on
> nfs.rpc-auth-allow: on
> server.allow-insecure: on
> user.cifs: off
> features.shard: off
> cluster.shd-wait-qlength: 1
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: enable
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> nfs.disable: on
> transport.address-family: inet
> server.outstanding-rpc-limit: 512
>
> > Why are you not able to get the xattrs from arbiter brick? It is the same
> > way as you do it on data bricks.
>
> Yes I must have confused myself yesterday somehow, here it is in full
> from all three bricks:
>
> Brick 1 (virt2): # getfattr -d -m . -e hex fedora27.qcow2
> # file: fedora27.qcow2
> trusted.afr.dirty=0x
> trusted.afr.virt_images-client-1=0x0228
> trusted.afr.virt_images-client-3=0x
> trusted.bit-rot.version=0x1d005a3aa0db000c6563
> trusted.gfid=0x7a36937d52fc4b55a93299e2328f02ba
> trusted.gfid2path.c076c6ac27a43012=0x30303030303030302d303030302d
> 303030302d303030302d3030303030303030303030312f6665646f726132372e71636f7732
> trusted.glusterfs.quota.----0001.contri.1=
> 0xa49eb001
> trusted.pgfid.----0001=0x0001
>
> Brick 2 (virt3): # getfattr -d -m . -e hex fedora27.qcow2
> # file: fedora27.qcow2
> trusted.afr.dirty=0x
> trusted.afr.virt_images-client-2=0x03ef
> trusted.afr.virt_images-client-3=0x
> trusted.bit-rot.version=0x19005a3a9f82000c382a
> trusted.gfid=0x7a36937d52fc4b55a93299e2328f02ba
> trusted.gfid2path.c076c6ac27a43012=0x30303030303030302d303030302d
> 303030302d303030302d3030303030303030303030312f6665646f726132372e71636f7732
> trusted.glusterfs.quota.----0001.contri.1=
> 0xa2fbe001
> trusted.pgfid.----0001=0x0001
>
> Brick 3 - arbiter (printserver): # getfattr -d -m . -e hex fedora27.qcow2
> # file: fedora27.qcow2
> trusted.afr.dirty=0x
> trusted.afr.virt_images-client-1=0x0228
> trusted.bit-rot.version=0x31005a39237200073206
> trusted.gfid=0x7a36937d52fc4b55a93299e2328f02ba
> trusted.gfid2path.c076c6ac27a43012=0x30303030303030302d303030302d
> 303030302d303030302d3030303030303030303030312f6665646f726132372e71636f7732
> trusted.glusterfs.quota.----0001.contri.1=
> 0x0001
> trusted.pgfid.----0001=0x0001
>
> I was expecting trusted.afr.virt_images-client-{1,2,3} on all bricks?
>
>From AFR-V2 we do not have  self blaming attrs. So you will see a brick
blaming other bricks only.
For example brcik1 can blame brick2 & brick 3, not itself.

>
> > The changelog xattrs are named trusted.afr.virt_images-client-{1,2,3}
> in the
> > getxattr outputs you have provided.
> > Did you do a remove-brick and add-brick any time? Otherwise it will be
> > trusted.afr.virt_images-client-{0,1,2} usually.
>
> Yes, the bricks was moved around initially; brick 0 was re-created as
> brick 2, and the arbiter was added later on as well.
>
> >
> > To overcome this scenario you can do what Ben Turner had suggested.
> Select
> > the source copy and change the xattrs manually.
>
> I won't mind doing that, but again, the guides assume that I have
> trusted.afr.virt_images-client-{1,2,3} on all bricks, so I'm not sure

Re: [Gluster-users] Clear heal statistics

2018-01-07 Thread Karthik Subrahmanya
Hi,

I am not aware of any command which clears the historic heal statistics.
You can use the command "gluster volume start  force" which will
restart the SHD and clears the statistics.

Regards,
Karthik

On Mon, Jan 8, 2018 at 3:23 AM, Gino Lisignoli  wrote:

> Is there any way to clear the historic statistic from the command "gluster
>  volume heal  statistics" ?
>
> It seems the command takes longer and longer to run each time it is used,
> to the point where it times out and no longer works.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Karthik Subrahmanya
Hi,

I am wondering why the other brick is not showing any entry in split brain
in the heal info split-brain output.
Can you give the output of stat & getfattr -d -m . -e hex
 from both the bricks.

Regards,
Karthik

On Mon, Feb 5, 2018 at 5:03 PM, Alex K  wrote:

> After stoping/starting the volume I have:
>
> gluster volume heal engine  info split-brain
> Brick gluster0:/gluster/engine/brick
> 
> Status: Connected
> Number of entries in split-brain: 1
>
> Brick gluster1:/gluster/engine/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> gluster volume heal engine split-brain latest-mtime
> gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
> Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
> permitted.
> Volume heal failed.
>
> I will appreciate any help.
> thanx,
> Alex
>
> On Mon, Feb 5, 2018 at 1:11 PM, Alex K  wrote:
>
>> Hi all,
>>
>> I have a split brain issue and have the following situation:
>>
>> gluster volume heal engine  info split-brain
>>
>> Brick gluster0:/gluster/engine/brick
>> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> cd ha_agent/
>> [root@v0 ha_agent]# ls -al
>> ls: cannot access hosted-engine.metadata: Input/output error
>> ls: cannot access hosted-engine.lockspace: Input/output error
>> total 8
>> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
>> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
>> l? ? ??  ?? hosted-engine.lockspace
>> l? ? ??  ?? hosted-engine.metadata
>>
>> I tried to delete the directory from one node but it gives Input/output
>> error.
>> How would one proceed to resolve this?
>>
>> Thanx,
>> Alex
>>
>>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Karthik Subrahmanya
On 05-Feb-2018 7:12 PM, "Alex K" <rightkickt...@gmail.com> wrote:

Hi Karthik,

I tried to delete one file at one node and that is probably the reason.
After several deletes seems that I deleted some files that shouldn't and
the ovirt engine hosted on this volume was not able to start.
Now I am setting up the engine from scratch...
In case I see this kind of split brain again I will get back before I start
deleting :)

Sure. Thanks for the update.

Regards,
Karthik



Alex


On Mon, Feb 5, 2018 at 2:34 PM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:

> Hi,
>
> I am wondering why the other brick is not showing any entry in split brain
> in the heal info split-brain output.
> Can you give the output of stat & getfattr -d -m . -e hex
>  from both the bricks.
>
> Regards,
> Karthik
>
> On Mon, Feb 5, 2018 at 5:03 PM, Alex K <rightkickt...@gmail.com> wrote:
>
>> After stoping/starting the volume I have:
>>
>> gluster volume heal engine  info split-brain
>> Brick gluster0:/gluster/engine/brick
>> 
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> gluster volume heal engine split-brain latest-mtime
>> gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
>> Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
>> permitted.
>> Volume heal failed.
>>
>> I will appreciate any help.
>> thanx,
>> Alex
>>
>> On Mon, Feb 5, 2018 at 1:11 PM, Alex K <rightkickt...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I have a split brain issue and have the following situation:
>>>
>>> gluster volume heal engine  info split-brain
>>>
>>> Brick gluster0:/gluster/engine/brick
>>> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
>>> Status: Connected
>>> Number of entries in split-brain: 1
>>>
>>> Brick gluster1:/gluster/engine/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> cd ha_agent/
>>> [root@v0 ha_agent]# ls -al
>>> ls: cannot access hosted-engine.metadata: Input/output error
>>> ls: cannot access hosted-engine.lockspace: Input/output error
>>> total 8
>>> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
>>> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
>>> l? ? ??  ?? hosted-engine.lockspace
>>> l? ? ??  ?? hosted-engine.metadata
>>>
>>> I tried to delete the directory from one node but it gives Input/output
>>> error.
>>> How would one proceed to resolve this?
>>>
>>> Thanx,
>>> Alex
>>>
>>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] self-heal trouble after changing arbiter brick

2018-02-09 Thread Karthik Subrahmanya
On Fri, Feb 9, 2018 at 3:23 PM, Seva Gluschenko <g...@webkontrol.ru> wrote:

> Hi Karthik,
>
> Thank you for your reply. The heal is still undergoing, as the
> /var/log/glusterfs/glustershd.log keeps growing, and there's a lot of
> pending entries in the heal info.
>
> The gluster version is 3.10.9 and 3.10.10 (the version update in
> progress). It doesn't have info summary [yet?], and the heal info is way
> too long to attach here. (It takes more than 20 minutes just to collect it,
> but the truth is, the cluster is quite heavily loaded, it handles roughly 8
> million reads and 100k writes daily.)
>
Since you have huge number of files inside nested directories and high load
on the cluster, it might take some time to complete the heal. You don't
need to worry about the gfids you are seeing on the heal info output.
Heal info summary is supported from version 3.13.

>
> The heal info output is full of lines like this:
>
> ...
>
> Brick gv2:/data/glusterfs
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ...
>
> And so forth. Out of 80k+ lines, less than just 200 are not related to
> gfids (and yes, number of gfids is well beyond 64999):
>
> # grep -c gfid heal-info.fpack
> 80578
>
> # grep -v gfid heal-info.myvol
> Brick gv0:/data/glusterfs
> Status: Connected
> Number of entries: 0
>
> Brick gv1:/data/glusterfs
> Status: Connected
> Number of entries: 0
>
> Brick gv4:/data/gv01-arbiter
> Status: Connected
> Number of entries: 0
>
> Brick gv2:/data/glusterfs
> /testset/13f/13f27c303b3cb5e23ee647d8285a4a6d.pack
> /testset/05c - Possibly undergoing heal
>
> /testset/b99 - Possibly undergoing heal
>
> /testset/dd7 - Possibly undergoing heal
>
> /testset/0b8 - Possibly undergoing heal
>
> /testset/f21 - Possibly undergoing heal
>
> ...
>
> And here is the getfattr output for a sample file:
>
> # getfattr -d -e hex -m . /data/glusterfs/testset/13f/
> 13f27c303b3cb5e23ee647d8285a4a6d.pack
> getfattr: Removing leading '/' from absolute path names
> # file: data/glusterfs/testset/13f/13f27c303b3cb5e23ee647d8285a4a6d.pack
> trusted.afr.dirty=0x
> trusted.afr.myvol-client-6=0x0001
> trusted.bit-rot.version=0x02005a0d2f650005bf97
> trusted.gfid=0xb42d966b77154de990ecd092201714fd
>
> I tried several files, and the output is pretty much the same, the gfid is
> the only difference.
>
> Could it be anything else I would provide to shed some light on this?
>
I wanted to check the getfattr output of a file and a directory which
belongs to the second replica sub volume from all the 3 bricks
Brick4: gv2:/data/glusterfs
Brick5: gv3:/data/glusterfs
Brick6: gv1:/data/gv23-arbiter (arbiter)
to see the direction of pending markers being set.

Regards,
Karthik

>
> --
> Best Regards,
>
> Seva Gluschenko
> CTO @ http://webkontrol.ru
>
>
> February 9, 2018 9:16 AM, "Karthik Subrahmanya" <ksubr...@redhat.com
> <%22karthik%20subrahmanya%22%20%3cksubr...@redhat.com%3E>> wrote:
>
> Hey,
> Did the heal completed and you still have some entries pending heal?
> If yes then can you provide the following informations to debug the issue.
> 1. Which version of gluster you are running
> 2. gluster volume heal  info summary or gluster volume heal
>  info
> 3. getfattr -d -e hex -m .  output of any one of the
> which is pending heal from all the bricks
> Regards,
> Karthik
> On Thu, Feb 8, 2018 at 12:48 PM, Seva Gluschenko <g...@webkontrol.ru>
> wrote:
>
> Hi folks,
>
> I'm troubled moving an arbiter brick to another server because of I/O load
> issues. My setup is as follows:
>
> # gluster volume info
>
> Volume Name: myvol
> Type: Distributed-Replicate
> Volume ID: 43ba517a-ac09-461e-99da-a197759a7dc8
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x (2 + 1) = 9
> Transport-type: tcp
> Bricks:
> Brick1: gv0:/data/glusterfs
> Brick2: gv1:/data/glusterfs
> Brick3: gv4:/data/gv01-arbiter (arbiter)
> Brick4: gv2:/data/glusterfs
> Brick5: gv3:/data/glusterfs
> Brick6: gv1:/data/gv23-arbiter (arbiter)
> Brick7: gv4:/data/glusterfs
> Brick8: gv5:/data/glusterfs
> Brick9: pluto:/var/gv45-arbiter (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> storage.owner-gid: 1000
> storage.owner-uid: 1000
> cluster.self-heal-daemon: enable
>
> The gv23-arbiter is the brick that was recently moved from other server
> (chronos) using the following command:
>
> # gluster volume replace-brick myvol chronos:/mnt/gv23-arbiter
> gv1:/data/gv23-arbiter commit force
> volume replace-brick: success: replace-brick commit force operation
> succes

Re: [Gluster-users] self-heal trouble after changing arbiter brick

2018-02-08 Thread Karthik Subrahmanya
Hey,

Did the heal completed and you still have some entries pending heal?
If yes then can you provide the following informations to debug the issue.
1. Which version of gluster you are running
2. gluster volume heal  info summary or gluster volume heal
 info
3. getfattr -d -e hex -m .  output of any one of the
which is pending heal from all the bricks

Regards,
Karthik

On Thu, Feb 8, 2018 at 12:48 PM, Seva Gluschenko  wrote:

> Hi folks,
>
> I'm troubled moving an arbiter brick to another server because of I/O load
> issues. My setup is as follows:
>
> # gluster volume info
>
> Volume Name: myvol
> Type: Distributed-Replicate
> Volume ID: 43ba517a-ac09-461e-99da-a197759a7dc8
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x (2 + 1) = 9
> Transport-type: tcp
> Bricks:
> Brick1: gv0:/data/glusterfs
> Brick2: gv1:/data/glusterfs
> Brick3: gv4:/data/gv01-arbiter (arbiter)
> Brick4: gv2:/data/glusterfs
> Brick5: gv3:/data/glusterfs
> Brick6: gv1:/data/gv23-arbiter (arbiter)
> Brick7: gv4:/data/glusterfs
> Brick8: gv5:/data/glusterfs
> Brick9: pluto:/var/gv45-arbiter (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> storage.owner-gid: 1000
> storage.owner-uid: 1000
> cluster.self-heal-daemon: enable
>
> The gv23-arbiter is the brick that was recently moved from other server
> (chronos) using the following command:
>
> # gluster volume replace-brick myvol chronos:/mnt/gv23-arbiter
> gv1:/data/gv23-arbiter commit force
> volume replace-brick: success: replace-brick commit force operation
> successful
>
> It's not the first time I was moving an arbiter brick, and the heal-count
> was zero for all the bricks before the change, so I didn't expect much
> trouble then. What was probably wrong is that I then forced chronos out of
> cluster with gluster peer detach command. All since that, over the course
> of the last 3 days, I see this:
>
> # gluster volume heal myvol statistics heal-count
> Gathering count of entries to be healed on volume myvol has been successful
>
> Brick gv0:/data/glusterfs
> Number of entries: 0
>
> Brick gv1:/data/glusterfs
> Number of entries: 0
>
> Brick gv4:/data/gv01-arbiter
> Number of entries: 0
>
> Brick gv2:/data/glusterfs
> Number of entries: 64999
>
> Brick gv3:/data/glusterfs
> Number of entries: 64999
>
> Brick gv1:/data/gv23-arbiter
> Number of entries: 0
>
> Brick gv4:/data/glusterfs
> Number of entries: 0
>
> Brick gv5:/data/glusterfs
> Number of entries: 0
>
> Brick pluto:/var/gv45-arbiter
> Number of entries: 0
>
> According to the /var/log/glusterfs/glustershd.log, the self-healing is
> undergoing, so it might be worth just sit and wait, but I'm wondering why
> this 64999 heal-count persists (a limitation on counter? In fact, gv2 and
> gv3 bricks contain roughly 30 million files), and I feel bothered because
> of the following output:
>
> # gluster volume heal myvol info heal-failed
> Gathering list of heal failed entries on volume myvol has been
> unsuccessful on bricks that are down. Please check if all brick processes
> are running.
>
> I attached the chronos server back to the cluster, with no noticeable
> effect. Any comments and suggestions would be much appreciated.
>
> --
> Best Regards,
>
> Seva Gluschenko
> CTO @ http://webkontrol.ru
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] self-heal trouble after changing arbiter brick

2018-02-09 Thread Karthik Subrahmanya
On 09-Feb-2018 7:07 PM, "Seva Gluschenko" <g...@webkontrol.ru> wrote:

Hi Karthik,


Thank you very much, you made me much more relaxed. Below is getfattr
output for a file from all the bricks:

root@gv2 ~ # getfattr -d -e hex -m . /data/glusterfs/testset/306/
30677af808ad578916f54783904e6342.pack

getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
trusted.afr.dirty=0x
trusted.afr.myvol-client-6=0x00010001
trusted.bit-rot.version=0x02005a0d2f650005bf97
trusted.gfid=0xe46e9a655128456bba0d98568d432717

root@gv3 ~ # getfattr -d -e hex -m . /data/glusterfs/testset/306/
30677af808ad578916f54783904e6342.pack

getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/testset/306/30677af808ad578916f54783904e6342.pack
trusted.afr.dirty=0x
trusted.afr.myvol-client-6=0x00010001
trusted.bit-rot.version=0x02005a0d2f6900076620
trusted.gfid=0xe46e9a655128456bba0d98568d432717

root@gv1 ~ # getfattr -d -e hex -m . /data/gv23-arbiter/testset/306/
30677af808ad578916f54783904e6342.pack

getfattr: Removing leading '/' from absolute path names
# file: data/gv23-arbiter/testset/306/30677af808ad578916f54783904e6342.pack
trusted.gfid=0xe46e9a655128456bba0d98568d432717

Is it okay that only gfid info is available on the arbiter brick?

Yes it is fine. On both the data bricks you have the good copy and the
entry is also created on the arbiter brick.  The arbiter brick is being
blamed by both the data bricks as well. So heal is happening in the right
direction and it should complete in some time.

Best Regards,
Karthik



--
Best Regards,

Seva Gluschenko
CTO @ http://webkontrol.ru


February 9, 2018 2:01 PM, "Karthik Subrahmanya" <ksubr...@redhat.com
<%22karthik%20subrahmanya%22%20%3cksubr...@redhat.com%3E>> wrote:

On Fri, Feb 9, 2018 at 3:23 PM, Seva Gluschenko <g...@webkontrol.ru> wrote:

Hi Karthik,

Thank you for your reply. The heal is still undergoing, as the
/var/log/glusterfs/glustershd.log keeps growing, and there's a lot of
pending entries in the heal info.

The gluster version is 3.10.9 and 3.10.10 (the version update in progress).
It doesn't have info summary [yet?], and the heal info is way too long to
attach here. (It takes more than 20 minutes just to collect it, but the
truth is, the cluster is quite heavily loaded, it handles roughly 8 million
reads and 100k writes daily.)

Since you have huge number of files inside nested directories and high load
on the cluster, it might take some time to complete the heal. You don't
need to worry about the gfids you are seeing on the heal info output.
Heal info summary is supported from version 3.13.


The heal info output is full of lines like this:

...

Brick gv2:/data/glusterfs









...

And so forth. Out of 80k+ lines, less than just 200 are not related to
gfids (and yes, number of gfids is well beyond 64999):

# grep -c gfid heal-info.fpack
80578

# grep -v gfid heal-info.myvol
Brick gv0:/data/glusterfs
Status: Connected
Number of entries: 0

Brick gv1:/data/glusterfs
Status: Connected
Number of entries: 0

Brick gv4:/data/gv01-arbiter
Status: Connected
Number of entries: 0

Brick gv2:/data/glusterfs
/testset/13f/13f27c303b3cb5e23ee647d8285a4a6d.pack
/testset/05c - Possibly undergoing heal

/testset/b99 - Possibly undergoing heal

/testset/dd7 - Possibly undergoing heal

/testset/0b8 - Possibly undergoing heal

/testset/f21 - Possibly undergoing heal

...

And here is the getfattr output for a sample file:

# getfattr -d -e hex -m . /data/glusterfs/testset/13f/13
f27c303b3cb5e23ee647d8285a4a6d.pack
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/testset/13f/13f27c303b3cb5e23ee647d8285a4a6d.pack
trusted.afr.dirty=0x
trusted.afr.myvol-client-6=0x0001
trusted.bit-rot.version=0x02005a0d2f650005bf97
trusted.gfid=0xb42d966b77154de990ecd092201714fd

I tried several files, and the output is pretty much the same, the gfid is
the only difference.

Could it be anything else I would provide to shed some light on this?

I wanted to check the getfattr output of a file and a directory which
belongs to the second replica sub volume from all the 3 bricks
Brick4: gv2:/data/glusterfs
Brick5: gv3:/data/glusterfs
Brick6: gv1:/data/gv23-arbiter (arbiter)
to see the direction of pending markers being set.
Regards,
Karthik


--
Best Regards,

Seva Gluschenko
CTO @ http://webkontrol.ru

February 9, 2018 9:16 AM, "Karthik Subrahmanya" <ksubr...@redhat.com
<%22karthik%20subrahmanya%22%20%3cksubr...@redhat.com%3E>> wrote:

Hey,
Did the heal completed and you still have some entries pending heal?
If yes then can you provide the following informations to debug the issue.
1. Which version of gluster you are running
2. gluster volume heal  info summa

Re: [Gluster-users] self-heal trouble after changing arbiter brick

2018-02-08 Thread Karthik Subrahmanya
On Fri, Feb 9, 2018 at 11:46 AM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:

> Hey,
>
> Did the heal completed and you still have some entries pending heal?
> If yes then can you provide the following informations to debug the issue.
> 1. Which version of gluster you are running
> 2. Output of gluster volume heal  info summary or gluster volume
> heal  info
> 3. getfattr -d -e hex -m .  output of any one of the
> file which is pending heal from all the bricks
>
> Regards,
> Karthik
>
> On Thu, Feb 8, 2018 at 12:48 PM, Seva Gluschenko <g...@webkontrol.ru>
> wrote:
>
>> Hi folks,
>>
>> I'm troubled moving an arbiter brick to another server because of I/O
>> load issues. My setup is as follows:
>>
>> # gluster volume info
>>
>> Volume Name: myvol
>> Type: Distributed-Replicate
>> Volume ID: 43ba517a-ac09-461e-99da-a197759a7dc8
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 3 x (2 + 1) = 9
>> Transport-type: tcp
>> Bricks:
>> Brick1: gv0:/data/glusterfs
>> Brick2: gv1:/data/glusterfs
>> Brick3: gv4:/data/gv01-arbiter (arbiter)
>> Brick4: gv2:/data/glusterfs
>> Brick5: gv3:/data/glusterfs
>> Brick6: gv1:/data/gv23-arbiter (arbiter)
>> Brick7: gv4:/data/glusterfs
>> Brick8: gv5:/data/glusterfs
>> Brick9: pluto:/var/gv45-arbiter (arbiter)
>> Options Reconfigured:
>> nfs.disable: on
>> transport.address-family: inet
>> storage.owner-gid: 1000
>> storage.owner-uid: 1000
>> cluster.self-heal-daemon: enable
>>
>> The gv23-arbiter is the brick that was recently moved from other server
>> (chronos) using the following command:
>>
>> # gluster volume replace-brick myvol chronos:/mnt/gv23-arbiter
>> gv1:/data/gv23-arbiter commit force
>> volume replace-brick: success: replace-brick commit force operation
>> successful
>>
>> It's not the first time I was moving an arbiter brick, and the heal-count
>> was zero for all the bricks before the change, so I didn't expect much
>> trouble then. What was probably wrong is that I then forced chronos out of
>> cluster with gluster peer detach command. All since that, over the course
>> of the last 3 days, I see this:
>>
>> # gluster volume heal myvol statistics heal-count
>> Gathering count of entries to be healed on volume myvol has been
>> successful
>>
>> Brick gv0:/data/glusterfs
>> Number of entries: 0
>>
>> Brick gv1:/data/glusterfs
>> Number of entries: 0
>>
>> Brick gv4:/data/gv01-arbiter
>> Number of entries: 0
>>
>> Brick gv2:/data/glusterfs
>> Number of entries: 64999
>>
>> Brick gv3:/data/glusterfs
>> Number of entries: 64999
>>
>> Brick gv1:/data/gv23-arbiter
>> Number of entries: 0
>>
>> Brick gv4:/data/glusterfs
>> Number of entries: 0
>>
>> Brick gv5:/data/glusterfs
>> Number of entries: 0
>>
>> Brick pluto:/var/gv45-arbiter
>> Number of entries: 0
>>
>> According to the /var/log/glusterfs/glustershd.log, the self-healing is
>> undergoing, so it might be worth just sit and wait, but I'm wondering why
>> this 64999 heal-count persists (a limitation on counter? In fact, gv2 and
>> gv3 bricks contain roughly 30 million files), and I feel bothered because
>> of the following output:
>>
>> # gluster volume heal myvol info heal-failed
>> Gathering list of heal failed entries on volume myvol has been
>> unsuccessful on bricks that are down. Please check if all brick processes
>> are running.
>>
>> I attached the chronos server back to the cluster, with no noticeable
>> effect. Any comments and suggestions would be much appreciated.
>>
>> --
>> Best Regards,
>>
>> Seva Gluschenko
>> CTO @ http://webkontrol.ru
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to fix an out-of-sync node?

2018-02-08 Thread Karthik Subrahmanya
Hi,

>From the information you provided, I am guessing that you have a replica 3
volume configured.
In that case you can run "gluster volume heal " which should do
the trick for you.

Regards,
Karthik

On Thu, Feb 8, 2018 at 6:16 AM, Frizz  wrote:

> I have a setup with 3 nodes running GlusterFS.
>
> gluster volume create myBrick replica 3 node01:/mnt/data/myBrick
> node02:/mnt/data/myBrick node03:/mnt/data/myBrick
>
> Unfortunately node1 seemed to stop syncing with the other nodes, but this
> was undetected for weeks!
>
> When I noticed it, I did a "service glusterd restart" on node1, hoping the
> three nodes would sync again.
>
> But this did not happen. Only the CPU load went up on all three nodes +
> the access time went up.
>
> When I look into the physical storage of the bricks, node1 is very
> different
> node01:/mnt/data/myBrick : 9GB data
> node02:/mnt/data/myBrick : 12GB data
> node03:/mnt/data/myBrick : 12GB data
>
> How do I sync data from the healthy nodes Node2/Node3 back to Node1?
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster replicate 3 arbiter 1 in split brain. gluster cli seems unaware

2017-12-22 Thread Karthik Subrahmanya
Hey Henrik,

Good to know that the issue got resolved. I will try to answer some of the
questions you have.
- The time taken to heal the file depends on its size. That's why you were
seeing some delay in getting everything back to normal in the heal info
output.
- You did not hit the split-brain situation. In split-brain all the bricks
will be blaming the other bricks. But in your case the third brick was not
blamed by any other brick.
- It was not able to heal the file because arbiter can not be source for
data heal. The other two data bricks were blaming each other, so heal was
not able to decide on the source.
  This is arbiter becoming source for data heal issue. We are working on
the fix for this, and it will be shipped with the next release.
- Since it was not in split brain, you were not able see this in heal info
split-brain and not able to resolve this using the cli for split-brain
resolution.
- You can use the heal command to perform syncing of data after brick
maintenance. Once the brick comes up any ways the heal will be triggered
automatically.
- You can use the heal info command to monitor the status of heal.

Regards,
Karthik

On Fri, Dec 22, 2017 at 6:01 PM, Henrik Juul Pedersen <h...@liab.dk> wrote:

> Hi Karthik,
>
> Thanks for the info. Maybe the documentation should be updated to
> explain the different AFR versions, I know I was confused.
>
> Also, when looking at the changelogs from my three bricks before fixing:
>
> Brick 1:
> trusted.afr.virt_images-client-1=0x0228
> trusted.afr.virt_images-client-3=0x
>
> Brick 2:
> trusted.afr.virt_images-client-2=0x03ef
> trusted.afr.virt_images-client-3=0x
>
> Brick 3 (arbiter):
> trusted.afr.virt_images-client-1=0x0228
>
> I would think that the changelog for client 1 should win by majority
> vote? Or how does the self-healing process work?
> I assumed this as the correct version, and reset client 2 on brick 2:
> # setfattr -n trusted.afr.virt_images-client-2 -v
> 0x fedora27.qcow2
>
> I then did a directory listing, which might have started a heal, but
> heal statistics show (i also did a full heal):
> Starting time of crawl: Fri Dec 22 11:34:47 2017
>
> Ending time of crawl: Fri Dec 22 11:34:47 2017
>
> Type of crawl: INDEX
> No. of entries healed: 0
> No. of entries in split-brain: 0
> No. of heal failed entries: 1
>
> Starting time of crawl: Fri Dec 22 11:39:29 2017
>
> Ending time of crawl: Fri Dec 22 11:39:29 2017
>
> Type of crawl: FULL
> No. of entries healed: 0
> No. of entries in split-brain: 0
> No. of heal failed entries: 1
>
> I was immediately able to touch the file, so gluster was okay about
> it, however heal info still showed the file for a while:
> # gluster volume heal virt_images info
> Brick virt3:/data/virt_images/brick
> /fedora27.qcow2
> Status: Connected
> Number of entries: 1
>
> Brick virt2:/data/virt_images/brick
> /fedora27.qcow2
> Status: Connected
> Number of entries: 1
>
> Brick printserver:/data/virt_images/brick
> /fedora27.qcow2
> Status: Connected
> Number of entries: 1
>
>
>
> Now heal info shows 0 entries, and the two data bricks have the same
> md5sum, so it's back in sync.
>
>
>
> I have a few questions after all of this:
>
> 1) How can a split brain happen in a replica 3 arbiter 1 setup with
> both server- and client quorum enabled?
> 2) Why was it not able to self heal, when tro bricks seemed in sync
> with their changelogs?
> 3) Why could I not see the file in heal info split-brain?
> 4) Why could I not fix this through the cli split-brain resolution tool?
> 5) Is it possible to force a sync in a volume? Or maybe test sync
> status? It might be smart to be able to "flush" changes when taking a
> brick down for maintenance.
> 6) How am I supposed to monitor events like this? I have a gluster
> volume with ~500.000 files, I need to be able to guarantee data
> integrity and availability to the users.
> 7) Is glusterfs "production ready"? Because I find it hard to monitor
> and thus trust in these setups. Also performance with small / many
> files seems horrible at best - but that's for another discussion.
>
> Thanks for all of your help, Ill continue to try and tweak some
> performance out of this. :)
>
> Best regards,
> Henrik Juul Pedersen
> LIAB ApS
>
> On 22 December 2017 at 07:26, Karthik Subrahmanya <ksubr...@redhat.com>
> wrote:
> > Hi Henrik,
> >
> > Thanks for providing the required outputs. See my replies inline.
> >
> > On Thu, Dec 21, 2017 at 10:42 PM, Henrik Juul Pedersen <h...@liab.dk>
> wrote:
> 

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Karthik Subrahmanya
On Mon, Feb 26, 2018 at 6:14 PM, Dave Sherohman <d...@sherohman.org> wrote:

> On Mon, Feb 26, 2018 at 05:45:27PM +0530, Karthik Subrahmanya wrote:
> > > "In a replica 2 volume... If we set the client-quorum option to
> > > auto, then the first brick must always be up, irrespective of the
> > > status of the second brick. If only the second brick is up, the
> > > subvolume becomes read-only."
> > >
> > By default client-quorum is "none" in replica 2 volume.
>
> I'm not sure where I saw the directions saying to set it, but I do have
> "cluster.quorum-type: auto" in my volume configuration.  (And I think
> that's client quorum, but feel free to correct me if I've misunderstood
> the docs.)
>
If it is "auto" then I think it is reconfigured. In replica 2 it will be
"none".

>
> > It applies to all the replica 2 volumes even if it has just 2 brick or
> more.
> > Total brick count in the volume doesn't matter for the quorum, what
> matters
> > is the number of bricks which are up in the particular replica subvol.
>
> Thanks for confirming that.
>
> > If I understood your configuration correctly it should look something
> like
> > this:
> > (Please correct me if I am wrong)
> > replica-1:  bricks 1 & 2
> > replica-2: bricks 3 & 4
> > replica-3: bricks 5 & 6
>
> Yes, that's correct.
>
> > Since quorum is per replica, if it is set to auto then it needs the first
> > brick of the particular replica subvol to be up to perform the fop.
> >
> > In replica 2 volumes you can end up in split-brains.
>
> How would that happen if bricks which are not in (cluster-wide) quorum
> refuse to accept writes?  I'm not seeing the reason for using individual
> subvolume quorums instead of full-volume quorum.
>
Split brains happen within the replica pair.
I will try to explain how you can end up in split-brain even with cluster
wide quorum:
Lets say you have 6 bricks (replica 2) volume and you always have at least
quorum number of bricks up & running.
Bricks 1 & 2 are part of replica subvol-1
Bricks 3 & 4 are part of replica subvol-2
Bricks 5 & 6 are part of replica subvol-3

- Brick 1 goes down and a write comes on a file which is part of that
replica subvol-1
- Quorum is met since we have 5 out of 6 bricks are running
- Brick 2 says brick 1 is bad
- Brick 2 goes down and brick 1 comes up. Heal did not happened
- Write comes on the same file, quorum is met, and now brick 1 says brick 2
is bad
- When both the bricks 1 & 2 are up, both of them blame the other brick -
*split-brain*

>
> > It would be great if you can consider configuring an arbiter or
> > replica 3 volume.
>
> I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
> bricks as arbiters with minimal effect on capacity.  What would be the
> sequence of commands needed to:
>
> 1) Move all data off of bricks 1 & 2
> 2) Remove that replica from the cluster
> 3) Re-add those two bricks as arbiters
>
>
(And did I miss any additional steps?)
>
> Unfortunately, I've been running a few months already with the current
> configuration and there are several virtual machines running off the
> existing volume, so I'll need to reconfigure it online if possible.
>
Without knowing the volume configuration it is difficult to suggest the
configuration change,
and since it is a live system you may end up in data unavailability or data
loss.
Can you give the output of "gluster volume info "
and which brick is of what size.
Note: The arbiter bricks need not be of bigger size.
[1] gives information about how you can provision the arbiter brick.

[1]
http://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#arbiter-bricks-sizing

Regards,
Karthik

>
> --
> Dave Sherohman
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] promote arbiter to full replica

2018-07-15 Thread Karthik Subrahmanya
Hi,

Yes you can do that.
- Make sure "gluster volume heal  info" is zero.
- Remove the arbiter brick using the command "gluster volume remove-brick
 replica 2 "
- Add a new brick of same size as of the other 2 data brick using the
command "gluster volume add-brick  replica 3 "
- heal info should become zero after some time (depends on the amount of
data needs to be replicated to the new brick)

HTH,
Karthik

On Mon, Jul 16, 2018 at 5:01 AM Laura Bailey  wrote:

> We have instructions for doing the reverse:
>
>
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/creating_arbitrated_replicated_volumes#sect-Convert_Rep2_to_Arbiter
>
> But not for going from arbiter to a larger replica. If we get some
> instructions in this thread I can add them to the docs. :)
>
> Cheers,
> Laura B
>
> On Sunday, July 15, 2018, Joseph Wenninger  wrote:
>
>> Hi!
>>
>> I'm using glusterfs 4.1
>>
>> I have a volume with replicas 3 arbiter 1.
>>
>> Is it possible to promote the arbiter to a full replica so that the volume
>> has 3 full copies of all data.
>>
>> I didn't find any documentation about that, so I guess it is not possible
>> to do, although I hope I've missed something, so perhaps somebody could
>> hint me to the right documentation
>>
>>
>> Best regards
>> Joseph Wenninger
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> --
> Laura Bailey
> Principal Technical Writer
> Customer Content Services BNE
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-04 Thread Karthik Subrahmanya
Hi,

>From the logs you have pasted it looks like those files are in GFID
split-brain.
They should have the GFIDs assigned on both the data bricks but they will
be different.

Can you please paste the getfattr output of those files and their parent
from all the bricks again?
Which version of gluster you are using?

If you are using a version higher than or equal to 3.12 gfid split brains
can be resolved using the methods (except method 4)
explained in the "Resolution of split-brain using gluster CLI" section in
[1].
Also note that for gfid split-brain resolution using CLI you have to pass
the name of the file as argument and not the GFID.

If it is lower than 3.12 (Please consider upgrading them since they are
EOL) you have to resolve it manually as explained in [2]

[1] https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
[2]
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#dir-split-brain

Thanks & Regards,
Karthik

On Wed, Jul 4, 2018 at 1:59 AM Gambit15  wrote:

> On 1 July 2018 at 22:37, Ashish Pandey  wrote:
>
>>
>> The only problem at the moment is that arbiter brick offline. You should
>> only bother about completion of maintenance of arbiter brick ASAP.
>> Bring this brick UP, start FULL heal or index heal and the volume will be
>> in healthy state.
>>
>
> Doesn't the arbiter only resolve split-brain situations? None of the files
> that have been marked for healing are marked as in split-brain.
>
> The arbiter has now been brought back up, however the problem continues.
>
> I've found the following information in the client log:
>
> [2018-07-03 19:09:29.245089] W [MSGID: 108008]
> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
> 0-engine-replicate-0: GFID mismatch for
> /hosted-engine.metadata
> 5e95ba8c-2f12-49bf-be2d-b4baf210d366 on engine-client-1 and
> b9cd7613-3b96-415d-a549-1dc788a4f94d on engine-client-0
> [2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk]
> 0-glusterfs-fuse: 10430040: LOOKUP()
> /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.metadata => -1
> (Input/output error)
> [2018-07-03 19:09:30.619000] W [MSGID: 108008]
> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
> 0-engine-replicate-0: GFID mismatch for
> /hosted-engine.lockspace
> 8e86902a-c31c-4990-b0c5-0318807edb8f on engine-client-1 and
> e5899a4c-dc5d-487e-84b0-9bbc73133c25 on engine-client-0
> [2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk]
> 0-glusterfs-fuse: 10430656: LOOKUP()
> /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.lockspace =>
> -1 (Input/output error)
>
> As you can see from the logs I posted previously, neither of those two
> files, on either of the two servers, have any of gluster's extended
> attributes set.
>
> The arbiter doesn't have any record of the files in question, as they were
> created after it went offline.
>
> How do I fix this? Is it possible to locate the correct gfids somewhere &
> redefine them on the files manually?
>
> Cheers,
>  Doug
>
> --
>> *From: *"Gambit15" 
>> *To: *"Ashish Pandey" 
>> *Cc: *"gluster-users" 
>> *Sent: *Monday, July 2, 2018 1:45:01 AM
>> *Subject: *Re: [Gluster-users] Files not healing & missing their
>> extended attributes - Help!
>>
>>
>> Hi Ashish,
>>
>> The output is below. It's a rep 2+1 volume. The arbiter is offline for
>> maintenance at the moment, however quorum is met & no files are reported as
>> in split-brain (it hosts VMs, so files aren't accessed concurrently).
>>
>> ==
>> [root@v0 glusterfs]# gluster volume info engine
>>
>> Volume Name: engine
>> Type: Replicate
>> Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: s0:/gluster/engine/brick
>> Brick2: s1:/gluster/engine/brick
>> Brick3: s2:/gluster/engine/arbiter (arbiter)
>> Options Reconfigured:
>> nfs.disable: on
>> performance.readdir-ahead: on
>> transport.address-family: inet
>> performance.quick-read: off
>> performance.read-ahead: off
>> performance.io-cache: off
>> performance.stat-prefetch: off
>> cluster.eager-lock: enable
>> network.remote-dio: enable
>> cluster.quorum-type: auto
>> cluster.server-quorum-type: server
>> storage.owner-uid: 36
>> storage.owner-gid: 36
>> performance.low-prio-threads: 32
>>
>> ==
>>
>> [root@v0 glusterfs]# gluster volume heal engine info
>> Brick s0:/gluster/engine/brick
>> /__DIRECT_IO_TEST__
>> /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent
>> /98495dbc-a29c-4893-b6a0-0aa70860d0c9
>> 
>> Status: Connected
>> Number of entries: 34
>>
>> Brick s1:/gluster/engine/brick
>> 
>> Status: Connected
>> Number of entries: 34
>>
>> Brick s2:/gluster/engine/arbiter
>> Status: Ponto final de transporte não está conectado
>> Number of entries: -
>>
>> ==
>> === PEER V0 ===
>>
>> [root@v0 glusterfs]# getfattr -m . -d -e hex
>> 

Re: [Gluster-users] Split brain directory

2018-01-24 Thread Karthik Subrahmanya
Hey,

>From the getfattr output you have provided, the directory is clearly not in
split brain.
If all the bricks are being blamed by others then it is called split brain.
In your case only client-13 that is Brick-14 in the volume info output had
a pending entry heal on the directory.
That is the last replica subvol which consists of the bricks

Brick13: glusterserver03.mydomain.local:/bricks/video/brick3/safe
Brick14: glusterserver04.mydomain.local:/bricks/video/brick3/safe
Brick15: glusterserver05.mydomain.local:/bricks/video/brick3/safe (arbiter)

Which got healed as part of the heal you ran, or part of the self heal
crawl and pending xattrs got reset to all zeros.
Which file are you not able to access? Can you give the getfattr output of
that file and give the shd log
and the mount log where you were not able to access the file.

Regards,
Karthik

On Wed, Jan 24, 2018 at 2:00 PM, Luca Gervasi 
wrote:

> Hello,
> I'm trying to fix an issue with a Directory Split on a gluster 3.10.3. The
> effect consist of a specific file in this splitted directory to randomly be
> unavailable on some clients.
> I have gathered all the informations on this gist: https://gist.
> githubusercontent.com/lucagervasi/534e0024d349933eef44615fa8a5c374/raw/
> 52ff8dd6a9cc8ba09b7f258aa85743d2854f9acc/splitinfo.txt
>
> I discovered the splitted directory by the extended attributes (lines
> 172,173, 291,292,
> trusted.afr.dirty=0x
> trusted.afr.vol-video-client-13=0x
> Seen on the bricks
> * /bricks/video/brick3/safe/video.mysite.it/htdocs/ su glusterserver05
> (lines 278 ro 294)
> * /bricks/video/brick3/safe/video.mysite.it/htdocs/ su glusterserver03
> (lines 159 to 175)
>
> Reading the documentation about afr extended attributes, this situation
> seems unclear (Docs from [1] and [2])
> as own changelog is 0, same as client-13 (glusterserver02.mydomain.
> local:/bricks/video/brick3/safe)
> as my understanding, such "dirty" attributes seems to indicate no split at
> all (feel free to correct me).
>
> Some days ago, I issued a "gluster volume heal vol-video full", which
> endend (probably) that day, leaving no info on /var/log/gluster/glustershd.log
> nor fixing this split.
> I tried to trigger a self heal using "stat" and "ls -l" over the splitted
> directory from a glusterfs mounted client directory, without having the bit
> set cleared.
> The volume heal info split-brain itself shows zero items to be healed
> (lines 388 to 446).
>
> All the clients mount this volume using glusterfs-fuse.
>
> I don't know what to do, please help.
>
> Thanks.
>
> Luca Gervasi
>
> References:
> [1] https://access.redhat.com/documentation/en-US/Red_Hat_
> Storage/2.1/html/Administration_Guide/Recovering_from_File_Split-
> brain.html
> [2] https://access.redhat.com/documentation/en-us/red_hat_
> gluster_storage/3.3/html/administration_guide/sect-managing_split-brain
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

2018-03-13 Thread Karthik Subrahmanya
Hi Anatoliy,

The heal command is basically used to heal any mismatching contents between
replica copies of the files.
For the command "gluster volume heal " to succeed, you should have
the self-heal-daemon running,
which is true only if your volume is of type replicate/disperse.
In your case you have a plain distribute volume where you do not store the
replica of any files.
So the volume heal will return you the error.

Regards,
Karthik

On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev 
wrote:

> Hi,
>
>
> Maybe someone can point me to a documentation or explain this? I can't
> find it myself.
> Do we have any other useful resources except doc.gluster.org? As I see
> many gluster options are not described there or there are no explanation
> what is doing...
>
>
>
> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
>
>> Hello,
>>
>> We have a very fresh gluster 3.10.10 installation.
>> Our volume is created as distributed volume, 9 bricks 96TB in total
>> (87TB after 10% of gluster disk space reservation)
>>
>> For some reasons I can’t “heal” the volume:
>> # gluster volume heal gv0
>> Launching heal operation to perform index self heal on volume gv0 has
>> been unsuccessful on bricks that are down. Please check if all brick
>> processes are running.
>>
>> Which processes should be run on every brick for heal operation?
>>
>> # gluster volume status
>> Status of volume: gv0
>> Gluster process TCP Port  RDMA Port  Online
>> Pid
>> 
>> --
>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  70850
>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  102951
>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  57535
>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  56676
>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  56880
>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  56889
>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  56902
>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  94920
>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>  56542
>>
>> Task Status of Volume gv0
>> 
>> --
>> There are no active volume tasks
>>
>>
>> # gluster volume info gv0
>> Volume Name: gv0
>> Type: Distribute
>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 9
>> Transport-type: rdma
>> Bricks:
>> Brick1: cn01-ib:/gfs/gv0/brick1/brick
>> Brick2: cn02-ib:/gfs/gv0/brick1/brick
>> Brick3: cn03-ib:/gfs/gv0/brick1/brick
>> Brick4: cn04-ib:/gfs/gv0/brick1/brick
>> Brick5: cn05-ib:/gfs/gv0/brick1/brick
>> Brick6: cn06-ib:/gfs/gv0/brick1/brick
>> Brick7: cn07-ib:/gfs/gv0/brick1/brick
>> Brick8: cn08-ib:/gfs/gv0/brick1/brick
>> Brick9: cn09-ib:/gfs/gv0/brick1/brick
>> Options Reconfigured:
>> client.event-threads: 8
>> performance.parallel-readdir: on
>> performance.readdir-ahead: on
>> cluster.nufa: on
>> nfs.disable: on
>>
>
> --
> Best regards,
> Anatoliy
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

2018-03-13 Thread Karthik Subrahmanya
On Wed, Mar 14, 2018 at 4:33 AM, Laura Bailey <lbai...@redhat.com> wrote:

> Can we add a smarter error message for this situation by checking volume
> type first?

Yes we can. I will do that.

Thanks,
Karthik

>
> Cheers,
> Laura B
>
>
> On Wednesday, March 14, 2018, Karthik Subrahmanya <ksubr...@redhat.com>
> wrote:
>
>> Hi Anatoliy,
>>
>> The heal command is basically used to heal any mismatching contents
>> between replica copies of the files.
>> For the command "gluster volume heal " to succeed, you should
>> have the self-heal-daemon running,
>> which is true only if your volume is of type replicate/disperse.
>> In your case you have a plain distribute volume where you do not store
>> the replica of any files.
>> So the volume heal will return you the error.
>>
>> Regards,
>> Karthik
>>
>> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> Maybe someone can point me to a documentation or explain this? I can't
>>> find it myself.
>>> Do we have any other useful resources except doc.gluster.org? As I see
>>> many gluster options are not described there or there are no explanation
>>> what is doing...
>>>
>>>
>>>
>>> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
>>>
>>>> Hello,
>>>>
>>>> We have a very fresh gluster 3.10.10 installation.
>>>> Our volume is created as distributed volume, 9 bricks 96TB in total
>>>> (87TB after 10% of gluster disk space reservation)
>>>>
>>>> For some reasons I can’t “heal” the volume:
>>>> # gluster volume heal gv0
>>>> Launching heal operation to perform index self heal on volume gv0 has
>>>> been unsuccessful on bricks that are down. Please check if all brick
>>>> processes are running.
>>>>
>>>> Which processes should be run on every brick for heal operation?
>>>>
>>>> # gluster volume status
>>>> Status of volume: gv0
>>>> Gluster process TCP Port  RDMA Port
>>>> Online  Pid
>>>> 
>>>> --
>>>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  70850
>>>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  102951
>>>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  57535
>>>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56676
>>>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56880
>>>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56889
>>>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56902
>>>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  94920
>>>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56542
>>>>
>>>> Task Status of Volume gv0
>>>> 
>>>> --
>>>> There are no active volume tasks
>>>>
>>>>
>>>> # gluster volume info gv0
>>>> Volume Name: gv0
>>>> Type: Distribute
>>>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 9
>>>> Transport-type: rdma
>>>> Bricks:
>>>> Brick1: cn01-ib:/gfs/gv0/brick1/brick
>>>> Brick2: cn02-ib:/gfs/gv0/brick1/brick
>>>> Brick3: cn03-ib:/gfs/gv0/brick1/brick
>>>> Brick4: cn04-ib:/gfs/gv0/brick1/brick
>>>> Brick5: cn05-ib:/gfs/gv0/brick1/brick
>>>> Brick6: cn06-ib:/gfs/gv0/brick1/brick
>>>> Brick7: cn07-ib:/gfs/gv0/brick1/brick
>>>> Brick8: cn08-ib:/gfs/gv0/brick1/brick
>>>> Brick9: cn09-ib:/gfs/gv0/brick1/brick
>>>> Options Reconfigured:
>>>> client.event-threads: 8
>>>> performance.parallel-readdir: on
>>>> performance.readdir-ahead: on
>>>> cluster.nufa: on
>>>> nfs.disable: on
>>>>
>>>
>>> --
>>> Best regards,
>>> Anatoliy
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>
> --
> Laura Bailey
> Senior Technical Writer
> Customer Content Services BNE
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

2018-03-14 Thread Karthik Subrahmanya
On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
wrote:

> Hi Karthik,
>
>
> Thanks a lot for the explanation.
>
> Does it mean a distributed volume health can be checked only by "gluster
> volume status " command?
>
Yes. I am not aware of any other command which can give the status of plain
distribute volume which is similar to the heal info command for
replicate/disperse volumes.

> And one more question: cluster.min-free-disk is 10% by default. What kind
> of "side effects" can we face if this option will be reduced to, for
> example, 5%? Could you point to any best practice document(s)?
>
Yes you can decrease it to any value. There won't be any side effect.

Regards,
Karthik

>
> Regards,
>
> Anatoliy
>
>
>
>
>
> On 2018-03-13 16:46, Karthik Subrahmanya wrote:
>
> Hi Anatoliy,
>
> The heal command is basically used to heal any mismatching contents
> between replica copies of the files.
> For the command "gluster volume heal " to succeed, you should
> have the self-heal-daemon running,
> which is true only if your volume is of type replicate/disperse.
> In your case you have a plain distribute volume where you do not store the
> replica of any files.
> So the volume heal will return you the error.
>
> Regards,
> Karthik
>
> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
> wrote:
>
>> Hi,
>>
>>
>> Maybe someone can point me to a documentation or explain this? I can't
>> find it myself.
>> Do we have any other useful resources except doc.gluster.org? As I see
>> many gluster options are not described there or there are no explanation
>> what is doing...
>>
>>
>>
>> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
>>
>>> Hello,
>>>
>>> We have a very fresh gluster 3.10.10 installation.
>>> Our volume is created as distributed volume, 9 bricks 96TB in total
>>> (87TB after 10% of gluster disk space reservation)
>>>
>>> For some reasons I can't "heal" the volume:
>>> # gluster volume heal gv0
>>> Launching heal operation to perform index self heal on volume gv0 has
>>> been unsuccessful on bricks that are down. Please check if all brick
>>> processes are running.
>>>
>>> Which processes should be run on every brick for heal operation?
>>>
>>> # gluster volume status
>>> Status of volume: gv0
>>> Gluster process TCP Port  RDMA Port  Online
>>> Pid
>>> 
>>> --
>>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  70850
>>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  102951
>>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  57535
>>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  56676
>>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  56880
>>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  56889
>>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  56902
>>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  94920
>>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>  56542
>>>
>>> Task Status of Volume gv0
>>> 
>>> --
>>> There are no active volume tasks
>>>
>>>
>>> # gluster volume info gv0
>>> Volume Name: gv0
>>> Type: Distribute
>>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 9
>>> Transport-type: rdma
>>> Bricks:
>>> Brick1: cn01-ib:/gfs/gv0/brick1/brick
>>> Brick2: cn02-ib:/gfs/gv0/brick1/brick
>>> Brick3: cn03-ib:/gfs/gv0/brick1/brick
>>> Brick4: cn04-ib:/gfs/gv0/brick1/brick
>>> Brick5: cn05-ib:/gfs/gv0/brick1/brick
>>> Brick6: cn06-ib:/gfs/gv0/brick1/brick
>>> Brick7: cn07-ib:/gfs/gv0/brick1/brick
>>> Brick8: cn08-ib:/gfs/gv0/brick1/brick
>>> Brick9: cn09-ib:/gfs/gv0/brick1/brick
>>> Options Reconfigured:
>>> client.event-threads: 8
>>> performance.parallel-readdir: on
>>> performance.readdir-ahead: on
>>> cluster.nufa: on
>>> nfs.disable: on
>>
>>
>> --
>> Best regards,
>> Anatoliy
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
> --
> Best regards,
> Anatoliy
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

2018-03-14 Thread Karthik Subrahmanya
On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubr...@redhat.com>
wrote:

>
>
> On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
> wrote:
>
>> Hi Karthik,
>>
>>
>> Thanks a lot for the explanation.
>>
>> Does it mean a distributed volume health can be checked only by "gluster
>> volume status " command?
>>
> Yes. I am not aware of any other command which can give the status of
> plain distribute volume which is similar to the heal info command for
> replicate/disperse volumes.
>
>> And one more question: cluster.min-free-disk is 10% by default. What kind
>> of "side effects" can we face if this option will be reduced to, for
>> example, 5%? Could you point to any best practice document(s)?
>>
> Yes you can decrease it to any value. There won't be any side effect.
>
Small correction here, min-free-disk should ideally be set to larger than
the largest file size likely to be written. Decreasing it beyond a point
raises the likelihood of the brick getting full which is a very bad state
to be in.
Will update you if I get some document which explains this thing. Sorry for
the previous statement.

>
> Regards,
> Karthik
>
>>
>> Regards,
>>
>> Anatoliy
>>
>>
>>
>>
>>
>> On 2018-03-13 16:46, Karthik Subrahmanya wrote:
>>
>> Hi Anatoliy,
>>
>> The heal command is basically used to heal any mismatching contents
>> between replica copies of the files.
>> For the command "gluster volume heal " to succeed, you should
>> have the self-heal-daemon running,
>> which is true only if your volume is of type replicate/disperse.
>> In your case you have a plain distribute volume where you do not store
>> the replica of any files.
>> So the volume heal will return you the error.
>>
>> Regards,
>> Karthik
>>
>> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <to...@tolid.eu.org>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> Maybe someone can point me to a documentation or explain this? I can't
>>> find it myself.
>>> Do we have any other useful resources except doc.gluster.org? As I see
>>> many gluster options are not described there or there are no explanation
>>> what is doing...
>>>
>>>
>>>
>>> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
>>>
>>>> Hello,
>>>>
>>>> We have a very fresh gluster 3.10.10 installation.
>>>> Our volume is created as distributed volume, 9 bricks 96TB in total
>>>> (87TB after 10% of gluster disk space reservation)
>>>>
>>>> For some reasons I can't "heal" the volume:
>>>> # gluster volume heal gv0
>>>> Launching heal operation to perform index self heal on volume gv0 has
>>>> been unsuccessful on bricks that are down. Please check if all brick
>>>> processes are running.
>>>>
>>>> Which processes should be run on every brick for heal operation?
>>>>
>>>> # gluster volume status
>>>> Status of volume: gv0
>>>> Gluster process TCP Port  RDMA Port
>>>> Online  Pid
>>>> 
>>>> --
>>>> Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  70850
>>>> Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  102951
>>>> Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  57535
>>>> Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56676
>>>> Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56880
>>>> Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56889
>>>> Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56902
>>>> Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  94920
>>>> Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152  Y
>>>>  56542
>>>>
>>>> Task Status of Volume gv0
>>>> 
>>>> --
>>>> There are no active volume tasks
>>>>
>>>>
>>>> # gluster volume info gv0
>>>

Re: [Gluster-users] Turn off replication

2018-04-06 Thread Karthik Subrahmanya
Hi Jose,

By switching into pure distribute volume you will lose availability if
something goes bad.

I am guessing you have a nX2 volume.
If you want to preserve one copy of the data in all the distributes, you
can do that by decreasing the replica count in the remove-brick operation.
If you have any inconsistency, heal them first using the "gluster volume
heal " command and wait till the
"gluster volume heal  info" output becomes zero, before removing
the bricks, so that you will have the correct data.
If you do not want to preserve the data then you can directly remove the
bricks.
Even after removing the bricks the data will be present in the backend of
the removed bricks. You have to manually erase them (both data and
.glusterfs folder).
See [1] for more details on remove-brick.

[1].
https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#shrinking-volumes

HTH,
Karthik


On Thu, Apr 5, 2018 at 8:17 PM, Jose Sanchez  wrote:

>
> We have a Gluster setup with 2 nodes (distributed replication) and we
> would like to switch it to the distributed mode. I know the data is
> duplicated between those nodes, what is the proper way of switching it to a
> distributed, we would like to double or gain the storage space on our
> gluster storage node. what happens with the data, do i need to erase one of
> the nodes?
>
> Jose
>
>
> -
> Jose Sanchez
> Systems/Network Analyst
> Center of Advanced Research Computing
> 1601 Central Ave.
> MSC 01 1190
> Albuquerque, NM 87131-0001
> carc.unm.edu
> 575.636.4232
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turn off replication

2018-04-07 Thread Karthik Subrahmanya
Hi Jose,

Thanks for providing the volume info. You have 2 subvolumes. Data is
replicated within the bricks of that subvolumes.
First one consisting of Node A's brick1 & Node B's brick1 and the second
one consisting of Node A's brick2 and Node B's brick2.
You don't have the same data on all the 4 bricks. Data are distributed
between these two subvolumes.
To remove the replica you can use the command
gluster volume remove-brick scratch replica 1 gluster02ib:/gdata/brick1/
scratch gluster02ib:/gdata/brick2/scratch force
So you will have one copy of data present from both the distributes.
Before doing this make sure "gluster volume heal scratch info" value is
zero. So copies you retain will have the correct data.
After the remove-brick erase the data from the backend.
Then you can expand the volume by following the steps at [1].

[1]
https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#expanding-volumes

Regards,
Karthik

On Fri, Apr 6, 2018 at 11:39 PM, Jose Sanchez <joses...@carc.unm.edu> wrote:

> Hi Karthik
>
> this is our configuration,  is 2x2 =4 , they are all replicated , each
> brick has 14tb. we have 2 nodes A and B, each one with brick 1 and 2.
>
> Node A  (replicated A1 (14tb) and B1 (14tb) ) same with node B (Replicated
> A2 (14tb) and B2 (14tb)).
>
> Do you think we need to degrade the node first before removing it. i
> believe the same copy of data is on all 4 bricks, we would like to keep one
> of them, and add the other bricks as extra space
>
> Thanks for your help on this
>
> Jose
>
>
>
>
>
> [root@gluster01 ~]# gluster volume info scratch
>
>
> Volume Name: scratch
> Type: Distributed-Replicate
> Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp,rdma
> Bricks:
> Brick1: gluster01ib:/gdata/brick1/scratch
> Brick2: gluster02ib:/gdata/brick1/scratch
> Brick3: gluster01ib:/gdata/brick2/scratch
> Brick4: gluster02ib:/gdata/brick2/scratch
> Options Reconfigured:
> performance.readdir-ahead: on
> nfs.disable: on
>
> [root@gluster01 ~]# gluster volume status all
> Status of volume: scratch
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick gluster01ib:/gdata/brick1/scratch 49152 49153  Y
> 1743
> Brick gluster02ib:/gdata/brick1/scratch 49156 49157  Y
> 1732
> Brick gluster01ib:/gdata/brick2/scratch 49154 49155  Y
> 1738
> Brick gluster02ib:/gdata/brick2/scratch 49158 49159  Y
> 1733
> Self-heal Daemon on localhost   N/A   N/AY
> 1728
> Self-heal Daemon on gluster02ib N/A   N/AY
> 1726
>
>
> Task Status of Volume scratch
> 
> --
> There are no active volume tasks
>
> -
> Jose Sanchez
> Systems/Network Analyst 1
> Center of Advanced Research Computing
> 1601 Central Ave
> <https://maps.google.com/?q=1601+Central+Ave=gmail=g>.
> MSC 01 1190
> Albuquerque, NM 87131-0001
> carc.unm.edu
> 575.636.4232
>
> On Apr 6, 2018, at 3:49 AM, Karthik Subrahmanya <ksubr...@redhat.com>
> wrote:
>
> Hi Jose,
>
> By switching into pure distribute volume you will lose availability if
> something goes bad.
>
> I am guessing you have a nX2 volume.
> If you want to preserve one copy of the data in all the distributes, you
> can do that by decreasing the replica count in the remove-brick operation.
> If you have any inconsistency, heal them first using the "gluster volume
> heal " command and wait till the
> "gluster volume heal  info" output becomes zero, before removing
> the bricks, so that you will have the correct data.
> If you do not want to preserve the data then you can directly remove the
> bricks.
> Even after removing the bricks the data will be present in the backend of
> the removed bricks. You have to manually erase them (both data and
> .glusterfs folder).
> See [1] for more details on remove-brick.
>
> [1]. https://docs.gluster.org/en/latest/Administrator%
> 20Guide/Managing%20Volumes/#shrinking-volumes
>
> HTH,
> Karthik
>
>
> On Thu, Apr 5, 2018 at 8:17 PM, Jose Sanchez <joses...@carc.unm.edu>
> wrote:
>
>>
>> We have a Gluster setup with 2 nodes (distributed replication) and we
>> would like to switch it to the distributed mode. I know the data is
>> duplicated between those nodes, what is the proper way of switching it to a
>> distributed, we would like to 

Re: [Gluster-users] Turn off replication

2018-04-12 Thread Karthik Subrahmanya
On Wed, Apr 11, 2018 at 7:38 PM, Jose Sanchez <joses...@carc.unm.edu> wrote:

> Hi Karthik
>
> Looking at the information you have provided me, I would like to make sure
> that I’m running the right commands.
>
> 1.   gluster volume heal scratch info
>
If the count is non zero, trigger the heal and wait for heal info count to
become zero.

> 2. gluster volume remove-brick scratch *replica 1 *
> gluster02ib:/gdata/brick1/scratch gluster02ib:/gdata/brick2/scratch force
>
3.  gluster volume add-brick* “#"* scratch gluster02ib:/gdata/brick1/
> scratch gluster02ib:/gdata/brick2/scratch
>
>
> Based on the configuration I have, Brick 1 from Node A and B are tide
> together and Brick 2 from Node A and B are also tide together. Looking at
> your remove command (step #2), it seems that you want me to remove Brick 1
> and 2 from Node B (gluster02ib). is that correct? I thought the data was
> distributed in bricks 1 between nodes A and B) and duplicated on Bricks 2
> (node A and B).
>
Data is duplicated between bricks 1 of nodes A & B and bricks 2 of nodes A
& B and data is distributed between these two pairs.
You need not always remove the bricks 1 & 2 from node B itself. The idea
here is to keep one copy from both the replica pairs.

>
> Also when I add the bricks back to gluster, do I need to specify if it is
> distributed or replicated?? and Do i need a configuration #?? for example
> on your command (Step #2) you have “replica 1” when remove bricks, do I
> need to do the same when adding the nodes back ?
>
No. You just need to erase the data on those bricks and add those bricks
back to the volume. The previous remove-brick command will make the volume
plain distribute. Then simply adding the bricks without specifying any "#"
will expand the volume as a plain distribute volue.

>
> Im planning on moving with this changes in few days. At this point each
> brick has 14tb and adding bricks 1 from node A and B, i have a total of
> 28tb, After doing all the process, (removing and adding bricks) I should be
> able to see a total of 56Tb right ?
>
Yes after all these you will have 56TB in total.
After adding the bricks, do volume rebalance, so that the data which were
present previously, will be moved to the correct bricks.

HTH,
Karthik

>
> Thanks
>
> Jose
>
>
>
>
> -
> Jose Sanchez
> Systems/Network Analyst 1
> Center of Advanced Research Computing
> 1601 Central Ave
> <https://maps.google.com/?q=1601+Central+Ave=gmail=g>.
> MSC 01 1190
> Albuquerque, NM 87131-0001
> carc.unm.edu
> 575.636.4232
>
> On Apr 7, 2018, at 8:29 AM, Karthik Subrahmanya <ksubr...@redhat.com>
> wrote:
>
> Hi Jose,
>
> Thanks for providing the volume info. You have 2 subvolumes. Data is
> replicated within the bricks of that subvolumes.
> First one consisting of Node A's brick1 & Node B's brick1 and the second
> one consisting of Node A's brick2 and Node B's brick2.
> You don't have the same data on all the 4 bricks. Data are distributed
> between these two subvolumes.
> To remove the replica you can use the command
> gluster volume remove-brick scratch replica 1 gluster02ib:/gdata/brick1/
> scratch gluster02ib:/gdata/brick2/scratch force
> So you will have one copy of data present from both the distributes.
> Before doing this make sure "gluster volume heal scratch info" value is
> zero. So copies you retain will have the correct data.
> After the remove-brick erase the data from the backend.
> Then you can expand the volume by following the steps at [1].
>
> [1] https://docs.gluster.org/en/latest/Administrator%
> 20Guide/Managing%20Volumes/#expanding-volumes
>
> Regards,
> Karthik
>
> On Fri, Apr 6, 2018 at 11:39 PM, Jose Sanchez <joses...@carc.unm.edu>
> wrote:
>
>> Hi Karthik
>>
>> this is our configuration,  is 2x2 =4 , they are all replicated , each
>> brick has 14tb. we have 2 nodes A and B, each one with brick 1 and 2.
>>
>> Node A  (replicated A1 (14tb) and B1 (14tb) ) same with node B
>> (Replicated A2 (14tb) and B2 (14tb)).
>>
>> Do you think we need to degrade the node first before removing it. i
>> believe the same copy of data is on all 4 bricks, we would like to keep one
>> of them, and add the other bricks as extra space
>>
>> Thanks for your help on this
>>
>> Jose
>>
>>
>>
>>
>>
>> [root@gluster01 ~]# gluster volume info scratch
>>
>> Volume Name: scratch
>> Type: Distributed-Replicate
>> Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106
>> Status: Started
>> Snapshot Count: 0
>> Number of Bri

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-27 Thread Karthik Subrahmanya
On Tue, Feb 27, 2018 at 1:40 PM, Dave Sherohman <d...@sherohman.org> wrote:

> On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya wrote:
> > I will try to explain how you can end up in split-brain even with cluster
> > wide quorum:
>
> Yep, the explanation made sense.  I hadn't considered the possibility of
> alternating outages.  Thanks!
>
> > > > It would be great if you can consider configuring an arbiter or
> > > > replica 3 volume.
> > >
> > > I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
> > > bricks as arbiters with minimal effect on capacity.  What would be the
> > > sequence of commands needed to:
> > >
> > > 1) Move all data off of bricks 1 & 2
> > > 2) Remove that replica from the cluster
> > > 3) Re-add those two bricks as arbiters
> > >
> > > (And did I miss any additional steps?)
> > >
> > > Unfortunately, I've been running a few months already with the current
> > > configuration and there are several virtual machines running off the
> > > existing volume, so I'll need to reconfigure it online if possible.
> > >
> > Without knowing the volume configuration it is difficult to suggest the
> > configuration change,
> > and since it is a live system you may end up in data unavailability or
> data
> > loss.
> > Can you give the output of "gluster volume info "
> > and which brick is of what size.
>
> Volume Name: palantir
> Type: Distributed-Replicate
> Volume ID: 48379a50-3210-41b4-9a77-ae143c8bcac0
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x 2 = 6
> Transport-type: tcp
> Bricks:
> Brick1: saruman:/var/local/brick0/data
> Brick2: gandalf:/var/local/brick0/data
> Brick3: azathoth:/var/local/brick0/data
> Brick4: yog-sothoth:/var/local/brick0/data
> Brick5: cthulhu:/var/local/brick0/data
> Brick6: mordiggian:/var/local/brick0/data
> Options Reconfigured:
> features.scrub: Inactive
> features.bitrot: off
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> network.ping-timeout: 1013
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> features.shard: on
> cluster.data-self-heal-algorithm: full
> storage.owner-uid: 64055
> storage.owner-gid: 64055
>
>
> For brick sizes, saruman/gandalf have
>
> $ df -h /var/local/brick0
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/gandalf-gluster  885G   55G  786G   7% /var/local/brick0
>
> and the other four have
>
> $ df -h /var/local/brick0
> Filesystem  Size  Used Avail Use% Mounted on
> /dev/sdb111T  254G   11T   3% /var/local/brick0
>

If you want to use the first two bricks as arbiter, then you need to be
aware of the following things:
- Your distribution count will be decreased to 2.
- Your data on the first subvol i.e., replica subvol - 1 will be
unavailable till it is copied to the other subvols
after removing the bricks from the cluster.

Since arbiter bricks need not be of same size as the data bricks, if you
can configure three more arbiter bricks
based on the guidelines in the doc [1], you can do it live and you will
have the distribution count also unchanged.

One more thing from the volume info; Only the options which are
reconfigured will appear in the volume info output.
The quorum-type is in the list which says it is manually reconfigured.

[1]
http://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#arbiter-bricks-sizing

Regards,
Karthik

>
>
> --
> Dave Sherohman
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Scheduled AutoCommit Function for WORM Feature

2018-02-27 Thread Karthik Subrahmanya
Hi David,

Yes it is a good to have feature, but AFAIK it is currently not in the
priority/focus list.
If anyone from community is interested in implementing this, is most
welcome.
Otherwise you need to wait for some more time until it comes to focus.

Thanks & Regards,
Karthik

On Tue, Feb 27, 2018 at 3:52 PM, David Spisla  wrote:

> Hello Gluster Community,
>
> while reading that article:
> https://github.com/gluster/glusterfs-specs/blob/master/
> under_review/worm-compliance.md
>
> there seems to be an interesting feature planned for the WORM Xlator:
>
> *Scheduled Auto-commit*: Scan Triggered Using timeouts for untouched
> files. The next scheduled namespace scan will cause the transition. CTR DB
> via libgfdb can be used to find files that have not changed. This can be
> verified with stat of the file.
>
> Is this feature still in focus? It is very usefull I think. A client does
> not have to trigger a FOP to make a file WORM-Retained after the expiration
> of the autcommit-period.
>
> Regards
> David Spisla
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-27 Thread Karthik Subrahmanya
On Tue, Feb 27, 2018 at 5:35 PM, Dave Sherohman <d...@sherohman.org> wrote:

> On Tue, Feb 27, 2018 at 04:59:36PM +0530, Karthik Subrahmanya wrote:
> > > > Since arbiter bricks need not be of same size as the data bricks, if
> you
> > > > can configure three more arbiter bricks
> > > > based on the guidelines in the doc [1], you can do it live and you
> will
> > > > have the distribution count also unchanged.
> > >
> > > I can probably find one or more machines with a few hundred GB free
> > > which could be allocated for arbiter bricks if it would be sigificantly
> > > simpler and safer than repurposing the existing bricks (and I'm getting
> > > the impression that it probably would be).
> >
> > Yes it is the simpler and safer way of doing that.
> >
> > >   Does it particularly matter
> > > whether the arbiters are all on the same node or on three separate
> > > nodes?
> > >
> >  No it doesn't matter as long as the bricks of same replica subvol are
> not
> > on the same nodes.
>
> OK, great.  So basically just install the gluster server on the new
> node(s), do a peer probe to add them to the cluster, and then
>
> gluster volume create palantir replica 3 arbiter 1 [saruman brick]
> [gandalf brick] [arbiter 1] [azathoth brick] [yog-sothoth brick] [arbiter
> 2] [cthulhu brick] [mordiggian brick] [arbiter 3]
>
gluster volume add-brick  replica 3 arbiter 1   
is the command. It will convert the existing volume to arbiter volume and
add the specified bricks as arbiter bricks to the existing subvols.
Once they are successfully added, self heal should start automatically and
you can check the status of heal using the command,
gluster volume heal  info

Regards,
Karthik
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-27 Thread Karthik Subrahmanya
On Tue, Feb 27, 2018 at 4:18 PM, Dave Sherohman <d...@sherohman.org> wrote:

> On Tue, Feb 27, 2018 at 03:20:25PM +0530, Karthik Subrahmanya wrote:
> > If you want to use the first two bricks as arbiter, then you need to be
> > aware of the following things:
> > - Your distribution count will be decreased to 2.
>
> What's the significance of this?  I'm trying to find documentation on
> distribution counts in gluster, but my google-fu is failing me.
>
More distribution, better load balancing.

>
> > - Your data on the first subvol i.e., replica subvol - 1 will be
> > unavailable till it is copied to the other subvols
> > after removing the bricks from the cluster.
>
> Hmm, ok.  I was sure I had seen a reference at some point to a command
> for migrating data off bricks to prepare them for removal.
>
> Is there an easy way to get a list of all files which are present on a
> given brick, then, so that I can see which data would be unavailable
> during this transfer?
>
The easiest way is by doing "ls" on the back end brick.

>
> > Since arbiter bricks need not be of same size as the data bricks, if you
> > can configure three more arbiter bricks
> > based on the guidelines in the doc [1], you can do it live and you will
> > have the distribution count also unchanged.
>
> I can probably find one or more machines with a few hundred GB free
> which could be allocated for arbiter bricks if it would be sigificantly
> simpler and safer than repurposing the existing bricks (and I'm getting
> the impression that it probably would be).

Yes it is the simpler and safer way of doing that.

>   Does it particularly matter
> whether the arbiters are all on the same node or on three separate
> nodes?
>
 No it doesn't matter as long as the bricks of same replica subvol are not
on the same nodes.

Regards,
Karthik

>
> --
> Dave Sherohman
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quorum in distributed-replicate volume

2018-02-26 Thread Karthik Subrahmanya
Hi Dave,

On Mon, Feb 26, 2018 at 4:45 PM, Dave Sherohman  wrote:

> I've configured 6 bricks as distributed-replicated with replica 2,
> expecting that all active bricks would be usable so long as a quorum of
> at least 4 live bricks is maintained.
>
The client quorum is configured per replica sub volume and not for the
entire volume.
Since you have a distributed-replicated volume with replica 2, the data
will have 2 copies,
and considering your scenario of quorum to be taken on the total number of
bricks will lead to split-brains.

>
> However, I have just found
>
> http://docs.gluster.org/en/latest/Administrator%20Guide/
> Split%20brain%20and%20ways%20to%20deal%20with%20it/
>
> Which states that "In a replica 2 volume... If we set the client-quorum
> option to auto, then the first brick must always be up, irrespective of
> the status of the second brick. If only the second brick is up, the
> subvolume becomes read-only."
>
By default client-quorum is "none" in replica 2 volume.

>
> Does this apply only to a two-brick replica 2 volume or does it apply to
> all replica 2 volumes, even if they have, say, 6 bricks total?
>
It applies to all the replica 2 volumes even if it has just 2 brick or more.
Total brick count in the volume doesn't matter for the quorum, what matters
is the number of bricks which are up in the particular replica subvol.

>
> If it does apply to distributed-replicated volumes with >2 bricks,
> what's the reasoning for it?  I would expect that, if the cluster splits
> into brick 1 by itself and bricks 2-3-4-5-6 still together, then brick 1
> will recognize that it doesn't have volume-wide quorum and reject
> writes, thus allowing brick 2 to remain authoritative and able to accept
> writes.
>
If I understood your configuration correctly it should look something like
this:
(Please correct me if I am wrong)
replica-1:  bricks 1 & 2
replica-2: bricks 3 & 4
replica-3: bricks 5 & 6
Since quorum is per replica, if it is set to auto then it needs the first
brick of the particular replica subvol to be up to perform the fop.

In replica 2 volumes you can end up in split-brains. It would be great if
you can consider configuring an arbiter or replica 3 volume.
You can find more details about their advantages over replica 2 volume in
the same document.

Regards,
Karthik

>
> --
> Dave Sherohman
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Transport endpoint is not connected : issue

2018-09-02 Thread Karthik Subrahmanya
Hey,

We need some more information to debug this.
I think you missed to send the output of 'gluster volume info '.
Can you also provide the bricks, shd and glfsheal logs as well?
In the setup how many peers are present? You also mentioned that "one of
the file servers have two processes for each of the volumes instead of one
per volume", which process are you talking about here?

Regards,
Karthik

On Sat, Sep 1, 2018 at 12:10 AM Johnson, Tim  wrote:

> Thanks for the reply.
>
>
>
>I have attached the gluster.log file from the host that it is happening
> to at this time.
>
> It does change which host it does this on.
>
>
>
> Thanks.
>
>
>
> *From: *Atin Mukherjee 
> *Date: *Friday, August 31, 2018 at 1:03 PM
> *To: *"Johnson, Tim" 
> *Cc: *Karthik Subrahmanya , Ravishankar N <
> ravishan...@redhat.com>, "gluster-users@gluster.org" <
> gluster-users@gluster.org>
> *Subject: *Re: [Gluster-users] Transport endpoint is not connected : issue
>
>
>
> Can you please pass all the gluster log files from the server where the
> transport end point not connected error is reported? As restarting glusterd
> didn’t solve this issue, I believe this isn’t a stale port problem but
> something else. Also please provide the output of ‘gluster v info ’
>
>
>
> (@cc Ravi, Karthik)
>
>
>
> On Fri, 31 Aug 2018 at 23:24, Johnson, Tim  wrote:
>
> Hello all,
>
>
>
>   We have a gluster replicate (with arbiter)  volumes that we are
> getting “Transport endpoint is not connected” with on a rotating basis
>  from each of the two file servers, and a third host that has the arbiter
> bricks on.
>
> This is happening when trying to run a heal on all the volumes on the
> gluster hosts   When I get the status of all the volumes all looks good.
>
>This behavior seems to be a forshadowing of the gluster volumes
> becoming unresponsive to our vm cluster.  As well as one of the file
> servers have two processes for each of the volumes instead of one per
> volume. Eventually the affected file server
>
> will drop off the listed peers. Restarting glusterd/glusterfsd on the
> affected file server does not take care of the issue, we have to bring down
> both file
>
> Servers due to the volumes not being seen by the vm cluster after the
> errors start occurring. I had seen that there were bug reports about the
> “Transport endpoint is not connected” on earlier versions of Gluster
> however had thought that
>
> It had been addressed.
>
>  Dmesg did have some entries for “a possible syn flood on port *”
> which we changed the  sysctl to “net.ipv4.tcp_max_syn_backlog = 2048” which
> seemed to help the syn flood messages but not the underlying volume issues.
>
> I have put the versions of all the Gluster packages installed below as
> well as the   “Heal” and “Status” commands showing the volumes are
>
>
>
>This has just started happening but cannot definitively say if this
> started occurring after an update or not.
>
>
>
>
>
> Thanks for any assistance.
>
>
>
>
>
> Running Heal  :
>
>
>
> gluster volume heal ovirt_engine info
>
> Brick 1.rrc.local:/bricks/brick0/ovirt_engine
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick 3.rrc.local:/bricks/brick0/ovirt_engine
>
> Status: Transport endpoint is not connected
>
> Number of entries: -
>
>
>
> Brick *3.rrc.local:/bricks/arb-brick/ovirt_engine
>
> Status: Transport endpoint is not connected
>
> Number of entries: -
>
>
>
>
>
> Running status :
>
>
>
> gluster volume status ovirt_engine
>
> Status of volume: ovirt_engine
>
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
>
> --
>
> Brick*.rrc.local:/bricks/brick0/ov
>
> irt_engine  49152 0  Y
> 5521
>
> Brick fs2-tier3.rrc.local:/bricks/brick0/ov
>
> irt_engine  49152 0  Y
> 6245
>
> Brick .rrc.local:/bricks/arb-b
>
> rick/ovirt_engine   49152 0  Y
> 3526
>
> Self-heal Daemon on localhost   N/A   N/AY
> 5509
>
> Self-heal Daemon on ***.rrc.local N/A   N/AY   6218
>
> Self-heal Daemon on ***.rrc.local   N/A   N/AY   3501
>
> Self-heal Daemon on .rrc.local N/A   N/AY   3657
>
> Self-heal Daemon on *.rrc.local   N/A   N/AY   3753
>
> Self-

Re: [Gluster-users] Transport endpoint is not connected : issue

2018-09-03 Thread Karthik Subrahmanya
On Mon, Sep 3, 2018 at 11:17 AM Karthik Subrahmanya 
wrote:

> Hey,
>
> We need some more information to debug this.
> I think you missed to send the output of 'gluster volume info '.
> Can you also provide the bricks, shd and glfsheal logs as well?
> In the setup how many peers are present? You also mentioned that "one of
> the file servers have two processes for each of the volumes instead of one
> per volume", which process are you talking about here?
>
Also provide the "ps aux | grep gluster" output.

>
> Regards,
> Karthik
>
> On Sat, Sep 1, 2018 at 12:10 AM Johnson, Tim  wrote:
>
>> Thanks for the reply.
>>
>>
>>
>>I have attached the gluster.log file from the host that it is
>> happening to at this time.
>>
>> It does change which host it does this on.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> *From: *Atin Mukherjee 
>> *Date: *Friday, August 31, 2018 at 1:03 PM
>> *To: *"Johnson, Tim" 
>> *Cc: *Karthik Subrahmanya , Ravishankar N <
>> ravishan...@redhat.com>, "gluster-users@gluster.org" <
>> gluster-users@gluster.org>
>> *Subject: *Re: [Gluster-users] Transport endpoint is not connected :
>> issue
>>
>>
>>
>> Can you please pass all the gluster log files from the server where the
>> transport end point not connected error is reported? As restarting glusterd
>> didn’t solve this issue, I believe this isn’t a stale port problem but
>> something else. Also please provide the output of ‘gluster v info ’
>>
>>
>>
>> (@cc Ravi, Karthik)
>>
>>
>>
>> On Fri, 31 Aug 2018 at 23:24, Johnson, Tim  wrote:
>>
>> Hello all,
>>
>>
>>
>>   We have a gluster replicate (with arbiter)  volumes that we are
>> getting “Transport endpoint is not connected” with on a rotating basis
>>  from each of the two file servers, and a third host that has the arbiter
>> bricks on.
>>
>> This is happening when trying to run a heal on all the volumes on the
>> gluster hosts   When I get the status of all the volumes all looks good.
>>
>>This behavior seems to be a forshadowing of the gluster volumes
>> becoming unresponsive to our vm cluster.  As well as one of the file
>> servers have two processes for each of the volumes instead of one per
>> volume. Eventually the affected file server
>>
>> will drop off the listed peers. Restarting glusterd/glusterfsd on the
>> affected file server does not take care of the issue, we have to bring down
>> both file
>>
>> Servers due to the volumes not being seen by the vm cluster after the
>> errors start occurring. I had seen that there were bug reports about the
>> “Transport endpoint is not connected” on earlier versions of Gluster
>> however had thought that
>>
>> It had been addressed.
>>
>>  Dmesg did have some entries for “a possible syn flood on port *”
>> which we changed the  sysctl to “net.ipv4.tcp_max_syn_backlog = 2048” which
>> seemed to help the syn flood messages but not the underlying volume issues.
>>
>> I have put the versions of all the Gluster packages installed below
>> as well as the   “Heal” and “Status” commands showing the volumes are
>>
>>
>>
>>This has just started happening but cannot definitively say if
>> this started occurring after an update or not.
>>
>>
>>
>>
>>
>> Thanks for any assistance.
>>
>>
>>
>>
>>
>> Running Heal  :
>>
>>
>>
>> gluster volume heal ovirt_engine info
>>
>> Brick 1.rrc.local:/bricks/brick0/ovirt_engine
>>
>> Status: Connected
>>
>> Number of entries: 0
>>
>>
>>
>> Brick 3.rrc.local:/bricks/brick0/ovirt_engine
>>
>> Status: Transport endpoint is not connected
>>
>> Number of entries: -
>>
>>
>>
>> Brick *3.rrc.local:/bricks/arb-brick/ovirt_engine
>>
>> Status: Transport endpoint is not connected
>>
>> Number of entries: -
>>
>>
>>
>>
>>
>> Running status :
>>
>>
>>
>> gluster volume status ovirt_engine
>>
>> Status of volume: ovirt_engine
>>
>> Gluster process TCP Port  RDMA Port  Online
>> Pid
>>
>>
>> --
>>
>> Brick*.rrc.local:/bricks/brick0/ov
>>
>> irt_engine

Re: [Gluster-users] sometimes entry remains in "gluster v heal vol-name info" until visit it from mnt

2018-09-28 Thread Karthik Subrahmanya
Hey,

Please provide the glustershd log from all the nodes and client logs on the
node from where you did the lookup on the file to resolve this issue.

Regards,
Karthik

On Fri, Sep 28, 2018 at 5:27 PM Ravishankar N 
wrote:

> + gluster-users.
>
> Adding Karthik to see if he has some cycles to look into this.
>
> -Ravi
>
> On 09/28/2018 12:07 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:
>
> Hi, glusterfs expert
>
> When I test with glusterfs version 3.12.3 I find it quite often that
> sometimes, there are entry remains in gluster volume heal info
> output for long time, *it does not disappear until you visit it from the
> mount point, is this normal*?
>
>
>
>
>
> [root@sn-0:/root]
>
> # gluster v heal services info
>
> Brick sn-0.local:/mnt/bricks/services/brick
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick sn-1.local:/mnt/bricks/services/brick
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick sn-2.local:/mnt/bricks/services/brick
>
> /fstest_88402c989256d6e39e50208c90c1e85d  //this entry remains in
> the output until you touch /mnt/services/
> fstest_88402c989256d6e39e50208c90c1e85d
>
> Status: Connected
>
> Number of entries: 1
>
>
>
> [root@sn-0:/root]
>
> # ssh sn-2.local
>
> Warning: Permanently added 'sn-2.local' (RSA) to the list of known hosts.
>
>
>
> USAGE OF THE ROOT ACCOUNT AND THE FULL BASH IS RECOMMENDED ONLY FOR
> LIMITED USE. PLEASE USE A NON-ROOT ACCOUNT AND THE SCLI SHELL (fsclish)
> AND/OR LIMITED BASH SHELL.
>
>
>
> Read /opt/nokia/share/security/readme_root.txt for more details.
>
>
>
> [root@sn-2:/root]
>
> # cd /mnt/bricks/services/brick/.glusterfs/indices/xattrop/
>
> [root@sn-2:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
>
> # ls
>
> 9138e315-efd6-46e0-8a3a-db535078c781
> xattrop-dfcd7e67-8c2d-4ef1-93e2-c180073c8d87
>
> [root@sn-2:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
>
> # getfattr -m . -d -e hex
> /mnt/bricks/services/brick/fstest_88402c989256d6e39e50208c90c1e85d/
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: mnt/bricks/services/brick/fstest_88402c989256d6e39e50208c90c1e85d/
>
> trusted.afr.services-client-1=0x00010001
>
> trusted.gfid=0x9138e315efd646e08a3adb535078c781
>
> trusted.glusterfs.dht=0x0001
>
>
>
> [root@sn-2:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
>
> # getfattr -m . -d -e hex
> /mnt/bricks/services/brick/fstest_88402c989256d6e39e50208c90c1e85d/fstest_4cf1be62e0b12d3d65fac8eacb523ef3/
>
> getfattr: Removing leading '/' from absolute path names
>
> # file:
> mnt/bricks/services/brick/fstest_88402c989256d6e39e50208c90c1e85d/fstest_4cf1be62e0b12d3d65fac8eacb523ef3/
>
> trusted.gfid=0x0ccb5c1f96064e699f62fdc72cf036f5
>
>
>
>
>
>
>
> “fstest_88402c989256d6e39e50208c90c1e85d” is only seen from sn-2 mount
> point and sn-2 service brick, there is no such entry if you ls
> /mnt/services on sn-0 or sn-1.
>
> [root@sn-2:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
>
> # cd /mnt/services/
>
> [root@sn-2:/mnt/services]
>
> # ls
>
> backup   db
> fstest_88402c989256d6e39e50208c90c1e85d  LCM  NE3SAgent
> _nokrcpautoremoteuser  PM9  RCP_Backup  SS_AlLightProcessor  SymptomDataUpl
>
> commoncollector  EventCorrelationEngine  hypertracer
>  Log  netservODSptp  rcpha   SWM
>
> [root@sn-2:/mnt/services]
>
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Karthik Subrahmanya
+Sanju Rakonde  & +Atin Mukherjee
 adding
glusterd folks who can help here.

On Wed, Mar 27, 2019 at 3:24 PM Riccardo Murri 
wrote:

> I managed to put the reinstalled server back into connected state with
> this procedure:
>
> 1. Run `for other_server in ...; do gluster peer probe $other_server;
> done` on the reinstalled server
> 2. Now all the peers on the reinstalled server show up as "Accepted
> Peer Request", which I fixed with the procedure outlined in the last
> paragraph of
> https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd
>
> Can anyone confirm that this is a good way to proceed and I won't be
> heading quickly towards corrupting volume data?
>
> Thanks,
> Riccardo
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

2019-03-21 Thread Karthik Subrahmanya
Hi Milos,

Thanks for the logs and the getfattr output.
>From the logs I can see that there are 6 entries under the
directory "/data/data-cluster/dms/final_archive" named
41be9ff5ec05c4b1c989c6053e709e59
5543982fab4b56060aa09f667a8ae617
a8b7f31775eebc8d1867e7f9de7b6eaf
c1d3f3c2d7ae90e891e671e2f20d5d4b
e5934699809a3b6dcfc5945f408b978b
e7cdc94f60d390812a5f9754885e119e
which are having gfid mismatch, so the heal is failing on this directory.

You can use the CLI option to resolve these files from gfid mismatch. You
can use any of the 3 methods available:
1. bigger-file
gluster volume heal  split-brain bigger-file 

2. latest-mtime
gluster volume heal  split-brain latest-mtime 

3. source-brick
gluster volume heal  split-brain source-brick 


where  must be absolute path w.r.t. the volume, starting with '/'.
If all those entries are directories then go for either
latest-mtime/source-brick option.
After you resolve all these gfid-mismatches, run the "gluster volume heal
" command. Then check the heal info and let me know the result.

Regards,
Karthik

On Thu, Mar 21, 2019 at 4:27 PM Milos Cuculovic  wrote:

> Sure, thank you for following up.
>
> About the commands, here is what I see:
>
> brick1:
> —
> sudo gluster volume heal storage2 info
> Brick storage3:/data/data-cluster
> 
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 2
> —
> sudo getfattr -d -m . -e hex /data/data-cluster/dms/final_archive
> getfattr: Removing leading '/' from absolute path names
> # file: data/data-cluster/dms/final_archive
> trusted.afr.dirty=0x
> trusted.afr.storage2-client-1=0x0010
> trusted.gfid=0x16c6a1e2b3fe4851972b998980097a87
> trusted.glusterfs.dht=0x0001
> trusted.glusterfs.dht.mds=0x
> —
> stat /data/data-cluster/dms/final_archive
>   File: '/data/data-cluster/dms/final_archive'
>   Size: 3497984   Blocks: 8768   IO Block: 4096   directory
> Device: 807h/2055d Inode: 26427748396  Links: 72123
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2018-10-09 04:22:40.514629044 +0200
> Modify: 2019-03-21 11:55:37.382278863 +0100
> Change: 2019-03-21 11:55:37.382278863 +0100
>  Birth: -
> —
> —
>
> brick2:
> —
> sudo gluster volume heal storage2 info
> Brick storage3:/data/data-cluster
> 
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 2
> —
> sudo getfattr -d -m . -e hex /data/data-cluster/dms/final_archive
> getfattr: Removing leading '/' from absolute path names
> # file: data/data-cluster/dms/final_archive
> trusted.afr.dirty=0x
> trusted.afr.storage2-client-0=0x0001
> trusted.gfid=0x16c6a1e2b3fe4851972b998980097a87
> trusted.glusterfs.dht=0x0001
> trusted.glusterfs.dht.mds=0x
> —
> stat /data/data-cluster/dms/final_archive
>   File: '/data/data-cluster/dms/final_archive'
>   Size: 3497984   Blocks: 8760   IO Block: 4096   directory
> Device: 807h/2055d Inode: 13563551265  Links: 72124
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2018-10-09 04:22:40.514629044 +0200
> Modify: 2019-03-21 11:55:46.382565124 +0100
> Change: 2019-03-21 11:55:46.382565124 +0100
>  Birth: -
> —
>
> Hope this helps.
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculo...@mdpi.com 
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 11:43, Karthik Subrahma

Re: [Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

2019-03-21 Thread Karthik Subrahmanya
Can you give me the stat & getfattr output of all those 6 entries from both
the bricks and the glfsheal-.log file from the node where you run
this command?
Meanwhile can you also try running this with the source-brick option?

On Thu, Mar 21, 2019 at 5:22 PM Milos Cuculovic  wrote:

> Thank you Karthik,
>
> I have run this for all files (see example below) and it says the file is
> not in split-brain:
>
> sudo gluster volume heal storage2 split-brain latest-mtime
> /dms/final_archive/41be9ff5ec05c4b1c989c6053e709e59
> Healing /dms/final_archive/41be9ff5ec05c4b1c989c6053e709e59 failed: File
> not in split-brain.
> Volume heal failed.
>
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculo...@mdpi.com 
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 12:36, Karthik Subrahmanya  wrote:
>
> Hi Milos,
>
> Thanks for the logs and the getfattr output.
> From the logs I can see that there are 6 entries under the
> directory "/data/data-cluster/dms/final_archive" named
> 41be9ff5ec05c4b1c989c6053e709e59
> 5543982fab4b56060aa09f667a8ae617
> a8b7f31775eebc8d1867e7f9de7b6eaf
> c1d3f3c2d7ae90e891e671e2f20d5d4b
> e5934699809a3b6dcfc5945f408b978b
> e7cdc94f60d390812a5f9754885e119e
> which are having gfid mismatch, so the heal is failing on this directory.
>
> You can use the CLI option to resolve these files from gfid mismatch. You
> can use any of the 3 methods available:
> 1. bigger-file
> gluster volume heal  split-brain bigger-file 
>
> 2. latest-mtime
> gluster volume heal  split-brain latest-mtime 
>
> 3. source-brick
> gluster volume heal  split-brain source-brick
>  
>
> where  must be absolute path w.r.t. the volume, starting with '/'.
> If all those entries are directories then go for either
> latest-mtime/source-brick option.
> After you resolve all these gfid-mismatches, run the "gluster volume heal
> " command. Then check the heal info and let me know the result.
>
> Regards,
> Karthik
>
> On Thu, Mar 21, 2019 at 4:27 PM Milos Cuculovic 
> wrote:
>
>> Sure, thank you for following up.
>>
>> About the commands, here is what I see:
>>
>> brick1:
>> —
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> 
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 2
>> —
>> sudo getfattr -d -m . -e hex /data/data-cluster/dms/final_archive
>> getfattr: Removing leading '/' from absolute path names
>> # file: data/data-cluster/dms/final_archive
>> trusted.afr.dirty=0x
>> trusted.afr.storage2-client-1=0x0010
>> trusted.gfid=0x16c6a1e2b3fe4851972b998980097a87
>> trusted.glusterfs.dht=0x0001
>> trusted.glusterfs.dht.mds=0x
>> —
>> stat /data/data-cluster/dms/final_archive
>>   File: '/data/data-cluster/dms/final_archive'
>>   Size: 3497984   Blocks: 8768   IO Block: 4096   directory
>> Device: 807h/2055d Inode: 26427748396  Links: 72123
>> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
>> Access: 2018-10-09 04:22:40.514629044 +0200
>> Modify: 2019-03-21 11:55:37.382278863 +0100
>> Change: 2019-03-21 11:55:37.382278863 +0100
>>  Birth: -
>> —
>> —
>>
>> brick2:
>> —
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> 
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: 

Re: [Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

2019-03-21 Thread Karthik Subrahmanya
Hi,

Note: I guess the volume you are talking about is of type replica-2 (1x2).
Usually replica 2 volumes are prone to split-brain. If you can consider
converting them to arbiter or replica-3, they will handle most of the cases
which can lead to slit-brains. For more information see [1].

Resolving the split-brain: [2] talks about how to interpret the heal info
output and different ways to resolve them using the CLI/manually/using the
favorite-child-policy.
If you are having entry split brain, and is a gfid split-brain (file/dir
having different gfids on the replica bricks) then you can use the CLI
option to resolve them. If a directory is in gfid split-brain in a
distributed-replicate volume and you are using the source-brick option
please make sure you use the brick of this subvolume, which has the same
gfid as that of the other distribute subvolume(s) where you have the
correct gfid, as the source.
If you are having a type mismatch then follow the steps in [3] to resolve
the split-brain.

[1]
https://docs.gluster.org/en/v3/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/
[2] https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
[3]
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#dir-split-brain

HTH,
Karthik

On Thu, Mar 21, 2019 at 1:45 PM Milos Cuculovic  wrote:

> I was now able to catch the split brain log:
>
> sudo gluster volume heal storage2 info
> Brick storage3:/data/data-cluster
> 
> 
> /dms/final_archive - Is in split-brain
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> 
> /dms/final_archive - Is in split-brain
>
> Status: Connected
> Number of entries: 2
>
> Milos
>
> On 21 Mar 2019, at 09:07, Milos Cuculovic  wrote:
>
> Since 24h, after upgrading from 4.0 to 4.1.7 one of the servers, the heal
> shows this:
>
> sudo gluster volume heal storage2 info
> Brick storage3:/data/data-cluster
> 
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 2
>
> The same files stay there. From time to time the status of the
> /dms/final_archive is in split brain at the following command shows:
>
> sudo gluster volume heal storage2 info split-brain
> Brick storage3:/data/data-cluster
> /dms/final_archive
> Status: Connected
> Number of entries in split-brain: 1
>
> Brick storage4:/data/data-cluster
> /dms/final_archive
> Status: Connected
> Number of entries in split-brain: 1
>
> How to know the file who is in split brain? The files in
> /dms/final_archive are not very important, fine to remove (ideally resolve
> the split brain) for the ones that differ.
>
> I can only see the directory and GFID. Any idea on how to resolve this
> situation as I would like to continue with the upgrade on the 2nd server,
> and for this the heal needs to be done with 0 entries in sudo gluster
> volume heal storage2 info
>
> Thank you in advance, Milos.
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

2019-03-21 Thread Karthik Subrahmanya
>
> sudo stat
> /data/data-cluster/dms/final_archive/41be9ff5ec05c4b1c989c6053e709e59
>   File:
> '/data/data-cluster/dms/final_archive/41be9ff5ec05c4b1c989c6053e709e59'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 42232631305  Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:06:26.994047597 +0100
> Modify: 2019-03-20 11:28:28.294689870 +0100
> Change: 2019-03-21 13:01:03.078748131 +0100
>  Birth: -
>
> sudo stat
> /data/data-cluster/dms/final_archive/5543982fab4b56060aa09f667a8ae617
>   File:
> '/data/data-cluster/dms/final_archive/5543982fab4b56060aa09f667a8ae617'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 78589109305  Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:07:20.342140927 +0100
> Modify: 2019-03-20 11:28:28.318690015 +0100
> Change: 2019-03-21 13:01:03.134748477 +0100
>  Birth: -
>
> sudo stat
> /data/data-cluster/dms/final_archive/a8b7f31775eebc8d1867e7f9de7b6eaf
>   File:
> '/data/data-cluster/dms/final_archive/a8b7f31775eebc8d1867e7f9de7b6eaf'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 54972096517  Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:06:55.414097315 +0100
> Modify: 2019-03-20 11:28:28.362690281 +0100
> Change: 2019-03-21 13:01:03.162748650 +0100
>  Birth: -
>
> sudo stat
> /data/data-cluster/dms/final_archive/c1d3f3c2d7ae90e891e671e2f20d5d4b
>   File:
> '/data/data-cluster/dms/final_archive/c1d3f3c2d7ae90e891e671e2f20d5d4b'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 40821259275  Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:07:08.558120309 +0100
> Modify: 2019-03-20 11:28:14.226604869 +0100
> Change: 2019-03-21 13:01:03.194748848 +0100
>  Birth: -
>
> sudo stat
> /data/data-cluster/dms/final_archive/e5934699809a3b6dcfc5945f408b978b
>   File:
> '/data/data-cluster/dms/final_archive/e5934699809a3b6dcfc5945f408b978b'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 15876654Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:06:02.070003998 +0100
> Modify: 2019-03-20 11:28:28.458690861 +0100
> Change: 2019-03-21 13:01:03.282749392 +0100
>  Birth: -
>
> sudo stat
> /data/data-cluster/dms/final_archive/e7cdc94f60d390812a5f9754885e119e
>   File:
> '/data/data-cluster/dms/final_archive/e7cdc94f60d390812a5f9754885e119e'
>   Size: 33Blocks: 0  IO Block: 4096   directory
> Device: 807h/2055d Inode: 49408944650  Links: 3
> Access: (0755/drwxr-xr-x)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2019-03-20 11:28:10.826584325 +0100
> Modify: 2019-03-20 11:28:10.834584374 +0100
> Change: 2019-03-20 14:06:07.940849268 +0100
>  Birth: -
> ————————
> 
>
> The file is from brick 2 that I upgraded and started the heal on.
>
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculo...@mdpi.com 
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 13:05, Karthik Subrahmanya  wrote:
>
> Can you give me the stat & getfattr output of all those 6 entries from
> both the bricks and the glfsheal-.log file from the node where you
> run this command?
> Meanwhile can you also try running this with the source-brick option?
>
> On Thu, Mar 21, 2019 at 5:22 PM Milos Cuculovic 
> wrote:
>
>> Thank you Karthik,
>>
>> I have run this for all files (see example below) and it says the file is
>> not in split-brain:
>>
>> sudo gluster volume heal storage2 split-brain latest-mtime
>> /dms/final_archive/41be9ff5ec05c4b1c989c6053e709e59
>> Healing /dms/final_archive/41be9ff5ec05c4b1

Re: [Gluster-users] Heal flapping between Possibly undergoing heal and In split brain

2019-03-21 Thread Karthik Subrahmanya
Can you attach the "glustershd.log"  file which will be present under
"/var/log/glusterfs/" from both the nodes and the "stat" & "getfattr -d -m
. -e hex " output of all the entries listed in the heal
info output from both the bricks?

On Thu, Mar 21, 2019 at 3:54 PM Milos Cuculovic  wrote:

> Thanks Karthik!
>
> I was trying to find some resolution methods from [2] but unfortunately
> none worked (I can explain what I tried if needed).
>
> I guess the volume you are talking about is of type replica-2 (1x2).
>
> That’s correct, aware of the arbiter solution but still didn’t took time
> to implement.
>
> From the info results I posted, how to know in which situation I am. No
> files are mentioned in spit brain, only directories. One brick has 3
> entries and one two entries.
>
> sudo gluster volume heal storage2 info
> [sudo] password for sshadmin:
> Brick storage3:/data/data-cluster
> 
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 3
>
> Brick storage4:/data/data-cluster
> 
> /dms/final_archive - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 2
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculo...@mdpi.com 
> Skype: milos.cuculovic.mdpi
>
> Disclaimer: The information and files contained in this message
> are confidential and intended solely for the use of the individual or
> entity to whom they are addressed. If you have received this message in
> error, please notify me and delete this message from your system. You may
> not copy this message in its entirety or in part, or disclose its contents
> to anyone.
>
> On 21 Mar 2019, at 10:27, Karthik Subrahmanya  wrote:
>
> Hi,
>
> Note: I guess the volume you are talking about is of type replica-2 (1x2).
> Usually replica 2 volumes are prone to split-brain. If you can consider
> converting them to arbiter or replica-3, they will handle most of the cases
> which can lead to slit-brains. For more information see [1].
>
> Resolving the split-brain: [2] talks about how to interpret the heal info
> output and different ways to resolve them using the CLI/manually/using the
> favorite-child-policy.
> If you are having entry split brain, and is a gfid split-brain (file/dir
> having different gfids on the replica bricks) then you can use the CLI
> option to resolve them. If a directory is in gfid split-brain in a
> distributed-replicate volume and you are using the source-brick option
> please make sure you use the brick of this subvolume, which has the same
> gfid as that of the other distribute subvolume(s) where you have the
> correct gfid, as the source.
> If you are having a type mismatch then follow the steps in [3] to resolve
> the split-brain.
>
> [1]
> https://docs.gluster.org/en/v3/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/
> [2]
> https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
> [3]
> https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#dir-split-brain
>
> HTH,
> Karthik
>
> On Thu, Mar 21, 2019 at 1:45 PM Milos Cuculovic 
> wrote:
>
>> I was now able to catch the split brain log:
>>
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> 
>> 
>> /dms/final_archive - Is in split-brain
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> 
>> /dms/final_archive - Is in split-brain
>>
>> Status: Connected
>> Number of entries: 2
>>
>> Milos
>>
>> On 21 Mar 2019, at 09:07, Milos Cuculovic  wrote:
>>
>> Since 24h, after upgrading from 4.0 to 4.1.7 one of the servers, the heal
>> shows this:
>>
>> sudo gluster volume heal storage2 info
>> Brick storage3:/data/data-cluster
>> 
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 3
>>
>> Brick storage4:/data/data-cluster
>> 
>> /dms/final_archive - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 2
>>
>> The same files stay there. From time to time the status of the
>> /dms/final_archive is in split brain at the following command shows:
>>
>> sudo gluster volume heal storage2 info split-brain
>> Brick storage3:/data/data-cluster
>> /dms/final_archive
>> Status: Connected
>> N

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Strahil,

Thank you for sharing your experience with reset-brick option.
Since he is using the gluster version 3.7.6, we do not have the reset-brick
[1] option implemented there. It is introduced in 3.9.0. He has to go with
replace-brick with the force option if he wants to use the same path & name
for the new brick.
Yes, it is recommended to have the new brick to be of the same size as that
of the other bricks.

[1]
https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command

Regards,
Karthik

On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:

> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum?

2019-04-10 Thread Karthik Subrahmanya
Hi,

I guess you missed Ravishankar's reply [1] for this query, on your previous
thread.
[1] https://lists.gluster.org/pipermail/gluster-users/2019-April/036247.html

Regards,
Karthik

On Wed, Apr 10, 2019 at 8:59 PM Ingo Fischer  wrote:

> Hi All,
>
> I had a replica 2 cluster to host my VM images from my Proxmox cluster.
> I got a bit around split brain scenarios by using "nufa" to make sure
> the files are located on the host where the machine also runs normally.
> So in fact one replica could fail and I still had the VM working.
>
> But then I thought about doing better and decided to add a node to
> increase replica and I decided against arbiter approach. During this I
> also decided to go away from nufa to make it a more normal approach.
>
> But in fact by adding the third replica and removing nufa I'm not really
> better on availability - only split-brain-chance. I'm still at the point
> that only one node is allowed to fail because else the now active client
> quorum is no longer met and FS goes read only (which in fact is not
> really better then failing completely as it was before).
>
> So I thought about adding arbiter bricks as "kind of 4th replica (but
> without space needs) ... but then I read in docs that only "replica 3
> arbiter 1" is allowed as combination. Is this still true?
> If docs are true: Why arbiter is not allowed for higher replica counts?
> It would allow to improve on client quorum in my understanding.
>
> Thank you for your opinion and/or facts :-)
>
> Ingo
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 10:23 AM Karthik Subrahmanya 
wrote:

> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
>
- if you remember the commands/steps that you followed, please give that as
well.

> - what problem(s) did you face?
>

> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:
>
>> Hi Karthnik,
>> I used only once the brick replace function when I wanted to change my
>> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
>> Most probably I should have stopped the source arbiter before doing that,
>> but the docs didn't mention it.
>>
>> Thus I always use reset-brick, as it never let me down.
>>
>> Best Regards,
>> Strahil Nikolov
>> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>>
>> Hi Strahil,
>>
>> Thank you for sharing your experience with reset-brick option.
>> Since he is using the gluster version 3.7.6, we do not have the
>> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
>> to go with replace-brick with the force option if he wants to use the same
>> path & name for the new brick.
>> Yes, it is recommended to have the new brick to be of the same size as
>> that of the other bricks.
>>
>> [1]
>> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>>
>> Regards,
>> Karthik
>>
>> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>>
>> I have used reset-brick - but I have just changed the brick layout.
>> You may give it a try, but I guess you need your new brick to have same
>> amount of space (or more).
>>
>> Maybe someone more experienced should share a more sound solution.
>>
>> Best Regards,
>> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
>> wrote:
>> >
>> > Hi all,
>> >
>> > I am running replica 3 gluster with 3 bricks. One of my servers failed
>> - all disks are showing errors and raid is in fault state.
>> >
>> > Type: Replicate
>> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> > Status: Started
>> > Number of Bricks: 1 x 3 = 3
>> > Transport-type: tcp
>> > Bricks:
>> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
>> down
>> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>> >
>> > So one of my bricks is totally failed (node2). It went down and all
>> data are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> > This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>> >
>> > What is the procedure of replacing Brick2 on node2, can someone advice?
>> I can’t find anything relevant in documentation.
>> >
>> > Thanks in advance,
>> > Martin
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Strahil,

Can you give us some more insights on
- the volume configuration you were using?
- why you wanted to replace your brick?
- which brick(s) you tried replacing?
- what problem(s) did you face?

Regards,
Karthik

On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:

> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 1:40 PM Strahil Nikolov 
wrote:

> Hi Karthik,
>
> - the volume configuration you were using?
> I used oVirt 4.2.6 Gluster Wizard, so I guess - we need to involve the
> oVirt devs here.
> - why you wanted to replace your brick?
> I have deployed the arbiter on another location as I thought I can deploy
> the Thin Arbiter (still waiting the docs to be updated), but once I
> realized that GlusterD doesn't support Thin Arbiter, I had to build another
> machine for a local arbiter - thus a replacement was needed.
>
We are working on supporting Thin-arbiter with GlusterD. Once done, we will
update on the users list so that you can play with it and let us know your
experience.

> - which brick(s) you tried replacing?
> I was replacing the old arbiter with a new one
> - what problem(s) did you face?
> All oVirt VMs got paused due to I/O errors.
>
There could be many reasons for this. Without knowing the exact state of
the system at that time, I am afraid to make any comment on this.

>
> At the end, I have rebuild the whole setup and I never tried to replace
> the brick this way (used only reset-brick which didn't cause any issues).
>
> As I mentioned that was on v3.12, which is not the default for oVirt
> 4.3.x - so my guess is that it is OK now (current is v5.5).
>
I don't remember anyone complaining about this recently. This should work
in the latest releases.

>
> Just sharing my experience.
>
Highly appreciated.

Regards,
Karthik

>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 11 април 2019 г., 0:53:52 ч. Гринуич-4, Karthik Subrahmanya <
> ksubr...@redhat.com> написа:
>
>
> Hi Strahil,
>
> Can you give us some more insights on
> - the volume configuration you were using?
> - why you wanted to replace your brick?
> - which brick(s) you tried replacing?
> - what problem(s) did you face?
>
> Regards,
> Karthik
>
> On Thu, Apr 11, 2019 at 10:14 AM Strahil  wrote:
>
> Hi Karthnik,
> I used only once the brick replace function when I wanted to change my
> Arbiter (v3.12.15 in oVirt 4.2.7)  and it was a complete disaster.
> Most probably I should have stopped the source arbiter before doing that,
> but the docs didn't mention it.
>
> Thus I always use reset-brick, as it never let me down.
>
> Best Regards,
> Strahil Nikolov
> On Apr 11, 2019 07:34, Karthik Subrahmanya  wrote:
>
> Hi Strahil,
>
> Thank you for sharing your experience with reset-brick option.
> Since he is using the gluster version 3.7.6, we do not have the
> reset-brick [1] option implemented there. It is introduced in 3.9.0. He has
> to go with replace-brick with the force option if he wants to use the same
> path & name for the new brick.
> Yes, it is recommended to have the new brick to be of the same size as
> that of the other bricks.
>
> [1]
> https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command
>
> Regards,
> Karthik
>
> On Wed, Apr 10, 2019 at 10:31 PM Strahil  wrote:
>
> I have used reset-brick - but I have just changed the brick layout.
> You may give it a try, but I guess you need your new brick to have same
> amount of space (or more).
>
> Maybe someone more experienced should share a more sound solution.
>
> Best Regards,
> Strahil NikolovOn Apr 10, 2019 12:42, Martin Toth 
> wrote:
> >
> > Hi all,
> >
> > I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
> >
> > Type: Replicate
> > Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> > Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is
> down
> > Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
> >
> > So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> > This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
> >
> > What is the procedure of replacing Brick2 on node2, can someone advice?
> I can’t find anything relevant in documentation.
> >
> > Thanks in advance,
> > Martin
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:

> Hi Karthik,
>
> more over, I would like to ask if there are some recommended
> settings/parameters for SHD in order to achieve good or fair I/O while
> volume will be healed when I will replace Brick (this should trigger
> healing process).
>
If I understand you concern correctly, you need to get fair I/O performance
for clients while healing takes place as part of  the replace brick
operation. For this you can turn off the "data-self-heal" and
"metadata-self-heal" options until the heal completes on the new brick.
Turning off client side healing doesn't compromise data integrity and
consistency. During the read request from client, pending xattr is
evaluated for replica copies and read is only served from correct copy.
During writes, IO will continue on both the replicas, SHD will take care of
healing files.
After replacing the brick, we strongly recommend you to consider upgrading
your gluster to one of the maintained versions. We have many stability
related fixes there, which can handle some critical issues and corner cases
which you could hit during these kind of scenarios.

Regards,
Karthik

> I had some problems in past when healing was triggered, VM disks became
> unresponsive because healing took most of I/O. My volume containing only
> big files with VM disks.
>
> Thanks for suggestions.
> BR,
> Martin
>
> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>
> Thanks, this looks ok to me, I will reset brick because I don't have any
> data anymore on failed node so I can use same path / brick name.
>
> Is reseting brick dangerous command? Should I be worried about some
> possible failure that will impact remaining two nodes? I am running really
> old 3.7.6 but stable version.
>
> Thanks,
> BR!
>
> Martin
>
>
> On 10 Apr 2019, at 12:20, Karthik Subrahmanya  wrote:
>
> Hi Martin,
>
> After you add the new disks and creating raid array, you can run the
> following command to replace the old brick with new one:
>
> - If you are going to use a different name to the new brick you can run
> gluster volume replace-brickcommit force
>
> - If you are planning to use the same name for the new brick as well then
> you can use
> gluster volume reset-brickcommit force
> Here old-brick & new-brick's hostname &  path should be same.
>
> After replacing the brick, make sure the brick comes online using volume
> status.
> Heal should automatically start, you can check the heal status to see all
> the files gets replicated to the newly added brick. If it does not start
> automatically, you can manually start that by running gluster volume heal
> .
>
> HTH,
> Karthik
>
> On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:
>
>> Hi all,
>>
>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>> all disks are showing errors and raid is in fault state.
>>
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>
>> So one of my bricks is totally failed (node2). It went down and all data
>> are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>>
>> What is the procedure of replacing Brick2 on node2, can someone advice? I
>> can’t find anything relevant in documentation.
>>
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Martin,

After you add the new disks and creating raid array, you can run the
following command to replace the old brick with new one:

- If you are going to use a different name to the new brick you can run
gluster volume replace-brickcommit force

- If you are planning to use the same name for the new brick as well then
you can use
gluster volume reset-brickcommit force
Here old-brick & new-brick's hostname &  path should be same.

After replacing the brick, make sure the brick comes online using volume
status.
Heal should automatically start, you can check the heal status to see all
the files gets replicated to the newly added brick. If it does not start
automatically, you can manually start that by running gluster volume heal
.

HTH,
Karthik

On Wed, Apr 10, 2019 at 3:13 PM Martin Toth  wrote:

> Hi all,
>
> I am running replica 3 gluster with 3 bricks. One of my servers failed -
> all disks are showing errors and raid is in fault state.
>
> Type: Replicate
> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>
> So one of my bricks is totally failed (node2). It went down and all data
> are lost (failed raid on node2). Now I am running only two bricks on 2
> servers out from 3.
> This is really critical problem for us, we can lost all data. I want to
> add new disks to node2, create new raid array on them and try to replace
> failed brick on this node.
>
> What is the procedure of replacing Brick2 on node2, can someone advice? I
> can’t find anything relevant in documentation.
>
> Thanks in advance,
> Martin
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-11 Thread Karthik Subrahmanya
On Thu, Apr 11, 2019 at 6:38 PM Martin Toth  wrote:

> Hi Karthik,
>
> On Thu, Apr 11, 2019 at 12:43 PM Martin Toth  wrote:
>
>> Hi Karthik,
>>
>> more over, I would like to ask if there are some recommended
>> settings/parameters for SHD in order to achieve good or fair I/O while
>> volume will be healed when I will replace Brick (this should trigger
>> healing process).
>>
> If I understand you concern correctly, you need to get fair I/O
> performance for clients while healing takes place as part of  the replace
> brick operation. For this you can turn off the "data-self-heal" and
> "metadata-self-heal" options until the heal completes on the new brick.
>
>
> This is exactly what I mean. I am running VM disks on remaining 2 (out of
> 3 - one failed as mentioned) nodes and I need to ensure there will be fair
> I/O performance available on these two nodes while replace brick operation
> will heal volume.
> I will not run any VMs on node where replace brick operation will be
> running. So if I understand correctly, when I will set :
>
> # gluster volume set  cluster.data-self-heal off
> # gluster volume set  cluster.metadata-self-heal off
>
> this will tell Gluster clients (libgfapi and FUSE mount) not to read from
> node “where replace brick operation” is in place but from remaing two
> healthy nodes. Is this correct ? Thanks for clarification.
>
The reads will be served from one of the good bricks since the file will
either be not present on the replaced brick at the time of read or it will
be present but marked for heal if it is not already healed. If already
healed by SHD, then it could be served from the new brick as well, but
there won't be any problem in reading from there in that scenario.
By setting these two options whenever a read comes from client it will not
try to heal the file for data/metadata. Otherwise it would try to heal (if
not already healed by SHD) when the read comes on this, hence slowing down
the client.

>
> Turning off client side healing doesn't compromise data integrity and
> consistency. During the read request from client, pending xattr is
> evaluated for replica copies and read is only served from correct copy.
> During writes, IO will continue on both the replicas, SHD will take care of
> healing files.
> After replacing the brick, we strongly recommend you to consider upgrading
> your gluster to one of the maintained versions. We have many stability
> related fixes there, which can handle some critical issues and corner cases
> which you could hit during these kind of scenarios.
>
>
> This will be first priority in infrastructure after fixing this cluster
> back to fully functional replica3. I will upgrade to 3.12.x and then to
> version 5 or 6.
>
Sounds good.

If you are planning to have the same name for the new brick and if you get
the error like "Brick may be containing or be contained by an existing
brick" even after using the force option, try  using a different name. That
should work.

Regards,
Karthik

>
> BR,
> Martin
>
> Regards,
> Karthik
>
>> I had some problems in past when healing was triggered, VM disks became
>> unresponsive because healing took most of I/O. My volume containing only
>> big files with VM disks.
>>
>> Thanks for suggestions.
>> BR,
>> Martin
>>
>> On 10 Apr 2019, at 12:38, Martin Toth  wrote:
>>
>> Thanks, this looks ok to me, I will reset brick because I don't have any
>> data anymore on failed node so I can use same path / brick name.
>>
>> Is reseting brick dangerous command? Should I be worried about some
>> possible failure that will impact remaining two nodes? I am running really
>> old 3.7.6 but stable version.
>>
>> Thanks,
>> BR!
>>
>> Martin
>>
>>
>> On 10 Apr 2019, at 12:20, Karthik Subrahmanya 
>> wrote:
>>
>> Hi Martin,
>>
>> After you add the new disks and creating raid array, you can run the
>> following command to replace the old brick with new one:
>>
>> - If you are going to use a different name to the new brick you can run
>> gluster volume replace-brickcommit
>> force
>>
>> - If you are planning to use the same name for the new brick as well then
>> you can use
>> gluster volume reset-brickcommit force
>> Here old-brick & new-brick's hostname &  path should be same.
>>
>> After replacing the brick, make sure the brick comes online using volume
>> status.
>> Heal should automatically start, you can check the heal status to see all
>> the files gets replicated to the newly added brick. If it does not start
>> automatically, you can m

Re: [Gluster-users] Volume stuck unable to add a brick

2019-04-16 Thread Karthik Subrahmanya
Hi Boris,

Thank you for providing the logs.
The problem here is because of the "auth.allow: 127.0.0.1" setting on the
volume.
When you try to add a new brick to the volume internally replication module
will try to set some metadata on the existing bricks to mark pending heal
on the new brick, by creating a temporary mount. Because of the auth.allow
setting that mount gets permission errors as seen in the below logs,
leading to add-brick failure.

>From data-gluster-dockervols.log-webserver9 :
[2019-04-15 14:00:34.226838] I [addr.c:55:compare_addr_and_update]
0-/data/gluster/dockervols: allowed = "127.0.0.1", received addr =
"192.168.200.147"
[2019-04-15 14:00:34.226895] E [MSGID: 115004]
[authenticate.c:224:gf_authenticate] 0-auth: no authentication module is
interested in accepting remote-client (null)
[2019-04-15 14:00:34.227129] E [MSGID: 115001]
[server-handshake.c:848:server_setvolume] 0-dockervols-server: Cannot
authenticate client from
webserver8.cast.org-55674-2019/04/15-14:00:20:495333-dockervols-client-2-0-0
3.12.2 [Permission denied]

>From dockervols-add-brick-mount.log :
[2019-04-15 14:00:20.672033] W [MSGID: 114043]
[client-handshake.c:1109:client_setvolume_cbk] 0-dockervols-client-2:
failed to set the volume [Permission denied]
[2019-04-15 14:00:20.672102] W [MSGID: 114007]
[client-handshake.c:1138:client_setvolume_cbk] 0-dockervols-client-2:
failed to get 'process-uuid' from reply dict [Invalid argument]
[2019-04-15 14:00:20.672129] E [MSGID: 114044]
[client-handshake.c:1144:client_setvolume_cbk] 0-dockervols-client-2:
SETVOLUME on remote-host failed: Authentication failed [Permission denied]
[2019-04-15 14:00:20.672151] I [MSGID: 114049]
[client-handshake.c:1258:client_setvolume_cbk] 0-dockervols-client-2:
sending AUTH_FAILED event

This is a known issue and we are planning to fix this. For the time being
we have a workaround for this.
- Before you try adding the brick set the auth.allow option to default
i.e., "*" or you can do this by running "gluster v reset 
auth.allow"
- Add the brick
- After it succeeds set back the auth.allow option to the previous value.

Regards,
Karthik

On Tue, Apr 16, 2019 at 5:20 PM Boris Goldowsky  wrote:

> OK, log files attached.
>
>
>
> Boris
>
>
>
>
>
> *From: *Karthik Subrahmanya 
> *Date: *Tuesday, April 16, 2019 at 2:52 AM
> *To: *Atin Mukherjee , Boris Goldowsky <
> bgoldow...@cast.org>
> *Cc: *Gluster-users 
> *Subject: *Re: [Gluster-users] Volume stuck unable to add a brick
>
>
>
>
>
>
>
> On Mon, Apr 15, 2019 at 9:43 PM Atin Mukherjee 
> wrote:
>
> +Karthik Subrahmanya 
>
>
>
> Didn't we we fix this problem recently? Failed to set extended attribute
> indicates that temp mount is failing and we don't have quorum number of
> bricks up.
>
>
>
> We had two fixes which handles two kind of add-brick scenarios.
>
> [1] Fails add-brick when increasing the replica count if any of the brick
> is down to avoid data loss. This can be overridden by using the force
> option.
>
> [2] Allow add-brick to set the extended attributes by the temp mount if
> the volume is already mounted (has clients).
>
>
>
> They are in version 3.12.2 so, patch [1] is present there. But since they
> are using the force option it should not have any problem even if they have
> any brick down. The error message they are getting is also different, so it
> is not because of any brick being down I guess.
>
> Patch [2] is not present in 3.12.2 and it is not the conversion from plain
> distribute to replicate volume. So the scenario is different here.
>
> It seems like they are hitting some other issue.
>
>
>
> @Boris,
>
> Can you attach the add-brick's temp mount log. The file name should look
> something like "dockervols-add-brick-mount.log". Can you also provide all
> the brick logs of that volume during that time.
>
>
>
> [1] https://review.gluster.org/#/c/glusterfs/+/16330/
>
> [2] https://review.gluster.org/#/c/glusterfs/+/21791/
>
>
>
> Regards,
>
> Karthik
>
>
>
> Boris - What's the gluster version are you using?
>
>
>
>
>
>
>
> On Mon, Apr 15, 2019 at 7:35 PM Boris Goldowsky 
> wrote:
>
> Atin, thank you for the reply.  Here are all of those pieces of
> information:
>
>
>
> [bgoldowsky@webserver9 ~]$ gluster --version
>
> glusterfs 3.12.2
>
> (same on all nodes)
>
>
>
> [bgoldowsky@webserver9 ~]$ sudo gluster peer status
>
> Number of Peers: 3
>
>
>
> Hostname: webserver11.cast.org
>
> Uuid: c2b147fd-cab4-4859-9922-db5730f8549d
>
> State: Peer in Cluster (Connected)
>
>
>
> Hostname: webserver1.cast.org
>
> Uuid: 4b918f65-2c9d-478e-8648-81d1d6526d4c
&g

Re: [Gluster-users] Volume stuck unable to add a brick

2019-04-16 Thread Karthik Subrahmanya
You're welcome!

On Tue 16 Apr, 2019, 7:12 PM Boris Goldowsky,  wrote:

> That worked!  Thank you SO much!
>
>
>
> Boris
>
>
>
>
>
> *From: *Karthik Subrahmanya 
> *Date: *Tuesday, April 16, 2019 at 8:20 AM
> *To: *Boris Goldowsky 
> *Cc: *Atin Mukherjee , Gluster-users <
> gluster-users@gluster.org>
> *Subject: *Re: [Gluster-users] Volume stuck unable to add a brick
>
>
>
> Hi Boris,
>
>
>
> Thank you for providing the logs.
>
> The problem here is because of the "auth.allow: 127.0.0.1" setting on the
> volume.
>
> When you try to add a new brick to the volume internally replication
> module will try to set some metadata on the existing bricks to mark pending
> heal on the new brick, by creating a temporary mount. Because of the
> auth.allow setting that mount gets permission errors as seen in the below
> logs, leading to add-brick failure.
>
>
>
> From data-gluster-dockervols.log-webserver9 :
>
> [2019-04-15 14:00:34.226838] I [addr.c:55:compare_addr_and_update]
> 0-/data/gluster/dockervols: allowed = "127.0.0.1", received addr =
> "192.168.200.147"
>
> [2019-04-15 14:00:34.226895] E [MSGID: 115004]
> [authenticate.c:224:gf_authenticate] 0-auth: no authentication module is
> interested in accepting remote-client (null)
>
> [2019-04-15 14:00:34.227129] E [MSGID: 115001]
> [server-handshake.c:848:server_setvolume] 0-dockervols-server: Cannot
> authenticate client from
> webserver8.cast.org-55674-2019/04/15-14:00:20:495333-dockervols-client-2-0-0
> 3.12.2 [Permission denied]
>
>
>
> From dockervols-add-brick-mount.log :
>
> [2019-04-15 14:00:20.672033] W [MSGID: 114043]
> [client-handshake.c:1109:client_setvolume_cbk] 0-dockervols-client-2:
> failed to set the volume [Permission denied]
>
> [2019-04-15 14:00:20.672102] W [MSGID: 114007]
> [client-handshake.c:1138:client_setvolume_cbk] 0-dockervols-client-2:
> failed to get 'process-uuid' from reply dict [Invalid argument]
>
> [2019-04-15 14:00:20.672129] E [MSGID: 114044]
> [client-handshake.c:1144:client_setvolume_cbk] 0-dockervols-client-2:
> SETVOLUME on remote-host failed: Authentication failed [Permission denied]
>
> [2019-04-15 14:00:20.672151] I [MSGID: 114049]
> [client-handshake.c:1258:client_setvolume_cbk] 0-dockervols-client-2:
> sending AUTH_FAILED event
>
>
>
> This is a known issue and we are planning to fix this. For the time being
> we have a workaround for this.
>
> - Before you try adding the brick set the auth.allow option to default
> i.e., "*" or you can do this by running "gluster v reset 
> auth.allow"
>
> - Add the brick
>
> - After it succeeds set back the auth.allow option to the previous value.
>
>
>
> Regards,
>
> Karthik
>
>
>
> On Tue, Apr 16, 2019 at 5:20 PM Boris Goldowsky 
> wrote:
>
> OK, log files attached.
>
>
>
> Boris
>
>
>
>
>
> *From: *Karthik Subrahmanya 
> *Date: *Tuesday, April 16, 2019 at 2:52 AM
> *To: *Atin Mukherjee , Boris Goldowsky <
> bgoldow...@cast.org>
> *Cc: *Gluster-users 
> *Subject: *Re: [Gluster-users] Volume stuck unable to add a brick
>
>
>
>
>
>
>
> On Mon, Apr 15, 2019 at 9:43 PM Atin Mukherjee 
> wrote:
>
> +Karthik Subrahmanya 
>
>
>
> Didn't we we fix this problem recently? Failed to set extended attribute
> indicates that temp mount is failing and we don't have quorum number of
> bricks up.
>
>
>
> We had two fixes which handles two kind of add-brick scenarios.
>
> [1] Fails add-brick when increasing the replica count if any of the brick
> is down to avoid data loss. This can be overridden by using the force
> option.
>
> [2] Allow add-brick to set the extended attributes by the temp mount if
> the volume is already mounted (has clients).
>
>
>
> They are in version 3.12.2 so, patch [1] is present there. But since they
> are using the force option it should not have any problem even if they have
> any brick down. The error message they are getting is also different, so it
> is not because of any brick being down I guess.
>
> Patch [2] is not present in 3.12.2 and it is not the conversion from plain
> distribute to replicate volume. So the scenario is different here.
>
> It seems like they are hitting some other issue.
>
>
>
> @Boris,
>
> Can you attach the add-brick's temp mount log. The file name should look
> something like "dockervols-add-brick-mount.log". Can you also provide all
> the brick logs of that volume during that time.
>
>
>
> [1] https://review.gluster.org/#/c/glusterfs/+/16330/
>
> [2] https://review.gluster.org/#

Re: [Gluster-users] adding thin arbiter

2019-04-22 Thread Karthik Subrahmanya
Hi,

Currently we do not have support for converting an existing volume to a
thin-arbiter volume. It is also not supported to replace the thin-arbiter
brick with a new one.
You can create a fresh thin arbiter volume using GD2 framework and play
around that. Feel free to share your experience with thin-arbiter.
The GD1 CLIs are being implemented. We will keep things posted on this list
as and when they are ready to consume.

Regards,
Karthik

On Fri, Apr 19, 2019 at 8:39 PM  wrote:

> Hi guys,
>
> On an existing volume, I have a volume with 3 replica. One of them is an
> arbiter. Is there a way to change the arbiter to a thin-arbiter? I tried
> removing the arbiter brick and add it back, but the add-brick command
> does't take the --thin-arbiter option.
>
> xpk
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread Karthik Subrahmanya
Hi Martin,

The reset-brick command is introduced in 3.9.0 and not present in 3.7.6.
You can try using the same replace-brick command with the force option even
if you want to use the same name for the brick being replaced.
3.7.6 is EOLed long back and glusterfs-6 is the latest version with lots of
improvements, bug fixes and new features. The release schedule can be found
at [1]. Upgrading to one of the maintained branch is highly recommended.

On Wed, Apr 10, 2019 at 4:14 PM Martin Toth  wrote:

> I’ve read this documentation but step 4 is really unclear to me. I don’t
> understand related mkdir/rmdir/setfattr and so on.
>
> Step 4:
>
> *Using the gluster volume fuse mount (In this example: /mnt/r2) set up
> metadata so that data will be synced to new brick (In this case it is
> from Server1:/home/gfs/r2_1 to Server1:/home/gfs/r2_5)*
>
> Why should I change trusted.non-existent-key on this volume?
> It is even more confusing because other mentioned howtos does not contain
> this step at all.
>
Those steps were needed in the older releases to set some metadata on the
good bricks so that heal should not happen from the replaced brick to good
bricks, which can lead to data loss. Since you are on 3.7.6, we have
automated all these steps for you in that branch. You just need to run the
replace-brick command, which will take care of all those things.

[1] https://www.gluster.org/release-schedule/

Regards,
Karthik

>
> BR,
> Martin
>
> On 10 Apr 2019, at 11:54, Davide Obbi  wrote:
>
>
> https://docs.gluster.org/en/v3/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick
>
> On Wed, Apr 10, 2019 at 11:42 AM Martin Toth  wrote:
>
>> Hi all,
>>
>> I am running replica 3 gluster with 3 bricks. One of my servers failed -
>> all disks are showing errors and raid is in fault state.
>>
>> Type: Replicate
>> Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
>> Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick is down
>> Brick3: node3.san:/tank/gluster/gv0imagestore/brick1
>>
>> So one of my bricks is totally failed (node2). It went down and all data
>> are lost (failed raid on node2). Now I am running only two bricks on 2
>> servers out from 3.
>> This is really critical problem for us, we can lost all data. I want to
>> add new disks to node2, create new raid array on them and try to replace
>> failed brick on this node.
>>
>> What is the procedure of replacing Brick2 on node2, can someone advice? I
>> can’t find anything relevant in documentation.
>>
>> Thanks in advance,
>> Martin
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Davide Obbi
> Senior System Administrator
>
> Booking.com B.V.
> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> Direct +31207031558
> [image: Booking.com] 
> Empowering people to experience the world since 1996
> 43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
> million reported listings
> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume stuck unable to add a brick

2019-04-16 Thread Karthik Subrahmanya
On Mon, Apr 15, 2019 at 9:43 PM Atin Mukherjee 
wrote:

> +Karthik Subrahmanya 
>
> Didn't we we fix this problem recently? Failed to set extended attribute
> indicates that temp mount is failing and we don't have quorum number of
> bricks up.
>

We had two fixes which handles two kind of add-brick scenarios.
[1] Fails add-brick when increasing the replica count if any of the brick
is down to avoid data loss. This can be overridden by using the force
option.
[2] Allow add-brick to set the extended attributes by the temp mount if the
volume is already mounted (has clients).

They are in version 3.12.2 so, patch [1] is present there. But since they
are using the force option it should not have any problem even if they have
any brick down. The error message they are getting is also different, so it
is not because of any brick being down I guess.
Patch [2] is not present in 3.12.2 and it is not the conversion from plain
distribute to replicate volume. So the scenario is different here.
It seems like they are hitting some other issue.

@Boris,
Can you attach the add-brick's temp mount log. The file name should look
something like "dockervols-add-brick-mount.log". Can you also provide all
the brick logs of that volume during that time.

[1] https://review.gluster.org/#/c/glusterfs/+/16330/
[2] https://review.gluster.org/#/c/glusterfs/+/21791/

Regards,
Karthik

>
> Boris - What's the gluster version are you using?
>
>
>
> On Mon, Apr 15, 2019 at 7:35 PM Boris Goldowsky 
> wrote:
>
>> Atin, thank you for the reply.  Here are all of those pieces of
>> information:
>>
>>
>>
>> [bgoldowsky@webserver9 ~]$ gluster --version
>>
>> glusterfs 3.12.2
>>
>> (same on all nodes)
>>
>>
>>
>> [bgoldowsky@webserver9 ~]$ sudo gluster peer status
>>
>> Number of Peers: 3
>>
>>
>>
>> Hostname: webserver11.cast.org
>>
>> Uuid: c2b147fd-cab4-4859-9922-db5730f8549d
>>
>> State: Peer in Cluster (Connected)
>>
>>
>>
>> Hostname: webserver1.cast.org
>>
>> Uuid: 4b918f65-2c9d-478e-8648-81d1d6526d4c
>>
>> State: Peer in Cluster (Connected)
>>
>> Other names:
>>
>> 192.168.200.131
>>
>> webserver1
>>
>>
>>
>> Hostname: webserver8.cast.org
>>
>> Uuid: be2f568b-61c5-4016-9264-083e4e6453a2
>>
>> State: Peer in Cluster (Connected)
>>
>> Other names:
>>
>> webserver8
>>
>>
>>
>> [bgoldowsky@webserver1 ~]$ sudo gluster v info
>>
>> Volume Name: dockervols
>>
>> Type: Replicate
>>
>> Volume ID: 6093a9c6-ec6c-463a-ad25-8c3e3305b98a
>>
>> Status: Started
>>
>> Snapshot Count: 0
>>
>> Number of Bricks: 1 x 3 = 3
>>
>> Transport-type: tcp
>>
>> Bricks:
>>
>> Brick1: webserver1:/data/gluster/dockervols
>>
>> Brick2: webserver11:/data/gluster/dockervols
>>
>> Brick3: webserver9:/data/gluster/dockervols
>>
>> Options Reconfigured:
>>
>> nfs.disable: on
>>
>> transport.address-family: inet
>>
>> auth.allow: 127.0.0.1
>>
>>
>>
>> Volume Name: testvol
>>
>> Type: Replicate
>>
>> Volume ID: 4d5f00f5-00ea-4dcf-babf-1a76eca55332
>>
>> Status: Started
>>
>> Snapshot Count: 0
>>
>> Number of Bricks: 1 x 4 = 4
>>
>> Transport-type: tcp
>>
>> Bricks:
>>
>> Brick1: webserver1:/data/gluster/testvol
>>
>> Brick2: webserver9:/data/gluster/testvol
>>
>> Brick3: webserver11:/data/gluster/testvol
>>
>> Brick4: webserver8:/data/gluster/testvol
>>
>> Options Reconfigured:
>>
>> transport.address-family: inet
>>
>> nfs.disable: on
>>
>>
>>
>> [bgoldowsky@webserver8 ~]$ sudo gluster v info
>>
>> Volume Name: dockervols
>>
>> Type: Replicate
>>
>> Volume ID: 6093a9c6-ec6c-463a-ad25-8c3e3305b98a
>>
>> Status: Started
>>
>> Snapshot Count: 0
>>
>> Number of Bricks: 1 x 3 = 3
>>
>> Transport-type: tcp
>>
>> Bricks:
>>
>> Brick1: webserver1:/data/gluster/dockervols
>>
>> Brick2: webserver11:/data/gluster/dockervols
>>
>> Brick3: webserver9:/data/gluster/dockervols
>>
>> Options Reconfigured:
>>
>> nfs.disable: on
>>
>> transport.address-family: inet
>>
>> auth.allow: 127.0.0.1
>>
>>
>>
>> Volume Name: testvol
>>
>> Type: Replicate
>>
>> Volume ID: 4d5f00f5-00ea-4dcf-babf-1a76eca55332
&

Re: [Gluster-users] Healing completely loss file on replica 3 volume

2019-12-01 Thread Karthik Subrahmanya
Hi Dmitry,

Answers inline.

On Fri, Nov 29, 2019 at 6:26 PM Dmitry Antipov  wrote:

> I'm trying to manually garbage data on bricks

First of all changing data directly on the backend is not recommended and
is not supported.  All the operations needs to be done from the client
mount point.
Only few special cases needs changing few data about the file directly on
the backend.

> (when the volume is
> stopped) and then check whether healing is possible. For example:
>
> Start:
>
> # glusterd --debug
>
> Bricks (on EXT4 mounted with 'rw,realtime'):
>
> # mkdir /root/data0
> # mkdir /root/data1
> # mkdir /root/data2
>
> Volume:
>
> # gluster volume create gv0 replica 3 [local-ip]:/root/data0
> [local-ip]:/root/data1  [local-ip]:/root/data2 force
> volume create: gv0: success: please start the volume to access data
> # gluster volume start gv0
> volume start: gv0: success
>
> Mount:
>
> # mkdir /mnt/gv0
> # mount -t glusterfs [local-ip]:/gv0 /mnt/gv0
> WARNING: getfattr not found, certain checks will be skipped..
>
> Create file:
>
> # openssl rand 65536 > /mnt/gv0/64K
> # md5sum /mnt/gv0/64K
> ca53c9c1b6cd78f59a91cd1b0b866ed9 /mnt/gv0/64K
>
> Umount and down the volume:
>
> # umount /mnt/gv0
> # gluster volume stop gv0
> Stopping volume will make its data inaccessible. Do you want to continue?
> (y/n) y
> volume stop: gv0: success
>
> Check data on bricks:
>
> # md5sum /root/data[012]/64K
> ca53c9c1b6cd78f59a91cd1b0b866ed9  /root/data0/64K
> ca53c9c1b6cd78f59a91cd1b0b866ed9  /root/data1/64K
> ca53c9c1b6cd78f59a91cd1b0b866ed9  /root/data2/64K
>
> Seems OK. Then garbage all:
>
> # openssl rand 65536 > /root/data0/64K
> # openssl rand 65536 > /root/data1/64K
> # openssl rand 65536 > /root/data2/64K
> # md5sum /root/data[012]/64K
> c69096d15007578dab95d9940f89e167  /root/data0/64K
> b85292fb60f1a1d27f1b0e3bc6bfdfae  /root/data1/64K
> c2e90335cc2f600ddab5c53a992b2bb6  /root/data2/64K
>
> Restart the volume and start full heal:
>
> # gluster volume start gv0
> volume start: gv0: success
> # /usr/glusterfs/sbin/gluster volume heal gv0 full
> Launching heal operation to perform full self heal on volume gv0 has been
> successful
> Use heal info commands to check status.
>
> Finally:
>
> # gluster volume heal gv0 info summary
>
> Brick [local-ip]:/root/data0
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick [local-ip]:/root/data1
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick [local-ip]:/root/data2
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Since all 3 copies are different from each other, majority voting is
> useless
> and data (IIUC) should be marked as split-brain at least. But I'm seeing
> just
> zeroes everywhere above. Why it is so?
>
Since the data is changed directly on the backend, gluster will not be
knowing these changes. If the changes done from the client mount fails on
some bricks, only those will be recognized and marked by gluster so that it
can heal those when possible. Since this is a replica 3 volume and if you
end up in split-brain when you are doing the operations on the mount pint,
then that will be a bug. As far as this is considered it is not a bug or
issue on the gluster side.

HTH,
Karthik

>
> Thanks in advance,
> Dmitry
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Healing entries get healed but there are constantly new entries appearing

2020-02-10 Thread Karthik Subrahmanya
Hi Ulrich,

Thank you for letting us know. Glad to hear that your system is back to
normal.

Regards,
Karthik

On Mon, Feb 10, 2020 at 9:51 PM Ulrich Pötter 
wrote:

> Hello Karthik,
>
> thank you very much. That was exactly the problem.
> Running the command (cat
> /.meta/graphs/active/-client-*/private | egrep -i
> 'connected') on the clients revealed that a few were not connected to all
> bricks.
> After restarting them, everything went back to normal.
>
> Regards,
> Ulrich
> Am 06.02.20 um 12:51 schrieb Karthik Subrahmanya:
>
> Hi Ulrich,
>
> From the problem statement, seems like the client(s) have lost connection
> with brick. Can you give the following information?
> - How many clients are there for this volume and which version they are in?
> - gluster volume info  & gluster volume status  outputs
> - Check whether all the clients are connected to all the bricks.
> If you are using the fuse clients give the output of the following from
> all the clients
> cat /.meta/graphs/active/-client-*/private | egrep
> -i 'connected'
> -If you are using non fuse clients generate the statedumps (
> https://docs.gluster.org/en/latest/Troubleshooting/statedump/) of each
> clients and give the output of
> grep -A 2 "xlator.protocol.client" /var/run/gluster/
> (If you have changed the statedump-path replace the path in the above
> command)
>
> Regards,
> Karthik
>
> On Thu, Feb 6, 2020 at 5:06 PM Ulrich Pötter 
> wrote:
>
>> Dear Gluster Users,
>>
>> we are running the following Gluster setup:
>> Replica 3 on 3 servers. Two are CentOs 7.6 with Gluster 6.5 and one was
>> upgraded to Centos 7.7 with Gluster 6.7.
>>
>> Since the upgrade to gluster 6.7 on one of the servers, we encountered
>> the following issue:
>> New healing entries appear and get healed, but soon afterwards new
>> healing entries appear.
>> The abovementioned problem started after we upgraded the server.
>> The healing issues do not only appear on the upgraded server, but on all
>> three.
>>
>> This does not seem to be a split brain issue as the output of the
>> command "gluster volume head  info split-brain" is "number of
>> entries in split-brain: 0"
>>
>> Has anyone else observed such behavior with different Gluster versions
>> in one replica setup?
>>
>> We hesitate with updating the other nodes, as we do not know if this
>> standard Gluster behaviour or if there is more to this problem.
>>
>> Can you help us?
>>
>> Thanks in advance,
>> Ulrich
>>
>> 
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/441850968
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Healing entries get healed but there are constantly new entries appearing

2020-02-06 Thread Karthik Subrahmanya
Hi Ulrich,

>From the problem statement, seems like the client(s) have lost connection
with brick. Can you give the following information?
- How many clients are there for this volume and which version they are in?
- gluster volume info  & gluster volume status  outputs
- Check whether all the clients are connected to all the bricks.
If you are using the fuse clients give the output of the following from all
the clients
cat /.meta/graphs/active/-client-*/private | egrep -i
'connected'
-If you are using non fuse clients generate the statedumps (
https://docs.gluster.org/en/latest/Troubleshooting/statedump/) of each
clients and give the output of
grep -A 2 "xlator.protocol.client" /var/run/gluster/
(If you have changed the statedump-path replace the path in the above
command)

Regards,
Karthik

On Thu, Feb 6, 2020 at 5:06 PM Ulrich Pötter 
wrote:

> Dear Gluster Users,
>
> we are running the following Gluster setup:
> Replica 3 on 3 servers. Two are CentOs 7.6 with Gluster 6.5 and one was
> upgraded to Centos 7.7 with Gluster 6.7.
>
> Since the upgrade to gluster 6.7 on one of the servers, we encountered
> the following issue:
> New healing entries appear and get healed, but soon afterwards new
> healing entries appear.
> The abovementioned problem started after we upgraded the server.
> The healing issues do not only appear on the upgraded server, but on all
> three.
>
> This does not seem to be a split brain issue as the output of the
> command "gluster volume head  info split-brain" is "number of
> entries in split-brain: 0"
>
> Has anyone else observed such behavior with different Gluster versions
> in one replica setup?
>
> We hesitate with updating the other nodes, as we do not know if this
> standard Gluster behaviour or if there is more to this problem.
>
> Can you help us?
>
> Thanks in advance,
> Ulrich
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster Heal Issue

2020-02-04 Thread Karthik Subrahmanya
Hi Chris,

By looking at the data provided (hope the other entry is also a file and
not the parent of the file for which the stat & getfattrs are provided) it
seems like the parent(s) of these entries are missing the entry pending
markers on the good bricks, which is necessary to create these files on the
bad node. Can you try the following steps and let us know whether you have
any luck with this?

- Find the actual path of the files on one of the bricks where it exists
using the below command
find  -samefile //
- Run lookup on the files from a client mount point
- Run gluster volume heal 
- Check the heal info to see whether these files gets healed or not

Regards,
Karthik

On Sat, Feb 1, 2020 at 2:25 PM Christian Reiss 
wrote:

> Hey folks,
>
> in our production setup with 3 nodes (HCI) we took one host down
> (maintenance, stop gluster, poweroff via ssh/ovirt engine). Once it was
> up the gluster hat 2k healing entries that went down in a matter on 10
> minutes to 2.
>
> Those two give me a headache:
>
> [root@node03:~] # gluster vol heal ssd_storage info
> Brick node01:/gluster_bricks/ssd_storage/ssd_storage
> 
> 
> Status: Connected
> Number of entries: 2
>
> Brick node02:/gluster_bricks/ssd_storage/ssd_storage
> Status: Connected
> Number of entries: 0
>
> Brick node03:/gluster_bricks/ssd_storage/ssd_storage
> 
> 
> Status: Connected
> Number of entries: 2
>
> No paths, only gfid. We took down node2, so it does not have the file:
>
> [root@node01:~] # md5sum
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
> 75c4941683b7eabc223fc9d5f022a77c
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>
> [root@node02:~] # md5sum
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
> md5sum:
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6:
>
> No such file or directory
>
> [root@node03:~] # md5sum
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
> 75c4941683b7eabc223fc9d5f022a77c
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>
> The other two files are md5-identical.
>
> These flags are identical, too:
>
> [root@node01:~] # getfattr -d -m . -e hex
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
> getfattr: Removing leading '/' from absolute path names
> # file:
>
> gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.ssd_storage-client-1=0x004f0001
> trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
>
> trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
>
> trusted.glusterfs.mdata=0x015e349b1e1139aa2a5e349b1e1139aa2a5e349949304a5eb2
>
> getfattr: Removing leading '/' from absolute path names
> # file:
>
> gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.ssd_storage-client-1=0x004f0001
> trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
>
> trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
>
> trusted.glusterfs.mdata=0x015e349b1e1139aa2a5e349b1e1139aa2a5e349949304a5eb2
>
>
> The only thing I can see is the different change times, really:
>
> [root@node01:~] # stat
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>File:
>
> ‘/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6’
>Size: 67108864   Blocks: 54576  IO Block: 4096   regular file
> Device: fd09h/64777dInode: 16152829909  Links: 2
> Access: (0660/-rw-rw)  Uid: (0/root)   Gid: (0/root)
> Context: system_u:object_r:glusterd_brick_t:s0
> Access: 2020-01-31 22:16:57.812620635 +0100
> Modify: 2020-02-01 07:19:24.183045141 +0100
> Change: 2020-02-01 07:19:24.186045203 +0100
>   Birth: -
>
> [root@node03:~] # stat
>
> /gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
>File:
>
> ‘/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6’
>Size: 67108864   Blocks: 54576  IO 

Re: [Gluster-users] set: failed: Quorum not met. Volume operation not allowed. SUCCESS

2020-08-27 Thread Karthik Subrahmanya
Hi,

You had server-quorum enabled which could be the cause of the errors
you were getting at the first place. In latest releases only
client-quorum is enabled and the server-quorum is disabled by default.
Yes, the order matters in such cases.

Regards,
Karthik

On Fri, Aug 28, 2020 at 2:37 AM WK  wrote:
>
> So success!
>
> I dont know why but when I set "server-quorum-type" to none FIRST it
> seemed to work without complaining about quorum.
>
> then quorum-type was able to be set to none as well
>
>gluster volume set VOL cluster.server-quorum-type none
>gluster volume set VOL cluster.quorum-type none
>
> Finally I used Karthik's remove-brick command and it worked this time
> and I am now copying off the needed image.
>
> So I guess order counts.
>
> Thanks.
>
> -wk
>
>
>
> On 8/27/2020 12:47 PM, WK wrote:
> > No Luck.  Same problem.
> >
> > I stopped the volume.
> >
> > I ran the remove-brick command. It warned about not being able to
> > migrate files from removed bricks and asked if I want to continue.
> >
> > when I say 'yes'
> >
> > Gluster responds with 'failed: Quorum not met Volume operation not
> > allowed'
> >
> >
> > -wk
> >
> > On 8/26/2020 9:28 PM, Karthik Subrahmanya wrote:
> >> Hi,
> >>
> >> Since your two nodes are scrapped and there is no chance that they
> >> will come back in later time, you can try reducing the replica count
> >> to 1 by removing the down bricks from the volume and then mounting the
> >> volume back to access the data which is available on the only up
> >> brick.
> >> The remove brick command looks like this:
> >>
> >> gluster volume remove-brick VOLNAME replica 1
> >> :/brick-path
> >> :/brick-path force
> >>
> >> Regards,
> >> Karthik
> >>
> >>
> >> On Thu, Aug 27, 2020 at 4:24 AM WK  wrote:
> >>> So we migrated a number of VMs from a small Gluster 2+1A volume to a
> >>> newer cluster.
> >>>
> >>> Then a few days later the client said he wanted an old forgotten
> >>> file that had been left behind on the the deprecated system.
> >>>
> >>> However the arbiter and one of the brick nodes had been scraped,
> >>> leaving only a single gluster node.
> >>>
> >>> The volume I need uses shards so I am not excited about having to
> >>> piece it back together.
> >>>
> >>> I powered it up the single node and tried to mount the volume and of
> >>> course it refused to mount due to quorum and gluster volume status
> >>> shows the volume offline
> >>>
> >>> In the past I had worked around this issue by disabling quorum, but
> >>> that was years ago, so I googled it and found list messages
> >>> suggesting the following:
> >>>
> >>>   gluster volume set VOL cluster.quorum-type none
> >>>   gluster volume set VOL cluster.server-quorum-type none
> >>>
> >>> However, the gluster 6.9 system refuses to accept those set commands
> >>> due to the quorum and spits out the set failed error.
> >>>
> >>> So in modern Gluster, what is the preferred method for starting and
> >>> mounting a  single node/volume that was once part of a actual 3 node
> >>> cluster?
> >>>
> >>> Thanks.
> >>>
> >>> -wk
> >>>
> >>>
> >>> 
> >>>
> >>>
> >>>
> >>> Community Meeting Calendar:
> >>>
> >>> Schedule -
> >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >>> Bridge: https://bluejeans.com/441850968
> >>>
> >>> Gluster-users mailing list
> >>> Gluster-users@gluster.org
> >>> https://lists.gluster.org/mailman/listinfo/gluster-users
> > 
> >
> >
> >
> > Community Meeting Calendar:
> >
> > Schedule -
> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > Bridge: https://bluejeans.com/441850968
> >
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] set: failed: Quorum not met. Volume operation not allowed.

2020-08-26 Thread Karthik Subrahmanya
Hi,

Since your two nodes are scrapped and there is no chance that they
will come back in later time, you can try reducing the replica count
to 1 by removing the down bricks from the volume and then mounting the
volume back to access the data which is available on the only up
brick.
The remove brick command looks like this:

gluster volume remove-brick VOLNAME replica 1
:/brick-path
:/brick-path force

Regards,
Karthik


On Thu, Aug 27, 2020 at 4:24 AM WK  wrote:
>
> So we migrated a number of VMs from a small Gluster 2+1A volume to a newer 
> cluster.
>
> Then a few days later the client said he wanted an old forgotten file that 
> had been left behind on the the deprecated system.
>
> However the arbiter and one of the brick nodes had been scraped, leaving only 
> a single gluster node.
>
> The volume I need uses shards so I am not excited about having to piece it 
> back together.
>
> I powered it up the single node and tried to mount the volume and of course 
> it refused to mount due to quorum and gluster volume status shows the volume 
> offline
>
> In the past I had worked around this issue by disabling quorum, but that was 
> years ago, so I googled it and found list messages suggesting the following:
>
>  gluster volume set VOL cluster.quorum-type none
>  gluster volume set VOL cluster.server-quorum-type none
>
> However, the gluster 6.9 system refuses to accept those set commands due to 
> the quorum and spits out the set failed error.
>
> So in modern Gluster, what is the preferred method for starting and mounting 
> a  single node/volume that was once part of a actual 3 node cluster?
>
> Thanks.
>
> -wk
>
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gfid mismatch detected - but no split brain - how to solve?

2020-06-01 Thread Karthik Subrahmanya
Hi,

I am assuming that you are using one of the maintained versions of gluster.

GFID split-brains can be resolved using one of the methods in the
split-brain resolution CLI as explained in the section "3. Resolution of
split-brain using gluster CLI" of
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/.

The things to be noted here while using this CLI for resolving gfid
split-brains are:
- You can not use the GFID of the file as an argument with any of the CLI
options to resolve GFID split-brain. It should be the absolute path as seen
from the mount point to the file considered as source.
- With source-brick option there is no way to resolve all the GFID
split-brain in one shot by not specifying any file-path in the CLI as done
while resolving data or metadata split-brain. For each file in GFID
split-brain, run the CLI with the policy you want to use.
- Resolving directory GFID split-brain using CLI with the "source-brick"
option in a "distributed-replicated" volume needs to be done on all the
volumes explicitly if the file is in gfid split-brain on multiple
subvolumes. Since directories get created on all the subvolumes, using one
particular brick as source for directory GFID split-brain, heal the
directories for that subvolume. In this case, other subvolumes must be
healed using the brick which has the same GFID as that of the previous
brick which was used as source for healing other subvolume.
Regards,
Karthik

On Sat, May 30, 2020 at 3:39 AM lejeczek  wrote:

> hi Guys
>
> I'm seeing "Gfid mismatch detected" in the logs but no split
> brain indicated (4-way replica)
>
> Brick
> swir-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 22
> Number of entries in heal pending: 22
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick
> whale-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 22
> Number of entries in heal pending: 22
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick
> rider-ring8:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick dzien:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER.USER-HOME
> Status: Connected
> Total Number of entries: 10
> Number of entries in heal pending: 10
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> On swir-ring8:
> ...
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /lock_file>,
> 37b2456f-5216-4679-ac5c-4908b24f895a on USER-HOME-client-15
> and ba8f87ed-9bf3-404e-8d67-2631923e1645 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.034935] and [2020-05-29 21:47:49.079480]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /t>,
> d7a4ed01-139b-4df3-8070-31bd620a6f15 on USER-HOME-client-15
> and d794b6ba-2a1d-4043-bb31-b98b22692763 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.126173] and [2020-05-29 21:47:49.155432]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /Tables.docx>,
> 344febd8-c89c-4bf3-8ad8-6494c2189c43 on USER-HOME-client-15
> and 48d5b12b-03f4-46bf-bed1-9f8f88815615 on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:47:49.194061] and [2020-05-29 21:47:49.239896]
> The message "E [MSGID: 108008]
> [afr-self-heal-entry.c:257:afr_selfheal_detect_gfid_and_type_mismatch]
> 0-USER-HOME-replicate-0: Skipping conservative merge on the
> file." repeated 8 times between [2020-05-29 21:47:49.037812]
> and [2020-05-29 21:47:49.240423]
> ...
>
> On whale-ring8:
> ...
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /pcs>,
> a83d0e5f-ef3a-40ab-be7b-784538d150be on USER-HOME-client-15
> and 89af3d31-81fa-4242-b8f7-0f49fd5fe57b on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:45:46.152052] and [2020-05-29 21:45:46.422393]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /history_database>,
> 81ebb0d5-264a-4eba-984a-e18673b43826 on USER-HOME-client-15
> and 2498a303-8937-43c3-939e-5e1d786b07fa on
> USER-HOME-client-13." repeated 2 times between [2020-05-29
> 21:45:46.167704] and [2020-05-29 21:45:46.437702]
> The message "E [MSGID: 108008]
> [afr-self-heal-common.c:384:afr_gfid_split_brain_source]
> 0-USER-HOME-replicate-0: Gfid mismatch detected for
> /client-state>,
> 

Re: [Gluster-users] File system very slow

2020-05-27 Thread Karthik Subrahmanya
Hi,

Please provide the following information to understand the setup and debug
this further:
- Which version of gluster you are using?
- 'gluster volume status atlassian' to confirm both bricks and shds are up
or not
- Complete output of 'gluster volume profile atlassian info' before running
'du' and during 'du'. Redirect this output to separate files and attach
them here
- Get the client side profile as well by following
https://docs.gluster.org/en/latest/Administrator%20Guide/Performance%20Testing/
- 'gluster volume heal atlassian info' to check whether there are any
pending heals and client side heal is contributing to this

Regards,
Karthik

On Wed, May 27, 2020 at 1:06 AM  wrote:

> I had a parsing error. It is Volume Name: atlassian
>
> On Tue, May 26, 2020 at 3:12 PM  wrote:
>
>> # gluster volume info
>>
>> Volume Name: myvol
>> Type: Replicate
>> Volume ID: cbdef65c-79ea-496e-b777-b6a2981b29cf
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1:/data/foo/gluster
>> Brick2: node2:/data/foo/gluster
>> Options Reconfigured:
>> client.event-threads: 4
>> server.event-threads: 4
>> performance.stat-prefetch: on
>> network.inode-lru-limit: 16384
>> performance.md-cache-timeout: 1
>> performance.cache-invalidation: false
>> performance.cache-samba-metadata: false
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> performance.io-thread-count: 16
>> performance.cache-refresh-timeout: 5
>> performance.write-behind-window-size: 5MB
>> performance.cache-size: 1GB
>> transport.address-family: inet
>> storage.fips-mode-rchecksum: on
>> nfs.disable: on
>> performance.client-io-threads: off
>> diagnostics.latency-measurement: on
>> diagnostics.count-fop-hits: on
>>
>> On Tue, May 26, 2020 at 3:06 PM Sunil Kumar Heggodu Gopala Acharya <
>> shegg...@redhat.com> wrote:
>>
>>> Hi,
>>>
>>> Please share the gluster volume information.
>>>
>>> # gluster vol info
>>>
>>>
>>> Regards,
>>>
>>> Sunil kumar Acharya
>>>
>>>
>>> On Wed, May 27, 2020 at 12:30 AM  wrote:
>>>
 I made the following changes for small file performance as suggested by
 http://blog.gluster.org/gluster-tiering-and-small-file-performance/

 I am still seeing du -sh /data/shared taking 39 minutes.

 Any other tuning I can do. Most of my files are 15K. Here is sample of
 small files with size and number of occurrences

 FileSize.# of occurrence
 

 1.1K 1122
 1.1M 1040
 1.2K 1281
 1.2M 1357
 1.3K 1149
 1.3M 1098
 1.4K 1119
 1.5K 1189
 1.6K 1036
 1.7K 1169
 11K 2157
 12K 2398
 13K 2402
 14K 2406*15K 2426*
 16K 2386
 17K 1986
 18K 2037
 19K 1829
 2.0K 1027
 2.1K 1048
 2.4K 1013
 20K 1585
 21K 1713
 22K 1590
 23K 1371
 24K 1428
 25K 1444
 26K 1391
 27K 1217
 28K 1485
 29K 1282
 30K 1303
 31K 1275
 32K 1296
 33K 1058
 36K 1023
 37K 1107
 39K 1092
 41K 1034
 42K 1187
 46K 1030





 On Mon, May 25, 2020 at 5:30 PM  wrote:

> time du -sh /data/shared
>
> 431G/data/shared
>
> real45m49.992s
> user0m20.043s
> sys2m32.456s
>
>
> gluster fs is extremely slow
>
> Any suggestions on what settings to change to improve it?
>
>
> --
> Asif Iqbal
> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
>
>

 --
 Asif Iqbal
 PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
 A: Because it messes up the order in which people normally read text.
 Q: Why is top-posting such a bad thing?

 



 Community Meeting Calendar:

 Schedule -
 Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
 Bridge: https://bluejeans.com/441850968

 Gluster-users mailing list
 Gluster-users@gluster.org
 https://lists.gluster.org/mailman/listinfo/gluster-users

>>>
>>
>> --
>> Asif Iqbal
>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>>
>>
>
> --
> Asif Iqbal
> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users 

Re: [Gluster-users] Issues with replicated gluster volume

2020-06-17 Thread Karthik Subrahmanya
Hi Ahemad,

Sorry for a lot of back and forth on this. But we might need a few more
details to find the actual cause here.
What version of gluster you are running on server and client nodes?
Also provide the statedump [1] of the bricks and the client process when
the hang is seen.

[1] https://docs.gluster.org/en/latest/Troubleshooting/statedump/

Regards,
Karthik

On Wed, Jun 17, 2020 at 9:25 AM ahemad_sh...@yahoo.com <
ahemad_sh...@yahoo.com> wrote:

> I have a 3 replica gluster volume created in 3 nodes and when one node is
> down due to some issue and the clients not able access volume. This was the
> issue. I have fixed the server and it is back. There was downtime at
> client. I just want to avoid the downtime since it is 3 replica.
>
> I am testing the high availability now by making one of the brick server
> rebooting or shut down manually. I just want to make volume accessible
> always by client. That is the reason we went for replica volume.
>
> So I just would like to know how to make the client volume high available
> even some VM or node which is having gluster volume goes down unexpectedly
> had down time of 10 hours.
>
>
>
> Glusterfsd service is used to stop which is disabled in my cluster and I
> see one more service running gluserd.
>
> Will starting glusterfsd service in all 3 replica nodes will help in
> achieving what I am trying.
>
> Hope I am clear.
>
> Thanks,
> Ahemad
>
>
>
> Thanks,
> Ahemad
>
>
>
> On Tue, Jun 16, 2020 at 23:12, Strahil Nikolov
>  wrote:
> In my cluster ,  the service is enabled and running.
>
> What actually  is your problem  ?
> When a gluster brick process dies unexpectedly - all fuse clients will be
> waiting for the timeout .
> The service glusterfsd is ensuring that during system shutdown ,  the
> brick procesees will be shutdown in such way that all native clients  won't
> 'hang' and wait for the timeout, but will directly choose  another brick.
>
> The same happens when you manually run the kill script  -  all gluster
> processes  shutdown and all clients are  redirected to another brick.
>
> Keep in mind that fuse mounts will  also be killed  both by the script and
> the glusterfsd service.
>
> Best Regards,
> Strahil Nikolov
>
> На 16 юни 2020 г. 19:48:32 GMT+03:00, ahemad shaik 
> написа:
> > Hi Strahil,
> >I have the gluster setup on centos 7 cluster.I see glusterfsd service
> >and it is in inactive state.
> >systemctl status glusterfsd.service● glusterfsd.service - GlusterFS
> >brick processes (stopping only)   Loaded: loaded
> >(/usr/lib/systemd/system/glusterfsd.service; disabled; vendor preset:
> >disabled)   Active: inactive (dead)
> >
> >so you mean starting this service in all the nodes where gluster
> >volumes are created, will solve the issue ?
> >
> >Thanks,Ahemad
> >
> >
> >On Tuesday, 16 June, 2020, 10:12:22 pm IST, Strahil Nikolov
> > wrote:
> >
> > Hi ahemad,
> >
> >the  script  kills  all gluster  processes,  so the clients won't wait
> >for the timeout before  switching to another node in the TSP.
> >
> >In CentOS/RHEL,  there  is a  systemd  service called
> >'glusterfsd.service' that  is taking care on shutdown to kill all
> >processes,  so clients won't hung.
> >
> >systemctl cat glusterfsd.service --no-pager
> ># /usr/lib/systemd/system/glusterfsd.service
> >[Unit]
> >Description=GlusterFS brick processes (stopping only)
> >After=network.target glusterd.service
> >
> >[Service]
> >Type=oneshot
> ># glusterd starts the glusterfsd processed on-demand
> ># /bin/true will mark this service as started, RemainAfterExit keeps it
> >active
> >ExecStart=/bin/true
> >RemainAfterExit=yes
> ># if there are no glusterfsd processes, a stop/reload should not give
> >an error
> >ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true"
> >ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true"
> >
> >[Install]
> >WantedBy=multi-user.target
> >
> >Best Regards,
> >Strahil  Nikolov
> >
> >На 16 юни 2020 г. 18:41:59 GMT+03:00, ahemad shaik
> > написа:
> >> Hi,
> >>I see there is a script file in below mentioned path in all nodes
> >using
> >>which gluster volume
> >>created./usr/share/glusterfs/scripts/stop-all-gluster-processes.sh
> >>I need to create a system service and when ever there is some server
> >>down, we need to call this script or we need to have it run always it
> >>will take care when some node is down to make sure that client will
> >not
> >>have any issues in accessing mount point ?
> >>can you please share any documentation on how to use this.That will be
> >>great help.
> >>Thanks,Ahemad
> >>
> >>
> >>
> >>
> >>On Tuesday, 16 June, 2020, 08:59:31 pm IST, Strahil Nikolov
> >> wrote:
> >>
> >> Hi Ahemad,
> >>
> >>You can simplify it  by creating a systemd service that  will  call
> >>the script.
> >>
> >>It was  already mentioned  in a previous thread  (with example),  so
> >>you can just use  it.
> >>
> >>Best Regards,
> >>Strahil  Nikolov
> >>
> >>На 16 юни 2020 г. 16:02:07 GMT+03:00, Hu Bert 
> >>написа:
> 

Re: [Gluster-users] Issues with replicated gluster volume

2020-06-17 Thread Karthik Subrahmanya
Hi Ahemad,

Glad to hear that your problem is resolved. Thanks Strahil and Hubert for
your suggestions.


On Wed, Jun 17, 2020 at 12:29 PM ahemad shaik 
wrote:

> Hi
>
> I tried starting and enabling the glusterfsd service suggested by Hubert
> and Strahil, I see that works when one of the  gluster volume is not
> available and client still able to access the mount point.
>
> Thanks so much Strahil , Hubert and Karthik on your suggestion and for the
> time.
>
> can you please help on making data consistent in all nodes when we have
> some 5 hours of down time and one of the server . how to achieve data
> consistency in all 3 nodes.
>
When the node/brick which was down comes back up, gluster self heal daemon
(glustershd) will automatically do the syncing of the data to the down
brick and make it consistent with the good copies. You can alternatively
run the index heal command "gluster volume heal " to trigger the
heal manually and you can see the entries needing heal and the progress of
heal by running "gluster volume heal  info".

HTH,
Karthik

>
> Any documentation on that end will be helpful.
>
> Thanks,
> Ahemad
>
> On Wednesday, 17 June, 2020, 12:03:06 pm IST, Karthik Subrahmanya <
> ksubr...@redhat.com> wrote:
>
>
> Hi Ahemad,
>
> Sorry for a lot of back and forth on this. But we might need a few more
> details to find the actual cause here.
> What version of gluster you are running on server and client nodes?
> Also provide the statedump [1] of the bricks and the client process when
> the hang is seen.
>
> [1] https://docs.gluster.org/en/latest/Troubleshooting/statedump/
>
> Regards,
> Karthik
>
> On Wed, Jun 17, 2020 at 9:25 AM ahemad_sh...@yahoo.com <
> ahemad_sh...@yahoo.com> wrote:
>
> I have a 3 replica gluster volume created in 3 nodes and when one node is
> down due to some issue and the clients not able access volume. This was the
> issue. I have fixed the server and it is back. There was downtime at
> client. I just want to avoid the downtime since it is 3 replica.
>
> I am testing the high availability now by making one of the brick server
> rebooting or shut down manually. I just want to make volume accessible
> always by client. That is the reason we went for replica volume.
>
> So I just would like to know how to make the client volume high available
> even some VM or node which is having gluster volume goes down unexpectedly
> had down time of 10 hours.
>
>
>
> Glusterfsd service is used to stop which is disabled in my cluster and I
> see one more service running gluserd.
>
> Will starting glusterfsd service in all 3 replica nodes will help in
> achieving what I am trying.
>
> Hope I am clear.
>
> Thanks,
> Ahemad
>
>
>
> Thanks,
> Ahemad
>
>
>
> On Tue, Jun 16, 2020 at 23:12, Strahil Nikolov
>  wrote:
> In my cluster ,  the service is enabled and running.
>
> What actually  is your problem  ?
> When a gluster brick process dies unexpectedly - all fuse clients will be
> waiting for the timeout .
> The service glusterfsd is ensuring that during system shutdown ,  the
> brick procesees will be shutdown in such way that all native clients  won't
> 'hang' and wait for the timeout, but will directly choose  another brick.
>
> The same happens when you manually run the kill script  -  all gluster
> processes  shutdown and all clients are  redirected to another brick.
>
> Keep in mind that fuse mounts will  also be killed  both by the script and
> the glusterfsd service.
>
> Best Regards,
> Strahil Nikolov
>
> На 16 юни 2020 г. 19:48:32 GMT+03:00, ahemad shaik 
> написа:
> > Hi Strahil,
> >I have the gluster setup on centos 7 cluster.I see glusterfsd service
> >and it is in inactive state.
> >systemctl status glusterfsd.service● glusterfsd.service - GlusterFS
> >brick processes (stopping only)   Loaded: loaded
> >(/usr/lib/systemd/system/glusterfsd.service; disabled; vendor preset:
> >disabled)   Active: inactive (dead)
> >
> >so you mean starting this service in all the nodes where gluster
> >volumes are created, will solve the issue ?
> >
> >Thanks,Ahemad
> >
> >
> >On Tuesday, 16 June, 2020, 10:12:22 pm IST, Strahil Nikolov
> > wrote:
> >
> > Hi ahemad,
> >
> >the  script  kills  all gluster  processes,  so the clients won't wait
> >for the timeout before  switching to another node in the TSP.
> >
> >In CentOS/RHEL,  there  is a  systemd  service called
> >'glusterfsd.service' that  is taking care on shutdown to kill all
> >processes,  so clients won't hung.
> >
> >systemctl cat glusterfsd.service --no-pager
> ># /u

Re: [Gluster-users] Issues with replicated gluster volume

2020-06-16 Thread Karthik Subrahmanya
gt; /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (-->
> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (-->
> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ) 0-glustervol-client-2:
> forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at
> 2020-06-16 05:16:48.756212 (xid=0xb4)
> [2020-06-16 05:16:59.731876] E [rpc-clnt.c:346:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (-->
> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (-->
> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (-->
> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ) 0-glustervol-client-2:
> forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at
> 2020-06-16 05:16:52.258940 (xid=0xb5)
> [2020-06-16 05:16:59.732060] E [rpc-clnt.c:346:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (-->
> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (-->
> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (-->
> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ) 0-glustervol-client-2:
> forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at
> 2020-06-16 05:16:54.618301 (xid=0xb6)
> [2020-06-16 05:16:59.732246] E [rpc-clnt.c:346:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (-->
> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (-->
> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (-->
> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ) 0-glustervol-client-2:
> forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at
> 2020-06-16 05:16:58.288790 (xid=0xb7)
> [2020-06-16 05:17:10.245302] I [rpc-clnt.c:2028:rpc_clnt_reconfig]
> 0-glustervol-client-2: changing port to 49152 (from 0)
> [2020-06-16 05:17:10.249896] I [MSGID: 114046]
> [client-handshake.c:1105:client_setvolume_cbk] 0-glustervol-client-2:
> Connected to glustervol-client-2, attached to remote volume '/data'.
>
> Thanks,
> Ahemad
>
> On Tuesday, 16 June, 2020, 10:10:16 am IST, Karthik Subrahmanya <
> ksubr...@redhat.com> wrote:
>
>
> Hi Ahemad,
>
> Please provide the following info:
> 1. gluster peer status
> 2. gluster volume info glustervol
> 3. gluster volume status glustervol
> 4. client log from node4 when you saw unavailability
>
> Regards,
> Karthik
>
> On Mon, Jun 15, 2020 at 11:07 PM ahemad shaik 
> wrote:
>
> Hi There,
>
> I have created 3 replica gluster volume with 3 bricks from 3 nodes.
>
> "gluster volume create glustervol replica 3 transport tcp node1:/data
> node2:/data node3:/data force"
>
> mounted on client node using below command.
>
> "mount -t glusterfs node4:/glustervol/mnt/"
>
> when any of the node (either node1,node2 or node3) goes down, gluster
> mount/volume (/mnt) not accessible at client (node4).
>
> purpose of replicated volume is high availability but not able to achieve
> it.
>
> Is it a bug or i am missing anything.
>
>
> Any suggestions will be great help!!!
>
> kindly suggest.
>
> Thanks,
> Ahemad
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Issues with replicated gluster volume

2020-06-16 Thread Karthik Subrahmanya
Hi Ahemad,

The logs don't seem to indicate anything specific.
Except this message in the glusterd logs, which I am not sure whether it
might cause any problems
[2020-06-16 07:19:27.418884] E [MSGID: 101097]
[xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing:
/usr/lib64/glusterfs/7.5/rpc-transport/socket.so: undefined symbol:
xlator_api
@Sanju Rakonde  could you please let us know what this
error message means and whether this could be leading to this specific
problem?

@ahemad shaik  replica 3 volumes will be available
when 1 of 3 bicks is down. Could you also paste the exact error that you
are getting while accessing or doing any operations on the mount? Did you
try rebooting any other node and facing the same problem?
To check whether there is any issue with the client connection can you give
the output of this command from node4?
cat /.meta/graphs/active/-client-*/private | egrep -i
'connected'
In your case the actual command might look like this
cat /mnt/.meta/graphs/active/glustervol-client-*/private | egrep -i
'connected'

Regards,
Karthik

On Tue, Jun 16, 2020 at 1:31 PM ahemad shaik  wrote:

> Hi Karthik,
>
> Please find attached logs.
>
> kindly suggest on how to make the volume high available.
>
> Thanks,
> Ahemad
>
>
>
> On Tuesday, 16 June, 2020, 12:09:10 pm IST, Karthik Subrahmanya <
> ksubr...@redhat.com> wrote:
>
>
> Hi,
>
> Thanks for the clarification.
> In that case can you attach complete glusterd, bricks and mount logs from
> all the nodes when this happened.
> Also paste the output that you are seeing when you try to access or do
> operations on the mount point.
>
> Regards,
> Karthik
>
> On Tue, Jun 16, 2020 at 11:55 AM ahemad shaik 
> wrote:
>
> Sorry, It was a typo.
>
> The command i exact command i have used is below.
>
> The volume is mounted on node4.
>
> ""mount -t glusterfs node1:/glustervol /mnt/ ""
>
>
> gluster Volume is created from node1,node2 and node3.
>
> ""gluster volume create glustervol replica 3 transport tcp node1:/data
> node2:/data node3:/data force""
>
> I have tried rebooting node3 to test high availability.
>
> I hope it is clear now.
>
> Please let me know if any questions.
>
> Thanks,
> Ahemad
>
>
>
> On Tuesday, 16 June, 2020, 11:45:48 am IST, Karthik Subrahmanya <
> ksubr...@redhat.com> wrote:
>
>
> Hi Ahemad,
>
> A quick question on the mount command that you have used
> "mount -t glusterfs node4:/glustervol/mnt/"
> Here you are specifying the hostname as node4 instead of node{1,2,3} which
> actually host the volume that you intend to mount. Is this a typo or did
> you paste the same command that you used for mounting?
> If it is the actual command that you have used, then node4 seems to have
> some old volume details which are not cleared properly and it is being used
> while mounting. According to the peer info that you provided, only node1, 2
> & 3 are part of the list, so node4 is unaware of the volume that you want
> to mount and this command is mounting a volume which is only visible to
> node4.
>
> Regards,
> Karthik
>
> On Tue, Jun 16, 2020 at 11:11 AM ahemad shaik 
> wrote:
>
> Hi Karthik,
>
>
> Please provide the following info, I see there are errors unable to
> connect to port and warning that transport point end connected. Please find
> the complete logs below.
>
> kindly suggest.
>
> 1. gluster peer status
>
> gluster peer status
> Number of Peers: 2
>
> Hostname: node1
> Uuid: 0e679115-15ad-4a85-9d0a-9178471ef90
> State: Peer in Cluster (Connected)
>
> Hostname: node2
> Uuid: 785a7c5b-86d3-45b9-b371-7e66e7fa88e0
> State: Peer in Cluster (Connected)
>
>
> gluster pool list
> UUIDHostname
>   State
> 0e679115-15ad-4a85-9d0a-9178471ef90 node1 Connected
> 785a7c5b-86d3-45b9-b371-7e66e7fa88e0node2
>  Connected
> ec137af6-4845-4ebb-955a-fac1df9b7b6clocalhost(node3)
>   Connected
>
> 2. gluster volume info glustervol
>
> Volume Name: glustervol
> Type: Replicate
> Volume ID: 5422bb27-1863-47d5-b216-61751a01b759
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: node1:/data
> Brick2: node2:/data
> Brick3: node3:/data
> Options Reconfigured:
> performance.client-io-threads: off
> nfs.disable: on
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
>
> 3. gluster volume status glustervol
>
> gluster volume status glustervol
> Status of volume: glustervol
> Gluster process TCP Port  RDMA Port 

Re: [Gluster-users] Issues with replicated gluster volume

2020-06-15 Thread Karthik Subrahmanya
Hi Ahemad,

Please provide the following info:
1. gluster peer status
2. gluster volume info glustervol
3. gluster volume status glustervol
4. client log from node4 when you saw unavailability

Regards,
Karthik

On Mon, Jun 15, 2020 at 11:07 PM ahemad shaik 
wrote:

> Hi There,
>
> I have created 3 replica gluster volume with 3 bricks from 3 nodes.
>
> "gluster volume create glustervol replica 3 transport tcp node1:/data
> node2:/data node3:/data force"
>
> mounted on client node using below command.
>
> "mount -t glusterfs node4:/glustervol/mnt/"
>
> when any of the node (either node1,node2 or node3) goes down, gluster
> mount/volume (/mnt) not accessible at client (node4).
>
> purpose of replicated volume is high availability but not able to achieve
> it.
>
> Is it a bug or i am missing anything.
>
>
> Any suggestions will be great help!!!
>
> kindly suggest.
>
> Thanks,
> Ahemad
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Self-heal never finishes

2021-02-28 Thread Karthik Subrahmanya
Hey,

I think [1] should help you.
If you can't find anything matching your situation or can't resolve it with
any of the methods listed there, please open an issue for this at [2], with
the following information.
- volume info, volume status, heal info and shd logs from node-1 & arbiter.
- Output of "getfattr -d -e hex -m. " of a few
entries from all the bricks which are listed in the heal info output.

[1]
https://uskarthik.blogspot.com/2020/02/entries-are-not-getting-healed-what-is_26.html
[2] https://github.com/gluster/glusterfs/issues

Regards,
Karthik

On Mon, Mar 1, 2021 at 8:12 AM Ben  wrote:

> I'm having a problem where once one of my volumes requires healing, it
> never finishes the process. I use a 3-node replica cluster (2 node +
> arbiter) as oVirt storage for virtual machines. I'm using Gluster version
> 8.3.
>
> When I patch my Gluster nodes, I try to keep the system online by
> rebooting them one at a time. However, I've found that once I reboot node
> 2, when it comes back up, self-heal will begin on both node 1 and the
> arbiter and never finish. I have let it run for weeks and still have
> entries in gluster volume heal  info. No heal entries are reported
> on the node that rebooted.
>
> I've set the volumes to the virt group (gluster volume set  group
> virt) per the RHEV documentation, and the gluster nodes don't seem to be
> overly busy. I'm hoping someone can point me in the right direction --
> since the volumes never heal, I'm basically running on one node. Let me
> know what additional info will be helpful for troubleshooting, and thank
> you in advance.
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Arbiter

2022-02-08 Thread Karthik Subrahmanya
Hi Andre,

Striped volumes are deprecated long back, see [1] & [2]. Seems like you are
using a very old version. May I know which version of gluster you are
running and the gluster volume info please?
Release schedule and the maintained branches can be found at [3].


[1] https://docs.gluster.org/en/latest/release-notes/6.0/
[2] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html
[3] https://www.gluster.org/release-schedule/

Regards,
Karthik

On Mon, Feb 7, 2022 at 9:43 PM Andre Probst  wrote:

> I have a striped and replicated volume with 4 nodes. How do I add an
> arbiter to this volume?
>
>
> --
> André Probst
> Consultor de Tecnologia
> 43 99617 8765
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Arbiter

2022-02-08 Thread Karthik Subrahmanya
On Tue, Feb 8, 2022 at 4:28 PM Gilberto Ferreira 
wrote:

> Forgive me if I am wrong, but AFAIK, arbiter is for a two-node
> configuration, isn't it?
>
Arbiter is to give the same consistency as replica-3 with 3 nodes, without
the need to have a full sized 3rd brick [1]. It will store the files and
their metadata but no data. This acts as a quorum brick to
avoid split-brains.
Since there are 4 nodes available here, and based on the configuration of
the available volumes (requested volume info for the same) I was thinking
whether the arbiter brick can be hosted on one of those nodes itself, or a
new node is required.

[1]
https://docs.gluster.org/en/latest/Administrator-Guide/arbiter-volumes-and-quorum/

Regards,
Karthik

> ---
> Gilberto Nunes Ferreira
> (47) 99676-7530 - Whatsapp / Telegram
>
>
>
>
>
>
> Em ter., 8 de fev. de 2022 às 07:17, Karthik Subrahmanya <
> ksubr...@redhat.com> escreveu:
>
>> Hi Andre,
>>
>> Striped volumes are deprecated long back, see [1] & [2]. Seems like you
>> are using a very old version. May I know which version of gluster you are
>> running and the gluster volume info please?
>> Release schedule and the maintained branches can be found at [3].
>>
>>
>> [1] https://docs.gluster.org/en/latest/release-notes/6.0/
>> [2]
>> https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html
>> [3] https://www.gluster.org/release-schedule/
>>
>> Regards,
>> Karthik
>>
>> On Mon, Feb 7, 2022 at 9:43 PM Andre Probst 
>> wrote:
>>
>>> I have a striped and replicated volume with 4 nodes. How do I add an
>>> arbiter to this volume?
>>>
>>>
>>> --
>>> André Probst
>>> Consultor de Tecnologia
>>> 43 99617 8765
>>> 
>>>
>>>
>>>
>>> Community Meeting Calendar:
>>>
>>> Schedule -
>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>> 
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users