Re: [Gluster-users] Restore a node in a replicating Gluster setup after data loss

2017-06-04 Thread Karthik Subrahmanya
Hay Niklaus,

Sorry for the delay. The *reset-brick* should do the trick for you.
You can have a look at [1] for more details.

[1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/

HTH,
Karthik

On Thu, Jun 1, 2017 at 12:28 PM, Niklaus Hofer <
niklaus.ho...@stepping-stone.ch> wrote:

> Hi
>
> We have a Replica 2 + Arbiter Gluster setup with 3 nodes Server1, Server2
> and Server3 where Server3 is the Arbiter node. There are several Gluster
> volumes ontop of that setup. They all look a bit like this:
>
> gluster volume info gv-tier1-vm-01
>
> [...]
> Number of Bricks: 1 x (2 + 1) = 3
> [...]
> Bricks:
> Brick1: Server1:/var/data/lv-vm-01
> Brick2: Server2:/var/data/lv-vm-01
> Brick3: Server3:/var/data/lv-vm-01/brick (arbiter)
> [...]
> cluster.data-self-heal-algorithm: full
> [...]
>
> We took down Server2 because we needed to do maintenance on this server's
> storage. During maintenance work, we ended up having to completely rebuild
> the storage on Server2. This means that "/var/data/lv-vm-01" on Server2 is
> now empty. However, all the Gluster Metadata in "/var/lib/glusterd/" is
> still in tact. Gluster has not been started on Server2.
>
> Here is what our sample gluster volume currently looks like on the still
> active nodes:
>
> gluster volume status gv-tier1-vm-01
>
> Status of volume: gv-tier1-vm-01
> Gluster process TCP Port  RDMA Port  Online
> Pid
> 
> --
> Brick Server1:/var/data/lv-vm-0149204 0  Y 22775
> Brick Server3:/var/data/lv-vm-01/brick  49161 0  Y 15334
> Self-heal Daemon on localhost   N/A   N/AY 19233
> Self-heal Daemon on Server3 N/A   N/AY 20839
>
>
> Now we would like to rebuild the data on Server2 from the still in tact
> data on Server1. That is to say, we hope to start up Gluster on Server2 in
> such a way that it will sync the data from Server1 back. If at all
> possible, the Gluster cluster should stay up during this process and access
> to the Gluster volumes should not be interrupted.
>
> What is the correct / recommended way of doing this?
>
> Greetings
> Niklaus Hofer
> --
> stepping stone GmbH
> Neufeldstrasse 9
> CH-3012 Bern
>
> Telefon: +41 31 332 53 63
> www.stepping-stone.ch
> niklaus.ho...@stepping-stone.ch
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster Monthly Newsletter, May 2017

2017-06-04 Thread Amye Scavarda
Gluster Monthly Newsletter, May 2017
Important happenings for Gluster for May:
---
3.11 Release!
Our 3.11 release is officially out!
https://blog.gluster.org/2017/05/announcing-gluster-3-11/

Note that this is a short term supported release.
3.12 is underway with a feature freeze date of July 17, 2017.
---
Gluster Summit 2017!

Gluster Summit 2017 will be held in Prague, Czech Republic on October 27
and 28th. We'll be opening a call for papers for this instead of having an
application process to attend.
https://www.gluster.org/events/summit2017

---
Our weekly community meeting has changed: we'll be meeting every other week
instead of weekly, moving the time to 15:00 UTC, and our agenda is at:
https://bit.ly/gluster-community-meetings
We hope this means that more people can join us. Kaushal outlines the
changes on the mailing list:
http://lists.gluster.org/pipermail/gluster-devel/2017-January/051918.html
---

>From Red Hat Summit:
Container-Native Storage for Modern Applications with OpenShift and Red Hat
Gluster Storage
http://bit.ly/2qpLVP0

Architecting and Performance-Tuning Efficient Gluster Storage Pools
http://bit.ly/2qpMgkK

Noteworthy threads from the mailing lists:
Announcing GlusterFS release 3.11.0 (Short Term Maintenance) - Shyam -
http://lists.gluster.org/pipermail/gluster-users/2017-May/031298.html
GlusterFS and Kafka - Christopher Schmidt -
http://lists.gluster.org/pipermail/gluster-users/2017-May/031185.html
gluster-block v0.2 is alive! - Prasanna Kalever -
http://lists.gluster.org/pipermail/gluster-users/2017-May/030933.html
GlusterFS removal from Openstack Cinder - Joe Julian
http://lists.gluster.org/pipermail/gluster-users/2017-May/031223.html
Release 3.12 and 4.0: Thoughts on scope - Shyam  -
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052811.html
Reviews older than 90 days  - Amar Tumballi -
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052844.html
[Proposal]: Changes to how we test and vote each patch  - Amar Tumballi -
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052868.html
Volgen support for loading trace and io-stats translators at specific
points in the graph - Krutika Dhananjay -
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052881.html
Backport for "Add back socket for polling of events immediately..." - Shyam
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052887.html
[Proposal]: New branch (earlier: Changes to how we test and vote each
patch) -  Amar Tumballi -
http://lists.gluster.org/pipermail/gluster-devel/2017-May/052933.html


Gluster Top 5 Contributors in the last 30 days:
Krutika Dhananjay, Michael Scherer, Kaleb S. Keithley, Nigel Babu, Xavier
Hernandez

Upcoming CFPs:
Open Source Summit Europe –
http://events.linuxfoundation.org/events/open-source-summit-europe/program/cfp

July 8

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebalance + VM corruption - current status and request for feedback

2017-06-04 Thread Krutika Dhananjay
The fixes are already available in 3.10.2, 3.8.12 and 3.11.0

-Krutika

On Sun, Jun 4, 2017 at 5:30 PM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Great news.
> Is this planned to be published in next release?
>
> Il 29 mag 2017 3:27 PM, "Krutika Dhananjay"  ha
> scritto:
>
>> Thanks for that update. Very happy to hear it ran fine without any
>> issues. :)
>>
>> Yeah so you can ignore those 'No such file or directory' errors. They
>> represent a transient state where DHT in the client process is yet to
>> figure out the new location of the file.
>>
>> -Krutika
>>
>>
>> On Mon, May 29, 2017 at 6:51 PM, Mahdi Adnan 
>> wrote:
>>
>>> Hello,
>>>
>>>
>>> Yes, i forgot to upgrade the client as well.
>>>
>>> I did the upgrade and created a new volume, same options as before, with
>>> one VM running and doing lots of IOs. i started the rebalance with force
>>> and after it completed the process i rebooted the VM, and it did start
>>> normally without issues.
>>>
>>> I repeated the process and did another rebalance while the VM running
>>> and everything went fine.
>>>
>>> But the logs in the client throwing lots of warning messages:
>>>
>>>
>>> [2017-05-29 13:14:59.416382] W [MSGID: 114031]
>>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-2:
>>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>>> [2017-05-29 13:14:59.416427] W [MSGID: 114031]
>>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-3:
>>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>>> [2017-05-29 13:14:59.808251] W [MSGID: 114031]
>>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-2:
>>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>>> [2017-05-29 13:14:59.808287] W [MSGID: 114031]
>>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-3:
>>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>>>
>>>
>>>
>>> Although the process went smooth, i will run another extensive test
>>> tomorrow just to be sure.
>>>
>>> --
>>>
>>> Respectfully
>>> *Mahdi A. Mahdi*
>>>
>>> --
>>> *From:* Krutika Dhananjay 
>>> *Sent:* Monday, May 29, 2017 9:20:29 AM
>>>
>>> *To:* Mahdi Adnan
>>> *Cc:* gluster-user; Gandalf Corvotempesta; Lindsay Mathieson; Kevin
>>> Lemonnier
>>> *Subject:* Re: Rebalance + VM corruption - current status and request
>>> for feedback
>>>
>>> Hi,
>>>
>>> I took a look at your logs.
>>> It very much seems like an issue that is caused by a mismatch in
>>> glusterfs client and server packages.
>>> So your client (mount) seems to be still running 3.7.20, as confirmed by
>>> the occurrence of the following log message:
>>>
>>> [2017-05-26 08:58:23.647458] I [MSGID: 100030] [glusterfsd.c:2338:main]
>>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>>> [2017-05-26 08:58:40.901204] I [MSGID: 100030] [glusterfsd.c:2338:main]
>>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>>> [2017-05-26 08:58:48.923452] I [MSGID: 100030] [glusterfsd.c:2338:main]
>>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>>>
>>> whereas the servers have rightly been upgraded to 3.10.2, as seen in
>>> rebalance log:
>>>
>>> [2017-05-26 09:36:36.075940] I [MSGID: 100030] [glusterfsd.c:2475:main]
>>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10.2
>>> (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/testvol
>>> --xlator-option *dht.use-readdirp=yes --xlator-option
>>> *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes
>>> --xlator-option *replicate*.data-self-heal=off --xlator-option
>>> 

Re: [Gluster-users] Adding a new replica to the cluster

2017-06-04 Thread Atin Mukherjee
Apologies for the delay!

>From the logs:

[2017-05-31 13:06:35.839780] E [MSGID: 106005]
[glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to  start
brick 156.125.102.103:/data/br0/vm0

[2017-05-31 13:06:35.839869] E [MSGID: 106074]
[glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable  to
add bricks

This looks like in serv3 the newly added brick failed to come up. You'd
have to look at the same brick log file on serv3 to see what was the
reason. As a workaround, does gluster v start  force bring up this
brick now? As far as glusterd configuration data is concerned, everything
seems fine, no inconsistency.

On Wed, May 31, 2017 at 6:53 PM, Merwan Ouddane  wrote:

> Hello,
>
>
> Here is a tar gz with the log, and the output of "gluster volume status
> && gluster volume info && gluster peer status" for each server
>
>
> Thank you,
>
> Merwan
> --
> *De :* Atin Mukherjee 
> *Envoyé :* mercredi 31 mai 2017 12:20:49
>
> *À :* Merwan Ouddane
> *Cc :* gluster-users@gluster.org
> *Objet :* Re: [Gluster-users] Adding a new replica to the cluster
>
>
> On Tue, May 30, 2017 at 2:14 PM, Merwan Ouddane  wrote:
>
>> Gluster peer status tells me that the peer is connected
>>
>>
>> Here is the log from the servers in the cluster (they are the same for
>> both of them):
>>
>> https://pastebin.com/4aM94PJ6
>>
>> The log from the server i'm trying to add to the cluster:
>>
>> https://pastebin.com/YGTcKRyw
>>
> I'm sorry but these log files are for brick and client. What I am looking
> for is glusterd log ( ls -lrt /var/log/glusterfs/*glusterd* ) from all the
> nodes. Also please paste the output of gluster volume status, gluster
> volume info, gluster peer status from all the nodes.
>
>>
>>
>> Merwan
>>
>> --
>> *De :* Atin Mukherjee 
>> *Envoyé :* mardi 30 mai 2017 04:51
>> *À :* Merwan Ouddane
>> *Cc :* gluster-users@gluster.org
>> *Objet :* Re: [Gluster-users] Adding a new replica to the cluster
>>
>>
>>
>> On Mon, May 29, 2017 at 9:52 PM, Merwan Ouddane  wrote:
>>
>>> Hello,
>>>
>>> I wanted to play around with gluster and I made a 2 nodes cluster
>>> replicated, then I wanted to add a third replica "on the fly".
>>>
>>> I manage to probe my third server from the cluster, but when I try to
>>> add the new brick to the volume, I get a "Request timed out"
>>>
>>> My command:
>>>   gluster volume add-brick vol0 replica 3 192.168.0.4:/data/br0/vol0
>>>
>>
>> This might happen if the originator glusterd lost connection with other
>> peers while this command was in process. Can you check if peer status is
>> healthy by gluster peer status command? If yes, you'd need to share the
>> glusterd log files from all the nodes with us to further analyse the issue.
>>
>>
>>
>>>
>>> Does anyone have an idea ?
>>>
>>> Regards,
>>> Merwan
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebalance + VM corruption - current status and request for feedback

2017-06-04 Thread Gandalf Corvotempesta
Great news.
Is this planned to be published in next release?

Il 29 mag 2017 3:27 PM, "Krutika Dhananjay"  ha
scritto:

> Thanks for that update. Very happy to hear it ran fine without any issues.
> :)
>
> Yeah so you can ignore those 'No such file or directory' errors. They
> represent a transient state where DHT in the client process is yet to
> figure out the new location of the file.
>
> -Krutika
>
>
> On Mon, May 29, 2017 at 6:51 PM, Mahdi Adnan 
> wrote:
>
>> Hello,
>>
>>
>> Yes, i forgot to upgrade the client as well.
>>
>> I did the upgrade and created a new volume, same options as before, with
>> one VM running and doing lots of IOs. i started the rebalance with force
>> and after it completed the process i rebooted the VM, and it did start
>> normally without issues.
>>
>> I repeated the process and did another rebalance while the VM running and
>> everything went fine.
>>
>> But the logs in the client throwing lots of warning messages:
>>
>>
>> [2017-05-29 13:14:59.416382] W [MSGID: 114031]
>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-2:
>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>> [2017-05-29 13:14:59.416427] W [MSGID: 114031]
>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-3:
>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>> [2017-05-29 13:14:59.808251] W [MSGID: 114031]
>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-2:
>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>> [2017-05-29 13:14:59.808287] W [MSGID: 114031]
>> [client-rpc-fops.c:2928:client3_3_lookup_cbk] 2-gfs_vol2-client-3:
>> remote operation failed. Path: /50294ed6-db7a-418d-965f-9b44c
>> 69a83fd/images/d59487fe-f3a9-4bad-a607-3a181c871711/aa01c3a0-5aa0-432d-82ad-d1f515f1d87f
>> (93c403f5-c769-44b9-a087-dc51fc21412e) [No such file or directory]
>>
>>
>>
>> Although the process went smooth, i will run another extensive test
>> tomorrow just to be sure.
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>> --
>> *From:* Krutika Dhananjay 
>> *Sent:* Monday, May 29, 2017 9:20:29 AM
>>
>> *To:* Mahdi Adnan
>> *Cc:* gluster-user; Gandalf Corvotempesta; Lindsay Mathieson; Kevin
>> Lemonnier
>> *Subject:* Re: Rebalance + VM corruption - current status and request
>> for feedback
>>
>> Hi,
>>
>> I took a look at your logs.
>> It very much seems like an issue that is caused by a mismatch in
>> glusterfs client and server packages.
>> So your client (mount) seems to be still running 3.7.20, as confirmed by
>> the occurrence of the following log message:
>>
>> [2017-05-26 08:58:23.647458] I [MSGID: 100030] [glusterfsd.c:2338:main]
>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>> [2017-05-26 08:58:40.901204] I [MSGID: 100030] [glusterfsd.c:2338:main]
>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>> [2017-05-26 08:58:48.923452] I [MSGID: 100030] [glusterfsd.c:2338:main]
>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
>> (args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
>> --volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
>> /rhev/data-center/mnt/glusterSD/s1:_testvol)
>>
>> whereas the servers have rightly been upgraded to 3.10.2, as seen in
>> rebalance log:
>>
>> [2017-05-26 09:36:36.075940] I [MSGID: 100030] [glusterfsd.c:2475:main]
>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10.2
>> (args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/testvol
>> --xlator-option *dht.use-readdirp=yes --xlator-option
>> *dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes
>> --xlator-option *replicate*.data-self-heal=off --xlator-option
>> *replicate*.metadata-self-heal=off --xlator-option
>> *replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on
>> --xlator-option *dht.rebalance-cmd=5 --xlator-option
>> *dht.node-uuid=7c0bf49e-1ede-47b1-b9a5-bfde6e60f07b --xlator-option
>> *dht.commit-hash=3376396580 <(337)%20639-6580>