[Gluster-users] Corruption dangers growing the bricks of a dist-rep volume w/ sharding, on v3.8.8?

2018-12-04 Thread Gambit15
Hi Guys,
 I've got a distributed replicated 2+1 (arbiter) volume with sharding
enabled, running 3.8.8, for VM hosting, and I need to expand it before I
leave over the holiday break.

Each server's brick is mounted on its own LV, so my plan is the following
with each server, one-by-one:

1. Take the peer offline
2. Add new disks & expand the brick's LV
3. Bring the peer back up
4. Heal the volume, before continuing to the next peer

My question is the following: I know there are reports of the data on
sharded volumes getting corrupted when expanding those volumes. Is that
only the case when expanding a volume by adding new bricks, or can that
also happen if the bricks' LVs are expanded, without making any changes to
the underlying layout of the volume's bricks?

Upgrading the Gluster version isn't an option for the moment, that will
have to wait until next year when I upgrade the entire environment.

Many thanks,
 Doug
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] "gfid differs on subvolume"

2018-12-04 Thread Gambit15
Hi Guys,
 I've got a distributed replica 2+1 (rep 3 arbiter 1) cluster, and it
appears a shard has been assigned different GFIDs on each replica set.

===
[2018-11-29 10:05:12.035422] W [MSGID: 109009]
[dht-common.c:2148:dht_lookup_linkfile_cbk] 0-data-novo-dht:
/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846: gfid different on data
file on data-novo-replicate-1, gfid local =
----, gfid node =
492d52d6-e3d1-4ed4-918d-9cdab7a135e0
[2018-11-29 10:05:12.036120] W [MSGID: 109009]
[dht-common.c:1887:dht_lookup_everywhere_cbk] 0-data-novo-dht:
/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846: gfid differs on
subvolume data-novo-replicate-1, gfid local =
c7f6cc63-ae40-4d1a-aa6f-fe97f7912036, gfid node =
492d52d6-e3d1-4ed4-918d-9cdab7a135e0
[2018-11-29 10:05:12.036159] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 0-data-novo-shard: Lookup on
shard 1846 failed. Base file gfid = a46fd27c-5aa6-4fc8-b8e1-c097065e7096
[Stale file handle]
[2018-11-29 10:05:12.036184] W [fuse-bridge.c:2228:fuse_readv_cbk]
0-glusterfs-fuse: 6916126: READ => -1
gfid=a46fd27c-5aa6-4fc8-b8e1-c097065e7096 fd=0x7f12fa57f06c (Stale file
handle)
===

 FIRST REPLICA SET (v0-v2) 

v0:~$ ls -l
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-T. 2 root root 0 Oct 17 11:28
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v0:~$ getfattr -d -m . -e hex
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xc7f6cc63ae404d1aaa6ffe97f7912036
trusted.glusterfs.dht.linkto=0x646174612d6e6f766f2d7265706c69636174652d3100

v1:~$ ls -l
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-T. 2 root root 0 Oct 17 11:28
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v1:~$ getfattr -d -m . -e hex
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xc7f6cc63ae404d1aaa6ffe97f7912036
trusted.glusterfs.dht.linkto=0x646174612d6e6f766f2d7265706c69636174652d3100

v2:~$ ls -l
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-T. 2 root root 0 Oct 17 11:28
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v2:~$ getfattr -d -m . -e hex
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xc7f6cc63ae404d1aaa6ffe97f7912036
trusted.glusterfs.dht.linkto=0x646174612d6e6f766f2d7265706c69636174652d3100

 SECOND REPLICA SET (v0-v2) 

v2:~$ ls -l
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-rw-rw. 2 root root 536870912 Nov 27 14:15
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v2:~$ getfattr -d -m . -e hex
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x
trusted.bit-rot.version=0x02005b4a379750d5
trusted.gfid=0x492d52d6e3d14ed4918d9cdab7a135e0

v3:~$ ls -l
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-rw-rw. 2 root root 536870912 Nov 27 14:15
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v3:~$ getfattr -d -m . -e hex
/gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/brick/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x
trusted.bit-rot.version=0x04005bda0242000c7cdd
trusted.gfid=0x492d52d6e3d14ed4918d9cdab7a135e0

v0:~$ ls -l
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
-rw-rw. 2 root root 0 Oct 17 11:28
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846

v0:~$ getfattr -d -m . -e hex
/gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
# file:
gluster/data-novo/arbiter/.shard/a46fd27c-5aa6-4fc8-b8e1-c097065e7096.1846
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x
trusted.bit-rot.version=0x03005bda02410001b61a
trusted.gfid=0x492d52d6e3d14ed4918d9cdab7a135e0

=

[Gluster-users] Corresponding op-version for each release?

2018-11-26 Thread Gambit15
Hey,
 The op-version for each release doesn't seem to be documented anywhere,
not even in the release notes. Does anyone know where this information can
be found?

In this case, I've just upgraded from 3.8 to 3.12 and need to update my
pool's compatibility version, however I'm sure it'd be useful for the
community to have a comprehensive list somewhere...

Regards,
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-04 Thread Gambit15
s the GFID's reported in the client log. I did however find the
symlinked files in a .glusterfs directory under the parent directory's GFID.

[root@v0 .glusterfs]# ls -l
/gluster/engine/brick/.glusterfs/db/9a/db9afb92-d2bc-49ed-8e34-dcd437ba7be2/
total 0
lrwxrwxrwx. 2 vdsm kvm 132 Jun 30 14:55 hosted-engine.lockspace ->
/var/run/vdsm/storage/98495dbc-a29c-4893-b6a0-0aa70860d0c9/2502aff4-6c67-4643-b681-99f2c87e793d/03919182-6be2-4cbc-aea2-b9d68422a800
lrwxrwxrwx. 2 vdsm kvm 132 Jun 30 14:55 hosted-engine.metadata ->
/var/run/vdsm/storage/98495dbc-a29c-4893-b6a0-0aa70860d0c9/99510501-6bdc-485a-98e8-c2f82ff8d519/71fa7e6c-cdfb-4da8-9164-2404b518d0ee


So if I delete those two symlinks & the files they point to, on one of the
two bricks, will that resolve the split brain? Is that correct?


> Thanks & Regards,
> Karthik
>
> On Wed, Jul 4, 2018 at 1:59 AM Gambit15  wrote:
>
>> On 1 July 2018 at 22:37, Ashish Pandey  wrote:
>>
>>>
>>> The only problem at the moment is that arbiter brick offline. You should
>>> only bother about completion of maintenance of arbiter brick ASAP.
>>> Bring this brick UP, start FULL heal or index heal and the volume will
>>> be in healthy state.
>>>
>>
>> Doesn't the arbiter only resolve split-brain situations? None of the
>> files that have been marked for healing are marked as in split-brain.
>>
>> The arbiter has now been brought back up, however the problem continues.
>>
>> I've found the following information in the client log:
>>
>> [2018-07-03 19:09:29.245089] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-engine-replicate-0: GFID mismatch for > dcd437ba7be2>/hosted-engine.metadata 5e95ba8c-2f12-49bf-be2d-b4baf210d366
>> on engine-client-1 and b9cd7613-3b96-415d-a549-1dc788a4f94d on
>> engine-client-0
>> [2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk]
>> 0-glusterfs-fuse: 10430040: LOOKUP() /98495dbc-a29c-4893-b6a0-
>> 0aa70860d0c9/ha_agent/hosted-engine.metadata => -1 (Input/output error)
>> [2018-07-03 19:09:30.619000] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-engine-replicate-0: GFID mismatch for > dcd437ba7be2>/hosted-engine.lockspace 8e86902a-c31c-4990-b0c5-0318807edb8f
>> on engine-client-1 and e5899a4c-dc5d-487e-84b0-9bbc73133c25 on
>> engine-client-0
>> [2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk]
>> 0-glusterfs-fuse: 10430656: LOOKUP() /98495dbc-a29c-4893-b6a0-
>> 0aa70860d0c9/ha_agent/hosted-engine.lockspace => -1 (Input/output error)
>>
>> As you can see from the logs I posted previously, neither of those two
>> files, on either of the two servers, have any of gluster's extended
>> attributes set.
>>
>> The arbiter doesn't have any record of the files in question, as they
>> were created after it went offline.
>>
>> How do I fix this? Is it possible to locate the correct gfids somewhere &
>> redefine them on the files manually?
>>
>> Cheers,
>>  Doug
>>
>> --
>>> *From: *"Gambit15" 
>>> *To: *"Ashish Pandey" 
>>> *Cc: *"gluster-users" 
>>> *Sent: *Monday, July 2, 2018 1:45:01 AM
>>> *Subject: *Re: [Gluster-users] Files not healing & missing their
>>> extended attributes - Help!
>>>
>>>
>>> Hi Ashish,
>>>
>>> The output is below. It's a rep 2+1 volume. The arbiter is offline for
>>> maintenance at the moment, however quorum is met & no files are reported as
>>> in split-brain (it hosts VMs, so files aren't accessed concurrently).
>>>
>>> ==
>>> [root@v0 glusterfs]# gluster volume info engine
>>>
>>> Volume Name: engine
>>> Type: Replicate
>>> Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: s0:/gluster/engine/brick
>>> Brick2: s1:/gluster/engine/brick
>>> Brick3: s2:/gluster/engine/arbiter (arbiter)
>>> Options Reconfigured:
>>> nfs.disable: on
>>> performance.readdir-ahead: on
>>> transport.address-family: inet
>>> performance.quick-read: off
>>> performance.read-ahead: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: off
>>> cluster.eager-lock: enable
>>> network.remote-dio: enable
>>>

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-04 Thread Gambit15
On 3 July 2018 at 23:37, Vlad Kopylov  wrote:

> might be too late but sort of simple always working solution for such
> cases is rebuilding .glusterfs
>
> kill it and query attr for all files again, it will recreate .glusterfs on
> all bricks
>
> something like mentioned here
> https://lists.gluster.org/pipermail/gluster-users/2018-January/033352.html
>

Is my problem with .glusterfs though? I'd be super cautious removing the
entire directory unless I'm sure that's the solution...

Cheers,


> On Tue, Jul 3, 2018 at 4:27 PM, Gambit15  wrote:
>
>> On 1 July 2018 at 22:37, Ashish Pandey  wrote:
>>
>>>
>>> The only problem at the moment is that arbiter brick offline. You should
>>> only bother about completion of maintenance of arbiter brick ASAP.
>>> Bring this brick UP, start FULL heal or index heal and the volume will
>>> be in healthy state.
>>>
>>
>> Doesn't the arbiter only resolve split-brain situations? None of the
>> files that have been marked for healing are marked as in split-brain.
>>
>> The arbiter has now been brought back up, however the problem continues.
>>
>> I've found the following information in the client log:
>>
>> [2018-07-03 19:09:29.245089] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-engine-replicate-0: GFID mismatch for > dcd437ba7be2>/hosted-engine.metadata 5e95ba8c-2f12-49bf-be2d-b4baf210d366
>> on engine-client-1 and b9cd7613-3b96-415d-a549-1dc788a4f94d on
>> engine-client-0
>> [2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk]
>> 0-glusterfs-fuse: 10430040: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70
>> 860d0c9/ha_agent/hosted-engine.metadata => -1 (Input/output error)
>> [2018-07-03 19:09:30.619000] W [MSGID: 108008]
>> [afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
>> 0-engine-replicate-0: GFID mismatch for > dcd437ba7be2>/hosted-engine.lockspace 8e86902a-c31c-4990-b0c5-0318807edb8f
>> on engine-client-1 and e5899a4c-dc5d-487e-84b0-9bbc73133c25 on
>> engine-client-0
>> [2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk]
>> 0-glusterfs-fuse: 10430656: LOOKUP() /98495dbc-a29c-4893-b6a0-0aa70
>> 860d0c9/ha_agent/hosted-engine.lockspace => -1 (Input/output error)
>>
>> As you can see from the logs I posted previously, neither of those two
>> files, on either of the two servers, have any of gluster's extended
>> attributes set.
>>
>> The arbiter doesn't have any record of the files in question, as they
>> were created after it went offline.
>>
>> How do I fix this? Is it possible to locate the correct gfids somewhere &
>> redefine them on the files manually?
>>
>> Cheers,
>>  Doug
>>
>> --
>>> *From: *"Gambit15" 
>>> *To: *"Ashish Pandey" 
>>> *Cc: *"gluster-users" 
>>> *Sent: *Monday, July 2, 2018 1:45:01 AM
>>> *Subject: *Re: [Gluster-users] Files not healing & missing their
>>> extended attributes - Help!
>>>
>>>
>>> Hi Ashish,
>>>
>>> The output is below. It's a rep 2+1 volume. The arbiter is offline for
>>> maintenance at the moment, however quorum is met & no files are reported as
>>> in split-brain (it hosts VMs, so files aren't accessed concurrently).
>>>
>>> ==
>>> [root@v0 glusterfs]# gluster volume info engine
>>>
>>> Volume Name: engine
>>> Type: Replicate
>>> Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: s0:/gluster/engine/brick
>>> Brick2: s1:/gluster/engine/brick
>>> Brick3: s2:/gluster/engine/arbiter (arbiter)
>>> Options Reconfigured:
>>> nfs.disable: on
>>> performance.readdir-ahead: on
>>> transport.address-family: inet
>>> performance.quick-read: off
>>> performance.read-ahead: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: off
>>> cluster.eager-lock: enable
>>> network.remote-dio: enable
>>> cluster.quorum-type: auto
>>> cluster.server-quorum-type: server
>>> storage.owner-uid: 36
>>> storage.owner-gid: 36
>>> performance.low-prio-threads: 32
>>>
>>> ==
>>>
>>> [root@v0 glusterfs]# gluster volume heal engine 

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-03 Thread Gambit15
On 1 July 2018 at 22:37, Ashish Pandey  wrote:

>
> The only problem at the moment is that arbiter brick offline. You should
> only bother about completion of maintenance of arbiter brick ASAP.
> Bring this brick UP, start FULL heal or index heal and the volume will be
> in healthy state.
>

Doesn't the arbiter only resolve split-brain situations? None of the files
that have been marked for healing are marked as in split-brain.

The arbiter has now been brought back up, however the problem continues.

I've found the following information in the client log:

[2018-07-03 19:09:29.245089] W [MSGID: 108008]
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
0-engine-replicate-0: GFID mismatch for
/hosted-engine.metadata
5e95ba8c-2f12-49bf-be2d-b4baf210d366 on engine-client-1 and
b9cd7613-3b96-415d-a549-1dc788a4f94d on engine-client-0
[2018-07-03 19:09:29.245585] W [fuse-bridge.c:471:fuse_entry_cbk]
0-glusterfs-fuse: 10430040: LOOKUP()
/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.metadata => -1
(Input/output error)
[2018-07-03 19:09:30.619000] W [MSGID: 108008]
[afr-self-heal-name.c:354:afr_selfheal_name_gfid_mismatch_check]
0-engine-replicate-0: GFID mismatch for
/hosted-engine.lockspace
8e86902a-c31c-4990-b0c5-0318807edb8f on engine-client-1 and
e5899a4c-dc5d-487e-84b0-9bbc73133c25 on engine-client-0
[2018-07-03 19:09:30.619360] W [fuse-bridge.c:471:fuse_entry_cbk]
0-glusterfs-fuse: 10430656: LOOKUP()
/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/hosted-engine.lockspace =>
-1 (Input/output error)

As you can see from the logs I posted previously, neither of those two
files, on either of the two servers, have any of gluster's extended
attributes set.

The arbiter doesn't have any record of the files in question, as they were
created after it went offline.

How do I fix this? Is it possible to locate the correct gfids somewhere &
redefine them on the files manually?

Cheers,
 Doug

--
> *From: *"Gambit15" 
> *To: *"Ashish Pandey" 
> *Cc: *"gluster-users" 
> *Sent: *Monday, July 2, 2018 1:45:01 AM
> *Subject: *Re: [Gluster-users] Files not healing & missing their extended
> attributes - Help!
>
>
> Hi Ashish,
>
> The output is below. It's a rep 2+1 volume. The arbiter is offline for
> maintenance at the moment, however quorum is met & no files are reported as
> in split-brain (it hosts VMs, so files aren't accessed concurrently).
>
> ==
> [root@v0 glusterfs]# gluster volume info engine
>
> Volume Name: engine
> Type: Replicate
> Volume ID: 279737d3-3e5a-4ee9-8d4a-97edcca42427
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: s0:/gluster/engine/brick
> Brick2: s1:/gluster/engine/brick
> Brick3: s2:/gluster/engine/arbiter (arbiter)
> Options Reconfigured:
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> performance.low-prio-threads: 32
>
> ==
>
> [root@v0 glusterfs]# gluster volume heal engine info
> Brick s0:/gluster/engine/brick
> /__DIRECT_IO_TEST__
> /98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent
> /98495dbc-a29c-4893-b6a0-0aa70860d0c9
> 
> Status: Connected
> Number of entries: 34
>
> Brick s1:/gluster/engine/brick
> 
> Status: Connected
> Number of entries: 34
>
> Brick s2:/gluster/engine/arbiter
> Status: Ponto final de transporte não está conectado
> Number of entries: -
>
> ==
> === PEER V0 ===
>
> [root@v0 glusterfs]# getfattr -m . -d -e hex /gluster/engine/brick/
> 98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent
> getfattr: Removing leading '/' from absolute path names
> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-2=0x24e8
> trusted.gfid=0xdb9afb92d2bc49ed8e34dcd437ba7be2
> trusted.glusterfs.dht=0x0001
>
> [root@v0 glusterfs]# getfattr -m . -d -e hex /gluster/engine/brick/
> 98495dbc-a29c-4893-b6a0-0aa70860d0c9/ha_agent/*
> getfattr: Removing leading '/' from absolute path names
> # file: gluster/engine/brick/98495dbc-a29c-4893-b6a0-0aa70860d0c9/
> ha_agent/hosted-engine.lockspace
> security.selinux=

Re: [Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-01 Thread Gambit15
  status
> 7 - glustershd.log out put just after you run full heal or index heal
>
> 
> Ashish
>
> --
> *From: *"Gambit15" 
> *To: *"gluster-users" 
> *Sent: *Sunday, July 1, 2018 11:50:16 PM
> *Subject: *[Gluster-users] Files not healing & missing their
> extendedattributes - Help!
>
>
> Hi Guys,
>  I had to restart our datacenter yesterday, but since doing so a number of
> the files on my gluster share have been stuck, marked as healing. After no
> signs of progress, I manually set off a full heal last night, but after
> 24hrs, nothing's happened.
>
> The gluster logs all look normal, and there're no messages about failed
> connections or heal processes kicking off.
>
> I checked the listed files' extended attributes on their bricks today, and
> they only show the selinux attribute. There's none of the trusted.*
> attributes I'd expect.
> The healthy files on the bricks do have their extended attributes though.
>
> I'm guessing that perhaps the files somehow lost their attributes, and
> gluster is no longer able to work out what to do with them? It's not logged
> any errors, warnings, or anything else out of the normal though, so I've no
> idea what the problem is or how to resolve it.
>
> I've got 16 hours to get this sorted before the start of work, Monday.
> Help!
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Files not healing & missing their extended attributes - Help!

2018-07-01 Thread Gambit15
Hi Guys,
 I had to restart our datacenter yesterday, but since doing so a number of
the files on my gluster share have been stuck, marked as healing. After no
signs of progress, I manually set off a full heal last night, but after
24hrs, nothing's happened.

The gluster logs all look normal, and there're no messages about failed
connections or heal processes kicking off.

I checked the listed files' extended attributes on their bricks today, and
they only show the selinux attribute. There's none of the trusted.*
attributes I'd expect.
The healthy files on the bricks do have their extended attributes though.

I'm guessing that perhaps the files somehow lost their attributes, and
gluster is no longer able to work out what to do with them? It's not logged
any errors, warnings, or anything else out of the normal though, so I've no
idea what the problem is or how to resolve it.

I've got 16 hours to get this sorted before the start of work, Monday. Help!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Arbiter node as VM

2017-06-29 Thread Gambit15
As long as the VM isn't hosted on one of the two Gluster nodes, that's
perfectly fine. One of my smaller clusters uses the same setup.

As for your other questions, as long as it supports Unix file permissions,
Gluster doesn't care what filesystem you use. Mix & match as you wish. Just
try to keep matching Gluster versions across your nodes.


On 29 June 2017 at 16:10, mabi  wrote:

> Hello,
>
> I have a replica 2 GlusterFS 3.8.11 cluster on 2 Debian 8 physical servers
> using ZFS as filesystem. Now in order to avoid a split-brain situation I
> would like to add a third node as arbiter.
>
> Regarding the arbiter node I have a few questions:
> - can the arbiter node be a virtual machine? (I am planning to use Xen as
> hypervisor)
> - can I use ext4 as file system on my arbiter? or does it need to be ZFS
> as the two other nodes?
> - or should I use here XFS with LVM this provisioning as mentioned in the
> - is it OK that my arbiter runs Debian 9 (Linux kernel v4) and my other
> two nodes run Debian 8 (kernel v3)?
> - what about thin provisioning of my volume on the arbiter node (
> https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Setting%20Up%20Volumes/) is this required? on my two other nodes
> I do not use any thin provisioning neither LVM but simply ZFS.
>
> Thanks in advance for your input.
>
> Best regards,
> Mabi
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Quorum lost with single failed peer in rep 3...?

2017-05-22 Thread Gambit15
Hey guys,
 I use a replica 3 arbiter 1 setup for hosting VMs, and have just had an
issue where taking one of the non-arbiter peers offline caused gluster to
complain of lost quorum & pause the volumes.

The two "full" peers host the VMs and data, and the arbiter is a VM on a
neighbouring cluster.

Before taking the peer offline, I migrated all VMs from it, verified there
were no running heal processes, and that all peers were connected.

Quorum is configured with the following...
cluster.server-quorum-type: server
cluster.quorum-type: auto

As I understand it, quorum auto means 51%, so quorum should be maintained
if any one of the peers fails. There have been a couple of occasions where
the arbiter went offline, but quorum was maintained as expected &
everything continued to function.

When the volumes were paused, I connected to the remaining node to see what
was going on. "gluster peer status" reported the offline node as
disconnected & the arbiter as connected, as expected. All "gluster volume"
commands hung.
When the offline node was rebooted, quorum returned & all services resumed.

>From the logs (pasted below), it appears the primary node & the arbiter
disconnected from each other around the time the secondary node went
offline, although that's contrary to what was reported by "gluster peer
status".

s0, 10.123.123.10: Full peer
s1, 10.123.123.11: Full peer (taken offline)
s2, 10.123.123.12: Arbiter


= s0 =

[2017-05-22 18:24:20.854775] I [MSGID: 106163]
[glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack]
0-management: using the op-version 30800
[2017-05-22 18:31:30.398272] E [rpc-clnt.c:200:call_bail] 0-management:
bailing out frame type(Peer mgmt) op(--(2)) xid = 0x7ab6 sent = 2017-05-22
18:21:20.549877. timeout = 600 for 10.123.123.12:2
4007
[2017-05-22 18:35:20.420878] E [rpc-clnt.c:200:call_bail] 0-management:
bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x7ab7 sent =
2017-05-22 18:25:11.187323. timeout = 600 for 10.123.123.
12:24007
[2017-05-22 18:35:20.420943] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on s2.
Please check log file for details.
[2017-05-22 18:35:20.421103] I [socket.c:3465:socket_submit_reply]
0-socket.management: not connected (priv->connected = -1)
[2017-05-22 18:35:20.421126] E [rpcsvc.c:1325:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.ma
nagement)
[2017-05-22 18:35:20.421145] E [MSGID: 106430]
[glusterd-utils.c:470:glusterd_submit_reply] 0-glusterd: Reply submission
failed
[2017-05-22 18:36:24.732098] W [socket.c:590:__socket_rwv] 0-management:
readv on 10.123.123.11:24007 failed (Não há dados disponíveis)
[2017-05-22 18:36:24.732214] I [MSGID: 106004]
[glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer
 (), in state ,
has
 disconnected from glusterd.
[2017-05-22 18:36:24.732293] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c)
[0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/
mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08]
-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa)
[0x7f17ae4b37fa] ) 0-management: Lock for vol data not held
[2017-05-22 18:36:24.732303] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for data
[2017-05-22 18:36:24.732323] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c)
[0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/
mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08]
-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa)
[0x7f17ae4b37fa] ) 0-management: Lock for vol engine not held
[2017-05-22 18:36:24.732330] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for engine
[2017-05-22 18:36:24.732350] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c)
[0x7f17ae400e5c]
-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08)
[0x7f17ae40aa08]
-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa)
[0x7f17ae4b37fa] ) 0-management: Lock for vol iso not held
[2017-05-22 18:36:24.732382] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for iso
[2017-05-22 18:36:24.732405] C [MSGID: 106002]
[glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume data. Stopping local bricks.
[2017-05-22 18:36:24.740516] C [MSGID: 106002]
[glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume engine. Stopping local bricks.
[2017-05-22 18:36:24.742215] C [MSGID: 106002]
[glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume iso. Stoppin

Re: [Gluster-users] adding arbiter

2017-03-31 Thread Gambit15
As I understand it, only new files will be sharded, but simply renaming or
moving them may be enough in that case.

I'm interested in the arbiter/sharding bug you've mentioned. Could you
provide any more details or a link?

Cheers,
 D

On 30 March 2017 at 20:25, Laura Bailey  wrote:

> I can't answer all of these, but I think the only way to share existing
> files is to create a new volume with sharding enabled and copy the files
> over into it.
>
> Cheers,
> Laura B
>
>
> On Friday, March 31, 2017, Alessandro Briosi  wrote:
>
>> Hi I need some advice.
>>
>> I'm currently on 3.8.10 and would like to know the following:
>>
>> 1. If I add an arbiter to an existing volume should I also run a
>> rebalance?
>> 2. If I had sharding enabled would adding the arbiter trigger the
>> corruption bug?
>> 3. What's the procedure to enable sharding on an existing volume so that
>> it shards already existing files?
>> 4. Suppose I have sharding disabled, then add an arbiter brick, then
>> enable sharding and execute the procedure for point 3, would this still
>> trigger the corruption bug?
>>
>> Thanks,
>> Alessandro
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> --
> Laura Bailey
> Senior Technical Writer
> Customer Content Services BNE
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption

2017-03-28 Thread Gambit15
On 19 March 2017 at 07:25, Mahdi Adnan  wrote:

> Thank you for your email mate.
>
> Yes, im aware of this but, to save costs i chose replica 2, this cluster
> is all flash.
>

For what it's worth, arbiters FTW!

https://gluster.readthedocs.io/en/latest/Administrator
Guide/arbiter-volumes-and-quorum/


> In version 3.7.x i had issues with ping timeout, if one hosts went down
> for few seconds the whole cluster hangs and become unavailable, to avoid
> this i adjusted the ping timeout to 5 seconds.
>
> As for choosing Ganesha over gfapi, VMWare does not support Gluster (FUSE
> or gfapi) im stuck with NFS for this volume.
>
> The other volume is mounted using gfapi in oVirt cluster.
>
>
>
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> --
> *From:* Krutika Dhananjay 
> *Sent:* Sunday, March 19, 2017 2:01:49 PM
>
> *To:* Mahdi Adnan
> *Cc:* gluster-users@gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
> While I'm still going through the logs, just wanted to point out a couple
> of things:
>
> 1. It is recommended that you use 3-way replication (replica count 3) for
> VM store use case
> 2. network.ping-timeout at 5 seconds is way too low. Please change it to
> 30.
>
> Is there any specific reason for using NFS-Ganesha over gfapi/FUSE?
>
> Will get back with anything else I might find or more questions if I have
> any.
>
> -Krutika
>
> On Sun, Mar 19, 2017 at 2:36 PM, Mahdi Adnan 
> wrote:
>
>> Thanks mate,
>>
>> Kindly, check the attachment.
>>
>>
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>> --
>> *From:* Krutika Dhananjay 
>> *Sent:* Sunday, March 19, 2017 10:00:22 AM
>>
>> *To:* Mahdi Adnan
>> *Cc:* gluster-users@gluster.org
>> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>>
>> In that case could you share the ganesha-gfapi logs?
>>
>> -Krutika
>>
>> On Sun, Mar 19, 2017 at 12:13 PM, Mahdi Adnan 
>> wrote:
>>
>>> I have two volumes, one is mounted using libgfapi for ovirt mount, the
>>> other one is exported via NFS-Ganesha for VMWare which is the one im
>>> testing now.
>>>
>>>
>>>
>>> --
>>>
>>> Respectfully
>>> *Mahdi A. Mahdi*
>>>
>>> --
>>> *From:* Krutika Dhananjay 
>>> *Sent:* Sunday, March 19, 2017 8:02:19 AM
>>>
>>> *To:* Mahdi Adnan
>>> *Cc:* gluster-users@gluster.org
>>> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>>>
>>>
>>>
>>> On Sat, Mar 18, 2017 at 10:36 PM, Mahdi Adnan 
>>> wrote:
>>>
 Kindly, check the attached new log file, i dont know if it's helpful or
 not but, i couldn't find the log with the name you just described.

>>> No. Are you using FUSE or libgfapi for accessing the volume? Or is it
>>> NFS?
>>>
>>> -Krutika
>>>


 --

 Respectfully
 *Mahdi A. Mahdi*

 --
 *From:* Krutika Dhananjay 
 *Sent:* Saturday, March 18, 2017 6:10:40 PM

 *To:* Mahdi Adnan
 *Cc:* gluster-users@gluster.org
 *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption

 mnt-disk11-vmware2.log seems like a brick log. Could you attach the
 fuse mount logs? It should be right under /var/log/glusterfs/ directory
 named after the mount point name, only hyphenated.

 -Krutika

 On Sat, Mar 18, 2017 at 7:27 PM, Mahdi Adnan 
 wrote:

> Hello Krutika,
>
>
> Kindly, check the attached logs.
>
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> --
> *From:* Krutika Dhananjay 
> *Sent:* Saturday, March 18, 2017 3:29:03 PM
> *To:* Mahdi Adnan
> *Cc:* gluster-users@gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
> Hi Mahdi,
>
> Could you attach mount, brick and rebalance logs?
>
> -Krutika
>
> On Sat, Mar 18, 2017 at 12:14 AM, Mahdi Adnan  > wrote:
>
>> Hi,
>>
>> I have upgraded to Gluster 3.8.10 today and ran the add-brick
>> procedure in a volume contains few VMs.
>> After the completion of rebalance, i have rebooted the VMs, some of
>> ran just fine, and others just crashed.
>> Windows boot to recovery mode and Linux throw xfs errors and does not
>> boot.
>> I ran the test again and it happened just as the first one, but i
>> have noticed only VMs doing disk IOs are affected by this bug.
>> The VMs in power off mode started fine and even md5 of the disk file
>> did not change after the rebalance.
>>
>> anyone else can confirm this ?
>>
>>
>> Volume info:
>>
>> Volume Name: vmware2
>> Type: Distributed-Replicate
>> Volume ID: 02328d46-a285-4533-aa3a-fb9bfeb688bf
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 22 x 2 = 44
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster01:/mnt/disk1/vmware2
>> 

Re: [Gluster-users] Backups

2017-03-23 Thread Gambit15
If the  entire gluster volume failed, I'd wipe it, setup a fresh master
volume & then copy the VM DR images onto the new volume. To restart each VM
after it's been restored, I'd setup a script to connect to the hypervisor's
API.

Of course, at the level you're speaking of, it could take a fair amount of
time before the last VM is restored.
As long as you've followed a naming standard, you could easily script in a
restore queue based on service priority.

If you need something quicker than that, then you've got little choice but
to go down the HA-with-a-big-fat-pipe route.

On 23 Mar 2017 18:46, "Gandalf Corvotempesta" <
gandalf.corvotempe...@gmail.com> wrote:

> The problem is not how to backup, but how to restore.
> How do you restore a whole cluster made of thousands of VMs ?
>
> If you move all VMs to a shared storage like gluster, you should
> consider how to recover everything from the gluster failure.
> If you had a bounch of VMs on each server with local disks, you had to
> recover only VMs affected by a single server failure,
> but moving everything to a shared storage means to be prepared for a
> disaster, where you *must* restore everything or hundreds of TB.
>
> 2017-03-23 23:07 GMT+01:00 Gambit15 :
> > Don't snapshot the entire gluster volume, keep a rolling routine for
> > snapshotting the individual VMs & rsync those.
> > As already mentioned, you need to "itemize" the backups - trying to
> manage
> > backups for the whole volume as a single unit is just crazy!
> >
> > Also, for long term backups, maintaining just the core data of each VM is
> > far more manageable.
> >
> > I settled on oVirt for our platform, and do the following...
> >
> > A cronjob regularly snapshots & clones each VM, whose image is then
> rsynced
> > to our backup storage;
> > The backup server snapshots the VM's image backup volume to maintain
> > history/versioning;
> > These full images are only maintained for 30 days, for DR purposes;
> > A separate routine rsyncs the VM's core data to its own data backup
> volume,
> > which is snapshotted & maintained for 10 years;
> >
> > This could be made more efficient by using guestfish to extract the core
> > data from backup image, instead of basically rsyncing the data across the
> > network twice.
> >
> > That active storage layer uses Gluster on top of XFS & LVM. The backup
> > storage layer uses a mirrored storage unit running ZFS on FreeNAS.
> > This of course doesn't allow for HA in the case of the entire cloud
> failing.
> > For that we'd use geo-rep & a big fat pipe.
> >
> > D
> >
> > On 23 March 2017 at 16:29, Gandalf Corvotempesta
> >  wrote:
> >>
> >> Yes but the biggest issue is how to recover
> >> You'll need to recover the whole storage not a single snapshot and this
> >> can last for days
> >>
> >> Il 23 mar 2017 9:24 PM, "Alvin Starr"  ha scritto:
> >>>
> >>> For volume backups you need something like snapshots.
> >>>
> >>> If you take a snapshot A of a live volume L that snapshot stays at that
> >>> moment in time and you can rsync that to another system or use
> something
> >>> like deltacp.pl to copy it.
> >>>
> >>> The usual process is to delete the snapshot once its copied and than
> >>> repeat the process again when the next backup is required.
> >>>
> >>> That process does require rsync/deltacp to read the complete volume on
> >>> both systems which can take a long time.
> >>>
> >>> I was kicking around the idea to try and handle snapshot deltas better.
> >>>
> >>> The idea is that you could take your initial snapshot A then sync that
> >>> snapshot to your backup system.
> >>>
> >>> At a later point you could take another snapshot B.
> >>>
> >>> Because snapshots contain the copies of the original data at the time
> of
> >>> the snapshot and unmodified data points to the Live volume it is
> possible to
> >>> tell what blocks of data have changed since the snapshot was taken.
> >>>
> >>> Now that you have a second snapshot you can in essence perform a diff
> on
> >>> the A and B snapshots to get only the blocks that changed up to the
> time
> >>> that B was taken.
> >>>
> >>> These blocks could be copied to the backup image and you should have a
> >>> clone of the B snapshot.
> 

Re: [Gluster-users] Backups

2017-03-23 Thread Gambit15
Don't snapshot the entire gluster volume, keep a rolling routine for
snapshotting the individual VMs & rsync those.
As already mentioned, you need to "itemize" the backups - trying to manage
backups for the whole volume as a single unit is just crazy!

Also, for long term backups, maintaining just the core data of each VM is
far more manageable.

I settled on oVirt for our platform, and do the following...

   - A cronjob regularly snapshots & clones each VM, whose image is then
   rsynced to our backup storage;
   - The backup server snapshots the VM's *image* backup volume to maintain
   history/versioning;
   - These full images are only maintained for 30 days, for DR purposes;
   - A separate routine rsyncs the VM's core data to its own *data* backup
   volume, which is snapshotted & maintained for 10 years;
  - This could be made more efficient by using guestfish to extract the
  core data from backup image, instead of basically rsyncing the
data across
  the network twice.

That *active* storage layer uses Gluster on top of XFS & LVM. The *backup*
storage layer uses a mirrored storage unit running ZFS on FreeNAS.
This of course doesn't allow for HA in the case of the entire cloud
failing. For that we'd use geo-rep & a big fat pipe.

D

On 23 March 2017 at 16:29, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Yes but the biggest issue is how to recover
> You'll need to recover the whole storage not a single snapshot and this
> can last for days
>
> Il 23 mar 2017 9:24 PM, "Alvin Starr"  ha scritto:
>
>> For volume backups you need something like snapshots.
>>
>> If you take a snapshot A of a live volume L that snapshot stays at that
>> moment in time and you can rsync that to another system or use something
>> like deltacp.pl to copy it.
>>
>> The usual process is to delete the snapshot once its copied and than
>> repeat the process again when the next backup is required.
>>
>> That process does require rsync/deltacp to read the complete volume on
>> both systems which can take a long time.
>>
>> I was kicking around the idea to try and handle snapshot deltas better.
>>
>> The idea is that you could take your initial snapshot A then sync that
>> snapshot to your backup system.
>>
>> At a later point you could take another snapshot B.
>>
>> Because snapshots contain the copies of the original data at the time of
>> the snapshot and unmodified data points to the Live volume it is possible
>> to tell what blocks of data have changed since the snapshot was taken.
>>
>> Now that you have a second snapshot you can in essence perform a diff on
>> the A and B snapshots to get only the blocks that changed up to the time
>> that B was taken.
>>
>> These blocks could be copied to the backup image and you should have a
>> clone of the B snapshot.
>>
>> You would not have to read the whole volume image but just the changed
>> blocks dramatically improving the speed of the backup.
>>
>> At this point you can delete the A snapshot and promote the B snapshot to
>> be the A snapshot for the next backup round.
>>
>> On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote:
>>
>> Are backup consistent?
>> What happens if the header on shard0 is synced referring to some data on
>> shard450 and when rsync parse shard450 this data is changed by subsequent
>> writes?
>>
>> Header would be backupped  of sync respect the rest of the image
>>
>> Il 23 mar 2017 8:48 PM, "Joe Julian"  ha scritto:
>>
>>> The rsync protocol only passes blocks that have actually changed. Raw
>>> changes fewer bits. You're right, though, that it still has to check the
>>> entire file for those changes.
>>>
>>> On 03/23/17 12:47, Gandalf Corvotempesta wrote:
>>>
>>> Raw or qcow doesn't change anything about the backup.
>>> Georep always have to sync the whole file
>>>
>>> Additionally, raw images has much less features than qcow
>>>
>>> Il 23 mar 2017 8:40 PM, "Joe Julian"  ha scritto:
>>>
 I always use raw images. And yes, sharding would also be good.

 On 03/23/17 12:36, Gandalf Corvotempesta wrote:

 Georep expose to another problem:
 When using gluster as storage for VM, the VM file is saved as qcow.
 Changes are inside the qcow, thus rsync has to sync the whole file every
 time

 A little workaround would be sharding, as rsync has to sync only the
 changed shards, but I don't think this is a good solution

 Il 23 mar 2017 8:33 PM, "Joe Julian"  ha scritto:

> In many cases, a full backup set is just not feasible. Georep to the
> same or different DC may be an option if the bandwidth can keep up with 
> the
> change set. If not, maybe breaking the data up into smaller more 
> manageable
> volumes where you only keep a smaller set of critical data and just back
> that up. Perhaps an object store (swift?) might handle fault tolerance
> distribution better for some workloads.
>
> There's no one right answer.
>
> On 03/23

Re: [Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer

2017-02-21 Thread Gambit15
Hi Avra,

On 21 February 2017 at 03:22, Avra Sengupta  wrote:

> Hi D,
>
> We tried reproducing the issue with a similar setup but were unable to do
> so. We are still investigating it.
>
> I have another follow-up question. You said that the repo exists only in
> s0? If that was the case, then bringing glusterd down on s0 only, deleteing
> the repo and starting glusterd once again would have removed it. The fact
> that the repo is restored as soon as glusterd restarts on s0, means that
> some other node(s) in the cluster also has that repo and is passing that
> information to the glusterd in s0 during handshake. Could you please
> confirm if any other node apart from s0 has the particular
> repo(/var/lib/glusterd/vols/data-teste) or not. Thanks.
>

I'll point out that this isn't a recurring issue. It's the first time this
has happened, and it's not happened since. If it wasn't for the orphaned
volume, I wouldn't even have requested support.

Huh, so, I've just rescanned all of the nodes, and the volume is now
appearing on all. That's very odd, as the volume was "created" on Weds 15th
& until the end of the 17th it was still only appearing on s0 (both in the
volume list & in the vols directory).
Grepping the etc-glusterfs-glusterd.vol logs, the first mention of the
volume after the failures I posted previously is the following...


[2017-02-17 15:46:17.199193] W [rpcsvc.c:265:rpcsvc_program_actor]
0-rpc-service: RPC program not available (req 1298437 330) for
10.123.123.102:49008
[2017-02-17 15:46:17.199216] E [rpcsvc.c:560:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2017-02-17 22:20:58.525036] I [MSGID: 106004]
[glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer
 (<978c228a-86f8-48dc-89c1-c63914eaa9a4>), in state ,
has
 disconnected from glusterd.
[2017-02-17 22:20:58.525128] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x1deac)
[0x7f2a85517eac] -->/usr/lib64/glusterfs/3.8.8/xlator/
mgmt/glusterd.so(+0x27a58) [0x7f2a85521a58]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xd09da)
[0x7f2a855ca9da] ) 0-management: Lock for vol data not held
[2017-02-17 22:20:58.525144] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for data
[2017-02-17 22:20:58.525171] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x1deac)
[0x7f2a85517eac] -->/usr/lib64/glusterfs/3.8.8/xlator/
mgmt/glusterd.so(+0x27a58) [0x7f2a85521a58]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xd09da)
[0x7f2a855ca9da] ) 0-management: Lock for vol data-novo not held
[2017-02-17 22:20:58.525182] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for data-novo
[2017-02-17 22:20:58.525205] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x1deac)
[0x7f2a85517eac]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x27a58)
[0x7f2a85521a58]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xd09da)
[0x7f2a855ca9da] ) 0-management: Lock for vol data-teste not held
[2017-02-17 22:20:58.525235] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for data-teste
[2017-02-17 22:20:58.525261] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x1deac)
[0x7f2a85517eac]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0x27a58)
[0x7f2a85521a58]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xd09da)
[0x7f2a855ca9da] ) 0-management: Lock for vol data-teste2 not held
[2017-02-17 22:20:58.525272] W [MSGID: 106118]
[glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not
released for data-teste2


That's 58 hours between the volume's failed creation & its first sign of
life...??

At the time when it was only appearing on s0, I tried stopping glusterd on
multiple occasions & deleting the volume's directory within vols, but it
always returned as soon as I restarted glusterd.
I did this with the help of Joe on IRC at the time, and he was also stumped
(he suggested that the data was possibly still being held in memory
somewhere), so I'm quite sure this wasn't simply an oversight on my part.

Anyway, many thanks for the help, and I'd be happy to provide any logs if
desired, however whilst knowing what happened & why might be useful, all
now seems to have resolved itself.

Cheers,
 Doug


>
> Regards,
> Avra
>
>
> On 02/20/2017 06:51 PM, Gambit15 wrote:
>
> Hi Avra,
>
> On 20 February 2017 at 02:51, Avra Sengupta  wrote:
>
>> Hi D,
>>
>> It seems y

Re: [Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer

2017-02-20 Thread Gambit15
Hi Avra,

On 20 February 2017 at 02:51, Avra Sengupta  wrote:

> Hi D,
>
> It seems you tried to take a clone of a snapshot, when that snapshot was
> not activated.
>

Correct. As per my commands, I then noticed the issue, checked the
snapshot's status & activated it. I included this in my command history
just to clear up any doubts from the logs.

However in this scenario, the cloned volume should not be in an
> inconsistent state. I will try to reproduce this and see if it's a bug.
> Meanwhile could you please answer the following queries:
> 1. How many nodes were in the cluster.
>

There are 4 nodes in a (2+1)x2 setup.
s0 replicates to s1, with an arbiter on s2, and s2 replicates to s3, with
an arbiter on s0.

2. How many bricks does the snapshot data-bck_GMT-2017.02.09-14.15.43 have?
>

6 bricks, including the 2 arbiters.


> 3. Was the snapshot clone command issued from a node which did not have
> any bricks for the snapshot data-bck_GMT-2017.02.09-14.15.43
>

All commands were issued from s0. All volumes have bricks on every node in
the cluster.


> 4. I see you tried to delete the new cloned volume. Did the new cloned
> volume land in this state after failure to create the clone or failure to
> delete the clone
>

I noticed there was something wrong as soon as I created the clone. The
clone command completed, however I was then unable to do anything with it
because the clone didn't exist on s1-s3.


>
> If you want to remove the half baked volume from the cluster please
> proceed with the following steps.
> 1. bring down glusterd on all nodes by running the following command on
> all nodes
> $ systemctl stop glusterd.
> Verify that the glusterd is down on all nodes by running the following
> command on all nodes
> $ systemctl status glusterd.
> 2. delete the following repo from all the nodes (whichever nodes it exists)
> /var/lib/glusterd/vols/data-teste
>

The repo only exists on s0, but stoppping glusterd on only s0 & deleting
the directory didn't work, the directory was restored as soon as glusterd
was restarted. I haven't yet tried stopping glusterd on *all* nodes before
doing this, although I'll need to plan for that, as it'll take the entire
cluster off the air.

Thanks for the reply,
 Doug


> Regards,
> Avra
>
>
> On 02/16/2017 08:01 PM, Gambit15 wrote:
>
> Hey guys,
>  I tried to create a new volume from a cloned snapshot yesterday, however
> something went wrong during the process & I'm now stuck with the new volume
> being created on the server I ran the commands on (s0), but not on the rest
> of the peers. I'm unable to delete this new volume from the server, as it
> doesn't exist on the peers.
>
> What do I do?
> Any insights into what may have gone wrong?
>
> CentOS 7.3.1611
> Gluster 3.8.8
>
> The command history & extract from etc-glusterfs-glusterd.vol.log are
> included below.
>
> gluster volume list
> gluster snapshot list
> gluster snapshot clone data-teste data-bck_GMT-2017.02.09-14.15.43
> gluster volume status data-teste
> gluster volume delete data-teste
> gluster snapshot create teste data
> gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04
> gluster snapshot status
> gluster snapshot activate teste_GMT-2017.02.15-12.44.04
> gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04
>
>
> [2017-02-15 12:43:21.667403] I [MSGID: 106499] 
> [glusterd-handler.c:4349:__glusterd_handle_status_volume]
> 0-management: Received status volume req for volume data-teste
> [2017-02-15 12:43:21.682530] E [MSGID: 106301] 
> [glusterd-syncop.c:1297:gd_stage_op_phase]
> 0-management: Staging of operation 'Volume Status' failed on localhost :
> Volume data-teste is not started
> [2017-02-15 12:43:43.633031] I [MSGID: 106495] 
> [glusterd-handler.c:3128:__glusterd_handle_getwd]
> 0-glusterd: Received getwd req
> [2017-02-15 12:43:43.640597] I [run.c:191:runner_log]
> (-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcc4b2)
> [0x7ffb396a14b2] 
> -->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcbf65)
> [0x7ffb396a0f65] -->/lib64/libglusterfs.so.0(runner_log+0x115)
> [0x7ffb44ec31c5] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/
> delete/post/S57glusterfind-delete-post --volname=data-teste
> [2017-02-15 13:05:20.103423] E [MSGID: 106122] [glusterd-snapshot.c:2397:
> glusterd_snapshot_clone_prevalidate] 0-management: Failed to pre validate
> [2017-02-15 13:05:20.103464] E [MSGID: 106443] [glusterd-snapshot.c:2413:
> glusterd_snapshot_clone_prevalidate] 0-management: One or more bricks are
> not running. Please run snapshot status command to see brick status.
> Please start the stopped brick and then issue snaps

Re: [Gluster-users] 90 Brick/Server suggestions?

2017-02-17 Thread Gambit15
>
> RAID is not an option, JBOD with EC will be used.
>

Any particular reason for this, other than maximising space by avoiding two
layers of RAID/redundancy?
Local RAID would be far simpler & quicker for replacing failed drives, and
it would greatly reduce the number of bricks & load on Gluster.

We use RAID volumes for our bricks, and the benefits of simplified
management far outweigh the costs of a little lost capacity.

D
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer

2017-02-16 Thread Gambit15
Hey guys,
 I tried to create a new volume from a cloned snapshot yesterday, however
something went wrong during the process & I'm now stuck with the new volume
being created on the server I ran the commands on (s0), but not on the rest
of the peers. I'm unable to delete this new volume from the server, as it
doesn't exist on the peers.

What do I do?
Any insights into what may have gone wrong?

CentOS 7.3.1611
Gluster 3.8.8

The command history & extract from etc-glusterfs-glusterd.vol.log are
included below.

gluster volume list
gluster snapshot list
gluster snapshot clone data-teste data-bck_GMT-2017.02.09-14.15.43
gluster volume status data-teste
gluster volume delete data-teste
gluster snapshot create teste data
gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04
gluster snapshot status
gluster snapshot activate teste_GMT-2017.02.15-12.44.04
gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04


[2017-02-15 12:43:21.667403] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume data-teste
[2017-02-15 12:43:21.682530] E [MSGID: 106301]
[glusterd-syncop.c:1297:gd_stage_op_phase] 0-management: Staging of
operation 'Volume Status' failed on localhost : Volume data-teste is not
started
[2017-02-15 12:43:43.633031] I [MSGID: 106495]
[glusterd-handler.c:3128:__glusterd_handle_getwd] 0-glusterd: Received
getwd req
[2017-02-15 12:43:43.640597] I [run.c:191:runner_log]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcc4b2)
[0x7ffb396a14b2]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcbf65)
[0x7ffb396a0f65] -->/lib64/libglusterfs.so.0(runner_log+0x115)
[0x7ffb44ec31c5] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post
--volname=data-teste
[2017-02-15 13:05:20.103423] E [MSGID: 106122]
[glusterd-snapshot.c:2397:glusterd_snapshot_clone_prevalidate]
0-management: Failed to pre validate
[2017-02-15 13:05:20.103464] E [MSGID: 106443]
[glusterd-snapshot.c:2413:glusterd_snapshot_clone_prevalidate]
0-management: One or more bricks are not running. Please run snapshot
status command to see brick status.
Please start the stopped brick and then issue snapshot clone command
[2017-02-15 13:05:20.103481] W [MSGID: 106443]
[glusterd-snapshot.c:8563:glusterd_snapshot_prevalidate] 0-management:
Snapshot clone pre-validation failed
[2017-02-15 13:05:20.103492] W [MSGID: 106122]
[glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot
Prevalidate Failed
[2017-02-15 13:05:20.103503] E [MSGID: 106122]
[glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Snapshot on local node
[2017-02-15 13:05:20.103514] E [MSGID: 106122]
[glusterd-mgmt.c:2243:glusterd_mgmt_v3_initiate_snap_phases] 0-management:
Pre Validation Failed
[2017-02-15 13:05:20.103531] E [MSGID: 106027]
[glusterd-snapshot.c:8118:glusterd_snapshot_clone_postvalidate]
0-management: unable to find clone data-teste volinfo
[2017-02-15 13:05:20.103542] W [MSGID: 106444]
[glusterd-snapshot.c:9063:glusterd_snapshot_postvalidate] 0-management:
Snapshot create post-validation failed
[2017-02-15 13:05:20.103561] W [MSGID: 106121]
[glusterd-mgmt.c:351:gd_mgmt_v3_post_validate_fn] 0-management:
postvalidate operation failed
[2017-02-15 13:05:20.103572] E [MSGID: 106121]
[glusterd-mgmt.c:1660:glusterd_mgmt_v3_post_validate] 0-management: Post
Validation failed for operation Snapshot on local node
[2017-02-15 13:05:20.103582] E [MSGID: 106122]
[glusterd-mgmt.c:2363:glusterd_mgmt_v3_initiate_snap_phases] 0-management:
Post Validation Failed
[2017-02-15 13:11:15.862858] W [MSGID: 106057]
[glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management:
Snap volume
c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick1-data-brick
not found [Argumento inválido]
[2017-02-15 13:11:16.314759] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
/run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick on
port 49452
[2017-02-15 13:11:16.316090] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-02-15 13:11:16.348867] W [MSGID: 106057]
[glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management:
Snap volume
c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick6-data-arbiter
not found [Argumento inválido]
[2017-02-15 13:11:16.558878] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
/run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter on
port 49453
[2017-02-15 13:11:16.559883] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-02-15 13:11:23.279721] E [MSGID: 106030]
[glusterd-snapshot.c:4736:glusterd_take_lvm_snapshot] 0-management: taking
snapshot of the brick
(/run/gluster/snaps/c3ceae3889484e96ab8bed69593c

[Gluster-users] Optimal shard size & self-heal algorithm for VM hosting?

2017-02-15 Thread Gambit15
Hey guys,
 I keep seeing different recommendations for the best shard sizes for VM
images, from 64MB to 512MB.

What's the benefit of smaller v larger shards?
I'm guessing smaller shards are quicker to heal, but larger shards will
provide better sequential I/O for single clients? Anything else?

I also usually see "cluster.data-self-heal-algorithm: full" is generally
recommended in these cases. Why not "diff"? Is it simply to reduce CPU load
when there's plenty of excess network capacity?

Thanks in advance,
Doug
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster quorum settings

2017-02-09 Thread Gambit15
Hi Bap,

On 6 February 2017 at 07:27, pasawwa  wrote:

> Hello,
>
> we just created 3 node gluster ( replica 3 arbiter 1 ) and get "systemctl
> status glusterd" message:
>
> n1.test.net etc-glusterfs-glusterd.vol[1458]: [2017-02-03
> 17:56:24.691334] C [MSGID: 106003] [glusterd-server-quorum.c:341:
> glusterd_do_volume_quorum_action] 0-management: Server quorum regained
> for volume TESTp1. Starting local bricks.
>
> How can we setup gluster quorum params to eliminate this warning and *to
> aviod split brain **and writ**ea**ble* if any single node goes down ?
>
> current settings:
>
> server.event-threads: 8
> client.event-threads: 8
> performance.io-thread-count: 20
> performance.readdir-ahead: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> *cluster.quorum-type: auto* # we are not shure to be 100%
> successfull for split brain ( update nodes eg. )
> *cluster.server-quorum-type: server*# it looks to be OK
> features.shard: on
> cluster.data-self-heal-algorithm: diff
> storage.owner-uid: 36
> storage.owner-gid: 36
> server.allow-insecure: on
> network.ping-timeout: 10
>

For a rep 3 setup, those default quorum configurations should allow you to
maintain writes & avoid split-brain should any single node fails.
To automate the healing process, I'd also add these to the list:

cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on



>
> https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/arbiter-volumes-and-quorum/
>
> regrads
> Bap.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Some questions about GlusterFS and Prod Environment

2017-02-09 Thread Gambit15
Hi Riccardo,

On 3 February 2017 at 07:06, Riccardo Filippone  wrote:

> Good morning guys,
>
> we are going to deploy a new production infrastructure.
>
> in order to share some folders through our app servers (Tomcat 8), I wan
> to create a GlusterFS storage area.
>
> I made some tests in localhost (using some VMs), but at the moment only
> with Apache2.
>
> My configuration is a mirroring replica (1:1).
>
> *My questions are:*
>
> 1) Considering 2 Frontends and 2 Gluster FS Nodes, the best practice is to
> write a different fstab entry for each node?
>

I don't see why you'd need that. The only differences you might have would
be if you're mounting different volumes on different servers.
You could homogenise that even further by simply using *localhost* instead
of the server's FQDN.


> Frontend1 (mount the first gluster node, and the second one as backup
> volume):
> gluster1.droplet.com:/app_volume /var/www/html glusterfs
> defaults,_netdev,backupvolfile-server=gluster2.droplet.com 0 0
>
> Frontend2 /etc/fstab (mount the second gluster node, and the first one as
> backup volume):
> gluster2.droplet.com:/app_volume /var/www/html glusterfs
> defaults,_netdev,backupvolfile-server=gluster1.droplet.com 0 0
>

On the more recent versions (3.7+ at least), if you're mounting through
fuse you no longer need to define backup servers, gluster deals with that
internally. One mount option I would add would be "relatime".


2) I want to backup every day the volume files. I can run a zip command
> directly from the gluster node, or I need to mount it on the backup server
> than run the command? Is there any other good solution to store a backup?
>

I don't see it'd make much difference. If it's lots of small files though,
there could be a performance hit. If it did become an issue, it might be
more efficient exporting snapshots (assuming your bricks are backed by LVM.
ZFS would make it even easier).



> 3) Can I share WebApps folder between Tomcat servers? Is there any known
> issue? (for example putting a .war into the WebApps folder, I think I'll
> generate some errors due to tomcat war deploying? Anyone have experiences
> with tomcat and glusterfs shared folders?)
>
> 4) Can I use a load balancer software to mount GlusterFS volumes? If YES,
> is there any benefits?
>

 Again, no need. The replica translator does that for you.



> Thank you in advance for your answers,
>
> best regads, have a nice day,
>
> Riccardo
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

Doug
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Can't start cloned volume: "Volume id mismatch"

2017-02-08 Thread Gambit15
Hey guys,
 Any ideas?

[root@v0 ~]# gluster volume start data2
volume start: data2: failed: Volume id mismatch for brick
s0:/run/gluster/snaps/data2/brick1/data/brick. Expected volume id
d8b0a411-70d9-454d-b5fb-7d7ca424adf2, volume id
a7eae608-f1c4-44fd-a6aa-5b9c19e13565
found

[root@v0 ~]# gluster volume info data2 | grep Volume\ ID
Volume ID: d8b0a411-70d9-454d-b5fb-7d7ca424adf2

[root@v0 ~]# gluster snapshot info data-bck_GMT-2017.02.07-14.30.28
Snapshot  : data-bck_GMT-2017.02.07-14.30.28
Snap UUID : 911ef04b-b922-4611-91a1-9abc29fd2360
Created   : 2017-02-07 14:30:28
Snap Volumes:

Snap Volume Name  : a7eae608f1c444fda6aa5b9c19e13565
Origin Volume name: data
Snaps taken for data  : 1
Snaps available for data  : 255
Status: Started


Cheers,
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Distributed volumes

2017-02-08 Thread Gambit15
Distributed is the default, and replicated distributes as long as there are
at least twice the number of bricks as your replication level.

For example...

gluster volume create gv0 server1:/data/brick1/gv0 server2:/data/brick1/gv0
server3:/data/brick1/gv0
...distributes files across all three servers/bricks.

gluster volume create gv0 replica 3 server1:/data/brick1/gv0
server2:/data/brick1/gv0 server3:/data/brick1/gv0
...creates a volume across the servers/bricks with 3 replicas. As there are
only 3 servers/bricks, there is nothing to distribute to.

gluster volume create gv0 replica 2 server1:/data/brick1/gv0
server2:/data/brick1/gv0 server3:/data/brick1/gv0 server4:/data/brick1/gv0
...distributes files across server1 & server2, & replicates server1 to
server3 & server2 to server4. (This is purely an example though, as this
particular config would then be vulnerable to split-brain).

D

On 8 February 2017 at 06:12, Dave Fan  wrote:

> Hi,
>
> I'm new to Gluster so a very basic question. Are all volumes distributed
> by default? Is there a switch to turn this feature on/off?
>
> I ask this because in an intro to Gluster I saw "Replicated Volume" and
> "Distributed Replicated Volume". Is the first type, "Replicated Volume",
> not distributed?
>
> Many thanks,
> Dave
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Swap space requirements

2017-02-07 Thread Gambit15
Gluster doesn't "require" swap any more than any other service, and with
the price of RAM today, most admins should even consider removing swap
altogether.

D

On 7 February 2017 at 10:56, Mark Connor  wrote:

> I am planning in deploying about 18 bricks of about 50 TB bricks each
> spanning 8-10 servers. My servers are high end servers with 128gb each. I
> have searched and cannot find any detail on swap partition requirements for
> the latest gluster server.  Can anyone offer me some advice?
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quick performance check?

2017-02-03 Thread Gambit15
On 3 February 2017 at 11:09, Momonth  wrote:

> Hi,
>
> I ran some benchmarking on SSD enabled servers, 10Gb connected, see
> the file attached.
>
> I'm still looking at GlusterFS as a persistent storage for containers,
> and it's clear it's not going to compete with local file system
> performance.
>

Well that's kind of a given, with the standard rep 3, you're doing a sort
of RAID 5 across the network. However depending on your use case & setup,
you *can* get performance boosts akin to RAID 10 setups, multiplied bu the
number of nodes/bricks in the cluster.

http://blog.gluster.org/category/performance/
https://s3.amazonaws.com/aws001/guided_trek/Performance_in_a_Gluster_Systemv6F.pdf

I couldn't find the particular doc, but I've seen some ludicrous
throughputs from configs using multiple nodes running SSDs in RAID 10 and
peering over Infiband.

D
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quick performance check?

2017-02-03 Thread Gambit15
Hi Alex,
 I don't use Gluster for storing large amounts of small files, however from
what I've read, that does appear to its big achilles heel.
Personally, if you're not looking to scale out to a lot more servers, I'd
go with Ceph or DRBD. Gluster's best features are in its scalability.
Also, it's worth pointing out that in any setup, you've got to be careful
with 2 node configurations as they're highly vulnerable to split-brain
scenarios.

Given the relatively small size of your data, caching tweaks & an arbiter
may well save you here, however I don't use enough of its caching features
to be able to give advice on it.

D

On 3 February 2017 at 08:28, Alex Sudakar  wrote:

> Hi.  I'm looking for a clustered filesystem for a very simple
> scenario.  I've set up Gluster but my tests have shown quite a
> performance penalty when compared to using a local XFS filesystem.
> This no doubt reflects the reality of moving to a proper distributed
> filesystem, but I'd like to quickly check that I haven't missed
> something obvious that might improve performance.
>
> I plan to have two Amazon AWS EC2 instances (virtual machines) both
> accessing the same filesystem for read/writes.  Access will be almost
> entirely reads, with the occasional modification, deletion or creation
> of files.  Ideally I wanted all those reads going straight to the
> local XFS filesystem and just the writes incurring a distributed
> performance penalty.  :-)
>
> So I've set up two VMs with Centos 7.2 and Gluster 3.8.8, each machine
> running as a combined Gluster server and client.  One brick on each
> machine, one volume in a 1 x 2 replica configuration.
>
> Everything works, it's just the performance penalty which is a surprise.
> :-)
>
> My test directory has 9,066 files and directories; 7,987 actual files.
> Total size is 63MB data, 85MB allocated; an average size of 8KB data
> per file.  The brick's files have a total of 117MB allocated, with the
> extra 32MB working out pretty much to be exactly the sum of the extra
> 4KB extents that would have been allocated for the XFS attributes per
> file - the VMs were installed with the default 256 byte inode size for
> the local filesystem, and from what I've read Gluster will force the
> filesystem to allocate an extent for its attributes.  'xfs_bmap' on a
> few files shows this is the case.
>
> A simple 'cat' of every file when laid out in 'native' directories on
> the XFS filesystem takes about 3 seconds.  A cat of all the files in
> the brick's directory on the same filesystem takes about 6.4 seconds,
> which I figure is due to the extra I/O for the inode metadata extents
> (although not quite certain; the additional extents added about 40%
> extra to the disk block allocation, so I'm unsure as to why the time
> increase was 100%).
>
> Doing the same test through the glusterfs mount takes about 25
> seconds; roughly four times longer than reading those same files
> directly from the brick itself.
>
> It took 30 seconds until I applied the 'md-cache' settings (for those
> variables that still exist in 3.8.8) mentioned in this very helpful
> article:
>
>   http://blog.gluster.org/category/performance/
>
> So use of the md-cache in a 'cold run' shaved off 5 seconds - due to
> common directory LOOKUP operations being cached I guess.
>
> Output of a 'volume info' is as follows:
>
> Volume Name: g1
> Type: Replicate
> Volume ID: bac6cd70-ca0d-4173-9122-644051444fe5
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: serverA:/data/brick1
> Brick2: serverC:/data/brick1
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> cluster.self-heal-daemon: enable
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: on
> performance.md-cache-timeout: 60
> network.inode-lru-limit: 9
>
> The article suggests a value of 600 for
> features.cache-invalidation-timeout but my Gluster version only
> permits a maximum value of 60.
>
> Network speed between the two VMs is about 120 MBytes/sec - the two
> VMs inhabit the same Amazon Virtual Private Cloud - so I don't think
> bandwidth is a factor.
>
> The 400% slowdown is no doubt the penalty incurred in moving to a
> proper distributed filesystem.  That article and other web pages I've
> read all say that each open of a file results in synchronous LOOKUP
> operations on all the replicas, so I'm guessing it just takes that
> much time for everything to happen before a file can be opened.
> Gluster profiling shows that there are 11,198 LOOKUP operations on the
> test cat of the 7,987 files.
>
> As a Gluster newbie I'd appreciate some quick advice if possible -
>
> 1.  Is this sort of performance hit - on directories of small files -
> typical for such a simple Gluster configuration?
>
> 2.  Is there anything I can do to speed things up?  :-)
>
> 3.  Repeating the 'cat' test immediately after the first

Re: [Gluster-users] Location of the gluster client log with libgfapi?

2017-01-27 Thread Gambit15
On 27 January 2017 at 19:05, Kevin Lemonnier  wrote:

> > Basically, every now & then I notice random VHD images popping up in the
> > heal queue, and they're almost always in pairs, "healing" the same file
> on
> > 2 of the 3 replicate bricks.
> > That already strikes me as odd, as if a file is "dirty" on more than one
> > brick, surely that's a split-brain scenario? (nothing logged in "info
> > split-brain" though)
>
> I don't think that's a problem, they do tend to show the heal on every
> brick
> but the one being healed .. I think the sources show the file to heal, not
> the
> dirty one.
> At least that's what I noticed on my clusters.
>
> >
> > Anyway, these heal processes always hang around for a couple of hours,
> even
> > when it's just metadata on an arbiter brick.
> > That doesn't make sense to me, an arbiter shouldn't take more than a
> couple
> > of seconds to heal!?
>
> Sorry, no idea on that, I never used arbiter setups.
>

If it's actually showing the source files that are being healed *from*, not
*to*, that'd make sense. Although it's a counter-intuitive way of
displaying things & is completely contrary to all of the documentation (as
described by readthedocs.gluster.io, Red Hat & Rackspace)



> >
> > I spoke with Joe on IRC, and he suggested I'd find more info in the
> > client's logs...
>
> Well it'd be good to know why they need healing, for sure.
> I don't know of any way to get that on the gluster side, you need to
> find a way on oVirt to redirect the output of the qemu process somewhere.
> That's where you'll find the libgfapi logs.
> Never used oVirt so I can't really help on that :/
>

Well you've given me somewhere to start from at least.

Appreciated!

D
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Location of the gluster client log with libgfapi?

2017-01-27 Thread Gambit15
Hey guys,
 Would anyone be able to tell me the name/location of the gluster client
log when mounting through libgfapi?

Cheers,
 D
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Cluster not healing

2017-01-23 Thread Gambit15
Have you verified that Gluster has marked the files as split-brain?

gluster volume heal  info split-brain

If you're fairly confident about which files are correct, you can automate
the split-brain healing procedure.

>From the manual...

> volume heal  split-brain bigger-file 
>   Performs healing of  which is in split-brain by
> choosing the bigger file in the replica as source.
>
> volume heal  split-brain source-brick 
>   Selects  as the source for all the files
> that are in split-brain in that replica and heals them.
>
> volume heal  split-brain source-brick
>  
>   Selects the split-brained  present in
>  as source and completes heal.
>

D

On 23 January 2017 at 16:28, James Wilkins  wrote:

> Hello,
>
> I have a couple of gluster clusters - setup with distributed/replicated
> volumes that have starting incrementing the heal-count from statistics -
> and for some files returning input/output error when attempting to access
> said files from a fuse mount.
>
> If i take one volume, from one cluster as an example:
>
> gluster volume heal storage01 statistics info
> 
> Brick storage02.:/storage/sdc/brick_storage01
> Number of entries: 595
> 
>
> And then proceed to look at one of these files (have found 2 copies - one
> on each server / brick)
>
> First brick:
>
> # getfattr -m . -d -e hex  /storage/sdc/brick_storage01/
> projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-
> content/themes/twentyfourteen/single.php
> getfattr: Removing leading '/' from absolute path names
> # file: storage/sdc/brick_storage01/projects/183-57c559ea4d60e-
> canary-test--node02/wordpress285-data/html/wp-
> content/themes/twentyfourteen/single.php
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.storage01-client-0=0x00020001
> trusted.bit-rot.version=0x02005874e2cd459d
> trusted.gfid=0xda4253be1c2647b7b6ec5c045d61d216
> trusted.glusterfs.quota.c9764826-596a-4886-9bc0-60ee9b3fce44.contri.1=
> 0x0601
> trusted.pgfid.c9764826-596a-4886-9bc0-60ee9b3fce44=0x0001
>
> Second Brick:
>
> # getfattr -m . -d -e hex /storage/sdc/brick_storage01/
> projects/183-57c559ea4d60e-canary-test--node02/wordpress285-data/html/wp-
> content/themes/twentyfourteen/single.php
> getfattr: Removing leading '/' from absolute path names
> # file: storage/sdc/brick_storage01/projects/183-57c559ea4d60e-
> canary-test--node02/wordpress285-data/html/wp-
> content/themes/twentyfourteen/single.php
> security.selinux=0x73797374656d5f753a6f626a6563
> 745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x020057868423000d6332
> trusted.gfid=0x14f74b04679345289dbd3290a3665cbc
> trusted.glusterfs.quota.47e007ee-6f91-4187-81f8-90a393deba2b.contri.1=
> 0x0601
> trusted.pgfid.47e007ee-6f91-4187-81f8-90a393deba2b=0x0001
>
>
>
> I can see the only the first brick has the appropiate trusted.afr.
> tag - e.g in this case
>
> trusted.afr.storage01-client-0=0x00020001
>
> Files are same size under stat - just the access/modify/change dates are
> different.
>
> My first question is - reading https://gluster.readthedocs.io/en/latest/
> Troubleshooting/split-brain/ this suggests that i should have this field
> on both copies of the files - or am I mis-reading?
>
> Secondly - am I correct that each one of these entries will require manual
> fixing?  (I have approx 6K files/directories in this state over two
> clusters - which appears like an awful lot of manual fixing)
>
> I've checked gluster volume info  and all appropiate
> services/self-heal daemon are running.  We've even tried a full heal/heal
> and iterating over parts of the filesystem in question with find / stat /
> md5sum.
>
> Any input appreciated.
>
> Cheers,
>
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] High-availability with KVM?

2017-01-20 Thread Gambit15
On 20 January 2017 at 19:26, Lindsay Mathieson 
wrote:

> This I think, highlights one of glusters few weaknesses - the
> inflexibility of brick layout. It would be really nice if you could just
> arbitrarily add bricks to distributed-replicate volumes and have files be
> evenly distributed among them as a whole. This would work particularly well
> with sharded volumes. Unfortunately I suspect this would need some sort of
> meta server.
>

Setups with arbiters especially. I know it's a relatively new feature
still, however all of the tools for managing & manipulating dist-rep
layouts are geared for standard bricks. It's very easy to accidentally
convert your arbiters into full rep 3 bricks.
Luckily, most situations where you want to grow your cluster from a basic
setup, you'll be converting to true rep 3 anyway.

D
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] High-availability with KVM?

2017-01-20 Thread Gambit15
>
> Type: Distributed-Replicate
> Number of Bricks: 2 x 2 = 4
>

With that setup, you lose quorum if you lose any one node.
Brick 1 replicates to brick 2, and brick 3 replicates to brick 4. If any
one of those goes down, quorum falls to <51%, which locks the brick under
the default settings.

If you've only got 4 servers to play with, I suggest you move to
replication 3 arbiter 1. Put the arbiter for servers 1 & 2 on server 3, and
the arbiter for servers 3 & 4 on server 1.

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] High-availability with KVM?

2017-01-20 Thread Gambit15
As long as there are enough nodes to satisfy quorum, the volumes should
remain R/W. Have you tried writing to the volume during this period?
Anything in your logs?

Doug

On 20 January 2017 at 17:06, Ziemowit Pierzycki 
wrote:

> Hi,
>
> I have a 4-node gluster with distributed-replicated volumes that serve
> out VM images.  The purpose of using gluster was to provide highly
> available storage for our virtualization platform.  When performing
> maintenance on one of the gluster nodes, I noticed the VM storage
> becomes unavailable for the duration of the node being shutdown and
> while the heal is running.  Each VM qemu process connects to gluster
> directly and needs to be restarted in order to bring up the VM again.
>
> So, is the setup wrong or does gluster not provide high availability?
>
> Thanks,
>
> Ziemowit
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Convert to Shard - Setting Guidance

2017-01-20 Thread Gambit15
> If your images easily fit within the bricks, why do you need sharding in
>> the first place? It adds an extra layer of complexity & removes the cool
>> feature of having entire files on each brick, making DR & things a lot
>> easier.
>
>
> Because healing with large VM images completes orders of magnitude faster
> and consumes far less bandwidth/cpu/disk IO
>

Isn't that only the case with full, non-granular heals? And according to
the docs, "full" forces a resync of the entire volume, not just the
out-of-sync files. Seems a bit overkill for anything other than a brick
replacement...


One question - how do you plan to convert the VM's?
>
> - setup a new volume and copy the VM images to that?
>
> - or change the shard setting inplace? (I don't think that would work)
>
AFAIK, if you enable sharding on a volume, it will only apply to new files.
An easier option might be to enable sharding & then clone the VMs to create
new VHDs.

Doug
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Convert to Shard - Setting Guidance

2017-01-20 Thread Gambit15
>
> > data-self-heal-algorithm full
>
> There was a bug in the default algo, at least for VM hosting,
> not that long ago. Not sure if it was fixed but I know we were
> told here to use full instead, I'm guessing that's why he's using it too.
>

Huh, not heard of that. Do you have any useful links I could read up on? My
GoogleFu only returned a few posts from 5 years ago...



> > If your images easily fit within the bricks, why do you need sharding in
> > the first place? It adds an extra layer of complexity & removes the cool
> > feature of having entire files on each brick, making DR & things a lot
> > easier.
>
> Because healing a VM disk without sharding freezes it for the duration of
> the heal,
> possibly hours depending on the size. That's just not acceptable. Unless
> that's
> related to the bug in the heal algo and it's been fixed ? Not sure
>

I've only been using Gluster since 3.7, last year, but I've not had any of
those issues. Looking back, it's had granular locking & healing for a good
many versions now.

Doug
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Convert to Shard - Setting Guidance

2017-01-20 Thread Gambit15
It's a given, but test it well before going into production. People have
occasionally had problems with corruption when converting to shards.
In my initial tests, enabling sharding took our I/O down to 15Kbps from
300Mpbs without.

data-self-heal-algorithm full
>
That could be painful. Any particular reason you've chosen full?

>
> All Bricks 1TB SSD

Image Sizes – Up to 300GB
>
If your images easily fit within the bricks, why do you need sharding in
the first place? It adds an extra layer of complexity & removes the cool
feature of having entire files on each brick, making DR & things a lot
easier.

Doug

On 20 January 2017 at 00:11, Gustave Dahl  wrote:

> I am looking for guidance on the recommend settings as I convert to
> shards.  I have read most of the list back through last year and I think
> the conclusion I came to was to keep it simple.
>
>
>
> One: It may take months to convert my current VM images to shard’s, do you
> see any issues with this?  My priority is to make sure future images are
> distributed as shards.
>
> Two:  Settings, my intent is to set it as follows based on guidance on the
> Redhat site and what I have been reading here.  Do these look okay?
> Additional suggestions?
>
>
>
> Modified Settings
>
> =
>
> features.shard enable
>
> features.shard-block-size 512MB
>
> data-self-heal-algorithm full
>
>
>
> Current Hardware
>
> =
>
> Hyper-converged.  VM’s running Gluster Nodes
>
> Currently across three servers.  Distributed-Replicate  - All Bricks 1TB
> SSD
>
> Network - 10GB Connections
>
> Image Sizes – Up to 300GB
>
>
>
> Current Gluster Version
>
> ===
>
> 3.8.4
>
>
>
> Current Settings
>
> =
>
> Type: Distributed-Replicate
>
> Number of Bricks: 4 x 3 = 12
>
> Transport-type: tcp
>
> Options Reconfigured:
>
> cluster.server-quorum-type: server
>
> cluster.quorum-type: auto
>
> network.remote-dio: enable
>
> cluster.eager-lock: enable
>
> performance.stat-prefetch: off
>
> performance.io-cache: off
>
> performance.read-ahead: off
>
> performance.quick-read: off
>
> server.allow-insecure: on
>
> performance.readdir-ahead: on
>
> performance.cache-size: 1GB
>
> performance.io-thread-count: 64
>
> nfs.disable: on
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Migration path from native Gluster-NFS towards NFS-Ganesha

2017-01-16 Thread Gambit15
Why are you using NFS for using Gluster with oVirt? oVirt is natively able
to mount Gluster volumes via FUSE, which is *far* more efficient!

Doug

On 12 January 2017 at 18:36, Giuseppe Ragusa 
wrote:

> Hi all,
>
> In light of the future removal of native Gluster-NFS (and also because of
> a worrying bug that causes NFS crashes, see https://bugzilla.redhat.com/
> show_bug.cgi?id=1381970 then http://www.gluster.org/
> pipermail/gluster-users/2016-November/029333.html and recently
> http://www.gluster.org/pipermail/gluster-users/2017-January/029632.html )
> I'm planning to move towards NFS-Ganesha.
>
> I have a couple of questions for which I could not find answers on the
> available docs (sorry if I missed something):
>
> 1) Is it possible (and advisable, in production too) today (3.8.x) to
> configure a GlusterFS based cluster to use NFS-Ganesha (as NFS v3/v4
> solution) and Samba (as CIFS solution) both controlled by CTDB as a highly
> available *and* load balanced (multiple IPs with DNS round-robin, not
> active/passive) storage solution? (note: I mean *without* using a full
> Pacemaker+Corosync stack)
>
> 2) If the answer to the above question is "yes", is the above above
> mentioned solution capable of coexisting with oVirt in an hyperconverged
> setup (assuming replica 3 etc. etc.)?
>
> Many thanks in advance to anyone who can answer the above and/or point me
> to any relevant resources/docs.
>
> Best regards,
> Giuseppe
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo replication as backup

2016-11-24 Thread Gambit15
I've got a couple of geo-diverse high-capacity ZFS storage boxes for this
exact purpose. Geo-rep rsyncs to the boxes & regular snapshots are taken of
the ZFS volumes. Works flawlessly & allows us to traverse & restore
specific versions of individual files in seconds/minutes.

On 21 November 2016 at 13:32, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> 2016-11-21 15:48 GMT+01:00 Aravinda :
> > When you set checkpoint, you can watch the status of checkpoint
> completion
> > using geo-rep status. Once checkpoint complete, it is guaranteed that
> > everything created before Checkpoint Time is synced to slave.(Note: It
> only
> > ensures that all the creates/updates done before checkpoint time but
> Geo-rep
> > may sync the files which are created/modified after Checkpoint time)
> >
> > Read more about Checkpoint here
> > http://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Geo%20Replication/#checkpoint
>
> Thank you.
> So, can I assume this is the official (and only) way to properly
> backup a Gluster storage ?
> I'm saying "only" way because it would be impossible to backup a multi
> terabyte storage with any other software.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Improving IOPS

2016-11-03 Thread Gambit15
There are lots of factors involved. Can you describe your setup & use case
a little more?

Doug

On 2 November 2016 at 00:09, Lindsay Mathieson 
wrote:

> And after having posted about the dangers of premature optimisation ...
> any suggestion for improving IOPS? as per earlier suggestions I tried
> setting server.event-threads and client.event-threads to 4, but it made no
> real difference.
>
>
> nb: the limiting factor on my cluster is the network (2 * 1G).
>
>
> --
> Lindsay Mathieson
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] What application workloads are too slow for you on gluster?

2016-09-26 Thread Gambit15
No problems with web hosting here, including loads of busy Wordpress sites
& the like. However you need to tune your filesystems correctly.
In our case, we've got webserver VMs running on top of a Gluster layer with
the following configurations...

   - Swap either disabled or strictly minimised with *vm.swappiness=10*
   - All filesystems are mounted with *relatime*
   - All logging is exported to an external server
   - Memcache on each server
   - nginx reverse proxy cache (Squid & Varnish'd also do the job)
   - APC PHP cache

Joe's even written a page

about optimisations.

In all, the filesystem is only touched when a user uploads a file.
Everything else pretty much runs in memory.

This problem isn't so much Gluster's fault, as people trying to use a
distributed filesystem/volume like a local disk.

On 24 September 2016 at 20:36, Pranith Kumar Karampuri 
wrote:

>
>
> On Sat, Sep 24, 2016 at 8:59 PM, Kevin Lemonnier 
> wrote:
>
>> On Sat, Sep 24, 2016 at 07:48:53PM +0530, Pranith Kumar Karampuri wrote:
>> >hi,
>> >A A A A A  I want to get a sense of the kinds of applications you
>> tried
>> >out on gluster but you had to find other alternatives because gluster
>> >didn't perform well enough or the soultion would become too
>> expensive if
>> >you move to all SSD kind of setup.
>>
>> Hi,
>>
>> Web Hosting is what comes to mind for me. Applications like prestashop,
>> wordpress,
>> some custom apps ... I know that I try to use DRBD as much as I can for
>> that since
>> GlusterFS makes the sites just way too slow to use, I tried both fuse and
>> NFS (not
>> ganesha since I'm on debian everytime though, don't know if that matters).
>> Using things like OPCache and moving the application's cache outside of
>> the volume
>> are helping a lot but that brings a whole loads of other problems you
>> can't always
>> deal with, so most of the time I just don't use gluster for that.
>>
>> Last time I really had to use gluster to host a web app I ended up
>> installing a VM
>> with a disk stored on glusterfs and configuring a simple NFS server, that
>> was way
>> faster than mounting a gluster volume directly on the web servers. At
>> least that
>> proves VM hosting works pretty well now though !
>>
>> Now I can't try tiering, unfortunatly I don't have the option of having
>> hardware for
>> that, but maybe that would indeed solve it if it makes looking up lots of
>> tiny files
>> quicker.
>>
>
> I guess website hosting could be small-file workload related? Tiering may
> not help here until we reduce network round trips. I was wondering if
> anyone has any write intensive workload that gluster couldn't keep up with
> and they had to move to SSDs to make sure gluster works fine with it.
>
>
>>
>> --
>> Kevin Lemonnier
>> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Pranith
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] So what are people using for 10G nics

2016-08-26 Thread Gambit15
Switch wise, have a look at the HP FlexFabric 5700-32XGT-8XG-2QSFP+ & Cisco
SG550XG-24T.

For what it's worth, you can minimise your bandwidth whilst maintaining
quorum if you use arbiters.

https://gluster.readthedocs.io/en/latest/Administrator%
20Guide/arbiter-volumes-and-quorum/

On 26 August 2016 at 16:04, WK  wrote:

> Prices seem to be dropping online at NewEgg etc and going from 2 nodes to
> 3 nodes for a quorum implies a lot more traffic than would be comfortable
> with 1G.
>
> Any NIC/Switch recommendations for RH/Cent 7.x and Ubuntu 16?
>
>
> -wk
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users