Re: [Gluster-users] trashcan on dist. repl. volume with geo-replication

2018-03-12 Thread Kotresh Hiremath Ravishankar
Hi Dietmar,

I am trying to understand the problem and have few questions.

1. Is trashcan enabled only on master volume?
2. Does the 'rm -rf' done on master volume synced to slave ?
3. If trashcan is disabled, the issue goes away?

The geo-rep error just says the it failed to create the directory
"Oracle_VM_VirtualBox_Extension" on slave.
Usually this would be because of gfid mismatch but I don't see that in your
case. So I am little more interested
in present state of the geo-rep. Is it still throwing same errors and same
failure to sync the same directory. If
so does the parent 'test1/b1' exists on slave?

And doing ls on trashcan should not affect geo-rep. Is there a easy
reproducer for this ?


Thanks,
Kotresh HR

On Mon, Mar 12, 2018 at 10:13 PM, Dietmar Putz 
wrote:

> Hello,
>
> in regard to
> https://bugzilla.redhat.com/show_bug.cgi?id=1434066
> i have been faced to another issue when using the trashcan feature on a
> dist. repl. volume running a geo-replication. (gfs 3.12.6 on ubuntu 16.04.4)
> for e.g. removing an entire directory with subfolders :
> tron@gl-node1:/myvol-1/test1/b1$ rm -rf *
>
> afterwards listing files in the trashcan :
> tron@gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/
>
> leads to an outage of the geo-replication.
> error on master-01 and master-02 :
>
> [2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl]
> _GMaster: slave's time stime=(1520861818, 0)
> [2018-03-12 13:37:14.835535] E [master(/brick1/mvol1):784:log_failures]
> _GMaster: ENTRY FAILEDdata=({'uid': 0, 'gfid':
> 'c38f75e3-194a-4d22-9094-50ac8f8756e7', 'gid': 0, 'mode': 16877, 'entry':
> '.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension',
> 'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
> [2018-03-12 13:37:14.835911] E 
> [syncdutils(/brick1/mvol1):299:log_raise_exception]
> : The above directory failed to sync. Please fix it to proceed further.
>
>
> both gfid's of the directories as shown in the log :
> brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
> brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension
> 0xc38f75e3194a4d22909450ac8f8756e7
>
> the shown directory contains just one file which is stored on gl-node3 and
> gl-node4 while node1 and 2 are in geo replication error.
> since the filesize limitation of the trashcan is obsolete i'm really
> interested to use the trashcan feature but i'm concerned it will interrupt
> the geo-replication entirely.
> does anybody else have been faced with this situation...any hints,
> workarounds... ?
>
> best regards
> Dietmar Putz
>
>
> root@gl-node1:~/tmp# gluster volume info mvol1
>
> Volume Name: mvol1
> Type: Distributed-Replicate
> Volume ID: a1c74931-568c-4f40-8573-dd344553e557
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gl-node1-int:/brick1/mvol1
> Brick2: gl-node2-int:/brick1/mvol1
> Brick3: gl-node3-int:/brick1/mvol1
> Brick4: gl-node4-int:/brick1/mvol1
> Options Reconfigured:
> changelog.changelog: on
> geo-replication.ignore-pid-check: on
> geo-replication.indexing: on
> features.trash-max-filesize: 2GB
> features.trash: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> root@gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1
> gl-node5-int::mvol1 config
> special_sync_mode: partial
> gluster_log_file: /var/log/glusterfs/geo-replica
> tion/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%
> 2F%2F127.0.0.1%3Amvol1.gluster.log
> ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replication/secret.pem
> change_detector: changelog
> use_meta_volume: true
> session_owner: a1c74931-568c-4f40-8573-dd344553e557
> state_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
> monitor.status
> gluster_params: aux-gfid-mount acl
> remote_gsyncd: /nonexistent/gsyncd
> working_dir: /var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.
> 178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
> state_detail_file: /var/lib/glusterd/geo-replicat
> ion/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.
> 178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status
> gluster_command_dir: /usr/sbin/
> pid_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
> monitor.pid
> georep_session_working_dir: /var/lib/glusterd/geo-replicat
> ion/mvol1_gl-node5-int_mvol1/
> ssh_command_tar: ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replicat
> ion/tar_ssh.pem
> master.stime_xattr_name: trusted.glusterfs.a1c74931-568
> c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
> changelog_log_file: /var/log/glusterfs/geo-replica
> tion/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%
> 2F%2F127.0.0.1%3Amvol1-changes.log
> socketdir: /var/run/gluster
> volume_id: a1c74931-568c-4f40-8573-dd344553e557
> ignore_deletes: false
> 

[Gluster-users] trashcan on dist. repl. volume with geo-replication

2018-03-12 Thread Dietmar Putz

Hello,

in regard to
https://bugzilla.redhat.com/show_bug.cgi?id=1434066
i have been faced to another issue when using the trashcan feature on a 
dist. repl. volume running a geo-replication. (gfs 3.12.6 on ubuntu 16.04.4)

for e.g. removing an entire directory with subfolders :
tron@gl-node1:/myvol-1/test1/b1$ rm -rf *

afterwards listing files in the trashcan :
tron@gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/

leads to an outage of the geo-replication.
error on master-01 and master-02 :

[2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl] 
_GMaster: slave's time stime=(1520861818, 0)
[2018-03-12 13:37:14.835535] E [master(/brick1/mvol1):784:log_failures] 
_GMaster: ENTRY FAILED    data=({'uid': 0, 'gfid': 
'c38f75e3-194a-4d22-9094-50ac8f8756e7', 'gid': 0, 'mode': 16877, 
'entry': 
'.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension', 
'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
[2018-03-12 13:37:14.835911] E 
[syncdutils(/brick1/mvol1):299:log_raise_exception] : The above 
directory failed to sync. Please fix it to proceed further.



both gfid's of the directories as shown in the log :
brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension 
0xc38f75e3194a4d22909450ac8f8756e7


the shown directory contains just one file which is stored on gl-node3 
and gl-node4 while node1 and 2 are in geo replication error.
since the filesize limitation of the trashcan is obsolete i'm really 
interested to use the trashcan feature but i'm concerned it will 
interrupt the geo-replication entirely.
does anybody else have been faced with this situation...any hints, 
workarounds... ?


best regards
Dietmar Putz


root@gl-node1:~/tmp# gluster volume info mvol1

Volume Name: mvol1
Type: Distributed-Replicate
Volume ID: a1c74931-568c-4f40-8573-dd344553e557
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gl-node1-int:/brick1/mvol1
Brick2: gl-node2-int:/brick1/mvol1
Brick3: gl-node3-int:/brick1/mvol1
Brick4: gl-node4-int:/brick1/mvol1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.trash-max-filesize: 2GB
features.trash: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

root@gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1 
gl-node5-int::mvol1 config

special_sync_mode: partial
gluster_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no 
-i /var/lib/glusterd/geo-replication/secret.pem

change_detector: changelog
use_meta_volume: true
session_owner: a1c74931-568c-4f40-8573-dd344553e557
state_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status

gluster_params: aux-gfid-mount acl
remote_gsyncd: /nonexistent/gsyncd
working_dir: 
/var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
state_detail_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status

gluster_command_dir: /usr/sbin/
pid_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
georep_session_working_dir: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
ssh_command_tar: ssh -oPasswordAuthentication=no 
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
master.stime_xattr_name: 
trusted.glusterfs.a1c74931-568c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
changelog_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log

socketdir: /var/run/gluster
volume_id: a1c74931-568c-4f40-8573-dd344553e557
ignore_deletes: false
state_socket_unencoded: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log

access_mount: true
root@gl-node1:/myvol-1/test1#

--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

2018-03-12 Thread Anatoliy Dmytriyev

Hello,

We have a very fresh gluster 3.10.10 installation.
Our volume is created as distributed volume, 9 bricks 96TB in total 
(87TB after 10% of gluster disk space reservation)


For some reasons I can’t “heal” the volume:
# gluster volume heal gv0
Launching heal operation to perform index self heal on volume gv0 has 
been unsuccessful on bricks that are down. Please check if all brick 
processes are running.


Which processes should be run on every brick for heal operation?

# gluster volume status
Status of volume: gv0
Gluster process TCP Port  RDMA Port  Online  
Pid

--
Brick cn01-ib:/gfs/gv0/brick1/brick 0 49152  Y   
70850
Brick cn02-ib:/gfs/gv0/brick1/brick 0 49152  Y   
102951
Brick cn03-ib:/gfs/gv0/brick1/brick 0 49152  Y   
57535
Brick cn04-ib:/gfs/gv0/brick1/brick 0 49152  Y   
56676
Brick cn05-ib:/gfs/gv0/brick1/brick 0 49152  Y   
56880
Brick cn06-ib:/gfs/gv0/brick1/brick 0 49152  Y   
56889
Brick cn07-ib:/gfs/gv0/brick1/brick 0 49152  Y   
56902
Brick cn08-ib:/gfs/gv0/brick1/brick 0 49152  Y   
94920
Brick cn09-ib:/gfs/gv0/brick1/brick 0 49152  Y   
56542


Task Status of Volume gv0
--
There are no active volume tasks


# gluster volume info gv0
Volume Name: gv0
Type: Distribute
Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
Status: Started
Snapshot Count: 0
Number of Bricks: 9
Transport-type: rdma
Bricks:
Brick1: cn01-ib:/gfs/gv0/brick1/brick
Brick2: cn02-ib:/gfs/gv0/brick1/brick
Brick3: cn03-ib:/gfs/gv0/brick1/brick
Brick4: cn04-ib:/gfs/gv0/brick1/brick
Brick5: cn05-ib:/gfs/gv0/brick1/brick
Brick6: cn06-ib:/gfs/gv0/brick1/brick
Brick7: cn07-ib:/gfs/gv0/brick1/brick
Brick8: cn08-ib:/gfs/gv0/brick1/brick
Brick9: cn09-ib:/gfs/gv0/brick1/brick
Options Reconfigured:
client.event-threads: 8
performance.parallel-readdir: on
performance.readdir-ahead: on
cluster.nufa: on
nfs.disable: on


--
Best regards,
Anatoliy
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Expected performance for WORM scenario

2018-03-12 Thread Nithya Balachandran
Hi,

Can you send us the following details:
1. gluster volume info
2. What client you are using to run this?

Thanks,
Nithya

On 12 March 2018 at 18:16, Andreas Ericsson 
wrote:

> Heya fellas.
>
> I've been struggling quite a lot to get glusterfs to perform even
> halfdecently with a write-intensive workload. Testnumbers are from gluster
> 3.10.7.
>
> We store a bunch of small files in a doubly-tiered sha1 hash fanout
> directory structure. The directories themselves aren't overly full. Most of
> the data we write to gluster is "write once, read probably never", so 99%
> of all operations are of the write variety.
>
> The network between servers is sound. 10gb network cards run over a 10gb
> (doh) switch. iperf reports 9.86Gbit/sec. ping reports a latency of 0.1 -
> 0.2 ms. There is no firewall, no packet inspection and no nothing between
> the servers, and the 10gb switch is the only path between the two machines,
> so traffic isn't going over some 2mbit wifi by accident.
>
> Our main storage has always been really slow (write speed of roughly
> 1.5MiB/s), but I had long attributed that to the extremely slow disks we
> use to back it, so now that we're expanding I set up a new gluster cluster
> with state of the art NVMe SSD drives to boost performance. However,
> performance only hopped up to around 2.1MiB/s. Perplexed, I tried it first
> with a 3-node cluster using 2GB ramdrives, which got me up to 2.4MiB/s. My
> last resort was to use a single node running on ramdisk, just to 100%
> exclude any network shenanigans, but the write performance stayed at an
> absolutely abysmal 3MiB/s.
>
> Writing straight to (the same) ramdisk gives me "normal" ramdisk speed (I
> don't actually remember the numbers, but my test that took 2 minutes with
> gluster completed before I had time to blink). Writing straight to the
> backing SSD drives gives me a throughput of 96MiB/sec.
>
> The test itself writes 8494 files that I simply took randomly from our
> production environment, comprising a total of 63.4MiB (so average file size
> is just under 8k. Most are actually close to 4k though, with the occasional
> 2-or-so MB file in there.
>
> I have googled and read a *lot* of performance-tuning guides, but the
> 3MiB/sec on single-node ramdisk seems to be far beyond the crippling one
> can cause by misconfiguration of a single system.
>
> With this in mind; What sort of write performance can one reasonably hope
> to get with gluster? Assume a 3-node cluster running on top of (small)
> ramdisks on a fast and stable network. Is it just a bad fit for our
> workload?
>
> /Andreas
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Expected performance for WORM scenario

2018-03-12 Thread Ondrej Valousek
Hi,
Gluster will never perform well for small files.
I believe there is  nothing you can do with this.
Ondrej

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Andreas Ericsson
Sent: Monday, March 12, 2018 1:47 PM
To: Gluster-users@gluster.org
Subject: [Gluster-users] Expected performance for WORM scenario

Heya fellas.

I've been struggling quite a lot to get glusterfs to perform even halfdecently 
with a write-intensive workload. Testnumbers are from gluster 3.10.7.

We store a bunch of small files in a doubly-tiered sha1 hash fanout directory 
structure. The directories themselves aren't overly full. Most of the data we 
write to gluster is "write once, read probably never", so 99% of all operations 
are of the write variety.

The network between servers is sound. 10gb network cards run over a 10gb (doh) 
switch. iperf reports 9.86Gbit/sec. ping reports a latency of 0.1 - 0.2 ms. 
There is no firewall, no packet inspection and no nothing between the servers, 
and the 10gb switch is the only path between the two machines, so traffic isn't 
going over some 2mbit wifi by accident.

Our main storage has always been really slow (write speed of roughly 1.5MiB/s), 
but I had long attributed that to the extremely slow disks we use to back it, 
so now that we're expanding I set up a new gluster cluster with state of the 
art NVMe SSD drives to boost performance. However, performance only hopped up 
to around 2.1MiB/s. Perplexed, I tried it first with a 3-node cluster using 2GB 
ramdrives, which got me up to 2.4MiB/s. My last resort was to use a single node 
running on ramdisk, just to 100% exclude any network shenanigans, but the write 
performance stayed at an absolutely abysmal 3MiB/s.

Writing straight to (the same) ramdisk gives me "normal" ramdisk speed (I don't 
actually remember the numbers, but my test that took 2 minutes with gluster 
completed before I had time to blink). Writing straight to the backing SSD 
drives gives me a throughput of 96MiB/sec.

The test itself writes 8494 files that I simply took randomly from our 
production environment, comprising a total of 63.4MiB (so average file size is 
just under 8k. Most are actually close to 4k though, with the occasional 
2-or-so MB file in there.

I have googled and read a *lot* of performance-tuning guides, but the 3MiB/sec 
on single-node ramdisk seems to be far beyond the crippling one can cause by 
misconfiguration of a single system.

With this in mind; What sort of write performance can one reasonably hope to 
get with gluster? Assume a 3-node cluster running on top of (small) ramdisks on 
a fast and stable network. Is it just a bad fit for our workload?

/Andreas

-

The information contained in this e-mail and in any attachments is confidential 
and is designated solely for the attention of the intended recipient(s). If you 
are not an intended recipient, you must not use, disclose, copy, distribute or 
retain this e-mail or any part thereof. If you have received this e-mail in 
error, please notify the sender by return e-mail and delete all copies of this 
e-mail from your computer system(s). Please direct any additional queries to: 
communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 
Group). Registered in Ireland no. 378073. Registered Office: South County 
Business Park, Leopardstown, Dublin 18.___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Expected performance for WORM scenario

2018-03-12 Thread Andreas Ericsson
Heya fellas.

I've been struggling quite a lot to get glusterfs to perform even
halfdecently with a write-intensive workload. Testnumbers are from gluster
3.10.7.

We store a bunch of small files in a doubly-tiered sha1 hash fanout
directory structure. The directories themselves aren't overly full. Most of
the data we write to gluster is "write once, read probably never", so 99%
of all operations are of the write variety.

The network between servers is sound. 10gb network cards run over a 10gb
(doh) switch. iperf reports 9.86Gbit/sec. ping reports a latency of 0.1 -
0.2 ms. There is no firewall, no packet inspection and no nothing between
the servers, and the 10gb switch is the only path between the two machines,
so traffic isn't going over some 2mbit wifi by accident.

Our main storage has always been really slow (write speed of roughly
1.5MiB/s), but I had long attributed that to the extremely slow disks we
use to back it, so now that we're expanding I set up a new gluster cluster
with state of the art NVMe SSD drives to boost performance. However,
performance only hopped up to around 2.1MiB/s. Perplexed, I tried it first
with a 3-node cluster using 2GB ramdrives, which got me up to 2.4MiB/s. My
last resort was to use a single node running on ramdisk, just to 100%
exclude any network shenanigans, but the write performance stayed at an
absolutely abysmal 3MiB/s.

Writing straight to (the same) ramdisk gives me "normal" ramdisk speed (I
don't actually remember the numbers, but my test that took 2 minutes with
gluster completed before I had time to blink). Writing straight to the
backing SSD drives gives me a throughput of 96MiB/sec.

The test itself writes 8494 files that I simply took randomly from our
production environment, comprising a total of 63.4MiB (so average file size
is just under 8k. Most are actually close to 4k though, with the occasional
2-or-so MB file in there.

I have googled and read a *lot* of performance-tuning guides, but the
3MiB/sec on single-node ramdisk seems to be far beyond the crippling one
can cause by misconfiguration of a single system.

With this in mind; What sort of write performance can one reasonably hope
to get with gluster? Assume a 3-node cluster running on top of (small)
ramdisks on a fast and stable network. Is it just a bad fit for our
workload?

/Andreas
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users