Re: [Gluster-users] Issues with Geo-replication (GlusterFS 6.3 on Ubuntu 18.04)

2019-10-10 Thread Alexander Iliev

Hi all,

I ended up reinstalling the nodes with CentOS 7.5 and GlusterFS 6.5 
(installed from the SIG.)


Now when I try to create a replication session I get the following:

> # gluster volume geo-replication store1 ::store2 create 
push-pem
> Unable to mount and fetch slave volume details. Please check the log: 
/var/log/glusterfs/geo-replication/gverify-slavemnt.log

> geo-replication command failed

You can find the contents of gverify-slavemnt.log below, but the initial 
error seems to be:


> [2019-10-10 22:07:51.578519] E [fuse-bridge.c:5211:fuse_first_lookup] 
0-fuse: first lookup on root failed (Transport endpoint is not connected)


I only found [this](https://bugzilla.redhat.com/show_bug.cgi?id=1659824) 
bug report which doesn't seem to help. The reported issue is failure to 
mount a volume on a GlusterFS client, but in my case I need 
geo-replication which implies the client (geo-replication master) being 
on a different network.


Any help will be appreciated.

Thanks!

gverify-slavemnt.log:

> [2019-10-10 22:07:40.571256] I [MSGID: 100030] 
[glusterfsd.c:2847:main] 0-glusterfs: Started running glusterfs version 
6.5 (args: glusterfs --xlator-option=*dht.lookup-unhashed=off 
--volfile-server  --volfile-id store2 -l 
/var/log/glusterfs/geo-replication/gverify-slavemnt.log 
/tmp/gverify.sh.5nFlRh)
> [2019-10-10 22:07:40.575438] I [glusterfsd.c:2556:daemonize] 
0-glusterfs: Pid of current running process is 6021
> [2019-10-10 22:07:40.584282] I [MSGID: 101190] 
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 0
> [2019-10-10 22:07:40.584299] I [MSGID: 101190] 
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 1
> [2019-10-10 22:07:40.928094] I [MSGID: 114020] [client.c:2393:notify] 
0-store2-client-0: parent translators are ready, attempting connect on 
transport
> [2019-10-10 22:07:40.931121] I [MSGID: 114020] [client.c:2393:notify] 
0-store2-client-1: parent translators are ready, attempting connect on 
transport
> [2019-10-10 22:07:40.933976] I [MSGID: 114020] [client.c:2393:notify] 
0-store2-client-2: parent translators are ready, attempting connect on 
transport

> Final graph:
> 
+--+

>   1: volume store2-client-0
>   2: type protocol/client
>   3: option ping-timeout 42
>   4: option remote-host 172.31.36.11
>   5: option remote-subvolume /data/gfs/store1/1/brick-store2
>   6: option transport-type socket
>   7: option transport.address-family inet
>   8: option transport.socket.ssl-enabled off
>   9: option transport.tcp-user-timeout 0
>  10: option transport.socket.keepalive-time 20
>  11: option transport.socket.keepalive-interval 2
>  12: option transport.socket.keepalive-count 9
>  13: option send-gids true
>  14: end-volume
>  15:
>  16: volume store2-client-1
>  17: type protocol/client
>  18: option ping-timeout 42
>  19: option remote-host 172.31.36.12
>  20: option remote-subvolume /data/gfs/store1/1/brick-store2
>  21: option transport-type socket
>  22: option transport.address-family inet
>  23: option transport.socket.ssl-enabled off
>  24: option transport.tcp-user-timeout 0
>  25: option transport.socket.keepalive-time 20
>  26: option transport.socket.keepalive-interval 2
>  27: option transport.socket.keepalive-count 9
>  28: option send-gids true
>  29: end-volume
>  30:
>  31: volume store2-client-2
>  32: type protocol/client
>  33: option ping-timeout 42
>  34: option remote-host 172.31.36.13
>  35: option remote-subvolume /data/gfs/store1/1/brick-store2
>  36: option transport-type socket
>  37: option transport.address-family inet
>  38: option transport.socket.ssl-enabled off
>  39: option transport.tcp-user-timeout 0
>  40: option transport.socket.keepalive-time 20
>  41: option transport.socket.keepalive-interval 2
>  42: option transport.socket.keepalive-count 9
>  43: option send-gids true
>  44: end-volume
>  45:
>  46: volume store2-replicate-0
>  47: type cluster/replicate
>  48: option afr-pending-xattr 
store2-client-0,store2-client-1,store2-client-2

>  49: option use-compound-fops off
>  50: subvolumes store2-client-0 store2-client-1 store2-client-2
>  51: end-volume
>  52:
>  53: volume store2-dht
>  54: type cluster/distribute
>  55: option lookup-unhashed off
>  56: option lock-migration off
>  57: option force-migration off
>  58: subvolumes store2-replicate-0
>  59: end-volume
>  60:
>  61: volume store2-write-behind
>  62: type performance/write-behind
>  63: subvolumes store2-dht
>  64: end-volume
>  65:
>  66: volume store2-read-ahead
>  67: type performance/read-ahead
>  68: subvolumes store2-write-behind
>  69: end-volume
>  70:
>  71: volume store2-readdir-ahead
>  72: type performance/readdir-ahead
>  

[Gluster-users] duplicate files on client side

2019-10-10 Thread Gerrit Giehl
Hi @all,

since a few hours we have a problem with duplicate files in one directory (and 
it’s subdirectories) of our volume.
It looks like that the error occurs only on client side (on the server side the 
folder structure looks fine) and that only files (not directories) are affected.

---
$ ls -l /glusterfs/gv0/affected-directory/
total 132
[...]
-rw-r--r--  1 usergroup19878 May 16  2017 Error_404.html
-rw-r--r--  1 usergroup19878 May 16  2017 Error_404.html
[…]
-rw-rw-r--  1 usergroup   91 Mar 10  2017 robots.txt
-rw-rw-r--  1 usergroup   91 Mar 10  2017 robots.txt
[…]
---

an remount didn’t fix the problem but after replacing the mount directive 
everything works fine, but I don’t no why (and if there are hidden side 
effects) and `$ gluster volume heal gv0 info` shows also still an error.
So I also started a full heal request via `$ gluster volume heal gv0 full` 
(hopefully it will fix the whole thing).

---
old
---
host1:/gv0  /glusterfs/gv0 glusterfs
defaults,_netdev,log-level=WARNING 0 0
---

---
new
---
/etc/glusterfs/datastore-vg0.vol/glusterfs/gv0 glusterfs
defaults,_netdev,log-level=WARNING 0 0
---

Does anyone have an idea why this happened? Thanks in advance.

Best regards,
Gerrit




---
$ cat /etc/glusterfs/datastore-vg0.vol

volume remote1
  type protocol/client
  option transport-type tcp
  option remote-host host1
  option remote-subvolume /export/xvda3/brick
end-volume

volume remote2
  type protocol/client
  option transport-type tcp
  option remote-host host2
  option remote-subvolume /export/xvda3/brick
end-volume

volume remote3
  type protocol/client
  option transport-type tcp
  option remote-host host3
  option remote-subvolume /export/xvda3/brick
end-volume

volume remote4
  type protocol/client
  option transport-type tcp
  option remote-host host4
  option remote-subvolume /export/xvda3/brick
end-volume

volume replicate
  type cluster/replicate
  subvolumes remote1 remote2 remote3 remote4
end-volume

volume writebehind
  type performance/write-behind
  #option window-size 1MB
  subvolumes replicate
end-volume

volume cache
  type performance/io-cache
  option cache-size 512MB
  subvolumes writebehind
end-volume
---


---
Server - glusterfs 3.8.7
---
$ sudo gluster volume info

Volume Name: gv0
Type: Distributed-Replicate
Volume ID: xxx
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: host1:/export/xvda3/brick
Brick2: host2:/export/xvda3/brick
Brick3: host3:/export/xvda3/brick
Brick4: host4:/export/xvda3/brick
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off
performance.cache-size: 512MB
performance.client-io-threads: on
performance.io-thread-count: 64
performance.read-ahead: off
performance.cache-refresh-timeout: 1
---


---
$ gluster volume heal gv0 info
Brick host1:/export/xvda3/brick
Status: Connected
Number of entries: 0

Brick host2:/export/xvda3/brick
Status: Connected
Number of entries: 0

Brick host3:/export/xvda3/brick
Status: Connected
Number of entries: 0

Brick host4:/export/xvda3/brick

Status: Connected
Number of entries: 1
---



---
$ gluster volume heal gv0 statistics

[...]

Ending time of crawl: Thu Oct 10 17:30:54 2019

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 2

Starting time of crawl: Thu Oct 10 17:40:54 2019

Ending time of crawl: Thu Oct 10 17:40:55 2019

Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 1

[...]

Starting time of crawl: Thu Oct 10 17:28:01 2019

Crawl is in progress
Type of crawl: FULL
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
---





Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org

[Gluster-users] [Gluster-devel] [Gluster-Maintainers] GlusterFS - 7.0RC3 - Test day (14th Oct 2019)

2019-10-10 Thread Rinku Kothiya
Hi,

Release-7 RC3 packages are built. We are planning to have a test day on
14-Oct-2019, we request your participation. Do post on the lists any
testing done and feedback for the same.

Packages for Fedora 29, Fedora 30, RHEL 8  at
https://download.gluster.org/pub/gluster/glusterfs/qa-releases/7.0rc3/

Packages for CentOS 7 at :
https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/

Packages are signed. The public key is at
https://download.gluster.org/pub/gluster/glusterfs/6/rsa.pub

Regards
Rinku


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/us

2019-10-10 Thread lejeczek
On 30/01/2019 20:26, Artem Russakovskii wrote:
> I found a similar issue
> here: https://bugzilla.redhat.com/show_bug.cgi?id=1313567. There's a
> comment from 3 days ago from someone else with 5.3 who started seeing
> the spam.
>
> Here's the command that repeats over and over:
> [2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref]
> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
> [0x7fd966fcd329]
> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
> [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>
> Is there any fix for this issue?
>
> Thanks.
>
> Sincerely,
> Artem
>
> --
> Founder, Android Police , APK Mirror
> , Illogical Robot LLC
> beerpla.net  | +ArtemRussakovskii
>  | @ArtemR
> 
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

I get no crashes but with 6.5 I see these:

...

[2019-10-10 15:52:08.528208] I [io-stats.c:4027:fini] 0-USER-HOME:
io-stats translator unloaded
[2019-10-10 15:52:10.441283] E [MSGID: 101046]
[dht-common.c:11247:dht_pt_fgetxattr_cbk] 0-USER-HOME-dht: dict is null
[2019-10-10 15:52:10.441387] E [MSGID: 101046]
[dht-common.c:11248:dht_pt_fgetxattr_cbk] 0-USER-HOME-dht: dict is null
[2019-10-10 15:52:10.555957] E [MSGID: 108006]
[afr-common.c:5318:__afr_handle_child_down_event]
0-USER-HOME-replicate-0: All subvolumes are down. Going offline until at
least one of them comes back up.
[2019-10-10 15:52:10.557136] I [io-stats.c:4027:fini] 0-USER-HOME:
io-stats translator unloaded
The message "E [MSGID: 101046] [dht-common.c:11220:dht_pt_getxattr_cbk]
0-USER-HOME-dht: dict is null" repeated 8 times between [2019-10-10
15:52:07.263547] and [2019-10-10 15:52:07.649220]
The message "E [MSGID: 101046] [dht-common.c:11221:dht_pt_getxattr_cbk]
0-USER-HOME-dht: dict is null" repeated 8 times between [2019-10-10
15:52:07.263620] and [2019-10-10 15:52:07.649223]
[2019-10-10 15:56:11.291652] E [MSGID: 101046]
[dht-common.c:11247:dht_pt_fgetxattr_cbk] 0-USER-HOME-dht: dict is null
[2019-10-10 15:56:11.291742] E [MSGID: 101046]
[dht-common.c:11248:dht_pt_fgetxattr_cbk] 0-USER-HOME-dht: dict is null
[2019-10-10 15:56:11.974495] E [MSGID: 101046]
[dht-common.c:11220:dht_pt_getxattr_cbk] 0-USER-HOME-dht: dict is null
[2019-10-10 15:56:11.974568] E [MSGID: 101046]
[dht-common.c:11221:dht_pt_getxattr_cbk] 0-USER-HOME-dht: dict is null
The message "E [MSGID: 101046] [dht-common.c:11220:dht_pt_getxattr_cbk]
0-USER-HOME-dht: dict is null" repeated 8 times between [2019-10-10
15:56:11.974495] and [2019-10-10 15:56:23.911313]
The message "E [MSGID: 101046] [dht-common.c:11221:dht_pt_getxattr_cbk]
0-USER-HOME-dht: dict is null" repeated 8 times between [2019-10-10
15:56:11.974568] and [2019-10-10 15:56:23.911316]

...

 
And in case it might have something to do with above log errors,
interestingly, if quotas are in use on paths in the volume then windows
shares (Samba) deny to copy in new data saying that 0 bytes is
free(which is false)




pEpkey.asc
Description: application/pgp-keys


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster slow reads

2019-10-10 Thread adi



 Hi,

 I have a very small distributed dispersed volume with 4+2 and 2 servers 
with 3 bricks on each of them. The volume is mounted via fuse client on 
another Linux server.


  The volume worked well for a few months in this setup. However in the 
last few days I have a very slow read speed (gluster uploads to the 
gluster client via the mount point).  By slow I mean 3-4 MB/sec on a 
gigabit link. I don't have small files stored on gluster, the smallest 
file is around 30-40 MB. The networking  between the client and the 
bricks is fine (all of them are connected in the same switch, no errors, 
some iptraf tests directly between client and gluster servers are 
looking good). On the same client I have mounted another 2 gluster 
volumes from other servers, both of them are ok.


 My 'gluster volume info' details are the following :

Volume Name: gluster4-vol
Type: Disperse
Volume ID: fa464bb9-b034-4fce-a56e-7ac157432d59
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: gluster4:/export/sdb1/brick
Brick2: gluster5:/export/sdb1/brick
Brick3: gluster4:/export/sdc1/brick
Brick4: gluster5:/export/sdc1/brick
Brick5: gluster4:/export/sdd1/brick
Brick6: gluster5:/export/sdd1/brick
Options Reconfigured:
network.ping-timeout: 60
performance.client-io-threads: on
performance.io-thread-count: 32
cluster.readdir-optimize: on
performance.cache-size: 1GB
client.event-threads: 10
server.event-threads: 10
cluster.lookup-optimize: on
server.allow-insecure: on
storage.reserve: 0
transport.address-family: inet
nfs.disable: on


 In the gluster client debug logs I noticed the following lines :

 [2019-10-10 13:49:58.571879] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: OPEN 
scheduled as fast fop
[2019-10-10 13:49:58.571964] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FSTAT 
scheduled as fast fop
[2019-10-10 13:49:58.572058] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FLUSH 
scheduled as normal fop
[2019-10-10 13:49:58.572728] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: OPEN 
scheduled as fast fop
[2019-10-10 13:49:58.576275] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FSTAT 
scheduled as fast fop
[2019-10-10 13:50:07.837069] D [logging.c:1952:_gf_msg_internal] 
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. 
About to flush least recently used log message to disk
The message "D [MSGID: 0] [io-threads.c:356:iot_schedule] 
0-gluster4-vol-io-threads: READ scheduled as slow fop" repeated 285 
times between [2019-10-10 13:49:58.357922] and [2019-10-10 
13:50:07.837047]
[2019-10-10 13:50:07.837068] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FLUSH 
scheduled as normal fop
[2019-10-10 13:50:07.837165] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: READ 
scheduled as slow fop



 What does it mean exactly "READ scheduled as slow fop" ? Can I schedule 
READs as normal or fast fop like the other operations ?


 I'm using this gluster volume only for reading so I don't care about 
writes right now.



 Thanks.



Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] volume's geo-replication.* features - needed if no georepl ?

2019-10-10 Thread Aravinda Vishwanathapura Krishna Murthy
These options are not required if Geo-replication is not used.

On Wed, Oct 9, 2019 at 1:11 PM lejeczek  wrote:

> hi everyone,
>
> are those options needed to be 'on' if cluster does not use georepl?
>
> > geo-replication.indexing
> on
> > geo-replication.indexing
> on
> > geo-replication.ignore-pid-check
> on
> > geo-replication.ignore-pid-checkon
>
> And are there any (negative) ramification of such a case?
>
> many thanks, L.
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
regards
Aravinda VK


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users