Re: [Gluster-users] Gluster slow reads

2019-10-15 Thread Ashish Pandey
Hi, 

I am keeping Raghvendra in loop and hope he can comment on the "Read being 
scheduled as slow fop". 

Other than that, I would request you to provide following information to debug 
this issue. 

1 - Profile information of the volume. You can find the steps here - 
https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/ 

2 - Gluster volume  status 

3 - Gluster volume heal  info 

4 - Check the size of each bricks to see if it has been filled. 

5 - gluster version of client and server. 

6 - Host OS version of client and server. 

--- 
Ashish 

- Original Message -

From: a...@telecompartners.ro 
To: gluster-users@gluster.org 
Sent: Thursday, October 10, 2019 8:08:24 PM 
Subject: [Gluster-users] Gluster slow reads 


Hi, 

I have a very small distributed dispersed volume with 4+2 and 2 servers 
with 3 bricks on each of them. The volume is mounted via fuse client on 
another Linux server. 

The volume worked well for a few months in this setup. However in the 
last few days I have a very slow read speed (gluster uploads to the 
gluster client via the mount point). By slow I mean 3-4 MB/sec on a 
gigabit link. I don't have small files stored on gluster, the smallest 
file is around 30-40 MB. The networking between the client and the 
bricks is fine (all of them are connected in the same switch, no errors, 
some iptraf tests directly between client and gluster servers are 
looking good). On the same client I have mounted another 2 gluster 
volumes from other servers, both of them are ok. 

My 'gluster volume info' details are the following : 

Volume Name: gluster4-vol 
Type: Disperse 
Volume ID: fa464bb9-b034-4fce-a56e-7ac157432d59 
Status: Started 
Snapshot Count: 0 
Number of Bricks: 1 x (4 + 2) = 6 
Transport-type: tcp 
Bricks: 
Brick1: gluster4:/export/sdb1/brick 
Brick2: gluster5:/export/sdb1/brick 
Brick3: gluster4:/export/sdc1/brick 
Brick4: gluster5:/export/sdc1/brick 
Brick5: gluster4:/export/sdd1/brick 
Brick6: gluster5:/export/sdd1/brick 
Options Reconfigured: 
network.ping-timeout: 60 
performance.client-io-threads: on 
performance.io-thread-count: 32 
cluster.readdir-optimize: on 
performance.cache-size: 1GB 
client.event-threads: 10 
server.event-threads: 10 
cluster.lookup-optimize: on 
server.allow-insecure: on 
storage.reserve: 0 
transport.address-family: inet 
nfs.disable: on 


In the gluster client debug logs I noticed the following lines : 

[2019-10-10 13:49:58.571879] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: OPEN 
scheduled as fast fop 
[2019-10-10 13:49:58.571964] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FSTAT 
scheduled as fast fop 
[2019-10-10 13:49:58.572058] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FLUSH 
scheduled as normal fop 
[2019-10-10 13:49:58.572728] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: OPEN 
scheduled as fast fop 
[2019-10-10 13:49:58.576275] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FSTAT 
scheduled as fast fop 
[2019-10-10 13:50:07.837069] D [logging.c:1952:_gf_msg_internal] 
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. 
About to flush least recently used log message to disk 
The message "D [MSGID: 0] [io-threads.c:356:iot_schedule] 
0-gluster4-vol-io-threads: READ scheduled as slow fop" repeated 285 
times between [2019-10-10 13:49:58.357922] and [2019-10-10 
13:50:07.837047] 
[2019-10-10 13:50:07.837068] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: FLUSH 
scheduled as normal fop 
[2019-10-10 13:50:07.837165] D [MSGID: 0] 
[io-threads.c:356:iot_schedule] 0-gluster4-vol-io-threads: READ 
scheduled as slow fop 


What does it mean exactly "READ scheduled as slow fop" ? Can I schedule 
READs as normal or fast fop like the other operations ? 

I'm using this gluster volume only for reading so I don't care about 
writes right now. 


Thanks. 

 

Community Meeting Calendar: 

APAC Schedule - 
Every 2nd and 4th Tuesday at 11:30 AM IST 
Bridge: https://bluejeans.com/118564314 

NA/EMEA Schedule - 
Every 1st and 3rd Tuesday at 01:00 PM EDT 
Bridge: https://bluejeans.com/118564314 

Gluster-users mailing list 
Gluster-users@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users 



Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Issues with Geo-replication (GlusterFS 6.3 on Ubuntu 18.04)

2019-10-15 Thread Aravinda Vishwanathapura Krishna Murthy
Hi Alexander,

Please check the status of Volume. Looks like the Slave volume mount is
failing because bricks are down or not reachable. If Volume status shows
all bricks are up then try mounting the slave volume using mount command.

```
masternode$ mkdir /mnt/vol
masternode$ mount -t glusterfs : /mnt/vol
```

On Fri, Oct 11, 2019 at 4:03 AM Alexander Iliev 
wrote:

> Hi all,
>
> I ended up reinstalling the nodes with CentOS 7.5 and GlusterFS 6.5
> (installed from the SIG.)
>
> Now when I try to create a replication session I get the following:
>
>  > # gluster volume geo-replication store1 ::store2 create
> push-pem
>  > Unable to mount and fetch slave volume details. Please check the log:
> /var/log/glusterfs/geo-replication/gverify-slavemnt.log
>  > geo-replication command failed
>
> You can find the contents of gverify-slavemnt.log below, but the initial
> error seems to be:
>
>  > [2019-10-10 22:07:51.578519] E [fuse-bridge.c:5211:fuse_first_lookup]
> 0-fuse: first lookup on root failed (Transport endpoint is not connected)
>
> I only found [this](https://bugzilla.redhat.com/show_bug.cgi?id=1659824)
> bug report which doesn't seem to help. The reported issue is failure to
> mount a volume on a GlusterFS client, but in my case I need
> geo-replication which implies the client (geo-replication master) being
> on a different network.
>
> Any help will be appreciated.
>
> Thanks!
>
> gverify-slavemnt.log:
>
>  > [2019-10-10 22:07:40.571256] I [MSGID: 100030]
> [glusterfsd.c:2847:main] 0-glusterfs: Started running glusterfs version
> 6.5 (args: glusterfs --xlator-option=*dht.lookup-unhashed=off
> --volfile-server  --volfile-id store2 -l
> /var/log/glusterfs/geo-replication/gverify-slavemnt.log
> /tmp/gverify.sh.5nFlRh)
>  > [2019-10-10 22:07:40.575438] I [glusterfsd.c:2556:daemonize]
> 0-glusterfs: Pid of current running process is 6021
>  > [2019-10-10 22:07:40.584282] I [MSGID: 101190]
> [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 0
>  > [2019-10-10 22:07:40.584299] I [MSGID: 101190]
> [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
>  > [2019-10-10 22:07:40.928094] I [MSGID: 114020] [client.c:2393:notify]
> 0-store2-client-0: parent translators are ready, attempting connect on
> transport
>  > [2019-10-10 22:07:40.931121] I [MSGID: 114020] [client.c:2393:notify]
> 0-store2-client-1: parent translators are ready, attempting connect on
> transport
>  > [2019-10-10 22:07:40.933976] I [MSGID: 114020] [client.c:2393:notify]
> 0-store2-client-2: parent translators are ready, attempting connect on
> transport
>  > Final graph:
>  >
>
> +--+
>  >   1: volume store2-client-0
>  >   2: type protocol/client
>  >   3: option ping-timeout 42
>  >   4: option remote-host 172.31.36.11
>  >   5: option remote-subvolume /data/gfs/store1/1/brick-store2
>  >   6: option transport-type socket
>  >   7: option transport.address-family inet
>  >   8: option transport.socket.ssl-enabled off
>  >   9: option transport.tcp-user-timeout 0
>  >  10: option transport.socket.keepalive-time 20
>  >  11: option transport.socket.keepalive-interval 2
>  >  12: option transport.socket.keepalive-count 9
>  >  13: option send-gids true
>  >  14: end-volume
>  >  15:
>  >  16: volume store2-client-1
>  >  17: type protocol/client
>  >  18: option ping-timeout 42
>  >  19: option remote-host 172.31.36.12
>  >  20: option remote-subvolume /data/gfs/store1/1/brick-store2
>  >  21: option transport-type socket
>  >  22: option transport.address-family inet
>  >  23: option transport.socket.ssl-enabled off
>  >  24: option transport.tcp-user-timeout 0
>  >  25: option transport.socket.keepalive-time 20
>  >  26: option transport.socket.keepalive-interval 2
>  >  27: option transport.socket.keepalive-count 9
>  >  28: option send-gids true
>  >  29: end-volume
>  >  30:
>  >  31: volume store2-client-2
>  >  32: type protocol/client
>  >  33: option ping-timeout 42
>  >  34: option remote-host 172.31.36.13
>  >  35: option remote-subvolume /data/gfs/store1/1/brick-store2
>  >  36: option transport-type socket
>  >  37: option transport.address-family inet
>  >  38: option transport.socket.ssl-enabled off
>  >  39: option transport.tcp-user-timeout 0
>  >  40: option transport.socket.keepalive-time 20
>  >  41: option transport.socket.keepalive-interval 2
>  >  42: option transport.socket.keepalive-count 9
>  >  43: option send-gids true
>  >  44: end-volume
>  >  45:
>  >  46: volume store2-replicate-0
>  >  47: type cluster/replicate
>  >  48: option afr-pending-xattr
> store2-client-0,store2-client-1,store2-client-2
>  >  49: option use-compound-fops off
>  >  50: subvolumes store2-client-0 store2-client-1 store2-client-2
>  >  51: 

Re: [Gluster-users] Client Handling of Elastic Clusters

2019-10-15 Thread Amar Tumballi
Hi Timothy,

Thanks for this report. This seems to be a genuine issue. I don't think we
have a solution for this issue for now, other than may be making sure we
point 'serverD' (or new server's IPs) as ServerA in /etc/hosts on that
particular client as a hack.

Meantime, it would be great if you copy paste this in an issue (
https://github.com/gluster/glusterfs/issues/new), it would be good to track
this.

Regards,
Amar

On Wed, Oct 16, 2019 at 12:35 AM Timothy Orme  wrote:

> Hello,
>
> I'm trying to setup an elastic gluster cluster and am running into a few
> odd edge cases that I'm unsure how to address.  I'll try and walk through
> the setup as best I can.
>
> If I have a replica 3 distributed-replicated volume, with 2 replicated
> volumes to start:
>
> MyVolume
>Replica 1
>   serverA
>   serverB
>   serverC
>Replica 2
>   serverD
>   serverE
>   serverF
>
> And the client mounts the volume with serverA as the primary volfile
> server, and B & C as the backups.
>
> Then, if I perform a scale down event, it selects the first replica volume
> as the one to remove.  So I end up with a configuration like:
>
> MyVolume
>Replica 2
>   serverD
>   serverE
>   serverF
>
> Everything rebalances and works great.  However, at this point, the client
> has lost any connection with a volfile server.  It knows about D, E, and F,
> so my data is all fine, but it can no longer retrieve a volfile.  In the
> logs I see:
>
> [2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: Exhausted all volfile servers
>
> This becomes problematic when I try and scale back up, and add a
> replicated volume back in:
>
> MyVolume
>Replica 2
>   serverD
>   serverE
>   serverF
>Replica 3
>   serverG
>   serverH
>   serverI
>
> And then rebalance the volume.  Now, I have all my data present, but the
> client only knows about D,E,F, so when I run an `ls` on a directory, only
> about half of the files are returned, since the other half live on G,H,I
> which the client doesn't know about.  The data is still there, but it would
> require a re-mount at one of the new servers.
>
> My question then, is there a way to have a more dynamic set of volfile
> servers? What would be great is if there was a way to tell the mount to
> fall back on the servers returned in the volfile itself in case the primary
> one goes away.
>
> If there's not an easy way to do this, is there a flag on the mount helper
> that can cause the mount to die or error out in the event that it is unable
> to retrieve volfiles?  The problem now is that it sort of silently fails
> and returns incomplete file listings, which for my use cases can cause
> improper processing of that data.  I'd rather have it hard error than
> provide bad results silently obviously.
>
> Hope that makes sense, if you need further clarity please let me know.
>
> Thanks,
> Tim
>
>
> 
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Client Handling of Elastic Clusters

2019-10-15 Thread Timothy Orme
Hello,

I'm trying to setup an elastic gluster cluster and am running into a few odd 
edge cases that I'm unsure how to address.  I'll try and walk through the setup 
as best I can.

If I have a replica 3 distributed-replicated volume, with 2 replicated volumes 
to start:

MyVolume
   Replica 1
  serverA
  serverB
  serverC
   Replica 2
  serverD
  serverE
  serverF

And the client mounts the volume with serverA as the primary volfile server, 
and B & C as the backups.

Then, if I perform a scale down event, it selects the first replica volume as 
the one to remove.  So I end up with a configuration like:

MyVolume
   Replica 2
  serverD
  serverE
  serverF

Everything rebalances and works great.  However, at this point, the client has 
lost any connection with a volfile server.  It knows about D, E, and F, so my 
data is all fine, but it can no longer retrieve a volfile.  In the logs I see:

[2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 
0-glusterfsd-mgmt: Exhausted all volfile servers

This becomes problematic when I try and scale back up, and add a replicated 
volume back in:

MyVolume
   Replica 2
  serverD
  serverE
  serverF
   Replica 3
  serverG
  serverH
  serverI

And then rebalance the volume.  Now, I have all my data present, but the client 
only knows about D,E,F, so when I run an `ls` on a directory, only about half 
of the files are returned, since the other half live on G,H,I which the client 
doesn't know about.  The data is still there, but it would require a re-mount 
at one of the new servers.

My question then, is there a way to have a more dynamic set of volfile servers? 
What would be great is if there was a way to tell the mount to fall back on the 
servers returned in the volfile itself in case the primary one goes away.

If there's not an easy way to do this, is there a flag on the mount helper that 
can cause the mount to die or error out in the event that it is unable to 
retrieve volfiles?  The problem now is that it sort of silently fails and 
returns incomplete file listings, which for my use cases can cause improper 
processing of that data.  I'd rather have it hard error than provide bad 
results silently obviously.

Hope that makes sense, if you need further clarity please let me know.

Thanks,
Tim




Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] determine filename from shard?

2019-10-15 Thread WK

great, I'll dig into when I get a chance and report back.

-wk

On 10/14/2019 10:05 AM, Amar Tumballi wrote:

Awesome, thanks!

Then, I hope this 
https://github.com/gluster/glusterfs/commit/ab2558a3e7a1b2de2d63a3812ab4ed58d10d8619 is 
included in the build.


What it means is, if you just list the xattrs of the file, you can get 
the file name (pgfid/basename) format. If there are no duplicate 
filenames, then it will be easy. Otherwise, you would need to use 
aux-gfid-mount to get details as told in the patch.


1.
  mount -t glusterfs -o aux-gfid-mount 127.0.0.2:testvol /mnt/aux_mount
 2. and then try doing the operation as told in commit message above
on /mnt/aux_mount/.gfid/


Regards,
Amar


On Mon, Oct 14, 2019 at 9:13 PM WK > wrote:


6.5 on Ubuntu 18


On 10/13/2019 10:14 PM, Amar Tumballi wrote:

Which is the version of GlusterFS you are using?

There are few methods like 'gfid2path' option to fetch details of
the file path from a gfid in versions above v6.0




On Mon, Oct 14, 2019 at 8:14 AM wkmail mailto:wkm...@bneit.com>> wrote:

We recently were trying to track down a high load situation
on a KVM
node in a cluster running gluster (replication2 + arb).

Iotop on the affected node showed that gluster was involved
showing
abnormally heavy writes. There wasn't any obvious activity on
the
network coming into the VMs, so it was something internal to
one of
them, so we had to keep looking

We then used gluster volume top which showed that a couple of
shards
were being pounded but we didn't see a way to immediately
associate
those shards with their VM file.

We eventually figured out the problem VM using other methods and
resolved the issue, but we would still like to know if there
is a script
or recipe to determine what file a shard may belong to as
that would
have sped up the resolution.

-wk






Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users



Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] one peer flooded with - 0-glusterfs: connection attempt on 127.0.0.1:24007 failed, (Invalid argument)

2019-10-15 Thread lejeczek

  
  
On 14/10/2019 08:33, Sanju Rakonde
  wrote:


  
  please check contents of /var/lib/glusterd/peers/
directory, it should not have any information regarding the
localhost. Please check the uuid of the local node at
/var/lib/glusterd/glusterd.info file and figure out
if you have a file with this uuid at /var/lib/glusterd/peers/*.
If you find any such file, please delete it and restart glusterd
on that node.
  
  
On Fri, Oct 11, 2019 at 3:15
  PM lejeczek  wrote:

hi
  guys,
  
  as per the subject.
  
  Only one thing that I'd like to tell first is that on that
  peer/node
  Samba runs. Other two peers do not show this in their logs.
  
  I gluster log for the volume I get plenty of:
  
  ...
  
  t)
  [2019-10-11 09:40:40.768647] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:43.777129] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:46.785522] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:49.794393] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:52.805158] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:55.817603] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:40:58.826136] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:41:01.836104] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  [2019-10-11 09:41:04.842676] E [socket.c:3498:socket_connect]
  0-glusterfs: connection attempt on 127.0.0.1:24007
  failed, (Invalid
  argument)
  ...
  
  Cluster runs off Centos 7 an ver is 6.5.
  
  glusterd.vol is on all three peers as here:
  
  volume management
      type mgmt/glusterd
      option working-directory /var/lib/glusterd
      option transport-type socket,rdma
      option transport.socket.keepalive-time 10
      option transport.socket.keepalive-interval 2
      option transport.socket.read-fail-log off
      option transport.socket.listen-port 24007
      option transport.rdma.listen-port 24008
      option ping-timeout 0
      option event-threads 1
  #   option lock-timer 180
  #   option transport.address-family inet6
  #   option base-port 49152
      option max-port  60999
  end-volume
  
  Any thoughts & suggestions very appreciated.
  
  Many thanks, L.
  
  
  
  Community Meeting Calendar:
  
  APAC Schedule -
  Every 2nd and 4th Tuesday at 11:30 AM IST
  Bridge: https://bluejeans.com/118564314
  
  NA/EMEA Schedule -
  Every 1st and 3rd Tuesday at 01:00 PM EDT
  Bridge: https://bluejeans.com/118564314
  
  Gluster-users mailing list
  Gluster-users@gluster.org
  https://lists.gluster.org/mailman/listinfo/gluster-users

  
  
  
  
  -- 
  

  Thanks,
  
  Sanju

  

okey, @devel - this might be worth more investigation &
  should be easy to reproduce.
These errors will show up if a user, in my case it was user's
  home fuse-mounted, is accessing some files, in my case it was just
  a shell session but(& maybe only in this specific scenario)
  with 'screen' program upon login - and (in my case) automount was
  restart (and possibly gluster's vol or gluster itself was
  re/started).
I spotted it here:
$ sudo journalctl -lf -o cat -u autofs

  ...

  setautomntent: lookup(sss): setautomntent: No such file or
  directory

  mounted indirect on /misc with timeout 300, freq 75 seconds

  ghosting enabled

  mounted indirect on /net with timeout 300, freq 75 seconds

  ghosting enabled

  mounted indirect on /0-AL