Re: [Gluster-users] Issues in AFR and self healing
On 08/10/2018 11:25 PM, Pablo Schandin wrote: Hello everyone! I'm having some trouble with something but I'm not quite sure of with what yet. I'm running GlusterFS 3.12.6 on Ubuntu 16.04. I have two servers (nodes) in the cluster in a replica mode. Each server has 2 bricks. As the servers are KVM running several VMs, one brick has some VMs locally defined in it and the second brick is the replicated from the other server. It has data but not actual writing is being done except for the replication. Server 1 Server 2 Volume 1 (gv1): Brick 1 defined VMs (read/write) > Brick 1 replicated qcow2 files Volume 2 (gv2): Brick 2 replicated qcow2 files <- Brick 2 defined VMs (read/write) So, the main issue arose when I got a nagios alarm that warned about a file listed to be healed. And then it disappeared. I came to find out that every 5 minutes, the self heal daemon triggers the healing and this fixes it. But looking at the logs I have a lot of entries in the glustershd.log file like this: [2018-08-09 14:23:37.689403] I [MSGID: 108026] [afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv1-replicate-0: Completed data selfheal on 407bd97b-e76c-4f81-8f59-7dae11507b0c. sources=[0] sinks=1 [2018-08-09 14:44:37.933143] I [MSGID: 108026] [afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv2-replicate-0: Completed data selfheal on 73713556-5b63-4f91-b83d-d7d82fee111f. sources=[0] sinks=1 The qcow2 files are being healed several times a day (up to 30 in occasions). As I understand, this means that a data heal occurred on file with gfid 407b... and 7371... in source to sink. Local server to replica server? Is it OK for the shd to heal files in the replicated brick that supposedly has no writing on it besides the mirroring? How does that work? In AFR, for writes, there is no notion of local/remote brick. No matter from which client you write to the volume, it gets sent to both bricks. i.e. the replication is synchronous and real time. How does afr replication work? The file with gfid 7371... is the qcow2 root disk of an owncloud server with 17GB of data. It does not seem to be that big to be a bottleneck of some sort, I think. Also, I was investigating the directory tree in brick/.glusterfs/indices and I notices that both in xattrop and dirty I always have a file created named xattrop-xx and dirty-xx. I read that the xattrop file is like a parent file or handle to reference other files created there as hardlinks with gfid name for the shd to heal. Is the same case as the ones in the dirty dir? Yes, before the write, the gfid gets captured inside dirty on all bricks. If the write is successful, it gets removed. In addition, if the write fails on one brick, the other brick will capture the gfid inside xattrop. Any help will be greatly appreciated it. Thanks! If frequent heals are triggered, it could mean there are frequent network disconnects from the clients to the bricks as writes happen. You can check the mount logs and see if that is the case and investigate possible network issues. HTH, Ravi Pablo. ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Issues in AFR and self healing
Hello everyone! I'm having some trouble with something but I'm not quite sure of with what yet. I'm running GlusterFS 3.12.6 on Ubuntu 16.04. I have two servers (nodes) in the cluster in a replica mode. Each server has 2 bricks. As the servers are KVM running several VMs, one brick has some VMs locally defined in it and the second brick is the replicated from the other server. It has data but not actual writing is being done except for the replication. Server 1 Server 2 Volume 1 (gv1): Brick 1 defined VMs (read/write) > Brick 1 replicated qcow2 files Volume 2 (gv2): Brick 2 replicated qcow2 files <- Brick 2 defined VMs (read/write) So, the main issue arose when I got a nagios alarm that warned about a file listed to be healed. And then it disappeared. I came to find out that every 5 minutes, the self heal daemon triggers the healing and this fixes it. But looking at the logs I have a lot of entries in the glustershd.log file like this: [2018-08-09 14:23:37.689403] I [MSGID: 108026] [afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv1-replicate-0: Completed data selfheal on 407bd97b-e76c-4f81-8f59-7dae11507b0c. sources=[0] sinks=1 [2018-08-09 14:44:37.933143] I [MSGID: 108026] [afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv2-replicate-0: Completed data selfheal on 73713556-5b63-4f91-b83d-d7d82fee111f. sources=[0] sinks=1 The qcow2 files are being healed several times a day (up to 30 in occasions). As I understand, this means that a data heal occurred on file with gfid 407b... and 7371... in source to sink. Local server to replica server? Is it OK for the shd to heal files in the replicated brick that supposedly has no writing on it besides the mirroring? How does that work? How does afr replication work? The file with gfid 7371... is the qcow2 root disk of an owncloud server with 17GB of data. It does not seem to be that big to be a bottleneck of some sort, I think. Also, I was investigating the directory tree in brick/.glusterfs/indices and I notices that both in xattrop and dirty I always have a file created named xattrop-xx and dirty-xx. I read that the xattrop file is like a parent file or handle to reference other files created there as hardlinks with gfid name for the shd to heal. Is the same case as the ones in the dirty dir? Any help will be greatly appreciated it. Thanks! Pablo. smime.p7s Description: S/MIME Cryptographic Signature ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On Aug 10, 2018 15:39, "Kaleb S. KEITHLEY" wrote:On 08/10/2018 09:23 AM, Karli Sjöberg wrote:> On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:>> Hi Karli, Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha. I just installed them last weekend ... they are working very well :)> > Okay, awesome!> > Is there any documentation on how to do that?> https://github.com/gluster/storhaug/wiki-- KalebThank you very much Kaleb, that's exactly what I was after! I will redo my cluster and try this approach instead!/K___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota
On 9 August 2018 at 19:54, mabi wrote: > Thanks for the documentation. On my client using FUSE mount I found the > PID by using ps (output below): > > root 456 1 4 14:17 ?00:05:15 /usr/sbin/glusterfs > --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private > > Then I ran the following command > > sudo kill -USR1 456 > > but now I can't find where the files are stored. Are these supposed to be > stored on the client directly? I checked /var/run/gluster and > /var/log/gluster but could not see anything and /var/log/gluster does not > even exist on the client. > They are usually created in /var/run/gluster. You will need to create the directory on the client if it does not exist. ‐‐‐ Original Message ‐‐‐ > On August 9, 2018 3:59 PM, Raghavendra Gowdappa > wrote: > > > > On Thu, Aug 9, 2018 at 6:47 PM, mabi wrote: > >> Hi Nithya, >> >> Thanks for the fast answer. Here the additional info: >> >> 1. gluster volume info >> >> Volume Name: myvol-private >> Type: Replicate >> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: gfs1a:/data/myvol-private/brick >> Brick2: gfs1b:/data/myvol-private/brick >> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter) >> Options Reconfigured: >> features.default-soft-limit: 95% >> transport.address-family: inet >> features.quota-deem-statfs: on >> features.inode-quota: on >> features.quota: on >> nfs.disable: on >> performance.readdir-ahead: on >> client.event-threads: 4 >> server.event-threads: 4 >> auth.allow: 192.168.100.92 >> >> >> >> 2. Sorry I have no clue how to take a "statedump" of a process on Linux. >> Which command should I use for that? and which process would you like, the >> blocked process (for example "ls")? >> > > Statedumps are gluster specific. Please refer to > https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for > instructions. > > >> >> Regards, >> M. >> >> ‐‐‐ Original Message ‐‐‐ >> On August 9, 2018 3:10 PM, Nithya Balachandran >> wrote: >> >> Hi, >> >> Please provide the following: >> >>1. gluster volume info >>2. statedump of the fuse process when it hangs >> >> >> Thanks, >> Nithya >> >> On 9 August 2018 at 18:24, mabi wrote: >> >>> Hello, >>> >>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version >>> 3.12.12 and now I see a weird behaviour on my client (using FUSE mount) >>> where I have processes (PHP 5.6 FPM) trying to access a specific directory >>> and then the process blocks. I can't kill the process either, not even with >>> kill -9. I need to reboot the machine in order to get rid of these blocked >>> processes. >>> >>> This directory has one particularity compared to the other directories >>> it is that it has reached it's quota soft-limit as you can see here in the >>> output of gluster volume quota list: >>> >>> Path Hard-limit Soft-limit >>> Used Available Soft-limit exceeded? Hard-limit exceeded? >>> >>> --- >>> /directory 100.0GB 80%(80.0GB) 90.5GB >>> 9.5GB Yes No >>> >>> That does not mean that it is the quota's fault but it might be a hint >>> where to start looking for... And by the way can someone explain me what >>> the soft-limit does? or does it not do anything special? >>> >>> Here is an the linux stack of a blocking process on that directory which >>> happened with a simple "ls -la": >>> >>> [Thu Aug 9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 >>> seconds. >>> [Thu Aug 9 14:21:07 2018] Not tainted 3.16.0-4-amd64 #1 >>> [Thu Aug 9 14:21:07 2018] "echo 0 > >>> /proc/sys/kernel/hung_task_timeout_secs" >>> disables this message. >>> [Thu Aug 9 14:21:07 2018] ls D 88017ef93200 0 >>> 2272 2268 0x0004 >>> [Thu Aug 9 14:21:07 2018] 88017653f490 0286 >>> 00013200 880174d7bfd8 >>> [Thu Aug 9 14:21:07 2018] 00013200 88017653f490 >>> 8800eeb3d5f0 8800fefac800 >>> [Thu Aug 9 14:21:07 2018] 880174d7bbe0 8800eeb3d6d0 >>> 8800fefac800 8800ffe1e1c0 >>> [Thu Aug 9 14:21:07 2018] Call Trace: >>> [Thu Aug 9 14:21:07 2018] [] ? >>> __fuse_request_send+0xbd/0x270 [fuse] >>> [Thu Aug 9 14:21:07 2018] [] ? >>> prepare_to_wait_event+0xf0/0xf0 >>> [Thu Aug 9 14:21:07 2018] [] ? >>> fuse_dentry_revalidate+0x181/0x300 [fuse] >>> [Thu Aug 9 14:21:07 2018] [] ? >>> lookup_fast+0x25e/0x2b0 >>> [Thu Aug 9 14:21:07 2018] [] ? >>> path_lookupat+0x155/0x780 >>> [Thu Aug 9 14:21:07 2018] [] ? >>> kmem_cache_alloc+0x75/0x480 >>> [Thu Aug 9 14:21:07 2018] [] ? >>> fuse_getxattr+0xe9/0x150 [fuse] >>> [Thu Aug 9 14:21:07 2018] [] ? >>> filename_lookup+0x26/0xc0 >>> [Thu Aug 9 14:21:07 2018]
Re: [Gluster-users] ganesha.nfsd process dies when copying files
Hi Karli, The following is my note which i gathered from google searches, storhaug wiki and more google searches ... i might have missed certain steps and this is based on Centos 7 install centos 7.x yum update -y i have disabled both firewalld and selinux In our setup we are using LSI raid card RAID10 and present the virtual drive partition as /dev/sdb Create LVM so that we could utilise the the snapshot feature of gluster pvcreate --dataalignment 256k /dev/sdb vgcreate --physicalextentsize 256K gfs_vg /dev/sdb set the volume to use all the space with -l 100%FREE lvcreate --thinpool gfs_vg/thin_pool -l 100%FREE --chunksize 256K --poolmetadatasize 15G --zero n we use XFS file system for our glusterfs mkfs.xfs -i size=512 /dev/gfs_vg/thin_pool Adding the following into /etc/fstab with mount point /bring1683 (you could change the name accordingly) /dev/gfs_vg/thin_pool /brick1683 xfs defaults 1 2 Enable gluster 4.1 repro vi /etc/yum.repos.d/Gluster.repo [gluster41] name=Gluster 4.1 baseurl=http://mirror.centos.org/centos/7/storage/$basearch/gluster-4.1/ gpgcheck=0 enabled=1 install gluster 4.1 yum install -y centos-release-gluster Once we have done the above steps on our 3 nodes, login to 1 of the node and issue the following gluster volume create gv0 replica 3 192.168.0.1:/brick1683/gv0 192.168.0.2:/brick1684/gv0 192.168.0.3:/brick1685/gv0 Setting up HA for NFS-Ganesha using CTDB install the storhaug package on all participating nodes Install the storhaug package on all nodes using the appropriate command for your system: yum -y install storhaug-nfs Note: this will install all the dependencies, e.g. ctdb, nfs-ganesha-gluster, glusterfs, and their related dependencies. Create a passwordless ssh key and copy it to all participating nodes On one of the participating nodes (Fedora, RHEL, CentOS): node1% ssh-keygen -f /etc/sysconfig/storhaug.d/secret.pem or (Debian, Ubuntu): node1% ssh-keygen -f /etc/default/storhaug.d/secret.pem When prompted for a password, press the Enter key. Copy the public key to all the nodes nodes (Fedora, RHEL, CentOS): node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node1 node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node2 node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node3 ... You can confirm that it works with (Fedora, RHEL, CentOS): node1% ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /etc/sysconfig/storhaug.d/secret.pem root@node1 populate /etc/ctdb/nodes and /etc/ctdb/public_addresses Select one node as your lead node, e.g. node1. On the lead node, create/edit /etc/ctdb/nodes and populate it with the (fixed) IP addresses of the participating nodes. It should look like this: 192.168.122.81 192.168.122.82 192.168.122.83 192.168.122.84 On the lead node, create/edit /etc/ctdb/public_addresses and populate it with the floating IP addresses (a.k.a. VIPs) for the participating nodes. These must be different than the IP addresses in /etc/ctdb/nodes. It should look like this: 192.168.122.85 eth0 192.168.122.86 eth0 192.168.122.87 eth0 192.168.122.88 eth0 edit /etc/ctdb/ctdbd.conf Ensure that the line CTDB_MANAGES_NFS=yes exists. If not, add it or change it from no to yes. Add or change the following lines: CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.ctdb/reclock CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout CTDB_NFS_STATE_FS_TYPE=glusterfs CTDB_NFS_STATE_MNT=/run/gluster/shared_storage CTDB_NFS_SKIP_SHARE_CHECK=yes NFS_HOSTNAME=localhost create a bare minimum /etc/ganesha/ganesha.conf file On the lead node: node1% touch /etc/ganesha/ganesha.conf or node1% echo "### NFS-Ganesha.config" > /etc/ganesha/ganesha.conf Note: you can edit this later to set global configuration options. create a trusted storage pool and start the gluster shared-storage volume On all the participating nodes: node1% systemctl start glusterd node2% systemctl start glusterd node3% systemctl start glusterd ... On the lead node, peer probe the other nodes: node1% gluster peer probe node2 node1% gluster peer probe node3 ... Optional: on one of the other nodes, peer probe node1: node2% gluster peer probe node1 Enable the gluster shared-storage volume: node1% gluster volume set all cluster.enable-shared-storage enable This takes a few moments. When done check that the gluster_shared_storage volume is mounted at /run/gluster/shared_storage on all the nodes. start the ctdbd and ganesha.nfsd daemons On the lead node: node1% storhaug setup You can watch the ctdb (/var/log/ctdb.log) and ganesha log (/var/log/ganesha/ganesha.log) to monitor their progress. From this point on you may enter storhaug commands from any of the participating nodes. export a gluster volume Create a gluster volume node1% gluster volume create replica 2 myvol node1:/bricks/vol/myvol node2:/bricks/vol/myvol node3:/bricks/vol/myvol node4:/bricks/vol/myvol ... Start the gluster
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On 08/10/2018 09:23 AM, Karli Sjöberg wrote: > On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote: >> Hi Karli, >> >> Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha. >> >> I just installed them last weekend ... they are working very well :) > > Okay, awesome! > > Is there any documentation on how to do that? > https://github.com/gluster/storhaug/wiki -- Kaleb signature.asc Description: OpenPGP digital signature ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On 08/10/2018 09:08 AM, Karli Sjöberg wrote: > On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote: >> On 08/10/2018 08:08 AM, Karli Sjöberg wrote: >>> Hey all! >>> ... >>> >>> glusterfs-client-xlators-3.10.12-1.el7.x86_64 >>> glusterfs-api-3.10.12-1.el7.x86_64 >>> nfs-ganesha-2.4.5-1.el7.x86_64 >>> centos-release-gluster310-1.0-1.el7.centos.noarch >>> glusterfs-3.10.12-1.el7.x86_64 >>> glusterfs-cli-3.10.12-1.el7.x86_64 >>> nfs-ganesha-gluster-2.4.5-1.el7.x86_64 >>> glusterfs-server-3.10.12-1.el7.x86_64 >>> glusterfs-libs-3.10.12-1.el7.x86_64 >>> glusterfs-fuse-3.10.12-1.el7.x86_64 >>> glusterfs-ganesha-3.10.12-1.el7.x86_64 >>> >> >> For nfs-ganesha problems you'd really be better served by posting to >> support@ or de...@lists.nfs-ganesha.org. >> >> Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs- >> 3.10 >> is even officially EOL. Ganesha isn't really organized enough to >> have >> done anything as bold as officially declaring 2.4 as having reached >> EOL. >> >> The nfs-ganesha devs are currently working on 2.7; maintaining and >> supporting 2.6, and less so 2.5, is pretty much at the limit of what >> they might be willing to help debug. >> >> I strongly encourage you to update to a more recent version of both >> glusterfs and nfs-ganesha. glusterfs-4.1 and nfs-ganesha-2.6 would >> be >> ideal. Then if you still have problems you're much more likely to get >> help. >> > > Hi, thank you for your answer, but it raises even more questions about > any potential production deployment. > > Actually, I knew that the versions are old, but it seems to me that you > are contradicting yourself: > > https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html > > "After 3.10 you'd need to use storhaug Which doesn't work > (yet). > I don't recall the context of that email. But I did also send this email https://lists.gluster.org/pipermail/gluster-devel/2018-June/054896.html announcing the availability of (working) storhaug-1.0. There are packages in Fedora, CentOS Storage SIG, gluster's Launchpad PPA, gluster's OBS, and for Debian on download.gluster.org. -- Kaleb signature.asc Description: OpenPGP digital signature ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote: > Hi Karli, > > Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha. > > I just installed them last weekend ... they are working very well :) Okay, awesome! Is there any documentation on how to do that? /K > > Cheers, > Edy > > On 8/10/2018 9:08 PM, Karli Sjöberg wrote: > > On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote: > > > On 08/10/2018 08:08 AM, Karli Sjöberg wrote: > > > > Hey all! > > > > ... > > > > > > > > glusterfs-client-xlators-3.10.12-1.el7.x86_64 > > > > glusterfs-api-3.10.12-1.el7.x86_64 > > > > nfs-ganesha-2.4.5-1.el7.x86_64 > > > > centos-release-gluster310-1.0-1.el7.centos.noarch > > > > glusterfs-3.10.12-1.el7.x86_64 > > > > glusterfs-cli-3.10.12-1.el7.x86_64 > > > > nfs-ganesha-gluster-2.4.5-1.el7.x86_64 > > > > glusterfs-server-3.10.12-1.el7.x86_64 > > > > glusterfs-libs-3.10.12-1.el7.x86_64 > > > > glusterfs-fuse-3.10.12-1.el7.x86_64 > > > > glusterfs-ganesha-3.10.12-1.el7.x86_64 > > > > > > > > > > For nfs-ganesha problems you'd really be better served by posting > > > to > > > support@ or de...@lists.nfs-ganesha.org. > > > > > > Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. > > > glusterfs- > > > 3.10 > > > is even officially EOL. Ganesha isn't really organized enough to > > > have > > > done anything as bold as officially declaring 2.4 as having > > > reached > > > EOL. > > > > > > The nfs-ganesha devs are currently working on 2.7; maintaining > > > and > > > supporting 2.6, and less so 2.5, is pretty much at the limit of > > > what > > > they might be willing to help debug. > > > > > > I strongly encourage you to update to a more recent version of > > > both > > > glusterfs and nfs-ganesha. glusterfs-4.1 and nfs-ganesha-2.6 > > > would > > > be > > > ideal. Then if you still have problems you're much more likely to > > > get > > > help. > > > > > > > Hi, thank you for your answer, but it raises even more questions > > about > > any potential production deployment. > > > > Actually, I knew that the versions are old, but it seems to me that > > you > > are contradicting yourself: > > > > https://lists.gluster.org/pipermail/gluster-users/2017-July/031753. > > html > > > > "After 3.10 you'd need to use storhaug Which doesn't work > > (yet). > > > > You need to use 3.10 for now." > > > > So how is that supposed to work? > > > > Is there documentation for how to get there? > > > > Thanks in advance! > > > > /K > > > > > > ___ > > Gluster-users mailing list > > Gluster-users@gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users signature.asc Description: This is a digitally signed message part ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
Hi Karli, Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha. I just installed them last weekend ... they are working very well :) Cheers, Edy On 8/10/2018 9:08 PM, Karli Sjöberg wrote: On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote: On 08/10/2018 08:08 AM, Karli Sjöberg wrote: Hey all! ... glusterfs-client-xlators-3.10.12-1.el7.x86_64 glusterfs-api-3.10.12-1.el7.x86_64 nfs-ganesha-2.4.5-1.el7.x86_64 centos-release-gluster310-1.0-1.el7.centos.noarch glusterfs-3.10.12-1.el7.x86_64 glusterfs-cli-3.10.12-1.el7.x86_64 nfs-ganesha-gluster-2.4.5-1.el7.x86_64 glusterfs-server-3.10.12-1.el7.x86_64 glusterfs-libs-3.10.12-1.el7.x86_64 glusterfs-fuse-3.10.12-1.el7.x86_64 glusterfs-ganesha-3.10.12-1.el7.x86_64 For nfs-ganesha problems you'd really be better served by posting to support@ or de...@lists.nfs-ganesha.org. Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs- 3.10 is even officially EOL. Ganesha isn't really organized enough to have done anything as bold as officially declaring 2.4 as having reached EOL. The nfs-ganesha devs are currently working on 2.7; maintaining and supporting 2.6, and less so 2.5, is pretty much at the limit of what they might be willing to help debug. I strongly encourage you to update to a more recent version of both glusterfs and nfs-ganesha. glusterfs-4.1 and nfs-ganesha-2.6 would be ideal. Then if you still have problems you're much more likely to get help. Hi, thank you for your answer, but it raises even more questions about any potential production deployment. Actually, I knew that the versions are old, but it seems to me that you are contradicting yourself: https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html "After 3.10 you'd need to use storhaug Which doesn't work (yet). You need to use 3.10 for now." So how is that supposed to work? Is there documentation for how to get there? Thanks in advance! /K ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote: > On 08/10/2018 08:08 AM, Karli Sjöberg wrote: > > Hey all! > > ... > > > > glusterfs-client-xlators-3.10.12-1.el7.x86_64 > > glusterfs-api-3.10.12-1.el7.x86_64 > > nfs-ganesha-2.4.5-1.el7.x86_64 > > centos-release-gluster310-1.0-1.el7.centos.noarch > > glusterfs-3.10.12-1.el7.x86_64 > > glusterfs-cli-3.10.12-1.el7.x86_64 > > nfs-ganesha-gluster-2.4.5-1.el7.x86_64 > > glusterfs-server-3.10.12-1.el7.x86_64 > > glusterfs-libs-3.10.12-1.el7.x86_64 > > glusterfs-fuse-3.10.12-1.el7.x86_64 > > glusterfs-ganesha-3.10.12-1.el7.x86_64 > > > > For nfs-ganesha problems you'd really be better served by posting to > support@ or de...@lists.nfs-ganesha.org. > > Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs- > 3.10 > is even officially EOL. Ganesha isn't really organized enough to > have > done anything as bold as officially declaring 2.4 as having reached > EOL. > > The nfs-ganesha devs are currently working on 2.7; maintaining and > supporting 2.6, and less so 2.5, is pretty much at the limit of what > they might be willing to help debug. > > I strongly encourage you to update to a more recent version of both > glusterfs and nfs-ganesha. glusterfs-4.1 and nfs-ganesha-2.6 would > be > ideal. Then if you still have problems you're much more likely to get > help. > Hi, thank you for your answer, but it raises even more questions about any potential production deployment. Actually, I knew that the versions are old, but it seems to me that you are contradicting yourself: https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html "After 3.10 you'd need to use storhaug Which doesn't work (yet). You need to use 3.10 for now." So how is that supposed to work? Is there documentation for how to get there? Thanks in advance! /K signature.asc Description: This is a digitally signed message part ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] ganesha.nfsd process dies when copying files
On 08/10/2018 08:08 AM, Karli Sjöberg wrote: > Hey all! > ... > > glusterfs-client-xlators-3.10.12-1.el7.x86_64 > glusterfs-api-3.10.12-1.el7.x86_64 > nfs-ganesha-2.4.5-1.el7.x86_64 > centos-release-gluster310-1.0-1.el7.centos.noarch > glusterfs-3.10.12-1.el7.x86_64 > glusterfs-cli-3.10.12-1.el7.x86_64 > nfs-ganesha-gluster-2.4.5-1.el7.x86_64 > glusterfs-server-3.10.12-1.el7.x86_64 > glusterfs-libs-3.10.12-1.el7.x86_64 > glusterfs-fuse-3.10.12-1.el7.x86_64 > glusterfs-ganesha-3.10.12-1.el7.x86_64 > For nfs-ganesha problems you'd really be better served by posting to support@ or de...@lists.nfs-ganesha.org. Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs-3.10 is even officially EOL. Ganesha isn't really organized enough to have done anything as bold as officially declaring 2.4 as having reached EOL. The nfs-ganesha devs are currently working on 2.7; maintaining and supporting 2.6, and less so 2.5, is pretty much at the limit of what they might be willing to help debug. I strongly encourage you to update to a more recent version of both glusterfs and nfs-ganesha. glusterfs-4.1 and nfs-ganesha-2.6 would be ideal. Then if you still have problems you're much more likely to get help. -- Kaleb ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] ganesha.nfsd process dies when copying files
Hey all! I am playing around on my computer with setting up a virtual mini- cluster of five VM's: 1x router 1x client 3x Gluster/NFS-Ganesha servers The router is pfSense, the client is Xubuntu 18.04 and the servers are CentOS 7.5. I set up the cluster using 'gdeploy' with configuration snippets taken from oVirt/Cockpit HCI setup and another snippet for setting up the NFS-Ganesha part of it. The configuration is successful apart from some minor details I debugged but I'm fairly sure I haven't made any obvious misses. All of the VM's are registered in pfSense's DNS, as well as the VIP's for the NFS-Ganesha nodes, which works great and the client have no issues with resolving any of the names. hv01.localdomain192.168.1.101 hv02.localdomain192.168.1.102 hv03.localdomain192.168.1.103 hv01v.localdomain 192.168.1.110 hv02v.localdomain 192.168.1.111 hv03v.localdomain 192.168.1.112 The cluster status is HEALTHY accoring to '/usr/libexec/ganesha/ganesha-ha.sh' before I start my tests: client# mount -t nfs -o vers=4.1 hv01v.localdomain:/data /mnt client# dd if=/dev/urandom of=/var/tmp/test.bin bs=1M count=1024 client# while true; do rsync /var/tmp/test.bin /mnt/; rm -f /mnt/test.bin; done Then after a while, the 'nfs-ganesha' service unexpectedly dies and doesn't restart by itself. The copy loop gets picked up after a while on 'hv02' until history repeats itself until all of the nodes' 'nfs- ganesha' services are dead. With normal logs activated, the dead node says nothing before dying; sudden heart attack syndrome- so no clues there, and ones remaining only says they've taken over... Right now I'm running with FULL_DEBUG which makes testing very difficult since the throughput is down to a crawl. Nothing strange about that, just takes a lot more time to provoke. Please don't hesitate to ask for more information in case there's something else you'd like me to share! I'm hoping someone recognizes this behaviour and knows what I'm doing wrong:) glusterfs-client-xlators-3.10.12-1.el7.x86_64 glusterfs-api-3.10.12-1.el7.x86_64 nfs-ganesha-2.4.5-1.el7.x86_64 centos-release-gluster310-1.0-1.el7.centos.noarch glusterfs-3.10.12-1.el7.x86_64 glusterfs-cli-3.10.12-1.el7.x86_64 nfs-ganesha-gluster-2.4.5-1.el7.x86_64 glusterfs-server-3.10.12-1.el7.x86_64 glusterfs-libs-3.10.12-1.el7.x86_64 glusterfs-fuse-3.10.12-1.el7.x86_64 glusterfs-ganesha-3.10.12-1.el7.x86_64 Thanks in advance! /K#gdeploy configuration generated by cockpit-gluster plugin [hosts] hv01.localdomain hv02.localdomain hv03.localdomain [yum] action=install repolist= gpgcheck=no update=no packages=glusterfs-server,glusterfs-api,glusterfs-ganesha,nfs-ganesha,nfs-ganesha-gluster,policycoreutils-python,device-mapper-multipath,corosync,pacemaker,pcs [script1:hv01.localdomain] action=execute ignore_script_errors=no file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h hv01.localdomain,hv02.localdomain,hv03.localdomain [script1:hv02.localdomain] action=execute ignore_script_errors=no file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h hv01.localdomain,hv02.localdomain,hv03.localdomain [script1:hv03.localdomain] action=execute ignore_script_errors=no file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h hv01.localdomain,hv02.localdomain,hv03.localdomain [disktype] jbod [diskcount] 12 [stripesize] 256 [service1] action=enable service=chronyd [service2] action=restart service=chronyd [script3] action=execute file=/usr/share/gdeploy/scripts/blacklist_all_disks.sh ignore_script_errors=no [pv1:hv01.localdomain] action=create devices=vdb ignore_pv_errors=no [pv1:hv02.localdomain] action=create devices=vdb ignore_pv_errors=no [pv1:hv03.localdomain] action=create devices=vdb ignore_pv_errors=no [vg1:hv01.localdomain] action=create vgname=gluster_vg_vdb pvname=vdb ignore_vg_errors=no [vg1:hv02.localdomain] action=create vgname=gluster_vg_vdb pvname=vdb ignore_vg_errors=no [vg1:hv03.localdomain] action=create vgname=gluster_vg_vdb pvname=vdb ignore_vg_errors=no [lv1:hv01.localdomain] action=create poolname=gluster_thinpool_vdb ignore_lv_errors=no vgname=gluster_vg_vdb lvtype=thinpool size=450GB poolmetadatasize=3GB [lv2:hv02.localdomain] action=create poolname=gluster_thinpool_vdb ignore_lv_errors=no vgname=gluster_vg_vdb lvtype=thinpool size=450GB poolmetadatasize=3GB [lv3:hv03.localdomain] action=create poolname=gluster_thinpool_vdb ignore_lv_errors=no vgname=gluster_vg_vdb lvtype=thinpool size=45GB poolmetadatasize=1GB [lv4:hv01.localdomain] action=create lvname=gluster_lv_data ignore_lv_errors=no vgname=gluster_vg_vdb mount=/gluster_bricks/data lvtype=thinlv poolname=gluster_thinpool_vdb virtualsize=450GB [lv5:hv02.localdomain] action=create lvname=gluster_lv_data ignore_lv_errors=no vgname=gluster_vg_vdb mount=/gluster_bricks/data lvtype=thinlv poolname=gluster_thinpool_vdb virtualsize=450GB [lv6:hv03.localdomain] action=create lvname=gluster_lv_data
Re: [Gluster-users] Gluster Outreachy
Hi all, *Gentle reminder!* The doc[1] for adding project ideas for Outreachy will be open for editing till August 20th. Please feel free to add your project ideas :). [1]: https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing Thanks, Bhumika On Wed, Jul 4, 2018 at 4:51 PM, Bhumika Goyal wrote: > Hi all, > > Gnome has been working on an initiative known as Outreachy[1] since 2010. > Outreachy is a three months remote internship program. It aims to increase > the participation of women and members from under-represented groups in > open source. This program is held twice in a year. During the internship > period, interns contribute to a project under the guidance of one or more > mentors. > > For the next round(Dec 2018- March 2019) we are planning to apply projects > from Gluster. We would like you to propose projects ideas or/and come > forward as mentors/volunteers. > Please feel free to add project ideas in this doc[2]. The doc[2] will be > open for editing till July end. > > [1]: https://www.outreachy.org/ > [2]: https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojK > sF16QI5-j7cUHcR5Pq4/edit?usp=sharing > > Outreachy timeline: > Pre-Application Period - Late August to early September > Application Period - Early September to mid-October > Internship Period - December to March > > Thanks, > Bhumika > ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users