Re: [Gluster-users] Issues in AFR and self healing

2018-08-10 Thread Ravishankar N



On 08/10/2018 11:25 PM, Pablo Schandin wrote:


Hello everyone!

I'm having some trouble with something but I'm not quite sure of with 
what yet. I'm running GlusterFS 3.12.6 on Ubuntu 16.04. I have two 
servers (nodes) in the cluster in a replica mode. Each server has 2 
bricks. As the servers are KVM running several VMs, one brick has some 
VMs locally defined in it and the second brick is the replicated from 
the other server. It has data but not actual writing is being done 
except for the replication.


                            Server 1                                 
        Server 2
Volume 1 (gv1): Brick 1 defined VMs (read/write)    >             
  Brick 1 replicated qcow2 files
Volume 2 (gv2): Brick 2 replicated qcow2 files <-             
Brick 2 defined VMs (read/write)


So, the main issue arose when I got a nagios alarm that warned about a 
file listed to be healed. And then it disappeared. I came to find out 
that every 5 minutes, the self heal daemon triggers the healing and 
this fixes it. But looking at the logs I have a lot of entries in the 
glustershd.log file like this:


[2018-08-09 14:23:37.689403] I [MSGID: 108026] 
[afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv1-replicate-0: 
Completed data selfheal on 407bd97b-e76c-4f81-8f59-7dae11507b0c. 
sources=[0]  sinks=1
[2018-08-09 14:44:37.933143] I [MSGID: 108026] 
[afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv2-replicate-0: 
Completed data selfheal on 73713556-5b63-4f91-b83d-d7d82fee111f. 
sources=[0]  sinks=1


The qcow2 files are being healed several times a day (up to 30 in 
occasions). As I understand, this means that a data heal occurred on 
file with gfid 407b... and 7371... in source to sink. Local server to 
replica server? Is it OK for the shd to heal files in the replicated 
brick that supposedly has no writing on it besides the mirroring? How 
does that work?


In AFR, for writes, there is no notion of local/remote brick. No matter 
from which client you write to the volume, it gets sent to both bricks. 
i.e. the replication is synchronous and real time.


How does afr replication work? The file with gfid 7371... is the qcow2 
root disk of an owncloud server with 17GB of data. It does not seem to 
be that big to be a bottleneck of some sort, I think.


Also, I was investigating the directory tree in 
brick/.glusterfs/indices and I notices that both in xattrop and dirty 
I always have a file created named xattrop-xx and dirty-xx. I 
read that the xattrop file is like a parent file or handle to 
reference other files created there as hardlinks with gfid name for 
the shd to heal. Is the same case as the ones in the dirty dir?


Yes, before the write, the gfid gets captured inside dirty on all 
bricks. If the write is successful, it gets removed. In addition, if the 
write fails on one brick, the other brick will capture the gfid inside 
xattrop.


Any help will be greatly appreciated it. Thanks!

If frequent heals are triggered, it could mean there are frequent 
network disconnects from the clients to the bricks as writes happen. You 
can check the mount logs and see if that is the case and investigate 
possible network issues.


HTH,
Ravi


Pablo.





___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Issues in AFR and self healing

2018-08-10 Thread Pablo Schandin

Hello everyone!

I'm having some trouble with something but I'm not quite sure of with 
what yet. I'm running GlusterFS 3.12.6 on Ubuntu 16.04. I have two 
servers (nodes) in the cluster in a replica mode. Each server has 2 
bricks. As the servers are KVM running several VMs, one brick has some 
VMs locally defined in it and the second brick is the replicated from 
the other server. It has data but not actual writing is being done 
except for the replication.


                            Server 1                                 
        Server 2
Volume 1 (gv1): Brick 1 defined VMs (read/write)    >               
Brick 1 replicated qcow2 files
Volume 2 (gv2): Brick 2 replicated qcow2 files        <-         
 Brick 2 defined VMs (read/write)


So, the main issue arose when I got a nagios alarm that warned about a 
file listed to be healed. And then it disappeared. I came to find out 
that every 5 minutes, the self heal daemon triggers the healing and this 
fixes it. But looking at the logs I have a lot of entries in the 
glustershd.log file like this:


[2018-08-09 14:23:37.689403] I [MSGID: 108026] 
[afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv1-replicate-0: 
Completed data selfheal on 407bd97b-e76c-4f81-8f59-7dae11507b0c. 
sources=[0]  sinks=1
[2018-08-09 14:44:37.933143] I [MSGID: 108026] 
[afr-self-heal-common.c:1656:afr_log_selfheal] 0-gv2-replicate-0: 
Completed data selfheal on 73713556-5b63-4f91-b83d-d7d82fee111f. 
sources=[0]  sinks=1


The qcow2 files are being healed several times a day (up to 30 in 
occasions). As I understand, this means that a data heal occurred on 
file with gfid 407b... and 7371... in source to sink. Local server to 
replica server? Is it OK for the shd to heal files in the replicated 
brick that supposedly has no writing on it besides the mirroring? How 
does that work?


How does afr replication work? The file with gfid 7371... is the qcow2 
root disk of an owncloud server with 17GB of data. It does not seem to 
be that big to be a bottleneck of some sort, I think.


Also, I was investigating the directory tree in brick/.glusterfs/indices 
and I notices that both in xattrop and dirty I always have a file 
created named xattrop-xx and dirty-xx. I read that the xattrop 
file is like a parent file or handle to reference other files created 
there as hardlinks with gfid name for the shd to heal. Is the same case 
as the ones in the dirty dir?


Any help will be greatly appreciated it. Thanks!

Pablo.





smime.p7s
Description: S/MIME Cryptographic Signature
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Karli Sjöberg
On Aug 10, 2018 15:39, "Kaleb S. KEITHLEY"  wrote:On 08/10/2018 09:23 AM, Karli Sjöberg wrote:> On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:>> Hi Karli, Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha. I just installed them last weekend ... they are working very well :)> > Okay, awesome!> > Is there any documentation on how to do that?> https://github.com/gluster/storhaug/wiki-- KalebThank you very much Kaleb, that's exactly what I was after! I will redo my cluster and try this approach instead!/K___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] blocking process on FUSE mount in directory which is using quota

2018-08-10 Thread Nithya Balachandran
On 9 August 2018 at 19:54, mabi  wrote:

> Thanks for the documentation. On my client using FUSE mount I found the
> PID by using ps (output below):
>
> root   456 1  4 14:17 ?00:05:15 /usr/sbin/glusterfs
> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>
> Then I ran the following command
>
> sudo kill -USR1 456
>
> but now I can't find where the files are stored. Are these supposed to be
> stored on the client directly? I checked /var/run/gluster and
> /var/log/gluster but could not see anything and /var/log/gluster does not
> even exist on the client.
>

They are usually created in /var/run/gluster. You will need to create the
directory on the client if it does not exist.



‐‐‐ Original Message ‐‐‐
> On August 9, 2018 3:59 PM, Raghavendra Gowdappa 
> wrote:
>
>
>
> On Thu, Aug 9, 2018 at 6:47 PM, mabi  wrote:
>
>> Hi Nithya,
>>
>> Thanks for the fast answer. Here the additional info:
>>
>> 1. gluster volume info
>>
>> Volume Name: myvol-private
>> Type: Replicate
>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs1a:/data/myvol-private/brick
>> Brick2: gfs1b:/data/myvol-private/brick
>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>> Options Reconfigured:
>> features.default-soft-limit: 95%
>> transport.address-family: inet
>> features.quota-deem-statfs: on
>> features.inode-quota: on
>> features.quota: on
>> nfs.disable: on
>> performance.readdir-ahead: on
>> client.event-threads: 4
>> server.event-threads: 4
>> auth.allow: 192.168.100.92
>>
>>
>>
>> 2. Sorry I have no clue how to take a "statedump" of a process on Linux.
>> Which command should I use for that? and which process would you like, the
>> blocked process (for example "ls")?
>>
>
> Statedumps are gluster specific. Please refer to
> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for
> instructions.
>
>
>>
>> Regards,
>> M.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On August 9, 2018 3:10 PM, Nithya Balachandran 
>> wrote:
>>
>> Hi,
>>
>> Please provide the following:
>>
>>1. gluster volume info
>>2. statedump of the fuse process when it hangs
>>
>>
>> Thanks,
>> Nithya
>>
>> On 9 August 2018 at 18:24, mabi  wrote:
>>
>>> Hello,
>>>
>>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version
>>> 3.12.12 and now I see a weird behaviour on my client (using FUSE mount)
>>> where I have processes (PHP 5.6 FPM) trying to access a specific directory
>>> and then the process blocks. I can't kill the process either, not even with
>>> kill -9. I need to reboot the machine in order to get rid of these blocked
>>> processes.
>>>
>>> This directory has one particularity compared to the other directories
>>> it is that it has reached it's quota soft-limit as you can see here in the
>>> output of gluster volume quota list:
>>>
>>>   Path   Hard-limit  Soft-limit
>>> Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>>> 
>>> ---
>>> /directory  100.0GB 80%(80.0GB)   90.5GB
>>>  9.5GB Yes   No
>>>
>>> That does not mean that it is the quota's fault but it might be a hint
>>> where to start looking for... And by the way can someone explain me what
>>> the soft-limit does? or does it not do anything special?
>>>
>>> Here is an the linux stack of a blocking process on that directory which
>>> happened with a simple "ls -la":
>>>
>>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120
>>> seconds.
>>> [Thu Aug  9 14:21:07 2018]   Not tainted 3.16.0-4-amd64 #1
>>> [Thu Aug  9 14:21:07 2018] "echo 0 > 
>>> /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [Thu Aug  9 14:21:07 2018] ls  D 88017ef93200 0
>>> 2272   2268 0x0004
>>> [Thu Aug  9 14:21:07 2018]  88017653f490 0286
>>> 00013200 880174d7bfd8
>>> [Thu Aug  9 14:21:07 2018]  00013200 88017653f490
>>> 8800eeb3d5f0 8800fefac800
>>> [Thu Aug  9 14:21:07 2018]  880174d7bbe0 8800eeb3d6d0
>>> 8800fefac800 8800ffe1e1c0
>>> [Thu Aug  9 14:21:07 2018] Call Trace:
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> __fuse_request_send+0xbd/0x270 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> prepare_to_wait_event+0xf0/0xf0
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> fuse_dentry_revalidate+0x181/0x300 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> lookup_fast+0x25e/0x2b0
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> path_lookupat+0x155/0x780
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> kmem_cache_alloc+0x75/0x480
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> fuse_getxattr+0xe9/0x150 [fuse]
>>> [Thu Aug  9 14:21:07 2018]  [] ?
>>> filename_lookup+0x26/0xc0
>>> [Thu Aug  9 14:21:07 2018] 

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Pui Edylie

Hi Karli,

The following is my note which i gathered from google searches, storhaug 
wiki and more google searches ... i might have missed certain steps and 
this is based on Centos 7


install centos 7.x
yum update -y

i have disabled both firewalld and selinux

In our setup we are using LSI raid card RAID10 and present the virtual 
drive partition as /dev/sdb


Create LVM so that we could utilise the the snapshot feature of gluster

pvcreate --dataalignment 256k /dev/sdb
vgcreate --physicalextentsize 256K gfs_vg /dev/sdb

set the volume to use all the space with -l 100%FREE
lvcreate --thinpool gfs_vg/thin_pool -l 100%FREE  --chunksize 256K 
--poolmetadatasize 15G --zero n


we use XFS file system for our glusterfs
mkfs.xfs -i size=512 /dev/gfs_vg/thin_pool

Adding the following into /etc/fstab with mount point /bring1683 (you 
could change the name accordingly)

/dev/gfs_vg/thin_pool /brick1683 xfs    defaults 1 2

Enable gluster 4.1 repro

vi /etc/yum.repos.d/Gluster.repo

[gluster41]
name=Gluster 4.1
baseurl=http://mirror.centos.org/centos/7/storage/$basearch/gluster-4.1/
gpgcheck=0
enabled=1

install gluster 4.1

yum install -y centos-release-gluster

Once we have done the above steps on our 3 nodes, login to 1 of the node 
and issue the following


gluster volume create gv0 replica 3 192.168.0.1:/brick1683/gv0 
192.168.0.2:/brick1684/gv0 192.168.0.3:/brick1685/gv0



Setting up HA for NFS-Ganesha using CTDB

install the storhaug package on all participating nodes
Install the storhaug package on all nodes using the appropriate command 
for your system:


yum -y install storhaug-nfs

Note: this will install all the dependencies, e.g. ctdb, 
nfs-ganesha-gluster, glusterfs, and their related dependencies.


Create a passwordless ssh key and copy it to all participating nodes
On one of the participating nodes (Fedora, RHEL, CentOS):
node1% ssh-keygen -f /etc/sysconfig/storhaug.d/secret.pem
or (Debian, Ubuntu):
node1% ssh-keygen -f /etc/default/storhaug.d/secret.pem
When prompted for a password, press the Enter key.

Copy the public key to all the nodes nodes (Fedora, RHEL, CentOS):
node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node1
node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node2
node1% ssh-copy-id -i /etc/sysconfig/storhaug.d/secret.pem.pub root@node3

...

You can confirm that it works with (Fedora, RHEL, CentOS):
node1% ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i 
/etc/sysconfig/storhaug.d/secret.pem root@node1



populate /etc/ctdb/nodes and /etc/ctdb/public_addresses
Select one node as your lead node, e.g. node1. On the lead node, 
create/edit /etc/ctdb/nodes and populate it with the (fixed) IP 
addresses of the participating nodes. It should look like this:

192.168.122.81
192.168.122.82
192.168.122.83
192.168.122.84

On the lead node, create/edit /etc/ctdb/public_addresses and populate it 
with the floating IP addresses (a.k.a. VIPs) for the participating 
nodes. These must be different than the IP addresses in /etc/ctdb/nodes. 
It should look like this:

192.168.122.85 eth0
192.168.122.86 eth0
192.168.122.87 eth0
192.168.122.88 eth0

edit /etc/ctdb/ctdbd.conf
Ensure that the line CTDB_MANAGES_NFS=yes exists. If not, add it or 
change it from no to yes. Add or change the following lines:

CTDB_RECOVERY_LOCK=/run/gluster/shared_storage/.ctdb/reclock
CTDB_NFS_CALLOUT=/etc/ctdb/nfs-ganesha-callout
CTDB_NFS_STATE_FS_TYPE=glusterfs
CTDB_NFS_STATE_MNT=/run/gluster/shared_storage
CTDB_NFS_SKIP_SHARE_CHECK=yes
NFS_HOSTNAME=localhost

create a bare minimum /etc/ganesha/ganesha.conf file
On the lead node:
node1% touch /etc/ganesha/ganesha.conf
or
node1% echo "### NFS-Ganesha.config" > /etc/ganesha/ganesha.conf

Note: you can edit this later to set global configuration options.

create a trusted storage pool and start the gluster shared-storage volume
On all the participating nodes:
node1% systemctl start glusterd
node2% systemctl start glusterd
node3% systemctl start glusterd
...

On the lead node, peer probe the other nodes:
node1% gluster peer probe node2
node1% gluster peer probe node3
...

Optional: on one of the other nodes, peer probe node1:
node2% gluster peer probe node1

Enable the gluster shared-storage volume:
node1% gluster volume set all cluster.enable-shared-storage enable
This takes a few moments. When done check that the 
gluster_shared_storage volume is mounted at /run/gluster/shared_storage 
on all the nodes.


start the ctdbd and ganesha.nfsd daemons
On the lead node:
node1% storhaug setup
You can watch the ctdb (/var/log/ctdb.log) and ganesha log 
(/var/log/ganesha/ganesha.log) to monitor their progress. From this 
point on you may enter storhaug commands from any of the participating 
nodes.


export a gluster volume
Create a gluster volume
node1% gluster volume create replica 2 myvol node1:/bricks/vol/myvol 
node2:/bricks/vol/myvol node3:/bricks/vol/myvol node4:/bricks/vol/myvol ...


Start the gluster 

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Kaleb S. KEITHLEY
On 08/10/2018 09:23 AM, Karli Sjöberg wrote:
> On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:
>> Hi Karli,
>>
>> Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha.
>>
>> I just installed them last weekend ... they are working very well :)
> 
> Okay, awesome!
> 
> Is there any documentation on how to do that?
> 

https://github.com/gluster/storhaug/wiki

-- 

Kaleb



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Kaleb S. KEITHLEY
On 08/10/2018 09:08 AM, Karli Sjöberg wrote:
> On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote:
>> On 08/10/2018 08:08 AM, Karli Sjöberg wrote:
>>> Hey all!
>>> ...
>>>
>>> glusterfs-client-xlators-3.10.12-1.el7.x86_64
>>> glusterfs-api-3.10.12-1.el7.x86_64
>>> nfs-ganesha-2.4.5-1.el7.x86_64
>>> centos-release-gluster310-1.0-1.el7.centos.noarch
>>> glusterfs-3.10.12-1.el7.x86_64
>>> glusterfs-cli-3.10.12-1.el7.x86_64
>>> nfs-ganesha-gluster-2.4.5-1.el7.x86_64
>>> glusterfs-server-3.10.12-1.el7.x86_64
>>> glusterfs-libs-3.10.12-1.el7.x86_64
>>> glusterfs-fuse-3.10.12-1.el7.x86_64
>>> glusterfs-ganesha-3.10.12-1.el7.x86_64
>>>
>>
>> For nfs-ganesha problems you'd really be better served by posting to
>> support@ or de...@lists.nfs-ganesha.org.
>>
>> Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs-
>> 3.10
>> is even officially EOL. Ganesha isn't really organized  enough to
>> have
>> done anything as bold as officially declaring 2.4 as having reached
>> EOL.
>>
>> The nfs-ganesha devs are currently working on 2.7; maintaining and
>> supporting 2.6, and less so 2.5, is pretty much at the limit of what
>> they might be willing to help debug.
>>
>> I strongly encourage you to update to a more recent version of both
>> glusterfs and nfs-ganesha.  glusterfs-4.1 and nfs-ganesha-2.6 would
>> be
>> ideal. Then if you still have problems you're much more likely to get
>> help.
>>
> 
> Hi, thank you for your answer, but it raises even more questions about
> any potential production deployment.
> 
> Actually, I knew that the versions are old, but it seems to me that you
> are contradicting yourself:
> 
> https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html
> 
> "After 3.10 you'd need to use storhaug Which doesn't work
> (yet).
> 

I don't recall the context of that email.

But I did also send this email
https://lists.gluster.org/pipermail/gluster-devel/2018-June/054896.html
announcing the availability of (working) storhaug-1.0.

There are packages in Fedora, CentOS Storage SIG, gluster's Launchpad
PPA, gluster's OBS, and for Debian on download.gluster.org.

-- 

Kaleb



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Karli Sjöberg
On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:
> Hi Karli,
> 
> Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha.
> 
> I just installed them last weekend ... they are working very well :)

Okay, awesome!

Is there any documentation on how to do that?

/K

> 
> Cheers,
> Edy
> 
> On 8/10/2018 9:08 PM, Karli Sjöberg wrote:
> > On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote:
> > > On 08/10/2018 08:08 AM, Karli Sjöberg wrote:
> > > > Hey all!
> > > > ...
> > > > 
> > > > glusterfs-client-xlators-3.10.12-1.el7.x86_64
> > > > glusterfs-api-3.10.12-1.el7.x86_64
> > > > nfs-ganesha-2.4.5-1.el7.x86_64
> > > > centos-release-gluster310-1.0-1.el7.centos.noarch
> > > > glusterfs-3.10.12-1.el7.x86_64
> > > > glusterfs-cli-3.10.12-1.el7.x86_64
> > > > nfs-ganesha-gluster-2.4.5-1.el7.x86_64
> > > > glusterfs-server-3.10.12-1.el7.x86_64
> > > > glusterfs-libs-3.10.12-1.el7.x86_64
> > > > glusterfs-fuse-3.10.12-1.el7.x86_64
> > > > glusterfs-ganesha-3.10.12-1.el7.x86_64
> > > > 
> > > 
> > > For nfs-ganesha problems you'd really be better served by posting
> > > to
> > > support@ or de...@lists.nfs-ganesha.org.
> > > 
> > > Both glusterfs-3.10 and nfs-ganesha-2.4 are really old.
> > > glusterfs-
> > > 3.10
> > > is even officially EOL. Ganesha isn't really organized  enough to
> > > have
> > > done anything as bold as officially declaring 2.4 as having
> > > reached
> > > EOL.
> > > 
> > > The nfs-ganesha devs are currently working on 2.7; maintaining
> > > and
> > > supporting 2.6, and less so 2.5, is pretty much at the limit of
> > > what
> > > they might be willing to help debug.
> > > 
> > > I strongly encourage you to update to a more recent version of
> > > both
> > > glusterfs and nfs-ganesha.  glusterfs-4.1 and nfs-ganesha-2.6
> > > would
> > > be
> > > ideal. Then if you still have problems you're much more likely to
> > > get
> > > help.
> > > 
> > 
> > Hi, thank you for your answer, but it raises even more questions
> > about
> > any potential production deployment.
> > 
> > Actually, I knew that the versions are old, but it seems to me that
> > you
> > are contradicting yourself:
> > 
> > https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.
> > html
> > 
> > "After 3.10 you'd need to use storhaug Which doesn't work
> > (yet).
> > 
> > You need to use 3.10 for now."
> > 
> > So how is that supposed to work?
> > 
> > Is there documentation for how to get there?
> > 
> > Thanks in advance!
> > 
> > /K
> > 
> > 
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>  
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

signature.asc
Description: This is a digitally signed message part
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Pui Edylie

Hi Karli,

Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha.


I just installed them last weekend ... they are working very well :)

Cheers,
Edy


On 8/10/2018 9:08 PM, Karli Sjöberg wrote:

On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote:

On 08/10/2018 08:08 AM, Karli Sjöberg wrote:

Hey all!
...

glusterfs-client-xlators-3.10.12-1.el7.x86_64
glusterfs-api-3.10.12-1.el7.x86_64
nfs-ganesha-2.4.5-1.el7.x86_64
centos-release-gluster310-1.0-1.el7.centos.noarch
glusterfs-3.10.12-1.el7.x86_64
glusterfs-cli-3.10.12-1.el7.x86_64
nfs-ganesha-gluster-2.4.5-1.el7.x86_64
glusterfs-server-3.10.12-1.el7.x86_64
glusterfs-libs-3.10.12-1.el7.x86_64
glusterfs-fuse-3.10.12-1.el7.x86_64
glusterfs-ganesha-3.10.12-1.el7.x86_64


For nfs-ganesha problems you'd really be better served by posting to
support@ or de...@lists.nfs-ganesha.org.

Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs-
3.10
is even officially EOL. Ganesha isn't really organized  enough to
have
done anything as bold as officially declaring 2.4 as having reached
EOL.

The nfs-ganesha devs are currently working on 2.7; maintaining and
supporting 2.6, and less so 2.5, is pretty much at the limit of what
they might be willing to help debug.

I strongly encourage you to update to a more recent version of both
glusterfs and nfs-ganesha.  glusterfs-4.1 and nfs-ganesha-2.6 would
be
ideal. Then if you still have problems you're much more likely to get
help.


Hi, thank you for your answer, but it raises even more questions about
any potential production deployment.

Actually, I knew that the versions are old, but it seems to me that you
are contradicting yourself:

https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html

"After 3.10 you'd need to use storhaug Which doesn't work
(yet).

You need to use 3.10 for now."

So how is that supposed to work?

Is there documentation for how to get there?

Thanks in advance!

/K


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Karli Sjöberg
On Fri, 2018-08-10 at 08:39 -0400, Kaleb S. KEITHLEY wrote:
> On 08/10/2018 08:08 AM, Karli Sjöberg wrote:
> > Hey all!
> > ...
> > 
> > glusterfs-client-xlators-3.10.12-1.el7.x86_64
> > glusterfs-api-3.10.12-1.el7.x86_64
> > nfs-ganesha-2.4.5-1.el7.x86_64
> > centos-release-gluster310-1.0-1.el7.centos.noarch
> > glusterfs-3.10.12-1.el7.x86_64
> > glusterfs-cli-3.10.12-1.el7.x86_64
> > nfs-ganesha-gluster-2.4.5-1.el7.x86_64
> > glusterfs-server-3.10.12-1.el7.x86_64
> > glusterfs-libs-3.10.12-1.el7.x86_64
> > glusterfs-fuse-3.10.12-1.el7.x86_64
> > glusterfs-ganesha-3.10.12-1.el7.x86_64
> > 
> 
> For nfs-ganesha problems you'd really be better served by posting to
> support@ or de...@lists.nfs-ganesha.org.
> 
> Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs-
> 3.10
> is even officially EOL. Ganesha isn't really organized  enough to
> have
> done anything as bold as officially declaring 2.4 as having reached
> EOL.
> 
> The nfs-ganesha devs are currently working on 2.7; maintaining and
> supporting 2.6, and less so 2.5, is pretty much at the limit of what
> they might be willing to help debug.
> 
> I strongly encourage you to update to a more recent version of both
> glusterfs and nfs-ganesha.  glusterfs-4.1 and nfs-ganesha-2.6 would
> be
> ideal. Then if you still have problems you're much more likely to get
> help.
> 

Hi, thank you for your answer, but it raises even more questions about
any potential production deployment.

Actually, I knew that the versions are old, but it seems to me that you
are contradicting yourself:

https://lists.gluster.org/pipermail/gluster-users/2017-July/031753.html

"After 3.10 you'd need to use storhaug Which doesn't work
(yet).

You need to use 3.10 for now."

So how is that supposed to work?

Is there documentation for how to get there?

Thanks in advance!

/K

signature.asc
Description: This is a digitally signed message part
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Kaleb S. KEITHLEY
On 08/10/2018 08:08 AM, Karli Sjöberg wrote:
> Hey all!
> ...
> 
> glusterfs-client-xlators-3.10.12-1.el7.x86_64
> glusterfs-api-3.10.12-1.el7.x86_64
> nfs-ganesha-2.4.5-1.el7.x86_64
> centos-release-gluster310-1.0-1.el7.centos.noarch
> glusterfs-3.10.12-1.el7.x86_64
> glusterfs-cli-3.10.12-1.el7.x86_64
> nfs-ganesha-gluster-2.4.5-1.el7.x86_64
> glusterfs-server-3.10.12-1.el7.x86_64
> glusterfs-libs-3.10.12-1.el7.x86_64
> glusterfs-fuse-3.10.12-1.el7.x86_64
> glusterfs-ganesha-3.10.12-1.el7.x86_64
> 

For nfs-ganesha problems you'd really be better served by posting to
support@ or de...@lists.nfs-ganesha.org.

Both glusterfs-3.10 and nfs-ganesha-2.4 are really old. glusterfs-3.10
is even officially EOL. Ganesha isn't really organized  enough to have
done anything as bold as officially declaring 2.4 as having reached EOL.

The nfs-ganesha devs are currently working on 2.7; maintaining and
supporting 2.6, and less so 2.5, is pretty much at the limit of what
they might be willing to help debug.

I strongly encourage you to update to a more recent version of both
glusterfs and nfs-ganesha.  glusterfs-4.1 and nfs-ganesha-2.6 would be
ideal. Then if you still have problems you're much more likely to get help.

-- 

Kaleb
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] ganesha.nfsd process dies when copying files

2018-08-10 Thread Karli Sjöberg
Hey all!

I am playing around on my computer with setting up a virtual mini-
cluster of five VM's:

1x router
1x client
3x Gluster/NFS-Ganesha servers

The router is pfSense, the client is Xubuntu 18.04 and the servers are
CentOS 7.5.

I set up the cluster using 'gdeploy' with configuration snippets taken
from oVirt/Cockpit HCI setup and another snippet for setting up the
NFS-Ganesha part of it. The configuration is successful apart from some
minor details I debugged but I'm fairly sure I haven't made any obvious
misses.

All of the VM's are registered in pfSense's DNS, as well as the VIP's
for the NFS-Ganesha nodes, which works great and the client have no
issues with resolving any of the names.

hv01.localdomain192.168.1.101
hv02.localdomain192.168.1.102
hv03.localdomain192.168.1.103
hv01v.localdomain   192.168.1.110
hv02v.localdomain   192.168.1.111
hv03v.localdomain   192.168.1.112

The cluster status is HEALTHY accoring to
'/usr/libexec/ganesha/ganesha-ha.sh' before I start my tests:

client# mount -t nfs -o vers=4.1 hv01v.localdomain:/data /mnt
client# dd if=/dev/urandom of=/var/tmp/test.bin bs=1M count=1024
client# while true; do rsync /var/tmp/test.bin /mnt/; rm -f
/mnt/test.bin; done

Then after a while, the 'nfs-ganesha' service unexpectedly dies and
doesn't restart by itself. The copy loop gets picked up after a while
on 'hv02' until history repeats itself until all of the nodes' 'nfs-
ganesha' services are dead.

With normal logs activated, the dead node says nothing before dying;
sudden heart attack syndrome- so no clues there, and ones remaining
only says they've taken over...

Right now I'm running with FULL_DEBUG which makes testing very
difficult since the throughput is down to a crawl. Nothing strange
about that, just takes a lot more time to provoke.

Please don't hesitate to ask for more information in case there's
something else you'd like me to share!

I'm hoping someone recognizes this behaviour and knows what I'm doing
wrong:)

glusterfs-client-xlators-3.10.12-1.el7.x86_64
glusterfs-api-3.10.12-1.el7.x86_64
nfs-ganesha-2.4.5-1.el7.x86_64
centos-release-gluster310-1.0-1.el7.centos.noarch
glusterfs-3.10.12-1.el7.x86_64
glusterfs-cli-3.10.12-1.el7.x86_64
nfs-ganesha-gluster-2.4.5-1.el7.x86_64
glusterfs-server-3.10.12-1.el7.x86_64
glusterfs-libs-3.10.12-1.el7.x86_64
glusterfs-fuse-3.10.12-1.el7.x86_64
glusterfs-ganesha-3.10.12-1.el7.x86_64

Thanks in advance!

/K#gdeploy configuration generated by cockpit-gluster plugin
[hosts]
hv01.localdomain
hv02.localdomain
hv03.localdomain

[yum]
action=install
repolist=
gpgcheck=no
update=no
packages=glusterfs-server,glusterfs-api,glusterfs-ganesha,nfs-ganesha,nfs-ganesha-gluster,policycoreutils-python,device-mapper-multipath,corosync,pacemaker,pcs

[script1:hv01.localdomain]
action=execute
ignore_script_errors=no
file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h 
hv01.localdomain,hv02.localdomain,hv03.localdomain

[script1:hv02.localdomain]
action=execute
ignore_script_errors=no
file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h 
hv01.localdomain,hv02.localdomain,hv03.localdomain

[script1:hv03.localdomain]
action=execute
ignore_script_errors=no
file=/usr/share/gdeploy/scripts/grafton-sanity-check.sh -d vdb -h 
hv01.localdomain,hv02.localdomain,hv03.localdomain

[disktype]
jbod

[diskcount]
12

[stripesize]
256

[service1]
action=enable
service=chronyd

[service2]
action=restart
service=chronyd

[script3]
action=execute
file=/usr/share/gdeploy/scripts/blacklist_all_disks.sh
ignore_script_errors=no

[pv1:hv01.localdomain]
action=create
devices=vdb
ignore_pv_errors=no

[pv1:hv02.localdomain]
action=create
devices=vdb
ignore_pv_errors=no

[pv1:hv03.localdomain]
action=create
devices=vdb
ignore_pv_errors=no

[vg1:hv01.localdomain]
action=create
vgname=gluster_vg_vdb
pvname=vdb
ignore_vg_errors=no

[vg1:hv02.localdomain]
action=create
vgname=gluster_vg_vdb
pvname=vdb
ignore_vg_errors=no

[vg1:hv03.localdomain]
action=create
vgname=gluster_vg_vdb
pvname=vdb
ignore_vg_errors=no

[lv1:hv01.localdomain]
action=create
poolname=gluster_thinpool_vdb
ignore_lv_errors=no
vgname=gluster_vg_vdb
lvtype=thinpool
size=450GB
poolmetadatasize=3GB

[lv2:hv02.localdomain]
action=create
poolname=gluster_thinpool_vdb
ignore_lv_errors=no
vgname=gluster_vg_vdb
lvtype=thinpool
size=450GB
poolmetadatasize=3GB

[lv3:hv03.localdomain]
action=create
poolname=gluster_thinpool_vdb
ignore_lv_errors=no
vgname=gluster_vg_vdb
lvtype=thinpool
size=45GB
poolmetadatasize=1GB

[lv4:hv01.localdomain]
action=create
lvname=gluster_lv_data
ignore_lv_errors=no
vgname=gluster_vg_vdb
mount=/gluster_bricks/data
lvtype=thinlv
poolname=gluster_thinpool_vdb
virtualsize=450GB

[lv5:hv02.localdomain]
action=create
lvname=gluster_lv_data
ignore_lv_errors=no
vgname=gluster_vg_vdb
mount=/gluster_bricks/data
lvtype=thinlv
poolname=gluster_thinpool_vdb
virtualsize=450GB

[lv6:hv03.localdomain]
action=create
lvname=gluster_lv_data

Re: [Gluster-users] Gluster Outreachy

2018-08-10 Thread Bhumika Goyal
Hi all,

*Gentle reminder!*

The doc[1] for adding project ideas for Outreachy will be open for editing
till August 20th. Please feel free to add your project ideas :).
[1]:
https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing

Thanks,
Bhumika



On Wed, Jul 4, 2018 at 4:51 PM, Bhumika Goyal  wrote:

> Hi all,
>
> Gnome has been working on an initiative known as Outreachy[1] since 2010.
> Outreachy is a three months remote internship program. It aims to increase
> the participation of women and members from under-represented groups in
> open source. This program is held twice in a year. During the internship
> period, interns contribute to a project under the guidance of one or more
> mentors.
>
> For the next round(Dec 2018- March 2019) we are planning to apply projects
> from Gluster. We would like you to propose projects ideas or/and come
> forward as mentors/volunteers.
> Please feel free to add project ideas in this doc[2]. The doc[2] will be
> open for editing till July end.
>
> [1]: https://www.outreachy.org/
> [2]: https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojK
> sF16QI5-j7cUHcR5Pq4/edit?usp=sharing
>
> Outreachy timeline:
> Pre-Application Period - Late August to early September
> Application Period - Early September to mid-October
> Internship Period -  December to March
>
> Thanks,
> Bhumika
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users