[Gluster-users] Gluster failure disk is full

2018-05-17 Thread Thing
Hi,

I have a 3 way gluster 4 setup.  I had a "mishap" of some sort and I lost
node no2.  There was and is 660gb spare on no1 and no3 but no2 is so full I
cannot mount it at boot, nor manually. I was making a 80gb VM so no idea
what happened, nor how to fix it?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] New 3.12.7 possible split-brain on replica 3

2018-05-17 Thread mabi
Hi Ravi,

Please fine below the answers to your questions

1) I have never touched the cluster.quorum-type option. Currently it is set as 
following for this volume:

Option  Value   
--  -   
cluster.quorum-type none

2) The .shareKey files are not supposed to be empty. They should be 512 bytes 
big and contain binary data (PGP Secret Sub-key). I am not in a position to say 
why it is in this specific case only 0 bytes and if it is the fault of the 
software (Nextcloud) or GlusterFS. I can just say here that I have another file 
server which is a simple NFS server with another Nextcloud installation and 
there I never saw any 0 bytes .shareKey files being created.

3) It seems to be quite random and I am not the person who uses the Nextcloud 
software so I can't say what it was doing at that specific time but I guess 
uploading files or moving files around. Basically I use GlusterFS to store the 
files/data of the Nextcloud web application where I have it mounted using a 
fuse mount (mount -t glusterfs).

Regarding the logs I have attached the mount log file from the client and below 
are the relevant log entries from the brick log file of all 3 nodes. Let me 
know if you need any other log files. Also if you know any "log file sanitizer 
tool" which can replace sensitive file names with random file names in log 
files that would like to use it as right now I have to do that manually.
​​
NODE 1 brick log:

[2018-05-15 06:54:20.176679] E [MSGID: 113015] [posix.c:1211:posix_opendir] 
0-myvol-private-posix: opendir failed on 
/data/myvol-private/brick/cloud/data/admin/files_encryption/keys/files/dir/dir/anotherdir/dir/OC_DEFAULT_MODULE
 [No such file or directory]

NODE 2 brick log:

[2018-05-15 06:54:20.176415] E [MSGID: 113015] [posix.c:1211:posix_opendir] 
0-myvol-private-posix: opendir failed on 
/data/myvol-private/brick/cloud/data/admin/files_encryption/keys/files/dir/dir/anotherdir/dir/OC_DEFAULT_MODULE
 [No such file or directory]

NODE 3 (arbiter) brick log:

[2018-05-15 06:54:19.898981] W [MSGID: 113103] [posix.c:285:posix_lookup] 
0-myvol-private-posix: Found stale gfid handle 
/srv/glusterfs/myvol-private/brick/.glusterfs/f0/65/f065a5e7-ac06-445f-add0-83acf8ce4155,
 removing it. [Stale file handle]
[2018-05-15 06:54:20.056196] W [MSGID: 113103] [posix.c:285:posix_lookup] 
0-myvol-private-posix: Found stale gfid handle 
/srv/glusterfs/myvol-private/brick/.glusterfs/8f/a1/8fa15dbd-cd5c-4900-b889-0fe7fce46a13,
 removing it. [Stale file handle]
[2018-05-15 06:54:20.172823] I [MSGID: 115056] 
[server-rpc-fops.c:485:server_rmdir_cbk] 0-myvol-private-server: 14740125: 
RMDIR 
/cloud/data/admin/files_encryption/keys/files/dir/dir/anotherdir/dir/OC_DEFAULT_MODULE
 (f065a5e7-ac06-445f-add0-83acf8ce4155/OC_DEFAULT_MODULE), client: 
nextcloud.domain.com-7972-2018/05/10-20:31:46:163206-myvol-private-client-2-0-0,
 error-xlator: myvol-private-posix [Directory not empty]
[2018-05-15 06:54:20.190911] I [MSGID: 115056] 
[server-rpc-fops.c:485:server_rmdir_cbk] 0-myvol-private-server: 14740141: 
RMDIR /cloud/data/admin/files_encryption/keys/files/dir/dir/anotherdir/dir 
(72a1613e-2ac0-48bd-8ace-f2f723f3796c/2016.03.15 AVB_Photovoltaik-Versicherung 
2013.pdf), client: 
nextcloud.domain.com-7972-2018/05/10-20:31:46:163206-myvol-private-client-2-0-0,
 error-xlator: myvol-private-posix [Directory not empty]


Best regards,
Mabi

‐‐‐ Original Message ‐‐‐

On May 17, 2018 7:00 AM, Ravishankar N  wrote:

> ​​
> 
> Hi mabi,
> 
> Some questions:
> 
> -Did you by any chance change the cluster.quorum-type option from the
> 
> default values?
> 
> -Is filename.shareKey supposed to be any empty file? Looks like the file
> 
> was fallocated with the keep-size option but never written to. (On the 2
> 
> data bricks, stat output shows Size =0, but non zero Blocks and yet a
> 
> 'regular empty file').
> 
> -Do you have some sort of a reproducer/ steps that you perform when the
> 
> issue occurs? Please also share the logs from all 3 nodes and the client(s).
> 
> Thanks,
> 
> Ravi
> 
> On 05/15/2018 05:26 PM, mabi wrote:
> 
> > Thank you Ravi for your fast answer. As requested you will find below the 
> > "stat" and "getfattr" of one of the files and its parent directory from all 
> > three nodes of my cluster.
> > 
> > NODE 1:
> > 
> > File: 
> > ‘/data/myvolume-private/brick/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/OC_DEFAULT_MODULE/filename.shareKey’
> > 
> > Size: 0 Blocks: 38 IO Block: 131072 regular empty file
> > 
> > Device: 23h/35d Inode: 744413 Links: 2
> > 
> > Access: (0644/-rw-r--r--) Uid: (20936/ UNKNOWN) Gid: (20936/ UNKNOWN)
> > 
> > Access: 2018-05-15 08:54:20.296048887 +0200
> > 
> > Modify: 2018-05-15 08:54:20.296048887 +0200
> > 
> > Change: 2018-05-15 08:54:20.340048505 +0200
> > 
> > Birth: -
> > 
> > File: