Re: [Gluster-users] self-heal not working

2017-08-21 Thread Ravishankar N

Explore the following:

- Launch index heal and look at the glustershd logs of all bricks for 
possible errors


- See if the glustershd in each node is connected to all bricks.

- If not try to restart shd by `volume start force`

- Launch index heal again and try.

- Try debugging the shd log by setting client-log-level to DEBUG 
temporarily.


On 08/22/2017 03:19 AM, mabi wrote:

Sure, it doesn't look like a split brain based on the output:

Brick node1.domain.tld:/data/myvolume/brick
Status: Connected
Number of entries in split-brain: 0

Brick node2.domain.tld:/data/myvolume/brick
Status: Connected
Number of entries in split-brain: 0

Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
Status: Connected
Number of entries in split-brain: 0





 Original Message 
Subject: Re: [Gluster-users] self-heal not working
Local Time: August 21, 2017 11:35 PM
UTC Time: August 21, 2017 9:35 PM
From: btur...@redhat.com
To: mabi 
Gluster Users 

Can you also provide:

gluster v heal  info split-brain

If it is split brain just delete the incorrect file from the brick 
and run heal again. I haven"t tried this with arbiter but I assume 
the process is the same.


-b

- Original Message -
> From: "mabi" 
> To: "Ben Turner" 
> Cc: "Gluster Users" 
> Sent: Monday, August 21, 2017 4:55:59 PM
> Subject: Re: [Gluster-users] self-heal not working
>
> Hi Ben,
>
> So it is really a 0 kBytes file everywhere (all nodes including the 
arbiter

> and from the client).
> Here below you will find the output you requested. Hopefully that 
will help
> to find out why this specific file is not healing... Let me know if 
you need

> any more information. Btw node3 is my arbiter node.
>
> NODE1:
>
> STAT:
> File:
> 
‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’

> Size: 0 Blocks: 38 IO Block: 131072 regular empty file
> Device: 24h/36d Inode: 10033884 Links: 2
> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.407404779 +0200
> Change: 2017-08-14 17:11:46.407404779 +0200
> Birth: -
>
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZhuknAAlJAg==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> 
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=

>
> NODE2:
>
> STAT:
> File:
> 
‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’

> Size: 0 Blocks: 38 IO Block: 131072 regular empty file
> Device: 26h/38d Inode: 10031330 Links: 2
> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.403704181 +0200
> Change: 2017-08-14 17:11:46.403704181 +0200
> Birth: -
>
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZhu6wAA8Hpw==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> 
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=

>
> NODE3:
> STAT:
> File:
> 
/srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png

> Size: 0 Blocks: 0 IO Block: 4096 regular empty file
> Device: ca11h/51729d Inode: 405208959 Links: 2
> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:04:55.530681000 +0200
> Change: 2017-08-14 17:11:46.604380051 +0200
> Birth: -
>
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZe6ejAAKPAg==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> 
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=

>
> CLIENT GLUSTER MOUNT:
> STAT:
> File:
> 
"/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"

> Size: 0 Blocks: 0 IO Block: 131072 regular empty file
> Device: 1eh/30d Inode: 11897049013408443114 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.407404779 +0200
> Change: 2017-08-14 17:11:46.407404779 +0200
> Birth: -
>
> >  Original Message 
> > Subject: Re: [Gluster-users] self-heal not working
> > Local Time: August 21, 2017 9:34 PM
> > UTC Time: August 21, 2017 7:34 PM
> > From: btur...@redhat.com
> > To: mabi 
> > Gluster Users 
> >
> > - Original Message -
> >> From: "mabi" 
> >> To: "Gluster Users" 
> >> Sent: Monday, August 21, 2017 9:28:24 AM
> >> Subject: [Gluster-users] self-heal not working
> >>
> >> Hi,
> >>
> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and 
there is
> >> currently one file listed to be healed as you can see below but 
never gets

> >> healed by the self-heal daemon:
> >>
> >> Brick 

Re: [Gluster-users] self-heal not working

2017-08-21 Thread mabi
Sure, it doesn't look like a split brain based on the output:

Brick node1.domain.tld:/data/myvolume/brick
Status: Connected
Number of entries in split-brain: 0

Brick node2.domain.tld:/data/myvolume/brick
Status: Connected
Number of entries in split-brain: 0

Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
Status: Connected
Number of entries in split-brain: 0

>  Original Message 
> Subject: Re: [Gluster-users] self-heal not working
> Local Time: August 21, 2017 11:35 PM
> UTC Time: August 21, 2017 9:35 PM
> From: btur...@redhat.com
> To: mabi 
> Gluster Users 
>
> Can you also provide:
>
> gluster v heal  info split-brain
>
> If it is split brain just delete the incorrect file from the brick and run 
> heal again. I haven"t tried this with arbiter but I assume the process is the 
> same.
>
> -b
>
> - Original Message -
>> From: "mabi" 
>> To: "Ben Turner" 
>> Cc: "Gluster Users" 
>> Sent: Monday, August 21, 2017 4:55:59 PM
>> Subject: Re: [Gluster-users] self-heal not working
>>
>> Hi Ben,
>>
>> So it is really a 0 kBytes file everywhere (all nodes including the arbiter
>> and from the client).
>> Here below you will find the output you requested. Hopefully that will help
>> to find out why this specific file is not healing... Let me know if you need
>> any more information. Btw node3 is my arbiter node.
>>
>> NODE1:
>>
>> STAT:
>> File:
>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>> Device: 24h/36d Inode: 10033884 Links: 2
>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> Access: 2017-08-14 17:04:55.530681000 +0200
>> Modify: 2017-08-14 17:11:46.407404779 +0200
>> Change: 2017-08-14 17:11:46.407404779 +0200
>> Birth: -
>>
>> GETFATTR:
>> trusted.afr.dirty=0sAQAA
>> trusted.bit-rot.version=0sAgBZhuknAAlJAg==
>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=
>>
>> NODE2:
>>
>> STAT:
>> File:
>> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file
>> Device: 26h/38d Inode: 10031330 Links: 2
>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> Access: 2017-08-14 17:04:55.530681000 +0200
>> Modify: 2017-08-14 17:11:46.403704181 +0200
>> Change: 2017-08-14 17:11:46.403704181 +0200
>> Birth: -
>>
>> GETFATTR:
>> trusted.afr.dirty=0sAQAA
>> trusted.bit-rot.version=0sAgBZhu6wAA8Hpw==
>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=
>>
>> NODE3:
>> STAT:
>> File:
>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file
>> Device: ca11h/51729d Inode: 405208959 Links: 2
>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> Access: 2017-08-14 17:04:55.530681000 +0200
>> Modify: 2017-08-14 17:04:55.530681000 +0200
>> Change: 2017-08-14 17:11:46.604380051 +0200
>> Birth: -
>>
>> GETFATTR:
>> trusted.afr.dirty=0sAQAA
>> trusted.bit-rot.version=0sAgBZe6ejAAKPAg==
>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=
>>
>> CLIENT GLUSTER MOUNT:
>> STAT:
>> File:
>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"
>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file
>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1
>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)
>> Access: 2017-08-14 17:04:55.530681000 +0200
>> Modify: 2017-08-14 17:11:46.407404779 +0200
>> Change: 2017-08-14 17:11:46.407404779 +0200
>> Birth: -
>>
>> >  Original Message 
>> > Subject: Re: [Gluster-users] self-heal not working
>> > Local Time: August 21, 2017 9:34 PM
>> > UTC Time: August 21, 2017 7:34 PM
>> > From: btur...@redhat.com
>> > To: mabi 
>> > Gluster Users 
>> >
>> > - Original Message -
>> >> From: "mabi" 
>> >> To: "Gluster Users" 
>> >> Sent: Monday, August 21, 2017 9:28:24 AM
>> >> Subject: [Gluster-users] self-heal not working
>> >>
>> >> Hi,
>> >>
>> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is
>> >> currently one file listed to be healed as you can see below but never gets
>> >> healed by the self-heal daemon:
>> >>
>> >> Brick node1.domain.tld:/data/myvolume/brick
>> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> >> Status: Connected
>> >> Number of entries: 1
>> >>
>> >> Brick node2.domain.tld:/data/myvolume/brick
>> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> >> 

Re: [Gluster-users] self-heal not working

2017-08-21 Thread Ben Turner
Can you also provide:

gluster v heal  info split-brain

If it is split brain just delete the incorrect file from the brick and run heal 
again.  I haven't tried this with arbiter but I assume the process is the same.

-b

- Original Message -
> From: "mabi" 
> To: "Ben Turner" 
> Cc: "Gluster Users" 
> Sent: Monday, August 21, 2017 4:55:59 PM
> Subject: Re: [Gluster-users] self-heal not working
> 
> Hi Ben,
> 
> So it is really a 0 kBytes file everywhere (all nodes including the arbiter
> and from the client).
> Here below you will find the output you requested. Hopefully that will help
> to find out why this specific file is not healing... Let me know if you need
> any more information. Btw node3 is my arbiter node.
> 
> NODE1:
> 
> STAT:
>   File:
>   
> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>   Size: 0 Blocks: 38 IO Block: 131072 regular empty file
> Device: 24h/36d Inode: 10033884Links: 2
> Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.407404779 +0200
> Change: 2017-08-14 17:11:46.407404779 +0200
> Birth: -
> 
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZhuknAAlJAg==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=
> 
> NODE2:
> 
> STAT:
>   File:
>   
> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
>   Size: 0 Blocks: 38 IO Block: 131072 regular empty file
> Device: 26h/38d Inode: 10031330Links: 2
> Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.403704181 +0200
> Change: 2017-08-14 17:11:46.403704181 +0200
> Birth: -
> 
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZhu6wAA8Hpw==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=
> 
> NODE3:
> STAT:
>   File:
>   
> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>   Size: 0 Blocks: 0  IO Block: 4096   regular empty file
> Device: ca11h/51729d Inode: 405208959   Links: 2
> Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:04:55.530681000 +0200
> Change: 2017-08-14 17:11:46.604380051 +0200
> Birth: -
> 
> GETFATTR:
> trusted.afr.dirty=0sAQAA
> trusted.bit-rot.version=0sAgBZe6ejAAKPAg==
> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=
> 
> CLIENT GLUSTER MOUNT:
> STAT:
>   File:
>   '/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png'
>   Size: 0 Blocks: 0  IO Block: 131072 regular empty file
> Device: 1eh/30d Inode: 11897049013408443114  Links: 1
> Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
> Access: 2017-08-14 17:04:55.530681000 +0200
> Modify: 2017-08-14 17:11:46.407404779 +0200
> Change: 2017-08-14 17:11:46.407404779 +0200
> Birth: -
> 
> >  Original Message 
> > Subject: Re: [Gluster-users] self-heal not working
> > Local Time: August 21, 2017 9:34 PM
> > UTC Time: August 21, 2017 7:34 PM
> > From: btur...@redhat.com
> > To: mabi 
> > Gluster Users 
> >
> > - Original Message -
> >> From: "mabi" 
> >> To: "Gluster Users" 
> >> Sent: Monday, August 21, 2017 9:28:24 AM
> >> Subject: [Gluster-users] self-heal not working
> >>
> >> Hi,
> >>
> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is
> >> currently one file listed to be healed as you can see below but never gets
> >> healed by the self-heal daemon:
> >>
> >> Brick node1.domain.tld:/data/myvolume/brick
> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >> Status: Connected
> >> Number of entries: 1
> >>
> >> Brick node2.domain.tld:/data/myvolume/brick
> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >> Status: Connected
> >> Number of entries: 1
> >>
> >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >> Status: Connected
> >> Number of entries: 1
> >>
> >> As once recommended on this mailing list I have mounted that glusterfs
> >> volume
> >> temporarily through fuse/glusterfs and ran a "stat" on that file which is
> >> listed above but nothing happened.
> >>
> >> The file itself is available on all 3 nodes/bricks but on the last node it
> >> has a different date. By the way this file is 0 kBytes big. Is that maybe
> >> the reason why the 

Re: [Gluster-users] self-heal not working

2017-08-21 Thread mabi
Hi Ben,

So it is really a 0 kBytes file everywhere (all nodes including the arbiter and 
from the client).
Here below you will find the output you requested. Hopefully that will help to 
find out why this specific file is not healing... Let me know if you need any 
more information. Btw node3 is my arbiter node.

NODE1:

STAT:
  File: 
‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
  Size: 0 Blocks: 38 IO Block: 131072 regular empty file
Device: 24h/36d Inode: 10033884Links: 2
Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
Access: 2017-08-14 17:04:55.530681000 +0200
Modify: 2017-08-14 17:11:46.407404779 +0200
Change: 2017-08-14 17:11:46.407404779 +0200
Birth: -

GETFATTR:
trusted.afr.dirty=0sAQAA
trusted.bit-rot.version=0sAgBZhuknAAlJAg==
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=

NODE2:

STAT:
  File: 
‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’
  Size: 0 Blocks: 38 IO Block: 131072 regular empty file
Device: 26h/38d Inode: 10031330Links: 2
Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
Access: 2017-08-14 17:04:55.530681000 +0200
Modify: 2017-08-14 17:11:46.403704181 +0200
Change: 2017-08-14 17:11:46.403704181 +0200
Birth: -

GETFATTR:
trusted.afr.dirty=0sAQAA
trusted.bit-rot.version=0sAgBZhu6wAA8Hpw==
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=

NODE3:
STAT:
  File: 
/srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
  Size: 0 Blocks: 0  IO Block: 4096   regular empty file
Device: ca11h/51729d Inode: 405208959   Links: 2
Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
Access: 2017-08-14 17:04:55.530681000 +0200
Modify: 2017-08-14 17:04:55.530681000 +0200
Change: 2017-08-14 17:11:46.604380051 +0200
Birth: -

GETFATTR:
trusted.afr.dirty=0sAQAA
trusted.bit-rot.version=0sAgBZe6ejAAKPAg==
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=

CLIENT GLUSTER MOUNT:
STAT:
  File: '/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png'
  Size: 0 Blocks: 0  IO Block: 131072 regular empty file
Device: 1eh/30d Inode: 11897049013408443114  Links: 1
Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
Access: 2017-08-14 17:04:55.530681000 +0200
Modify: 2017-08-14 17:11:46.407404779 +0200
Change: 2017-08-14 17:11:46.407404779 +0200
Birth: -

>  Original Message 
> Subject: Re: [Gluster-users] self-heal not working
> Local Time: August 21, 2017 9:34 PM
> UTC Time: August 21, 2017 7:34 PM
> From: btur...@redhat.com
> To: mabi 
> Gluster Users 
>
> - Original Message -
>> From: "mabi" 
>> To: "Gluster Users" 
>> Sent: Monday, August 21, 2017 9:28:24 AM
>> Subject: [Gluster-users] self-heal not working
>>
>> Hi,
>>
>> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is
>> currently one file listed to be healed as you can see below but never gets
>> healed by the self-heal daemon:
>>
>> Brick node1.domain.tld:/data/myvolume/brick
>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> Status: Connected
>> Number of entries: 1
>>
>> Brick node2.domain.tld:/data/myvolume/brick
>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> Status: Connected
>> Number of entries: 1
>>
>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
>> Status: Connected
>> Number of entries: 1
>>
>> As once recommended on this mailing list I have mounted that glusterfs volume
>> temporarily through fuse/glusterfs and ran a "stat" on that file which is
>> listed above but nothing happened.
>>
>> The file itself is available on all 3 nodes/bricks but on the last node it
>> has a different date. By the way this file is 0 kBytes big. Is that maybe
>> the reason why the self-heal does not work?
>
> Is the file actually 0 bytes or is it just 0 bytes on the arbiter(0 bytes are 
> expected on the arbiter, it just stores metadata)? Can you send us the output 
> from stat on all 3 nodes:
>
> $ stat 
> $ getfattr -d -m - 
> $ stat 
>
> Lets see what things look like on the back end, it should tell us why healing 
> is failing.
>
> -b
>
>>
>> And how can I now make this file to heal?
>>
>> Thanks,
>> Mabi
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org

Re: [Gluster-users] self-heal not working

2017-08-21 Thread Ben Turner
- Original Message -
> From: "mabi" 
> To: "Gluster Users" 
> Sent: Monday, August 21, 2017 9:28:24 AM
> Subject: [Gluster-users] self-heal not working
> 
> Hi,
> 
> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is
> currently one file listed to be healed as you can see below but never gets
> healed by the self-heal daemon:
> 
> Brick node1.domain.tld:/data/myvolume/brick
> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> Status: Connected
> Number of entries: 1
> 
> Brick node2.domain.tld:/data/myvolume/brick
> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> Status: Connected
> Number of entries: 1
> 
> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> Status: Connected
> Number of entries: 1
> 
> As once recommended on this mailing list I have mounted that glusterfs volume
> temporarily through fuse/glusterfs and ran a "stat" on that file which is
> listed above but nothing happened.
> 
> The file itself is available on all 3 nodes/bricks but on the last node it
> has a different date. By the way this file is 0 kBytes big. Is that maybe
> the reason why the self-heal does not work?

Is the file actually 0 bytes or is it just 0 bytes on the arbiter(0 bytes are 
expected on the arbiter, it just stores metadata)?  Can you send us the output 
from stat on all 3 nodes:

$ stat 
$ getfattr -d -m - 
$ stat 

Lets see what things look like on the back end, it should tell us why healing 
is failing.

-b

> 
> And how can I now make this file to heal?
> 
> Thanks,
> Mabi
> 
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Brick count limit in a volume

2017-08-21 Thread Serkan Çoban
Hi,
Gluster version is 3.10.5. I am trying to create a 5500 brick volume,
but getting an error stating that  bricks is the limit. Is this a
known limit? Can I change this with an option?

Thanks,
Serkan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Glusterd not working with systemd in redhat 7

2017-08-21 Thread Atin Mukherjee
The log doesn't indicate that glusterd didn't come up.

@Mohit - could you provide your input on the flooding of EPOLLERR entries
observed here?

On Mon, Aug 21, 2017 at 6:59 PM, Cesar da Silva 
wrote:

> Hi!
>
> Please see bellow. Note that web1.dasilva.network is the address of the
> local machine where one of the bricks is installed and that ties to mount.
>
> [2017-08-20 20:30:40.359236] I [MSGID: 100030] [glusterfsd.c:2476:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.11.2
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
> [2017-08-20 20:30:40.973249] I [MSGID: 106478] [glusterd.c:1422:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2017-08-20 20:30:40.973303] I [MSGID: 106479] [glusterd.c:1469:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2017-08-20 20:30:41.489229] W [MSGID: 103071] 
> [rdma.c:4591:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed [Enheten finns
> inte]
> [2017-08-20 20:30:41.489263] W [MSGID: 103055] [rdma.c:4898:init]
> 0-rdma.management: Failed to initialize IB Device
> [2017-08-20 20:30:41.489270] W [rpc-transport.c:350:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2017-08-20 20:30:41.489308] W [rpcsvc.c:1660:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2017-08-20 20:30:41.489318] E [MSGID: 106243] [glusterd.c:1693:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2017-08-20 20:30:57.917320] I [MSGID: 106513] 
> [glusterd-store.c:2193:glusterd_restore_op_version]
> 0-glusterd: retrieved op-version: 31100
> [2017-08-20 20:31:01.785150] I [MSGID: 106498] [glusterd-handler.c:3602:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2017-08-20 20:31:01.827584] I [MSGID: 106498] [glusterd-handler.c:3602:
> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
> [2017-08-20 20:31:01.827639] W [MSGID: 106062] [glusterd-handler.c:3399:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2017-08-20 20:31:01.827678] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2017-08-20 20:31:01.827752] W [MSGID: 101002] [options.c:954:xl_opt_validate]
> 0-management: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2017-08-20 20:31:01.828546] W [MSGID: 106062] [glusterd-handler.c:3399:
> glusterd_transport_inet_options_build] 0-glusterd: Failed to get
> tcp-user-timeout
> [2017-08-20 20:31:01.828568] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2017-08-20 20:31:01.828623] W [MSGID: 101002] [options.c:954:xl_opt_validate]
> 0-management: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2017-08-20 20:31:01.881962] I [MSGID: 106544]
> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
> b5718e49-db55-4f01-8839-01b3b257b8b2
> Final graph:
> +---
> ---+
>   1: volume management
>   2: type mgmt/glusterd
>   3: option rpc-auth.auth-glusterfs on
>   4: option rpc-auth.auth-unix on
>   5: option rpc-auth.auth-null on
>   6: option rpc-auth-allow-insecure on
>   7: option transport.socket.listen-backlog 128
>   8: option event-threads 1
>   9: option ping-timeout 0
>  10: option transport.socket.read-fail-log off
>  11: option transport.socket.keepalive-interval 2
>  12: option transport.socket.keepalive-time 10
>  13: option transport-type rdma
>  14: option working-directory /var/lib/glusterd
>  15: end-volume
>  16:
> +---
> ---+
> [2017-08-20 20:31:01.888009] I [MSGID: 101190] 
> [event-epoll.c:602:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2017-08-20 20:31:01.891406] W [rpcsvc.c:265:rpcsvc_program_actor]
> 0-rpc-service: RPC program not available (req 1298437 330) for
> 192.168.1.63:49148
> [2017-08-20 20:31:01.891429] E [rpcsvc.c:557:rpcsvc_check_and_reply_error]
> 0-rpcsvc: rpc actor failed to complete successfully
> [2017-08-20 20:31:02.746317] I [MSGID: 106493] 
> [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk]
> 0-glusterd: Received ACC from uuid: 8f66df4a-e286-4c63-9b0b-257c1ccd08b0,
> host: web3.dasilva.network, port: 0
> [2017-08-20 20:31:02.890721] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-nfs: setting frame-timeout to 600
> [2017-08-20 20:31:02.958990] I [MSGID: 106132] 
> [glusterd-proc-mgmt.c:83:glusterd_proc_stop]
> 0-management: nfs already stopped
> [2017-08-20 20:31:02.959047] I [MSGID: 106568] 
> [glusterd-svc-mgmt.c:228:glusterd_svc_stop]
> 0-management: nfs service is stopped
> [2017-08-20 

Re: [Gluster-users] Glusterd not working with systemd in redhat 7

2017-08-21 Thread Cesar da Silva
Hi!

Please see bellow. Note that web1.dasilva.network is the address of the
local machine where one of the bricks is installed and that ties to mount.

[2017-08-20 20:30:40.359236] I [MSGID: 100030] [glusterfsd.c:2476:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.11.2
(args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
[2017-08-20 20:30:40.973249] I [MSGID: 106478] [glusterd.c:1422:init]
0-management: Maximum allowed open file descriptors set to 65536
[2017-08-20 20:30:40.973303] I [MSGID: 106479] [glusterd.c:1469:init]
0-management: Using /var/lib/glusterd as working directory
[2017-08-20 20:30:41.489229] W [MSGID: 103071]
[rdma.c:4591:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
channel creation failed [Enheten finns inte]
[2017-08-20 20:30:41.489263] W [MSGID: 103055] [rdma.c:4898:init]
0-rdma.management: Failed to initialize IB Device
[2017-08-20 20:30:41.489270] W [rpc-transport.c:350:rpc_transport_load]
0-rpc-transport: 'rdma' initialization failed
[2017-08-20 20:30:41.489308] W [rpcsvc.c:1660:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2017-08-20 20:30:41.489318] E [MSGID: 106243] [glusterd.c:1693:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2017-08-20 20:30:57.917320] I [MSGID: 106513]
[glusterd-store.c:2193:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 31100
[2017-08-20 20:31:01.785150] I [MSGID: 106498]
[glusterd-handler.c:3602:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2017-08-20 20:31:01.827584] I [MSGID: 106498]
[glusterd-handler.c:3602:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2017-08-20 20:31:01.827639] W [MSGID: 106062]
[glusterd-handler.c:3399:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2017-08-20 20:31:01.827678] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-08-20 20:31:01.827752] W [MSGID: 101002]
[options.c:954:xl_opt_validate] 0-management: option 'address-family' is
deprecated, preferred is 'transport.address-family', continuing with
correction
[2017-08-20 20:31:01.828546] W [MSGID: 106062]
[glusterd-handler.c:3399:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2017-08-20 20:31:01.828568] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-08-20 20:31:01.828623] W [MSGID: 101002]
[options.c:954:xl_opt_validate] 0-management: option 'address-family' is
deprecated, preferred is 'transport.address-family', continuing with
correction
[2017-08-20 20:31:01.881962] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
b5718e49-db55-4f01-8839-01b3b257b8b2
Final graph:
+--+
  1: volume management
  2: type mgmt/glusterd
  3: option rpc-auth.auth-glusterfs on
  4: option rpc-auth.auth-unix on
  5: option rpc-auth.auth-null on
  6: option rpc-auth-allow-insecure on
  7: option transport.socket.listen-backlog 128
  8: option event-threads 1
  9: option ping-timeout 0
 10: option transport.socket.read-fail-log off
 11: option transport.socket.keepalive-interval 2
 12: option transport.socket.keepalive-time 10
 13: option transport-type rdma
 14: option working-directory /var/lib/glusterd
 15: end-volume
 16:
+--+
[2017-08-20 20:31:01.888009] I [MSGID: 101190]
[event-epoll.c:602:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2017-08-20 20:31:01.891406] W [rpcsvc.c:265:rpcsvc_program_actor]
0-rpc-service: RPC program not available (req 1298437 330) for
192.168.1.63:49148
[2017-08-20 20:31:01.891429] E [rpcsvc.c:557:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2017-08-20 20:31:02.746317] I [MSGID: 106493]
[glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: Received ACC
from uuid: 8f66df4a-e286-4c63-9b0b-257c1ccd08b0, host:
web3.dasilva.network, port: 0
[2017-08-20 20:31:02.890721] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-nfs: setting frame-timeout to 600
[2017-08-20 20:31:02.958990] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already
stopped
[2017-08-20 20:31:02.959047] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is
stopped
[2017-08-20 20:31:02.987001] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-glustershd: setting frame-timeout to 600
[2017-08-20 20:31:03.022959] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd
already stopped
[2017-08-20 20:31:03.023014] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd
service is stopped
[2017-08-20 20:31:03.023084] I [MSGID: 106567]

[Gluster-users] self-heal not working

2017-08-21 Thread mabi
Hi,

I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is 
currently one file listed to be healed as you can see below but never gets 
healed by the self-heal daemon:

Brick node1.domain.tld:/data/myvolume/brick
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
Status: Connected
Number of entries: 1

Brick node2.domain.tld:/data/myvolume/brick
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
Status: Connected
Number of entries: 1

Brick node3.domain.tld:/srv/glusterfs/myvolume/brick
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
Status: Connected
Number of entries: 1

As once recommended on this mailing list I have mounted that glusterfs volume 
temporarily through fuse/glusterfs and ran a "stat" on that file which is 
listed above but nothing happened.

The file itself is available on all 3 nodes/bricks but on the last node it has 
a different date. By the way this file is 0 kBytes big. Is that maybe the 
reason why the self-heal does not work?

And how can I now make this file to heal?

Thanks,
Mabi___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Glusterd not working with systemd in redhat 7

2017-08-21 Thread Atin Mukherjee
On Mon, Aug 21, 2017 at 2:49 AM, Cesar da Silva 
wrote:

> Hi!
> I am having same issue but I am running Ubuntu v16.04.
> It does not mount during boot, but works if I mount it manually. I am
> running the Gluster-server on the same machines (3 machines)
> Here is the /tc/fstab file
>
> /dev/sdb1 /data/gluster ext4 defaults 0 0
>
> web1.dasilva.network:/www /mnt/glusterfs/www glusterfs
> defaults,_netdev,log-level=debug,log-file=/var/log/gluster.log 0 0
> web1.dasilva.network:/etc /mnt/glusterfs/etc glusterfs
> defaults,_netdev,log-level=debug,log-file=/var/log/gluster.log 0 0
>
> Here is the logfile /var/log/gluster.log
>

Could you point us to the glusterd log file content?


> --8<--start of log file  --
> [2017-08-20 20:30:39.638989] I [MSGID: 100030] [glusterfsd.c:2476:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.11.2
> (args: /usr/sbin/glusterfs --log-level=DEBUG --log-file=/var/log/gluster.log
> --volfile-server=web1.dasilva.network --volfile-id=/www
> /mnt/glusterfs/www)
> [2017-08-20 20:30:39.639024] I [MSGID: 100030] [glusterfsd.c:2476:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.11.2
> (args: /usr/sbin/glusterfs --log-level=DEBUG --log-file=/var/log/gluster.log
> --volfile-server=web1.dasilva.network --volfile-id=/etc
> /mnt/glusterfs/etc)
> [2017-08-20 20:30:39.926333] D [MSGID: 0] 
> [glusterfsd.c:442:set_fuse_mount_options]
> 0-glusterfsd: fopen-keep-cache mode 2
> [2017-08-20 20:30:39.926376] D [MSGID: 0] 
> [glusterfsd.c:506:set_fuse_mount_options]
> 0-glusterfsd: fuse direct io type 2
> [2017-08-20 20:30:39.926389] D [MSGID: 0] 
> [glusterfsd.c:530:set_fuse_mount_options]
> 0-glusterfsd: fuse no-root-squash mode 0
> [2017-08-20 20:30:39.926486] D [MSGID: 0] 
> [options.c:1224:xlator_option_init_double]
> 0-fuse: option negative-timeout using set value 0.00
> [2017-08-20 20:30:39.926532] D [MSGID: 0] 
> [glusterfsd.c:442:set_fuse_mount_options]
> 0-glusterfsd: fopen-keep-cache mode 2
> [2017-08-20 20:30:39.926557] D [MSGID: 0] 
> [glusterfsd.c:506:set_fuse_mount_options]
> 0-glusterfsd: fuse direct io type 2
> [2017-08-20 20:30:39.926569] D [MSGID: 0] 
> [glusterfsd.c:530:set_fuse_mount_options]
> 0-glusterfsd: fuse no-root-squash mode 0
> [2017-08-20 20:30:39.926568] D [MSGID: 0] 
> [options.c:1221:xlator_option_init_bool]
> 0-fuse: option no-root-squash using set value disable
> [2017-08-20 20:30:39.926649] D [MSGID: 0] 
> [options.c:1224:xlator_option_init_double]
> 0-fuse: option negative-timeout using set value 0.00
> [2017-08-20 20:30:39.926726] D [MSGID: 0] 
> [options.c:1221:xlator_option_init_bool]
> 0-fuse: option no-root-squash using set value disable
> [2017-08-20 20:30:39.927105] D [logging.c:1790:__gf_log_inject_timer_event]
> 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
> [2017-08-20 20:30:39.927359] D [rpc-clnt.c:1062:rpc_clnt_connection_init]
> 0-glusterfs: defaulting frame-timeout to 30mins
> [2017-08-20 20:30:39.927373] D [rpc-clnt.c:1076:rpc_clnt_connection_init]
> 0-glusterfs: disable ping-timeout
> [2017-08-20 20:30:39.927385] D [rpc-transport.c:279:rpc_transport_load]
> 0-rpc-transport: attempt to load file /usr/lib/x86_64-linux-gnu/
> glusterfs/3.11.2/rpc-transport/socket.so
> [2017-08-20 20:30:39.927774] D [logging.c:1790:__gf_log_inject_timer_event]
> 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
> [2017-08-20 20:30:39.927994] D [rpc-clnt.c:1062:rpc_clnt_connection_init]
> 0-glusterfs: defaulting frame-timeout to 30mins
> [2017-08-20 20:30:39.928008] D [rpc-clnt.c:1076:rpc_clnt_connection_init]
> 0-glusterfs: disable ping-timeout
> [2017-08-20 20:30:39.928019] D [rpc-transport.c:279:rpc_transport_load]
> 0-rpc-transport: attempt to load file /usr/lib/x86_64-linux-gnu/
> glusterfs/3.11.2/rpc-transport/socket.so
> [2017-08-20 20:30:40.380267] W [MSGID: 101002] [options.c:954:xl_opt_validate]
> 0-glusterfs: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2017-08-20 20:30:40.380391] D [socket.c:4163:socket_init] 0-glusterfs:
> Configued transport.tcp-user-timeout=0
> [2017-08-20 20:30:40.380403] D [socket.c:4181:socket_init] 0-glusterfs:
> Reconfigued transport.keepalivecnt=9
> [2017-08-20 20:30:40.380410] D [socket.c:4264:socket_init] 0-glusterfs:
> SSL support on the I/O path is NOT enabled
> [2017-08-20 20:30:40.380417] D [socket.c:4267:socket_init] 0-glusterfs:
> SSL support for glusterd is NOT enabled
> [2017-08-20 20:30:40.380423] D [socket.c:4284:socket_init] 0-glusterfs:
> using system polling thread
> [2017-08-20 20:30:40.380432] D [rpc-clnt.c:1581:rpcclnt_cbk_program_register]
> 0-glusterfs: New program registered: GlusterFS Callback, Num: 52743234,
> Ver: 1
> [2017-08-20 20:30:40.382224] W [MSGID: 101002] [options.c:954:xl_opt_validate]
> 0-glusterfs: option 'address-family' is deprecated, preferred is
> 

Re: [Gluster-users] gverify.sh purpose

2017-08-21 Thread mabi
Thanks Saravana for your quick answer. I was wondering because I have an issue 
where on my master geo-rep cluster I run a self-compiled version of GlusterFS 
from git and on my slave geo-replication node I run an official version. As 
such the versions do not match and the gverify.sh script fails. I have posted a 
mail in the gluster-devel mailing list yesterday as I think this has more to do 
with compilation.

>  Original Message 
> Subject: Re: [Gluster-users] gverify.sh purpose
> Local Time: August 21, 2017 10:39 AM
> UTC Time: August 21, 2017 8:39 AM
> From: sarum...@redhat.com
> To: mabi , Gluster Users 
>
> On Saturday 19 August 2017 02:05 AM, mabi wrote:
>
>> Hi,
>>
>> When creating a geo-replication session is the gverify.sh used  or ran 
>> respectively?
>
> Yes, It is executed as part of geo-replication session creation.
>
>> or is gverify.sh just an ad-hoc command to test manually if creating a 
>> geo-replication creationg would succeed?
>
> No need to run separately
>
> ~
> Saravana___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gverify.sh purpose

2017-08-21 Thread Saravanakumar Arumugam



On Saturday 19 August 2017 02:05 AM, mabi wrote:

Hi,

When creating a geo-replication session is the gverify.sh used  or ran 
respectively?


Yes, It is executed as part of geo-replication session creation.

or is gverify.sh just an ad-hoc command to test manually if creating a 
geo-replication creationg would succeed?




No need to run separately


~
Saravana


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users