Re: [Gluster-users] [Gluster-devel] Query!

2016-05-22 Thread Atin Mukherjee
Yes, you'd need to construct everything from scratch in this case.

-Atin
Sent from one plus one
On 23-May-2016 8:33 AM, "ABHISHEK PALIWAL"  wrote:

> Hi Atin,
>
> Thanks for your reply. But we fall in this situation then what is the
> solution to recover from here I have already remove /var/log/glusterd from
> one peer, should I need to remove /var/log/glusterd from both of the peers.
>
> Regards,
> Abhishek
>
> On Fri, May 20, 2016 at 8:39 PM, Atin Mukherjee <
> atin.mukherje...@gmail.com> wrote:
>
>> -Atin
>> Sent from one plus one
>> On 20-May-2016 5:34 PM, "ABHISHEK PALIWAL" 
>> wrote:
>> >
>> > Actually we have some other files related to system initial
>> configuration for that we
>> > need to format the volume where these bricks are also created and after
>> this we are
>> > facing some abnormal behavior in gluster and some failure logs like
>> volume ID mismatch something.
>> >
>> > That is why I am asking this is the right way to format volume where
>> bricks are created.
>>
>> No certainly not. If you format your brick, you loose the data and so as
>> all the extended attributes. In this case your volume would bound to behave
>> abnormally.
>>
>> >
>> > and also is there any link between /var/lib/glusterd and xattr stored
>> in .glusterfs directory at brick path.
>> >
>> > Regards,
>> > Abhishek
>> >
>> > On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee 
>> wrote:
>> >>
>> >> And most importantly why would you do that? What's your use case
>> Abhishek?
>> >>
>> >> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
>> >> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
>> >> >> I am not getting any failure and after restart the glusterd when I
>> run
>> >> >> volume info command it creates the brick directory
>> >> >> as well as .glsuterfs (xattrs).
>> >> >>
>> >> >> but some time even after restart the glusterd, volume info command
>> >> >> showing no volume present.
>> >> >>
>> >> >> Could you please tell me why this unpredictable problem is
>> occurring.
>> >> >>
>> >> >
>> >> > Because as stated earlier you erase all the information about the
>> >> > brick?  How is this unpredictable?
>> >> >
>> >> >
>> >> > If you want to delete and recreate a brick you should have used the
>> >> > remove-brick/add-brick commands.
>> >> >
>> >> > --
>> >> > Lindsay Mathieson
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > Gluster-users mailing list
>> >> > Gluster-users@gluster.org
>> >> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>> >
>> > Regards
>> > Abhishek Paliwal
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>
>> -Atin
>> Sent from one plus one
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-22 Thread ABHISHEK PALIWAL
Hi Atin,

Thanks for your reply. But we fall in this situation then what is the
solution to recover from here I have already remove /var/log/glusterd from
one peer, should I need to remove /var/log/glusterd from both of the peers.

Regards,
Abhishek

On Fri, May 20, 2016 at 8:39 PM, Atin Mukherjee 
wrote:

> -Atin
> Sent from one plus one
> On 20-May-2016 5:34 PM, "ABHISHEK PALIWAL" 
> wrote:
> >
> > Actually we have some other files related to system initial
> configuration for that we
> > need to format the volume where these bricks are also created and after
> this we are
> > facing some abnormal behavior in gluster and some failure logs like
> volume ID mismatch something.
> >
> > That is why I am asking this is the right way to format volume where
> bricks are created.
>
> No certainly not. If you format your brick, you loose the data and so as
> all the extended attributes. In this case your volume would bound to behave
> abnormally.
>
> >
> > and also is there any link between /var/lib/glusterd and xattr stored in
> .glusterfs directory at brick path.
> >
> > Regards,
> > Abhishek
> >
> > On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee 
> wrote:
> >>
> >> And most importantly why would you do that? What's your use case
> Abhishek?
> >>
> >> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
> >> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
> >> >> I am not getting any failure and after restart the glusterd when I
> run
> >> >> volume info command it creates the brick directory
> >> >> as well as .glsuterfs (xattrs).
> >> >>
> >> >> but some time even after restart the glusterd, volume info command
> >> >> showing no volume present.
> >> >>
> >> >> Could you please tell me why this unpredictable problem is occurring.
> >> >>
> >> >
> >> > Because as stated earlier you erase all the information about the
> >> > brick?  How is this unpredictable?
> >> >
> >> >
> >> > If you want to delete and recreate a brick you should have used the
> >> > remove-brick/add-brick commands.
> >> >
> >> > --
> >> > Lindsay Mathieson
> >> >
> >> >
> >> >
> >> > ___
> >> > Gluster-users mailing list
> >> > Gluster-users@gluster.org
> >> > http://www.gluster.org/mailman/listinfo/gluster-users
> >> >
> >
> >
> >
> >
> > --
> >
> >
> >
> >
> > Regards
> > Abhishek Paliwal
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
>
> -Atin
> Sent from one plus one
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Atin Mukherjee
-Atin
Sent from one plus one
On 20-May-2016 5:34 PM, "ABHISHEK PALIWAL"  wrote:
>
> Actually we have some other files related to system initial configuration
for that we
> need to format the volume where these bricks are also created and after
this we are
> facing some abnormal behavior in gluster and some failure logs like
volume ID mismatch something.
>
> That is why I am asking this is the right way to format volume where
bricks are created.

No certainly not. If you format your brick, you loose the data and so as
all the extended attributes. In this case your volume would bound to behave
abnormally.
>
> and also is there any link between /var/lib/glusterd and xattr stored in
.glusterfs directory at brick path.
>
> Regards,
> Abhishek
>
> On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee 
wrote:
>>
>> And most importantly why would you do that? What's your use case
Abhishek?
>>
>> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
>> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
>> >> I am not getting any failure and after restart the glusterd when I run
>> >> volume info command it creates the brick directory
>> >> as well as .glsuterfs (xattrs).
>> >>
>> >> but some time even after restart the glusterd, volume info command
>> >> showing no volume present.
>> >>
>> >> Could you please tell me why this unpredictable problem is occurring.
>> >>
>> >
>> > Because as stated earlier you erase all the information about the
>> > brick?  How is this unpredictable?
>> >
>> >
>> > If you want to delete and recreate a brick you should have used the
>> > remove-brick/add-brick commands.
>> >
>> > --
>> > Lindsay Mathieson
>> >
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-Atin
Sent from one plus one
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL
Actually we have some other files related to system initial configuration
for that we
need to format the volume where these bricks are also created and after
this we are
facing some abnormal behavior in gluster and some failure logs like volume
ID mismatch something.

That is why I am asking this is the right way to format volume where bricks
are created.

and also is there any link between /var/lib/glusterd and xattr stored in
.glusterfs directory at brick path.

Regards,
Abhishek

On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee  wrote:

> And most importantly why would you do that? What's your use case Abhishek?
>
> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
> >> I am not getting any failure and after restart the glusterd when I run
> >> volume info command it creates the brick directory
> >> as well as .glsuterfs (xattrs).
> >>
> >> but some time even after restart the glusterd, volume info command
> >> showing no volume present.
> >>
> >> Could you please tell me why this unpredictable problem is occurring.
> >>
> >
> > Because as stated earlier you erase all the information about the
> > brick?  How is this unpredictable?
> >
> >
> > If you want to delete and recreate a brick you should have used the
> > remove-brick/add-brick commands.
> >
> > --
> > Lindsay Mathieson
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Atin Mukherjee
And most importantly why would you do that? What's your use case Abhishek?

On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
> On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
>> I am not getting any failure and after restart the glusterd when I run
>> volume info command it creates the brick directory
>> as well as .glsuterfs (xattrs).
>>
>> but some time even after restart the glusterd, volume info command
>> showing no volume present.
>>
>> Could you please tell me why this unpredictable problem is occurring.
>>
> 
> Because as stated earlier you erase all the information about the
> brick?  How is this unpredictable?
> 
> 
> If you want to delete and recreate a brick you should have used the
> remove-brick/add-brick commands.
> 
> -- 
> Lindsay Mathieson
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Lindsay Mathieson

On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
I am not getting any failure and after restart the glusterd when I run 
volume info command it creates the brick directory

as well as .glsuterfs (xattrs).

but some time even after restart the glusterd, volume info command 
showing no volume present.


Could you please tell me why this unpredictable problem is occurring.



Because as stated earlier you erase all the information about the 
brick?  How is this unpredictable?



If you want to delete and recreate a brick you should have used the 
remove-brick/add-brick commands.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL
I am not getting any failure and after restart the glusterd when I run
volume info command it creates the brick directory
as well as .glsuterfs (xattrs).

but some time even after restart the glusterd, volume info command showing
no volume present.

Could you please tell me why this unpredictable problem is occurring.

Regards,
Abhishek

On Fri, May 20, 2016 at 3:50 PM, Kaushal M  wrote:

> This would erase the xattrs set on the brick root (volume-id), which
> identify it as a brick. Brick processes will fail to start when this
> xattr isn't present.
>
>
> On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
>  wrote:
> > Hi
> >
> > What will happen if we format the volume where the bricks of replicate
> > gluster volume's are created and restart the glusterd on both node.
> >
> > It will work fine or in this case need to remove /var/lib/glusterd
> directory
> > as well.
> >
> > --
> > Regards
> > Abhishek Paliwal
> >
> > ___
> > Gluster-devel mailing list
> > gluster-de...@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Kaushal M
This would erase the xattrs set on the brick root (volume-id), which
identify it as a brick. Brick processes will fail to start when this
xattr isn't present.


On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
 wrote:
> Hi
>
> What will happen if we format the volume where the bricks of replicate
> gluster volume's are created and restart the glusterd on both node.
>
> It will work fine or in this case need to remove /var/lib/glusterd directory
> as well.
>
> --
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-14 Thread ABHISHEK PALIWAL
Then how can I resolved this issue?

On Mon, Mar 14, 2016 at 1:37 PM, Ravishankar N 
wrote:

> On 03/14/2016 10:36 AM, ABHISHEK PALIWAL wrote:
>
> Hi Ravishankar,
>
> I just want to inform that this file have some different properties from
> other files like this is the file which having the fixed size and when
> there is no space in file the next data will start wrapping from the top of
> the file.
>
> Means in this file we are doing the wrapping of the data as well.
>
> So, I just want to know is this feature of file will effect gluster to
> identify the split-brain or xattr attributes?
>
> Hi,
> No it shouldn't matter at what offset the writes happen. The xattrs only
> track that the write was  missed  (and therefore a pending heal),
> irrespective of (offset, length).
> Ravi
>
>
>
> Regards,
> Abhishek
>
> On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>>
>>
>> On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N < 
>> ravishan...@redhat.com> wrote:
>>
>>> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>>>
>>>
 Ok, just to confirm, glusterd  and other brick processes are running
 after this node rebooted?
 When you run the above command, you need to check
 /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
 client-log-level to DEBUG would give you a more verbose message

 Yes, glusterd and other brick processes running fine. I have check the
>>> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
>>> Here is the logs from that file
>>>
>>> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>>> with index 1
>>> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
>>> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
>>> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
>>> info [No such file or directory]
>>> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
>>> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
>>> get reserved ports, hence there is a possibility that glusterfs may consume
>>> reserved port
>>> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
>>> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>>>
>>>
>>> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
>>> restart glusterd and run the heal info command again .
>>>
>>
>> No hint from the logs? I'll try your suggestion.
>>
>>>
>>> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
>>> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
>>> remote-host: localhost (Transport endpoint is not connected) [Transport
>>> endpoint is not connected]
>>> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
>>> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
>>> servers [Transport endpoint is not connected]
>>>
 # gluster volume heal c_glusterfs info split-brain
 c_glusterfs: Not able to fetch volfile from glusterd
 Volume heal failed.




 And based on the your observation I understood that this is not the
 problem of split-brain but *is there any way through which can find
 out the file which is not in split-brain as well as not in sync?*


 `gluster volume heal c_glusterfs info split-brain`  should give you
 files that need heal.

>>>
>>> Sorry  I meant 'gluster volume heal c_glusterfs info' should give you
>>> the files that need heal and 'gluster volume heal c_glusterfs info
>>> split-brain' the list of files in split-brain.
>>> The commands are detailed in
>>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>>>
>>
>> Yes, I have tried this as well It is also giving Number of entries : 0
>> means no healing is required but the file
>> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> is not in sync both of brick showing the different version of this file.
>>
>> You can see it in the getfattr command outcome as well.
>>
>>
>> # getfattr -m . -d -e hex
>> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> getfattr: Removing leading '/' from absolute path names
>> # file:
>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> trusted.afr.c_glusterfs-client-0=0x
>> trusted.afr.c_glusterfs-client-2=0x
>> trusted.afr.c_glusterfs-client-4=0x
>> trusted.afr.c_glusterfs-client-6=0x
>> trusted.afr.c_glusterfs-client-8=*0x0006** //because
>> client8 is the latest client in our case and starting 8 digits *
>>
>> *0006are saying like there is something in changelog data. *
>> trusted.afr.dirty=0x
>> trusted.bit-rot.version=0x001356d86c0c000217fd
>> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>>

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-14 Thread Ravishankar N

On 03/14/2016 10:36 AM, ABHISHEK PALIWAL wrote:

Hi Ravishankar,

I just want to inform that this file have some different properties 
from other files like this is the file which having the fixed size and 
when there is no space in file the next data will start wrapping from 
the top of the file.


Means in this file we are doing the wrapping of the data as well.

So, I just want to know is this feature of file will effect gluster to 
identify the split-brain or xattr attributes?

Hi,
No it shouldn't matter at what offset the writes happen. The xattrs only 
track that the write was  missed  (and therefore a pending heal), 
irrespective of (offset, length).

Ravi



Regards,
Abhishek

On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL 
mailto:abhishpali...@gmail.com>> wrote:




On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N
mailto:ravishan...@redhat.com>> wrote:

On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:



Ok, just to confirm, glusterd  and other brick processes
are running after this node rebooted?
When you run the above command, you need to check
/var/log/glusterfs/glfsheal-volname.log logs errros.
Setting client-log-level to DEBUG would give you a more
verbose message

Yes, glusterd and other brick processes running fine. I have
check the /var/log/glusterfs/glfsheal-volname.log file
without the log-level= DEBUG. Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012]
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs:
could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081]
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs:
Not able to get reserved ports, hence there is a possibility
that glusterfs may consume reserved port
[2016-03-02 13:51:39.072583] E
[socket.c:2278:socket_connect_finish] 0-gfapi: connection to
127.0.0.1:24007  failed (Connection
refused)


Not sure why ^^ occurs. You could try flushing iptables
(iptables -F), restart glusterd and run the heal info command
again .


No hint from the logs? I'll try your suggestion.



[2016-03-02 13:51:39.072663] E [MSGID: 104024]
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to
connect with remote-host: localhost (Transport endpoint is
not connected) [Transport endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025]
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all
volfile servers [Transport endpoint is not connected]


# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.





And based on the your observation I understood that this
is not the problem of split-brain but *is there any way
through which can find out the file which is not in
split-brain as well as not in sync?*


`gluster volume heal c_glusterfs info split-brain` 
should give you files that need heal.




Sorry  I meant 'gluster volume heal c_glusterfs info' should
give you the files that need heal and 'gluster volume heal
c_glusterfs info split-brain' the list of files in split-brain.
The commands are detailed in

https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md


Yes, I have tried this as well It is also giving Number of entries
: 0 means no healing is required but the file
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
is not in sync both of brick showing the different version of this
file.

You can see it in the getfattr command outcome as well.


# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006**//because
client8 is the latest client in our case and starting 8 digits **
*
*0006are saying like there is something in changelog data.
*
trusted.afr.dirty=0x00

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-13 Thread ABHISHEK PALIWAL
Hi Ravishankar,

I just want to inform that this file have some different properties from
other files like this is the file which having the fixed size and when
there is no space in file the next data will start wrapping from the top of
the file.

Means in this file we are doing the wrapping of the data as well.

So, I just want to know is this feature of file will effect gluster to
identify the split-brain or xattr attributes?

Regards,
Abhishek

On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL 
wrote:

>
>
> On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N 
> wrote:
>
>> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>>
>>
>>> Ok, just to confirm, glusterd  and other brick processes are running
>>> after this node rebooted?
>>> When you run the above command, you need to check
>>> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>>> client-log-level to DEBUG would give you a more verbose message
>>>
>>> Yes, glusterd and other brick processes running fine. I have check the
>> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
>> Here is the logs from that file
>>
>> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
>> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
>> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
>> info [No such file or directory]
>> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
>> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
>> get reserved ports, hence there is a possibility that glusterfs may consume
>> reserved port
>> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
>> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>>
>>
>> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
>> restart glusterd and run the heal info command again .
>>
>
> No hint from the logs? I'll try your suggestion.
>
>>
>> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
>> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
>> remote-host: localhost (Transport endpoint is not connected) [Transport
>> endpoint is not connected]
>> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
>> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
>> servers [Transport endpoint is not connected]
>>
>>> # gluster volume heal c_glusterfs info split-brain
>>> c_glusterfs: Not able to fetch volfile from glusterd
>>> Volume heal failed.
>>>
>>>
>>>
>>>
>>> And based on the your observation I understood that this is not the
>>> problem of split-brain but *is there any way through which can find out
>>> the file which is not in split-brain as well as not in sync?*
>>>
>>>
>>> `gluster volume heal c_glusterfs info split-brain`  should give you
>>> files that need heal.
>>>
>>
>> Sorry  I meant 'gluster volume heal c_glusterfs info' should give you
>> the files that need heal and 'gluster volume heal c_glusterfs info
>> split-brain' the list of files in split-brain.
>> The commands are detailed in
>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>>
>
> Yes, I have tried this as well It is also giving Number of entries : 0
> means no healing is required but the file
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is
> not in sync both of brick showing the different version of this file.
>
> You can see it in the getfattr command outcome as well.
>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x
> trusted.afr.c_glusterfs-client-2=0x
> trusted.afr.c_glusterfs-client-4=0x
> trusted.afr.c_glusterfs-client-6=0x
> trusted.afr.c_glusterfs-client-8=*0x0006** //because
> client8 is the latest client in our case and starting 8 digits *
>
> *0006are saying like there is something in changelog data.*
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x** // and
> here we can say that there is no split brain but the file is out of sync*
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001156d86c290005735c
> trusted.gf

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-04 Thread ABHISHEK PALIWAL
On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N 
wrote:

> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>
>
>> Ok, just to confirm, glusterd  and other brick processes are running
>> after this node rebooted?
>> When you run the above command, you need to check
>> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>> client-log-level to DEBUG would give you a more verbose message
>>
>> Yes, glusterd and other brick processes running fine. I have check the
> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
> Here is the logs from that file
>
> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
> info [No such file or directory]
> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
> get reserved ports, hence there is a possibility that glusterfs may consume
> reserved port
> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>
>
> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
> restart glusterd and run the heal info command again .
>

No hint from the logs? I'll try your suggestion.

>
> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
> remote-host: localhost (Transport endpoint is not connected) [Transport
> endpoint is not connected]
> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
> servers [Transport endpoint is not connected]
>
>> # gluster volume heal c_glusterfs info split-brain
>> c_glusterfs: Not able to fetch volfile from glusterd
>> Volume heal failed.
>>
>>
>>
>>
>> And based on the your observation I understood that this is not the
>> problem of split-brain but *is there any way through which can find out
>> the file which is not in split-brain as well as not in sync?*
>>
>>
>> `gluster volume heal c_glusterfs info split-brain`  should give you files
>> that need heal.
>>
>
> Sorry  I meant 'gluster volume heal c_glusterfs info' should give you the
> files that need heal and 'gluster volume heal c_glusterfs info
> split-brain' the list of files in split-brain.
> The commands are detailed in
> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>

Yes, I have tried this as well It is also giving Number of entries : 0
means no healing is required but the file
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is
not in sync both of brick showing the different version of this file.

You can see it in the getfattr command outcome as well.


# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006** //because
client8 is the latest client in our case and starting 8 digits *

*0006are saying like there is something in changelog data.*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x** // and here
we can say that there is no split brain but the file is out of sync*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae


> Regards,
>
   Abhishek

>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-04 Thread Ravishankar N

On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:



Ok, just to confirm, glusterd  and other brick processes are
running after this node rebooted?
When you run the above command, you need to check
/var/log/glusterfs/glfsheal-volname.log logs errros. Setting
client-log-level to DEBUG would give you a more verbose message

Yes, glusterd and other brick processes running fine. I have check the 
/var/log/glusterfs/glfsheal-volname.log file without the log-level= 
DEBUG. Here is the logs from that file


[2016-03-02 13:51:39.059440] I [MSGID: 101190] 
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started 
thread with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012] 
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not 
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting 
reserved ports info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081] 
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able 
to get reserved ports, hence there is a possibility that glusterfs may 
consume reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] 
0-gfapi: connection to 127.0.0.1:24007  failed 
(Connection refused)


Not sure why ^^ occurs. You could try flushing iptables (iptables -F), 
restart glusterd and run the heal info command again .


[2016-03-02 13:51:39.072663] E [MSGID: 104024] 
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with 
remote-host: localhost (Transport endpoint is not connected) 
[Transport endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025] 
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile 
servers [Transport endpoint is not connected]



# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.





And based on the your observation I understood that this is not
the problem of split-brain but *is there any way through which
can find out the file which is not in split-brain as well as not
in sync?*


`gluster volume heal c_glusterfs info split-brain`  should give
you files that need heal.



Sorry  I meant 'gluster volume heal c_glusterfs info' should give you 
the files that need heal and 'gluster volume heal c_glusterfs info 
split-brain' the list of files in split-brain.
The commands are detailed in 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md




I have run "gluster volume heal c_glusterfs info split-brain" command 
but it is not showing that file which is out of sync that is the issue 
file is not in sync on both of the brick and split-brain is not 
showing that command in output for heal required.


Thats is why I am asking that is there any command other than this 
split brain command so that I can find out the files those are 
required the heal operation but not displayed in the output of 
"gluster volume heal c_glusterfs info split-brain" command.








___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-04 Thread ABHISHEK PALIWAL
On Fri, Mar 4, 2016 at 5:31 PM, Ravishankar N 
wrote:

> On 03/04/2016 12:10 PM, ABHISHEK PALIWAL wrote:
>
> Hi Ravi,
>
> 3. On the rebooted node, do you have ssl enabled by any chance? There is a
> bug for "Not able to fetch volfile' when ssl is enabled:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>
> -> I have checked but ssl is disabled but still getting these errors
>
> # gluster volume heal c_glusterfs info
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
> Ok, just to confirm, glusterd  and other brick processes are running after
> this node rebooted?
> When you run the above command, you need to check
> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
> client-log-level to DEBUG would give you a more verbose message
>
> Yes, glusterd and other brick processes running fine. I have check the
/var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012]
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081]
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may consume
reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
[2016-03-02 13:51:39.072663] E [MSGID: 104024]
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
remote-host: localhost (Transport endpoint is not connected) [Transport
endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025]
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
servers [Transport endpoint is not connected]

> # gluster volume heal c_glusterfs info split-brain
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
>
>
> And based on the your observation I understood that this is not the
> problem of split-brain but *is there any way through which can find out
> the file which is not in split-brain as well as not in sync?*
>
>
> `gluster volume heal c_glusterfs info split-brain`  should give you files
> that need heal.
>

I have run "gluster volume heal c_glusterfs info split-brain" command but
it is not showing that file which is out of sync that is the issue file is
not in sync on both of the brick and split-brain is not showing that
command in output for heal required.

Thats is why I am asking that is there any command other than this split
brain command so that I can find out the files those are required the heal
operation but not displayed in the output of "gluster volume heal
c_glusterfs info split-brain" command.

>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x
> trusted.afr.c_glusterfs-client-2=0x
> trusted.afr.c_glusterfs-client-4=0x
> trusted.afr.c_glusterfs-client-6=0x
> trusted.afr.c_glusterfs-client-8=*0x0006** //because
> client8 is the latest client in our case and starting 8 digits *
>
> *0006are saying like there is something in changelog data. *
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x** // and
> here we can say that there is no split brain but the file is out of sync*
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001156d86c290005735c
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
> Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
> Options Reconfigured:
> performance.readdir-ahead: on
> network.ping-timeout: 4
> nfs.disable: on
>
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume ID: c6a61455-d378-48bf-ad

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-04 Thread Ravishankar N

On 03/04/2016 12:10 PM, ABHISHEK PALIWAL wrote:

Hi Ravi,

3. On the rebooted node, do you have ssl enabled by any chance? There 
is a bug for "Not able to fetch volfile' when ssl is enabled: 
https://bugzilla.redhat.com/show_bug.cgi?id=1258931


-> I have checked but ssl is disabled but still getting these errors

# gluster volume heal c_glusterfs info
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.



Ok, just to confirm, glusterd  and other brick processes are running 
after this node rebooted?
When you run the above command, you need to check 
/var/log/glusterfs/glfsheal-volname.log logs errros. Setting 
client-log-level to DEBUG would give you a more verbose message



# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.





And based on the your observation I understood that this is not the 
problem of split-brain but *is there any way through which can find 
out the file which is not in split-brain as well as not in sync?*


`gluster volume heal c_glusterfs info split-brain` should give you files 
that need heal.




# getfattr -m . -d -e hex 
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006**//because client8 
is the latest client in our case and starting 8 digits **

*
*0006are saying like there is something in changelog data.
*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex 
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x**// and 
here we can say that there is no split brain but the file is out of sync*

trusted.afr.dirty=0x
trusted.bit-rot.version=0x001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on


# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on

# gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git 

Copyright (c) 2006-2011 Gluster Inc. > 


GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU 
General Public License.

# gluster volume heal info heal-failed
Usage: volume heal  [enable | disable | full |statistics 
[heal-count [replica ]] |info [healed | 
heal-failed | split-brain] |split-brain {bigger-file  
|source-brick  []}]

# gluster volume heal c_glusterfs info heal-failed
Command not supported. Please use "gluster volume heal c_glusterfs 
info" and logs to find the heal information.

# lhsh 002500
 ___  _ _  _ __   _ _ _ _ _
 |   |_] |_]  ||   | \  | | |  \___/
 |_  |   |  |_ __|__ |  \_| |_| _/   \_

002500> gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git 

Copyright (c) 2006-2011 Gluster Inc. > 


GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU 
General Public License.

002500>

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-03 Thread ABHISHEK PALIWAL
Hi Ravi,

3. On the rebooted node, do you have ssl enabled by any chance? There is a
bug for "Not able to fetch volfile' when ssl is enabled:
https://bugzilla.redhat.com/show_bug.cgi?id=1258931

-> I have checked but ssl is disabled but still getting these errors

# gluster volume heal c_glusterfs info
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.

# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.

And based on the your observation I understood that this is not the problem
of split-brain but *is there any way through which can find out the file
which is not in split-brain as well as not in sync?*

# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006** //because
client8 is the latest client in our case and starting 8 digits *

*0006are saying like there is something in changelog data.*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x** // and here
we can say that there is no split brain but the file is out of sync*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on


# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on

# gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. 
>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
# gluster volume heal info heal-failed
Usage: volume heal  [enable | disable | full |statistics
[heal-count [replica ]] |info [healed | heal-failed |
split-brain] |split-brain {bigger-file  |source-brick
 []}]
# gluster volume heal c_glusterfs info heal-failed
Command not supported. Please use "gluster volume heal c_glusterfs info"
and logs to find the heal information.
# lhsh 002500
 ___  _   _  _ __   _ _ _ _ _
 |   |_] |_]  ||   | \  | | |  \___/
 |_  |   ||_ __|__ |  \_| |_| _/   \_

002500> gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. 
>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
002500>

Regards,
Abhishek

On Thu, Mar 3, 2016 at 4:54 PM, ABHISHEK PALIWAL 
wrote:

>
> On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N 
> wrote:
>
>> Hi,
>>
>> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:
>>
>> Hi Ravi,
>>
>> As I discussed earlier this issue, I investigated this issue and find
>> that healing is not triggered because the "gluster volume heal c_glusterfs
>> info split-brain" command not showing any entries as a outcome of this
>> command even though the file in split brain case.
>>
>>
>> Couple of observations from th

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-03 Thread ABHISHEK PALIWAL
On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N 
wrote:

> Hi,
>
> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:
>
> Hi Ravi,
>
> As I discussed earlier this issue, I investigated this issue and find that
> healing is not triggered because the "gluster volume heal c_glusterfs info
> split-brain" command not showing any entries as a outcome of this command
> even though the file in split brain case.
>
>
> Couple of observations from the 'commands_output' file.
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> The afr xattrs do not indicate that the file is in split brain:
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=0x
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x000b56d6dd1d000ec7a9
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
>
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x0008
> trusted.afr.c_glusterfs-client-2=0x0002
> trusted.afr.c_glusterfs-client-4=0x0002
> trusted.afr.c_glusterfs-client-6=0x0002
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x000b56d6dcb7000c87e7
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> 1. There doesn't seem to be a split-brain going by the trusted.afr* xattrs.
>

if it is not the split brain problem then how can I resolve this.


> 2. You seem to have re-used the bricks from another volume/setup. For
> replica 2, only trusted.afr.c_glusterfs-client-0 and
> trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs -
> client-0,2,4 and 6
>

could you please suggest why these entries are there because I am not able
to find out scenario. I am rebooting the one board multiple times to
reproduce the issue and after every reboot doing the remove-brick and
add-brick on the same volume for the second board.


> 3. On the rebooted node, do you have ssl enabled by any chance? There is a
> bug for "Not able to fetch volfile' when ssl is enabled:
> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>
> Btw, you for data and metadata split-brains you can use the gluster CLI
> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
> instead of modifying the file from the back end.
>

But you are saying it is not split brain problem and even the split-brain
command  is not showing any file so how can I find the bigger file in size.
Also in my case the file size is fix 2MB it is overwritten every time.

>
> -Ravi
>
>
> So, what I have done I manually deleted the gfid entry of that file from
> .glusterfs directory and follow the instruction mentioned in the following
> link to do heal
>
>
> https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md
>
> and this works fine for me.
>
> But my question is why the split-brain command not showing any file in
> output.
>
> Here I am attaching all the log which I get from the node for you and also
> the output of commands from both of the boards
>
> In this tar file two directories are present
>
> 000300 - log for the board which is running continuously
> 002500-  log for the board which is rebooted
>
> I am waiting for your reply please help me out on this issue.
>
> Thanks in advanced.
>
> Regards,
> Abhishek
>
> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N <
>> ravishan...@redhat.com> wrote:
>>
>>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:
>>>
>>> Yes correct
>>>
>>>
>>> Okay, so when you say the files are not in sync until some time, are you
>>> getting stale data when accessing from the mount?
>>> I'm not able to figure out why heal info shows zero when the files are
>>> not in sync, despite all IO happening from the mounts. Could you provide
>>> the output of getfattr -d -m . -e hex /brick/file-name from both bricks
>>> when you hit this issue?
>>>
>>> I'll provide the logs once I get. here delay means we are powering on
>>> the second board after the 10 minutes.
>>>
>>>
>>> On Feb 26, 2016 9:57 AM, "Ravishankar N" < 
>>> ravishan...@redhat.com> wrote:
>>>
 Hello,

 On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:

 Hi Ravi,

 Thanks for the response.

 We are using Glugsterfs-3.7.8

 Here is the use case:

 We have a logging file which saves logs of the events for every board
 of a node and these files are in sync using glusterfs. System in replica 2
 mode it means When one brick in a replicated volume goes offline, the
 glusterd daemons on the other nodes keep track of all the files that are
 not replicated to the offline brick. When the offline brick becomes
 available again, the clust

Re: [Gluster-users] [Gluster-devel] Query on healing process

2016-03-03 Thread Ravishankar N

Hi,

On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,

As I discussed earlier this issue, I investigated this issue and find 
that healing is not triggered because the "gluster volume heal 
c_glusterfs info split-brain" command not showing any entries as a 
outcome of this command even though the file in split brain case.


Couple of observations from the 'commands_output' file.

getfattr -d -m . -e hex 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

The afr xattrs do not indicate that the file is in split brain:
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-1=0x
trusted.afr.dirty=0x
trusted.bit-rot.version=0x000b56d6dd1d000ec7a9
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae



getfattr -d -m . -e hex 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-0=0x0008
trusted.afr.c_glusterfs-client-2=0x0002
trusted.afr.c_glusterfs-client-4=0x0002
trusted.afr.c_glusterfs-client-6=0x0002
trusted.afr.dirty=0x
trusted.bit-rot.version=0x000b56d6dcb7000c87e7
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

1. There doesn't seem to be a split-brain going by the trusted.afr* xattrs.
2. You seem to have re-used the bricks from another volume/setup. For 
replica 2, only trusted.afr.c_glusterfs-client-0 and 
trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs - 
client-0,2,4 and 6
3. On the rebooted node, do you have ssl enabled by any chance? There is 
a bug for "Not able to fetch volfile' when ssl is enabled: 
https://bugzilla.redhat.com/show_bug.cgi?id=1258931


Btw, you for data and metadata split-brains you can use the gluster CLI 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md 
instead of modifying the file from the back end.


-Ravi


So, what I have done I manually deleted the gfid entry of that file 
from .glusterfs directory and follow the instruction mentioned in the 
following link to do heal


https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md

and this works fine for me.

But my question is why the split-brain command not showing any file in 
output.


Here I am attaching all the log which I get from the node for you and 
also the output of commands from both of the boards


In this tar file two directories are present

000300 - log for the board which is running continuously
002500-  log for the board which is rebooted

I am waiting for your reply please help me out on this issue.

Thanks in advanced.

Regards,
Abhishek

On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL 
mailto:abhishpali...@gmail.com>> wrote:


On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N
mailto:ravishan...@redhat.com>> wrote:

On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:


Yes correct



Okay, so when you say the files are not in sync until some
time, are you getting stale data when accessing from the mount?
I'm not able to figure out why heal info shows zero when the
files are not in sync, despite all IO happening from the
mounts. Could you provide the output of getfattr -d -m . -e
hex /brick/file-name from both bricks when you hit this issue?

I'll provide the logs once I get. here delay means we are
powering on the second board after the 10 minutes.



On Feb 26, 2016 9:57 AM, "Ravishankar N"
mailto:ravishan...@redhat.com>> wrote:

Hello,

On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,

Thanks for the response.

We are using Glugsterfs-3.7.8

Here is the use case:

We have a logging file which saves logs of the events
for every board of a node and these files are in sync
using glusterfs. System in replica 2 mode it means When
one brick in a replicated volume goes offline, the
glusterd daemons on the other nodes keep track of all
the files that are not replicated to the offline brick.
When the offline brick becomes available again, the
cluster initiates a healing process, replicating the
updated files to that brick. But in our casse, we see
that log file of one board is not in the sync and its
format is corrupted means files are not in sync.


Just to understand you correctly, you have mounted the 2
node replica-2 volume on both these nodes and writing to
a logging file from the mounts right?



Even the outcome of #gluster volume heal c_glusterfs
info shows that there is no pending heals.

Also , The logging file which is u