subject:"\[Gluster\-devel\] Query\!"

Re: [Gluster-devel] Query regards to expose client-pid to fuse process

2019-10-13 Thread Aravinda Vishwanathapura Krishna Murthy

Geo-replication uses this option to identify itself as an internal client

On Sun, Oct 13, 2019 at 11:41 AM Amar Tumballi  wrote:

>
>
> On Fri, Oct 11, 2019 at 5:05 PM Mohit Agrawal  wrote:
>
>> Hi,
>>
>> Yes, you are right it is not a default value.
>>
>> We can assign the client_pid only while volume has mounted after through
>> a glusterfs binary directly like
>> /usr/local/sbin/glusterfs --process-name fuse
>> --volfile-server=192.168.1.3 --client-pid=-3 --volfile-id=/test /mnt1
>>
>>
> I agree that this is in general risky, and good to fix. But as the check
> for this happens after basic auth check in RPC (ip/user based), it should
> be OK.  Good to open a github issue and have some possible design options
> so we can have more discussions on this.
>
> -Amar
>
>
>
>> Regards,
>> Mohit Agrawal
>>
>>
>> On Fri, Oct 11, 2019 at 4:52 PM Nithya Balachandran 
>> wrote:
>>
>>>
>>>
>>> On Fri, 11 Oct 2019 at 14:56, Mohit Agrawal  wrote:
>>>
 Hi,

   I have a query specific to authenticate a client based on the PID
 (client-pid).
   It can break the bricks xlator functionality, Usually, on the brick
 side we take a decision about the
source of fop request based on PID.If PID value is -ve xlator
 considers the request has come from an internal
   client otherwise it has come from an external client.

   If a user has mounted the volume through fuse after provide
 --client-pid to command line argument similar to internal client PID
   in that case brick_xlator consider external fop request also as an
 internal and it will break functionality.

   We are checking pid in (lease, posix-lock, worm, trash) xlator to
 know about the source of the fops.
   Even there are other brick xlators also we are checking specific PID
 value for all internal
   clients that can be break if the external client has the same pid.

   My query is why we need to expose client-pid as an argument to the
 fuse process?

>>>
>>>
>>> I don't think this is a default value to the fuse mount. One place where
>>> this helps us is with the script based file migration and rebalance - we
>>> can provide a negative pid to  the special client mount to ensure these
>>> fops are also treated as internal fops.
>>>
>>> In the meantime I do not see the harm in having this option available as
>>> it allows a specific purpose. Are there any other client processes that use
>>> this?
>>>
>>>I think we need to resolve it. Please share your view on the same.

 Thanks,
 Mohit Agrawal
 ___

 Community Meeting Calendar:

 APAC Schedule -
 Every 2nd and 4th Tuesday at 11:30 AM IST
 Bridge: https://bluejeans.com/118564314

 NA/EMEA Schedule -
 Every 1st and 3rd Tuesday at 01:00 PM EDT
 Bridge: https://bluejeans.com/118564314

 Gluster-devel mailing list
 Gluster-devel@gluster.org
 https://lists.gluster.org/mailman/listinfo/gluster-devel

 ___
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/118564314
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/118564314
>>
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>

-- 
regards
Aravinda VK
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to expose client-pid to fuse process

2019-10-13 Thread Amar Tumballi

On Fri, Oct 11, 2019 at 5:05 PM Mohit Agrawal  wrote:

> Hi,
>
> Yes, you are right it is not a default value.
>
> We can assign the client_pid only while volume has mounted after through a
> glusterfs binary directly like
> /usr/local/sbin/glusterfs --process-name fuse --volfile-server=192.168.1.3
> --client-pid=-3 --volfile-id=/test /mnt1
>
>
I agree that this is in general risky, and good to fix. But as the check
for this happens after basic auth check in RPC (ip/user based), it should
be OK.  Good to open a github issue and have some possible design options
so we can have more discussions on this.

-Amar



> Regards,
> Mohit Agrawal
>
>
> On Fri, Oct 11, 2019 at 4:52 PM Nithya Balachandran 
> wrote:
>
>>
>>
>> On Fri, 11 Oct 2019 at 14:56, Mohit Agrawal  wrote:
>>
>>> Hi,
>>>
>>>   I have a query specific to authenticate a client based on the PID
>>> (client-pid).
>>>   It can break the bricks xlator functionality, Usually, on the brick
>>> side we take a decision about the
>>>source of fop request based on PID.If PID value is -ve xlator
>>> considers the request has come from an internal
>>>   client otherwise it has come from an external client.
>>>
>>>   If a user has mounted the volume through fuse after provide
>>> --client-pid to command line argument similar to internal client PID
>>>   in that case brick_xlator consider external fop request also as an
>>> internal and it will break functionality.
>>>
>>>   We are checking pid in (lease, posix-lock, worm, trash) xlator to know
>>> about the source of the fops.
>>>   Even there are other brick xlators also we are checking specific PID
>>> value for all internal
>>>   clients that can be break if the external client has the same pid.
>>>
>>>   My query is why we need to expose client-pid as an argument to the
>>> fuse process?
>>>
>>
>>
>> I don't think this is a default value to the fuse mount. One place where
>> this helps us is with the script based file migration and rebalance - we
>> can provide a negative pid to  the special client mount to ensure these
>> fops are also treated as internal fops.
>>
>> In the meantime I do not see the harm in having this option available as
>> it allows a specific purpose. Are there any other client processes that use
>> this?
>>
>>I think we need to resolve it. Please share your view on the same.
>>>
>>> Thanks,
>>> Mohit Agrawal
>>> ___
>>>
>>> Community Meeting Calendar:
>>>
>>> APAC Schedule -
>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>> Bridge: https://bluejeans.com/118564314
>>>
>>> NA/EMEA Schedule -
>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>> Bridge: https://bluejeans.com/118564314
>>>
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to expose client-pid to fuse process

2019-10-11 Thread Mohit Agrawal

Hi,

Yes, you are right it is not a default value.

We can assign the client_pid only while volume has mounted after through a
glusterfs binary directly like
/usr/local/sbin/glusterfs --process-name fuse --volfile-server=192.168.1.3
--client-pid=-3 --volfile-id=/test /mnt1

Regards,
Mohit Agrawal


On Fri, Oct 11, 2019 at 4:52 PM Nithya Balachandran 
wrote:

>
>
> On Fri, 11 Oct 2019 at 14:56, Mohit Agrawal  wrote:
>
>> Hi,
>>
>>   I have a query specific to authenticate a client based on the PID
>> (client-pid).
>>   It can break the bricks xlator functionality, Usually, on the brick
>> side we take a decision about the
>>source of fop request based on PID.If PID value is -ve xlator
>> considers the request has come from an internal
>>   client otherwise it has come from an external client.
>>
>>   If a user has mounted the volume through fuse after provide
>> --client-pid to command line argument similar to internal client PID
>>   in that case brick_xlator consider external fop request also as an
>> internal and it will break functionality.
>>
>>   We are checking pid in (lease, posix-lock, worm, trash) xlator to know
>> about the source of the fops.
>>   Even there are other brick xlators also we are checking specific PID
>> value for all internal
>>   clients that can be break if the external client has the same pid.
>>
>>   My query is why we need to expose client-pid as an argument to the fuse
>> process?
>>
>
>
> I don't think this is a default value to the fuse mount. One place where
> this helps us is with the script based file migration and rebalance - we
> can provide a negative pid to  the special client mount to ensure these
> fops are also treated as internal fops.
>
> In the meantime I do not see the harm in having this option available as
> it allows a specific purpose. Are there any other client processes that use
> this?
>
>I think we need to resolve it. Please share your view on the same.
>>
>> Thanks,
>> Mohit Agrawal
>> ___
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/118564314
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/118564314
>>
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to expose client-pid to fuse process

2019-10-11 Thread Nithya Balachandran

On Fri, 11 Oct 2019 at 14:56, Mohit Agrawal  wrote:

> Hi,
>
>   I have a query specific to authenticate a client based on the PID
> (client-pid).
>   It can break the bricks xlator functionality, Usually, on the brick side
> we take a decision about the
>source of fop request based on PID.If PID value is -ve xlator considers
> the request has come from an internal
>   client otherwise it has come from an external client.
>
>   If a user has mounted the volume through fuse after provide --client-pid
> to command line argument similar to internal client PID
>   in that case brick_xlator consider external fop request also as an
> internal and it will break functionality.
>
>   We are checking pid in (lease, posix-lock, worm, trash) xlator to know
> about the source of the fops.
>   Even there are other brick xlators also we are checking specific PID
> value for all internal
>   clients that can be break if the external client has the same pid.
>
>   My query is why we need to expose client-pid as an argument to the fuse
> process?
>


I don't think this is a default value to the fuse mount. One place where
this helps us is with the script based file migration and rebalance - we
can provide a negative pid to  the special client mount to ensure these
fops are also treated as internal fops.

In the meantime I do not see the harm in having this option available as it
allows a specific purpose. Are there any other client processes that use
this?

   I think we need to resolve it. Please share your view on the same.
>
> Thanks,
> Mohit Agrawal
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query regards to expose client-pid to fuse process

2019-10-11 Thread Mohit Agrawal

Hi,

  I have a query specific to authenticate a client based on the PID
(client-pid).
  It can break the bricks xlator functionality, Usually, on the brick side
we take a decision about the
   source of fop request based on PID.If PID value is -ve xlator considers
the request has come from an internal
  client otherwise it has come from an external client.

  If a user has mounted the volume through fuse after provide --client-pid
to command line argument similar to internal client PID
  in that case brick_xlator consider external fop request also as an
internal and it will break functionality.

  We are checking pid in (lease, posix-lock, worm, trash) xlator to know
about the source of the fops.
  Even there are other brick xlators also we are checking specific PID
value for all internal
  clients that can be break if the external client has the same pid.

  My query is why we need to expose client-pid as an argument to the fuse
process?
   I think we need to resolve it. Please share your view on the same.

Thanks,
Mohit Agrawal
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-05-02 Thread Vijay Bellur

Hi Mohit,

Thank you for the update. More inline.

On Wed, May 1, 2019 at 11:45 PM Mohit Agrawal  wrote:

> Hi Vijay,
>
> I have tried to execute smallfile tool on volume(12x3), i have not found
> any significant performance improvement
> for smallfile operations, I have configured 4 clients and 8 thread to run
> operations.
>

For measuring performance, did you measure both time taken and cpu
consumed? Normally O(n) computations are cpu expensive and we might see
better results with a hash table when a large number of objects ( a few
thousands) are present in a single dictionary. If you haven't gathered cpu
statistics, please also gather that for comparison.


> I have generated statedump and found below data for dictionaries specific
> to gluster processes
>
> brick
> max-pairs-per-dict=50
> total-pairs-used=192212171
> total-dicts-used=24794349
> average-pairs-per-dict=7
>
>
> glusterd
> max-pairs-per-dict=301
> total-pairs-used=156677
> total-dicts-used=30719
> average-pairs-per-dict=5
>
>
> fuse process
> [dict]
> max-pairs-per-dict=50
> total-pairs-used=88669561
> total-dicts-used=12360543
> average-pairs-per-dict=7
>
> It seems dictionary has max-pairs in case of glusterd and while no. of
> volumes are high the number can be increased.
> I think there is no performance regression in case of brick and fuse. I
> have used hash_size 20 for the dictionary.
> Let me know if you can provide some other test to validate the same.
>

A few more items to try out:

1. Vary the number of buckets and test.
2. Create about 1 volumes and measure performance for a volume info
 operation on some random volume?
3. Check the related patch from Facebook and see if we can incorporate any
ideas from their patch.

Thanks,
Vijay



> Thanks,
> Mohit Agrawal
>
> On Tue, Apr 30, 2019 at 2:29 PM Mohit Agrawal  wrote:
>
>> Thanks, Amar for sharing the patch, I will test and share the result.
>>
>> On Tue, Apr 30, 2019 at 2:23 PM Amar Tumballi Suryanarayan <
>> atumb...@redhat.com> wrote:
>>
>>> Shreyas/Kevin tried to address it some time back using
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (
>>> https://review.gluster.org/16830)
>>>
>>> I vaguely remember the reason to keep the hash value 1 was done during
>>> the time when we had dictionary itself sent as on wire protocol, and in
>>> most other places, number of entries in dictionary was on an avg, 3. So, we
>>> felt, saving on a bit of memory for optimization was better at that time.
>>>
>>> -Amar
>>>
>>> On Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal 
>>> wrote:
>>>
 sure Vijay, I will try and update.

 Regards,
 Mohit Agrawal

 On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur 
 wrote:

> Hi Mohit,
>
> On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal 
> wrote:
>
>> Hi All,
>>
>>   I was just looking at the code of dict, I have one query current
>> dictionary logic.
>>   I am not able to understand why we use hash_size is 1 for a
>> dictionary.IMO with the
>>   hash_size of 1 dictionary always work like a list, not a hash, for
>> every lookup
>>   in dictionary complexity is O(n).
>>
>>   Before optimizing the code I just want to know what was the exact
>> reason to define
>>   hash_size is 1?
>>
>
> This is a good question. I looked up the source in gluster's historic
> repo [1] and hash_size is 1 even there. So, this could have been the case
> since the first version of the dictionary code.
>
> Would you be able to run some tests with a larger hash_size and share
> your observations?
>
> Thanks,
> Vijay
>
> [1]
> https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c
>
>
>

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-05-02 Thread Mohit Agrawal

Hi Vijay,

I have tried to execute smallfile tool on volume(12x3), i have not found
any significant performance improvement
for smallfile operations, I have configured 4 clients and 8 thread to run
operations.

I have generated statedump and found below data for dictionaries specific
to gluster processes

brick
max-pairs-per-dict=50
total-pairs-used=192212171
total-dicts-used=24794349
average-pairs-per-dict=7


glusterd
max-pairs-per-dict=301
total-pairs-used=156677
total-dicts-used=30719
average-pairs-per-dict=5


fuse process
[dict]
max-pairs-per-dict=50
total-pairs-used=88669561
total-dicts-used=12360543
average-pairs-per-dict=7

It seems dictionary has max-pairs in case of glusterd and while no. of
volumes are high the number can be increased.
I think there is no performance regression in case of brick and fuse. I
have used hash_size 20 for the dictionary.
Let me know if you can provide some other test to validate the same.

Thanks,
Mohit Agrawal

On Tue, Apr 30, 2019 at 2:29 PM Mohit Agrawal  wrote:

> Thanks, Amar for sharing the patch, I will test and share the result.
>
> On Tue, Apr 30, 2019 at 2:23 PM Amar Tumballi Suryanarayan <
> atumb...@redhat.com> wrote:
>
>> Shreyas/Kevin tried to address it some time back using
>> https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (
>> https://review.gluster.org/16830)
>>
>> I vaguely remember the reason to keep the hash value 1 was done during
>> the time when we had dictionary itself sent as on wire protocol, and in
>> most other places, number of entries in dictionary was on an avg, 3. So, we
>> felt, saving on a bit of memory for optimization was better at that time.
>>
>> -Amar
>>
>> On Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal 
>> wrote:
>>
>>> sure Vijay, I will try and update.
>>>
>>> Regards,
>>> Mohit Agrawal
>>>
>>> On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur 
>>> wrote:
>>>
 Hi Mohit,

 On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal 
 wrote:

> Hi All,
>
>   I was just looking at the code of dict, I have one query current
> dictionary logic.
>   I am not able to understand why we use hash_size is 1 for a
> dictionary.IMO with the
>   hash_size of 1 dictionary always work like a list, not a hash, for
> every lookup
>   in dictionary complexity is O(n).
>
>   Before optimizing the code I just want to know what was the exact
> reason to define
>   hash_size is 1?
>

 This is a good question. I looked up the source in gluster's historic
 repo [1] and hash_size is 1 even there. So, this could have been the case
 since the first version of the dictionary code.

 Would you be able to run some tests with a larger hash_size and share
 your observations?

 Thanks,
 Vijay

 [1]
 https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c



>
>   Please share your view on the same.
>
> Thanks,
> Mohit Agrawal
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

 ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-04-30 Thread Mohit Agrawal

Thanks, Amar for sharing the patch, I will test and share the result.

On Tue, Apr 30, 2019 at 2:23 PM Amar Tumballi Suryanarayan <
atumb...@redhat.com> wrote:

> Shreyas/Kevin tried to address it some time back using
> https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (
> https://review.gluster.org/16830)
>
> I vaguely remember the reason to keep the hash value 1 was done during the
> time when we had dictionary itself sent as on wire protocol, and in most
> other places, number of entries in dictionary was on an avg, 3. So, we
> felt, saving on a bit of memory for optimization was better at that time.
>
> -Amar
>
> On Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal 
> wrote:
>
>> sure Vijay, I will try and update.
>>
>> Regards,
>> Mohit Agrawal
>>
>> On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur  wrote:
>>
>>> Hi Mohit,
>>>
>>> On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal 
>>> wrote:
>>>
 Hi All,

   I was just looking at the code of dict, I have one query current
 dictionary logic.
   I am not able to understand why we use hash_size is 1 for a
 dictionary.IMO with the
   hash_size of 1 dictionary always work like a list, not a hash, for
 every lookup
   in dictionary complexity is O(n).

   Before optimizing the code I just want to know what was the exact
 reason to define
   hash_size is 1?

>>>
>>> This is a good question. I looked up the source in gluster's historic
>>> repo [1] and hash_size is 1 even there. So, this could have been the case
>>> since the first version of the dictionary code.
>>>
>>> Would you be able to run some tests with a larger hash_size and share
>>> your observations?
>>>
>>> Thanks,
>>> Vijay
>>>
>>> [1]
>>> https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c
>>>
>>>
>>>

   Please share your view on the same.

 Thanks,
 Mohit Agrawal
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> --
> Amar Tumballi (amarts)
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-04-30 Thread Amar Tumballi Suryanarayan

Shreyas/Kevin tried to address it some time back using
https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (
https://review.gluster.org/16830)

I vaguely remember the reason to keep the hash value 1 was done during the
time when we had dictionary itself sent as on wire protocol, and in most
other places, number of entries in dictionary was on an avg, 3. So, we
felt, saving on a bit of memory for optimization was better at that time.

-Amar

On Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal  wrote:

> sure Vijay, I will try and update.
>
> Regards,
> Mohit Agrawal
>
> On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur  wrote:
>
>> Hi Mohit,
>>
>> On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal 
>> wrote:
>>
>>> Hi All,
>>>
>>>   I was just looking at the code of dict, I have one query current
>>> dictionary logic.
>>>   I am not able to understand why we use hash_size is 1 for a
>>> dictionary.IMO with the
>>>   hash_size of 1 dictionary always work like a list, not a hash, for
>>> every lookup
>>>   in dictionary complexity is O(n).
>>>
>>>   Before optimizing the code I just want to know what was the exact
>>> reason to define
>>>   hash_size is 1?
>>>
>>
>> This is a good question. I looked up the source in gluster's historic
>> repo [1] and hash_size is 1 even there. So, this could have been the case
>> since the first version of the dictionary code.
>>
>> Would you be able to run some tests with a larger hash_size and share
>> your observations?
>>
>> Thanks,
>> Vijay
>>
>> [1]
>> https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c
>>
>>
>>
>>>
>>>   Please share your view on the same.
>>>
>>> Thanks,
>>> Mohit Agrawal
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-04-30 Thread Mohit Agrawal

sure Vijay, I will try and update.

Regards,
Mohit Agrawal

On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur  wrote:

> Hi Mohit,
>
> On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal  wrote:
>
>> Hi All,
>>
>>   I was just looking at the code of dict, I have one query current
>> dictionary logic.
>>   I am not able to understand why we use hash_size is 1 for a
>> dictionary.IMO with the
>>   hash_size of 1 dictionary always work like a list, not a hash, for
>> every lookup
>>   in dictionary complexity is O(n).
>>
>>   Before optimizing the code I just want to know what was the exact
>> reason to define
>>   hash_size is 1?
>>
>
> This is a good question. I looked up the source in gluster's historic repo
> [1] and hash_size is 1 even there. So, this could have been the case since
> the first version of the dictionary code.
>
> Would you be able to run some tests with a larger hash_size and share your
> observations?
>
> Thanks,
> Vijay
>
> [1]
> https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c
>
>
>
>>
>>   Please share your view on the same.
>>
>> Thanks,
>> Mohit Agrawal
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regarding dictionary logic

2019-04-30 Thread Vijay Bellur

Hi Mohit,

On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal  wrote:

> Hi All,
>
>   I was just looking at the code of dict, I have one query current
> dictionary logic.
>   I am not able to understand why we use hash_size is 1 for a
> dictionary.IMO with the
>   hash_size of 1 dictionary always work like a list, not a hash, for every
> lookup
>   in dictionary complexity is O(n).
>
>   Before optimizing the code I just want to know what was the exact reason
> to define
>   hash_size is 1?
>

This is a good question. I looked up the source in gluster's historic repo
[1] and hash_size is 1 even there. So, this could have been the case since
the first version of the dictionary code.

Would you be able to run some tests with a larger hash_size and share your
observations?

Thanks,
Vijay

[1] https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c



>
>   Please share your view on the same.
>
> Thanks,
> Mohit Agrawal
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query regarding dictionary logic

2019-04-29 Thread Mohit Agrawal

Hi All,

  I was just looking at the code of dict, I have one query current
dictionary logic.
  I am not able to understand why we use hash_size is 1 for a
dictionary.IMO with the
  hash_size of 1 dictionary always work like a list, not a hash, for every
lookup
  in dictionary complexity is O(n).

  Before optimizing the code I just want to know what was the exact reason
to define
  hash_size is 1?

  Please share your view on the same.

Thanks,
Mohit Agrawal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about one glustershd coredump issue

2018-09-28 Thread Zhou, Cynthia (NSB - CN/Hangzhou)



From: Ravishankar N 
Sent: Thursday, September 27, 2018 6:04 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) 
Subject: Re: query about one glustershd coredump issue


Hi,

I think it is better to send it on gluster-users mailing list to get more 
attention.
Regards,
Ravi
On 09/27/2018 01:10 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:
Hi,
In my test env(glusterfs 3.12.3 3brick config) when restart sn-0 sn-1 and sn-2 
at the same time, occasionally, I met glustershd on sn-0 coredump.
Could you help to shed some light on this issue? Thanks!


My gdb result of the coredump file

[root@sn-0:/root]
# gdb /usr/sbin/glusterfs 
core.glusterfs.0.c5f0c5547fbd4e5aa8f350b748e5675e.1812.153796707500
GNU gdb (GDB) Fedora 8.1-14.wf29
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/glusterfs...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 1818]
[New LWP 1812]
[New LWP 1813]
[New LWP 1817]
[New LWP 1966]
[New LWP 1968]
[New LWP 1970]
[New LWP 1974]
[New LWP 1976]
[New LWP 1814]
[New LWP 1815]
[New LWP 1816]
[New LWP 1828]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfs -s sn-0.local --volfile-id 
gluster/glustershd -p /var/run/g'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f1b5e5d7d24 in client3_3_lookup_cbk (req=0x7f1b44002300, 
iov=0x7f1b44002340, count=1,
myframe=0x7f1b4401c850) at client-rpc-fops.c:2802
2802  client-rpc-fops.c: No such file or directory.
[Current thread is 1 (Thread 0x7f1b5f00c700 (LWP 1818))]
Missing separate debuginfos, use: dnf debuginfo-install 
rcp-pack-glusterfs-1.2.0-RCP2.wf29.x86_64
(gdb) bt
#0  0x7f1b5e5d7d24 in client3_3_lookup_cbk (req=0x7f1b44002300, 
iov=0x7f1b44002340, count=1,
myframe=0x7f1b4401c850) at client-rpc-fops.c:2802
#1  0x7f1b64553d47 in rpc_clnt_handle_reply (clnt=0x7f1b5808bbb0, 
pollin=0x7f1b580c6620)
at rpc-clnt.c:778
#2  0x7f1b645542e5 in rpc_clnt_notify (trans=0x7f1b5808bde0, 
mydata=0x7f1b5808bbe0,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f1b580c6620) at rpc-clnt.c:971
#3  0x7f1b64550319 in rpc_transport_notify (this=0x7f1b5808bde0, 
event=RPC_TRANSPORT_MSG_RECEIVED,
data=0x7f1b580c6620) at rpc-transport.c:538
#4  0x7f1b5f49734d in socket_event_poll_in (this=0x7f1b5808bde0, 
notify_handled=_gf_true)
at socket.c:2315
#5  0x7f1b5f497992 in socket_event_handler (fd=25, idx=15, gen=7, 
data=0x7f1b5808bde0, poll_in=1,
poll_out=0, poll_err=0) at socket.c:2471
#6  0x7f1b647fe5ac in event_dispatch_epoll_handler (event_pool=0x230cb00, 
event=0x7f1b5f00be84)
at event-epoll.c:583
#7  0x7f1b647fe883 in event_dispatch_epoll_worker (data=0x23543d0) at 
event-epoll.c:659
#8  0x7f1b6354a5da in start_thread () from /lib64/libpthread.so.0
#9  0x7f1b62e20cbf in clone () from /lib64/libc.so.6
(gdb) info thread
  Id   Target Id Frame
* 1Thread 0x7f1b5f00c700 (LWP 1818) 0x7f1b5e5d7d24 in 
client3_3_lookup_cbk (req=0x7f1b44002300,
iov=0x7f1b44002340, count=1, myframe=0x7f1b4401c850) at 
client-rpc-fops.c:2802
  2Thread 0x7f1b64c83780 (LWP 1812) 0x7f1b6354ba3d in 
__pthread_timedjoin_ex ()
   from /lib64/libpthread.so.0
  3Thread 0x7f1b61eae700 (LWP 1813) 0x7f1b63554300 in nanosleep () from 
/lib64/libpthread.so.0
  4Thread 0x7f1b5feaa700 (LWP 1817) 0x7f1b635508ca in 
pthread_cond_timedwait@@GLIBC_2.3.2 
()
   from /lib64/libpthread.so.0
  5Thread 0x7f1b5ca2b700 (LWP 1966) 0x7f1b62dee4b0 in nanosleep () from 
/lib64/libc.so.6
  6Thread 0x7f1b4f7fe700 (LWP 1968) 0x7f1b6355050c in 
pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
  7Thread 0x7f1b4e7fc700 (LWP 1970) 0x7f1b6355050c in 
pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
  8Thread 0x7f1b4d7fa700 (LWP 1974) 0x7f1b62dee4b0 in nanosleep () from 
/lib64/libc.so.6
  9Thread 0x7f1b33fff700 (LWP 1976) 0x7f1b62dee4b0 in nanosleep () from 
/lib64/libc.so.6
  10

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-10 Thread Zhou, Cynthia (NSB - CN/Hangzhou)

Hi,
I check the link you provided. It does not mention the the "dirty" attribute, 
if I try to fix this split-brain by manually setfattr command, should I only 
set the "trusted.afr.export-client-0" command?
By the way, I feel it is quite strange that the output of "gluster volume heal 
export info" command there is two entries with the same name, how does this 
happen?
gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

I also do some other test, when sn-0 side file/dir does not has "dirty" and 
"trusted.afr.export-client-*" attribute and sn-1 side file/dir has both "dirty" 
and "trusted.afr.export-client-*" non-zero. The gluster could self heal such 
scenario. But in this case the it could never self heal.

From: Ravishankar N [mailto:ravishan...@redhat.com]
Sent: Thursday, February 08, 2018 11:56 AM
To: Zhou, Cynthia (NSB - CN/Hangzhou) ; 
Gluster-devel@gluster.org
Subject: Re: query about a split-brain problem found in glusterfs3.12.3




On 02/08/2018 07:16 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:
Hi,
Thanks for responding?
If split-brain happen in such kind of test is reasonable, how to fix this 
split-brain situation?

If you are using replica 2, then there is no prevention. Once they occur, you 
can resolve them using 
http://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/

If you want to prevent split-brain, you would need to use replica 3 or arbiter 
volume.

Regards,
Ravi

From: Ravishankar N [mailto:ravishan...@redhat.com]
Sent: Thursday, February 08, 2018 12:12 AM
To: Zhou, Cynthia (NSB - CN/Hangzhou) 
; 
Gluster-devel@gluster.org
Subject: Re: query about a split-brain problem found in glusterfs3.12.3




On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:
   Good day.
   Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn nodes, which 
is creating/removing files repeatedly in testdir. then we reboot sn nodes(sn0 
and sn1) by sequence. Then we meet following problem.
Do you have some comments on how this could happen? And how to fix it in 
this situation? Thanks!

Is the problem that split-brain is happening? Is this a replica 2 volume? If 
yes, then it looks like it is expected behavior?
Regards
Ravi




gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while .

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

Status: Connected
Number of entries: 1



[root@sn-0:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



[root@sn-1:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-10 Thread Zhou, Cynthia (NSB - CN/Hangzhou)

Hi,
Thanks for responding?
If split-brain happen in such kind of test is reasonable, how to fix this 
split-brain situation?

From: Ravishankar N [mailto:ravishan...@redhat.com]
Sent: Thursday, February 08, 2018 12:12 AM
To: Zhou, Cynthia (NSB - CN/Hangzhou) ; 
Gluster-devel@gluster.org
Subject: Re: query about a split-brain problem found in glusterfs3.12.3

On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:
   Good day.
   Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn nodes, which 
is creating/removing files repeatedly in testdir. then we reboot sn nodes(sn0 
and sn1) by sequence. Then we meet following problem.
Do you have some comments on how this could happen? And how to fix it in 
this situation? Thanks!

Is the problem that split-brain is happening? Is this a replica 2 volume? If 
yes, then it looks like it is expected behavior?
Regards
Ravi

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while .

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-08 Thread Ravishankar N




On 02/08/2018 01:08 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi,

I check the link you provided. It does not mention the the “dirty” 
attribute, if I try to fix this split-brain by manually setfattr 
command, should I only set the “trusted.afr.export-client-0” command?


Manually resetting xattrs is not recommended. Use the gluster CLI to 
resolve it.


By the way, I feel it is quite strange that the output of “gluster 
volume heal export info” command there is two entries with the same 
name, how does this happen?


Maybe the same entry is listed in different subfolders of 
.glusterfs/indices?


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

I also do some other test, when sn-0 side file/dir does not has 
“dirty” and “trusted.afr.export-client-*” attribute and sn-1 side 
file/dir has both “dirty” and “trusted.afr.export-client-*” non-zero. 
The gluster could self heal such scenario. But in this case the it 
could never self heal.


*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 11:56 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
; Gluster-devel@gluster.org

*Subject:* Re: query about a split-brain problem found in glusterfs3.12.3

On 02/08/2018 07:16 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi,

Thanks for responding?

If split-brain happen in such kind of test is reasonable, how to
fix this split-brain situation?

If you are using replica 2, then there is no prevention. Once they 
occur, you can resolve them using 
http://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/


If you want to prevent split-brain, you would need to use replica 3 or 
arbiter volume.


Regards,
Ravi

*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 12:12 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou)

; Gluster-devel@gluster.org

*Subject:* Re: query about a split-brain problem found in
glusterfs3.12.3

On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in
/mnt/export/testdir. We start 3 ior process (IOR tool) from
non-sn nodes, which is creating/removing files repeatedly in
testdir. then we reboot sn nodes(sn0 and sn1) by sequence.
Then we meet following problem.

Do you have some comments on how this could happen? And how to
fix it in this situation? Thanks!


Is the problem that split-brain is happening? Is this a replica 2
volume? If yes, then it looks like it is expected behavior?
Regards
Ravi


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root ]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick

Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick

/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



___

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Ravishankar N




On 02/08/2018 07:16 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi,

Thanks for responding?

If split-brain happen in such kind of test is reasonable, how to fix 
this split-brain situation?


If you are using replica 2, then there is no prevention. Once they 
occur, you can resolve them using 
http://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/


If you want to prevent split-brain, you would need to use replica 3 or 
arbiter volume.


Regards,
Ravi


*From:*Ravishankar N [mailto:ravishan...@redhat.com]
*Sent:* Thursday, February 08, 2018 12:12 AM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
; Gluster-devel@gluster.org

*Subject:* Re: query about a split-brain problem found in glusterfs3.12.3

On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn
nodes, which is creating/removing files repeatedly in testdir.
then we reboot sn nodes(sn0 and sn1) by sequence. Then we meet
following problem.

Do you have some comments on how this could happen? And how to fix
it in this situation? Thanks!


Is the problem that split-brain is happening? Is this a replica 2 
volume? If yes, then it looks like it is expected behavior?

Regards
Ravi

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root ]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick

Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick

/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Ravishankar N




On 02/07/2018 10:39 AM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:


Hi glusterfs expert:

Good day.

Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn 
nodes, which is creating/removing files repeatedly in testdir. then we 
reboot sn nodes(sn0 and sn1) by sequence. Then we meet following problem.


Do you have some comments on how this could happen? And how to fix it 
in this situation? Thanks!




Is the problem that split-brain is happening? Is this a replica 2 
volume? If yes, then it looks like it is expected behavior?

Regards
Ravi


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root ]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick 


Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick 


/testdir - Is in split-brain

Status: Connected
Number of entries: 1

[root@sn-0:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

[root@sn-1:/root ]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Zhou, Cynthia (NSB - CN/Hangzhou)


Hi glusterfs expert:
   Good day.
   Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn nodes, which 
is creating/removing files repeatedly in testdir. then we reboot sn nodes(sn0 
and sn1) by sequence. Then we meet following problem.
Do you have some comments on how this could happen? And how to fix it in 
this situation? Thanks!


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while .

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal

and finally:

[root@sn-0:/root]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

Status: Connected
Number of entries: 1



[root@sn-0:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



[root@sn-1:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] query about a split-brain problem found in glusterfs3.12.3

2018-02-07 Thread Zhou, Cynthia (NSB - CN/Hangzhou)

Hi glusterfs expert:
   Good day.
   Lately, we meet a glusterfs split brain problem in our env in 
/mnt/export/testdir. We start 3 ior process (IOR tool) from non-sn nodes, which 
is creating/removing files repeatedly in testdir. then we reboot sn nodes(sn0 
and sn1) by sequence. Then we meet following problem.
Do you have some comments on how this could happen? And how to fix it in 
this situation? Thanks!


gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

/testdir - Possibly undergoing heal

Status: Connected
Number of entries: 2

wait for a while …..

gluster volume heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Possibly undergoing heal

/testdir - Possibly undergoing heal
and finally:

[root@sn-0:/root]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain

Status: Connected
Number of entries: 1



[root@sn-0:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001



[root@sn-1:/root]

# getfattr -m .* -d -e  hex /mnt/bricks/export/brick/testdir

getfattr: Removing leading '/' from absolute path names

# file: mnt/bricks/export/brick/testdir

trusted.afr.dirty=0x0001

trusted.afr.export-client-0=0x0038

trusted.gfid=0x5622cff893b3484dbdb6a20a0edb0e77

trusted.glusterfs.dht=0x0001

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] query about why glustershd can not afr_selfheal_recreate_entry because of "afr: Prevent null gfids in self-heal entry re-creation"

2018-01-16 Thread Ravishankar N




On 01/16/2018 02:22 PM, Lian, George (NSB - CN/Hangzhou) wrote:


Hi,

Thanks a lots for your update.

I would like try to introduce more detail for which the issue came from.

This issue is came from a test case in our team, it is the step like 
the following:


1)Setup a glusterfs ENV with replicate 2 storage server nodes and 2 
client nodes


2)Generate a split-brain file , sn-0 is normal, sn-1 is dirty.

Hi , sorry I did not understand the test case. What type of split-brain 
did you create? (data/metadata or gfid or file type mismatch)?


3)Delete the directory before heal begin  (in this phase, the normal 
correct file in sn-0 is deleted by “rm” command , dirty file is still 
there )



Delete from the backend brick directly?


4)After that, the self-heal process will always be failure with the 
log which attached in last mail


Maybe you can write a script or a .t file (like the ones in 
https://github.com/gluster/glusterfs/tree/master/tests/basic/afr) so 
that your test can be understood unambiguously.



Also attach some command output FYI.

From my understand , the Glusterfs maybe can’t handle the split-brain 
file in this case, could you share your comments and confirm whether 
do some enhancement for this case or not?


If you create a split-brain in gluster, self-heal cannot heal it. You 
need to resolve it using one of the methods listed in 
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/#heal-info-and-split-brain-resolution


Thanks,
Ravi

/_rm -rf /mnt/export/testdir rm: cannot remove 
'/mnt/export/testdir/test file': No data available_//__/


/__/

/__/

/_[root@sn-1:/root]_/

/_# ls -l /mnt/export/testdir/_/

/_ls: cannot access '/mnt/export/testdir/IORFILE_82_2': No data 
available_/


/_total 0_/

/_-? ? ? ? ?    ? test_file_/

/__/

/_[root@sn-1:/root]_/

/_# getfattr -m . -d -e hex /mnt/bricks/export/brick/testdir/_/

/_getfattr: Removing leading '/' from absolute path names_/

/_# file: mnt/bricks/export/brick/testdir/_/

/_trusted.afr.dirty=0x0001_/

/_trusted.afr.export-client-0=0x0054_/

/_trusted.gfid=0xb217d6af49024f189a69e0ccf5207572_/

/_trusted.glusterfs.dht=0x0001_/

/__/

/_[root@sn-0:/var/log/glusterfs]_/

/_#  getfattr -m . -d -e hex /mnt/bricks/export/brick/testdir/_/

/_getfattr: Removing leading '/' from absolute path names_/

/_# file: mnt/bricks/export/brick/testdir/_/

/_trusted.gfid=0xb217d6af49024f189a69e0ccf5207572_/

/_trusted.glusterfs.dht=0x0001_/

/__/

Best Regards

George

*From:*gluster-devel-boun...@gluster.org 
[mailto:gluster-devel-boun...@gluster.org] *On Behalf Of *Ravishankar N

*Sent:* Tuesday, January 16, 2018 1:44 PM
*To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com>; Gluster Devel <gluster-devel@gluster.org>
*Subject:* Re: [Gluster-devel] query about why glustershd can not 
afr_selfheal_recreate_entry because of "afr: Prevent null gfids in 
self-heal entry re-creation"


+ gluster-devel

On 01/15/2018 01:41 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert,

    Good day,

    When I do some test about glusterfs self-heal I find
following prints showing when dir/file type get error it cannot
get self-healed.

*Could you help to check if it is an expected behavior ? because I
find the code change **https://review.gluster.org/#/c/17981/**add
check for iatt->ia_type,  so what if a file’s ia_type get
corrupted ? in this case it should not get self-healed* ?


Yes, without knowing the ia-type , afr_selfheal_recreate_entry () 
cannot decide what type of FOP to do (mkdir/link/mknod ) to create the 
appropriate file on the sink. You would need to find out why the 
source brick is not returning valid ia_type. i.e. why 
replies[source].poststat is not valid.

Thanks,
Ravi


Thanks!

//heal info output

[root@sn-0:/home/robot]

# gluster v heal export info

Brick sn-0.local:/mnt/bricks/export/brick

Status: Connected

Number of entries: 0

Brick sn-1.local:/mnt/bricks/export/brick

/testdir - Is in split-brain

Status: Connected

Number of entries: 1

//sn-1 glustershd
log///

[2018-01-15 03:53:40.011422] I [MSGID: 108026]
[afr-self-heal-entry.c:887:afr_selfheal_entry_do]
0-export-replicate-0: performing entry selfheal on
b217d6af-4902-4f18-9a69-e0ccf5207572

[2018-01-15 03:53:40.013994] W [MSGID: 114031]
[client-rpc-fops.c:2860:client3_3_lookup_cbk] 0-export-client-1:
remote operation failed. Path: (null)
(----) [No data available]

[2018-01-15 03:53:40.014025] E [MSGID: 108037]
[afr-self-heal-entry.c:92:afr_selfheal_recreate_entry]
0-export-replic

Re: [Gluster-devel] query about why glustershd can not afr_selfheal_recreate_entry because of "afr: Prevent null gfids in self-heal entry re-creation"

2018-01-15 Thread Ravishankar N


+ gluster-devel


On 01/15/2018 01:41 PM, Zhou, Cynthia (NSB - CN/Hangzhou) wrote:

Hi glusterfs expert,
    Good day,
    When I do some test about glusterfs self-heal I find following 
prints showing when dir/file type get error it cannot get self-healed.
*Could you help to check if it is an expected behavior ? because I 
find the code change **_https://review.gluster.org/#/c/17981/_**add 
check for **iatt->ia_typ**e,  so what if a file’s ia_type get 
corrupted ? in this case it should not get self-healed* ?


Yes, without knowing the ia-type , afr_selfheal_recreate_entry () cannot 
decide what type of FOP to do (mkdir/link/mknod ) to create the 
appropriate file on the sink. You would need to find out why the source 
brick is not returning valid ia_type. i.e. why replies[source].poststat 
is not valid.

Thanks,
Ravi


Thanks!
//heal info output
[root@sn-0:/home/robot]
# gluster v heal export info
Brick sn-0.local:/mnt/bricks/export/brick
Status: Connected
Number of entries: 0
Brick sn-1.local:/mnt/bricks/export/brick
/testdir - Is in split-brain
Status: Connected
Number of entries: 1
//sn-1 glustershd 
log///
[2018-01-15 03:53:40.011422] I [MSGID: 108026] 
[afr-self-heal-entry.c:887:afr_selfheal_entry_do] 
0-export-replicate-0: performing entry selfheal on 
b217d6af-4902-4f18-9a69-e0ccf5207572
[2018-01-15 03:53:40.013994] W [MSGID: 114031] 
[client-rpc-fops.c:2860:client3_3_lookup_cbk] 0-export-client-1: 
remote operation failed. Path: (null) 
(----) [No data available]
[2018-01-15 03:53:40.014025] E [MSGID: 108037] 
[afr-self-heal-entry.c:92:afr_selfheal_recreate_entry] 
0-export-replicate-0: Invalid ia_type (0) or 
gfid(----). source brick=1, 
pargfid=----, name=IORFILE_82_2
//gdb attached to sn-1 
glustershd/

root@sn-1:/var/log/glusterfs]
# gdb attach 2191
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<_http://gnu.org/licenses/gpl.html_>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<_http://www.gnu.org/software/gdb/bugs/_>.
Find the GDB manual and other documentation resources online at:
<_http://www.gnu.org/software/gdb/documentation/_>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
attach: No such file or directory.
Attaching to process 2191
[New LWP 2192]
[New LWP 2193]
[New LWP 2194]
[New LWP 2195]
[New LWP 2196]
[New LWP 2197]
[New LWP 2239]
[New LWP 2241]
[New LWP 2243]
[New LWP 2245]
[New LWP 2247]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x7f90aca037bd in __pthread_join (threadid=140259279345408, 
thread_return=0x0) at pthread_join.c:90

90 pthread_join.c: No such file or directory.
(gdb) break afr_selfheal_recreate_entry
Breakpoint 1 at 0x7f90a3b56dec: file afr-self-heal-entry.c, line 73.
(gdb) c
Continuing.
[Switching to Thread 0x7f90a1b8e700 (LWP 2241)]
Thread 9 "glustershdheal" hit Breakpoint 1, 
afr_selfheal_recreate_entry (frame=0x7f90980018d0, dst=0, source=1, 
sources=0x7f90a1b8ceb0 "", dir=0x7f9098011940, name=0x7f909c015d48 
"IORFILE_82_2",

inode=0x7f9098001bd0, replies=0x7f90a1b8c890) at afr-self-heal-entry.c:73
73 afr-self-heal-entry.c: No such file or directory.
(gdb) n
74  in afr-self-heal-entry.c
(gdb) n
75  in afr-self-heal-entry.c
(gdb) n
76  in afr-self-heal-entry.c
(gdb) n
77  in afr-self-heal-entry.c
(gdb) n
78  in afr-self-heal-entry.c
(gdb) n
79  in afr-self-heal-entry.c
(gdb) n
80  in afr-self-heal-entry.c
(gdb) n
81  in afr-self-heal-entry.c
(gdb) n
82  in afr-self-heal-entry.c
(gdb) n
83  in afr-self-heal-entry.c
(gdb) n
85  in afr-self-heal-entry.c
(gdb) n
86  in afr-self-heal-entry.c
(gdb) n
87  in afr-self-heal-entry.c
(gdb) print iatt->ia_type
$1 = IA_INVAL
(gdb) print gf_uuid_is_null(iatt->ia_gfid)
$2 = 1
(gdb) bt
#0 afr_selfheal_recreate_entry (frame=0x7f90980018d0, dst=0, source=1, 
sources=0x7f90a1b8ceb0 "", dir=0x7f9098011940, name=0x7f909c015d48 
"IORFILE_82_2", inode=0x7f9098001bd0, replies=0x7f90a1b8c890)

    at afr-self-heal-entry.c:87
#1 0x7f90a3b57d20 in __afr_selfheal_merge_dirent 
(frame=0x7f90980018d0, this=0x7f90a4024610, fd=0x7f9098413090, 
name=0x7f909c015d48 "IORFILE_82_2", inode=0x7f9098001bd0,
sources=0x7f90a1b8ceb0 "", healed_sinks=0x7f90a1b8ce70 
"\001\001A\230\220\177", locked_on=0x7f90a1b8ce50 
"\001\001\270\241\220\177",

Re: [Gluster-devel] Query specific to getting crash

2017-10-09 Thread Mohit Agrawal

Hi Niels,

   Thanks for your response.I will file a bug and will update same bt in
bug also.
   I don't know about the reproducer, I was getting a crash only one time.
   Please let us know if anyone has objection to merge this patch.

Thanks
Mohit Agrawal

On Mon, Oct 9, 2017 at 4:16 PM, Niels de Vos  wrote:

> On Mon, Oct 09, 2017 at 02:07:23PM +0530, Mohit Agrawal wrote:
> > +
> >
> > On Mon, Oct 9, 2017 at 11:33 AM, Mohit Agrawal 
> wrote:
> >
> > >
> > > On Mon, Oct 9, 2017 at 11:16 AM, Mohit Agrawal 
> > > wrote:
> > >
> > >> Hi All,
> > >>
> > >>
> > >> For specific to this patch(https://review.gluster.org/#/c/18436/) i
> am
> > >> getting crash in nfs(only once) for the
> > >> test case (./tests/basic/mount-nfs-auth.t), although i tried to
> execute
> > >> the same test case in a loop on centos
> > >> machine but i have not found any crash.
> > >>
> > >> After anaylys the crash it seems cache(entry) is invalidate in thread
> 10
> > >> and same it is trying to access
> > >> in thread 1.
> > >>
> > >> >>>.
> > >>
> > >> (gdb) thread 1
> > >> [Switching to thread 1 (Thread 0x7fe852cfe700 (LWP 19073))]#0
> > >>  0x7fe859665c85 in auth_cache_lookup (
> > >> cache=0x7fe854027db0, fh=0x7fe84466684c, host_addr=0x7fe844565e40
> > >> "23.253.175.80",
> > >> timestamp=0x7fe852cfb1e0, can_write=0x7fe852cfb1dc)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/auth-cache.c:295
> > >> 295*can_write = lookup_res->item->opts->rw;
> > >> (gdb) bt
> > >> #0  0x7fe859665c85 in auth_cache_lookup (cache=0x7fe854027db0,
> > >> fh=0x7fe84466684c,
> > >> host_addr=0x7fe844565e40 "23.253.175.80",
> timestamp=0x7fe852cfb1e0,
> > >> can_write=0x7fe852cfb1dc)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/auth-cache.c:295
> > >> #1  0x7fe859665ebc in is_nfs_fh_cached (cache=0x7fe854027db0,
> > >> fh=0x7fe84466684c,
> > >> host_addr=0x7fe844565e40 "23.253.175.80")
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/auth-cache.c:390
> > >> #2  0x7fe85962b82c in mnt3_check_cached_fh (ms=0x7fe854023d60,
> > >> fh=0x7fe84466684c,
> > >> host_addr=0x7fe844565e40 "23.253.175.80", is_write_op=_gf_false)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/mount3.c:1954
> > >> #3  0x7fe85962ba92 in _mnt3_authenticate_req (ms=0x7fe854023d60,
> > >> req=0x7fe844679148,
> > >> fh=0x7fe84466684c, path=0x0, authorized_export=0x0,
> > >> authorized_host=0x0, is_write_op=_gf_false)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/mount3.c:2011
> > >> #4  0x7fe85962bf65 in mnt3_authenticate_request
> (ms=0x7fe854023d60,
> > >> req=0x7fe844679148,
> > >> fh=0x7fe84466684c, volname=0x0, path=0x0, authorized_path=0x0,
> > >> authorized_host=0x0,
> > >> is_write_op=_gf_false)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/mount3.c:2130
> > >> #5  0x7fe859652370 in nfs3_fh_auth_nfsop (cs=0x7fe8446663c8,
> > >> is_write_op=_gf_false)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3981
> > >> #6  0x7fe85963631a in nfs3_lookup_resume (carg=0x7fe8446663c8)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3.c:155---Type  to continue,
> or q
> > >>  to quit---
> > >> 9
> > >> #7  0x7fe859651b98 in nfs3_fh_resolve_entry_hard
> (cs=0x7fe8446663c8)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3791
> > >> #8  0x7fe859651e35 in nfs3_fh_resolve_entry (cs=0x7fe8446663c8)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3844
> > >> #9  0x7fe859651e94 in nfs3_fh_resolve_resume (cs=0x7fe8446663c8)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3862
> > >> #10 0x7fe8596520ad in nfs3_fh_resolve_root (cs=0x7fe8446663c8)
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3915
> > >> #11 0x7fe85965245f in nfs3_fh_resolve_and_resume
> (cs=0x7fe8446663c8,
> > >> fh=0x7fe852cfc980,
> > >> entry=0x7fe852cfc9c0 "test-bg-write", resum_fn=0x7fe85963621d
> > >> )
> > >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> > >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:4011
> > >> #12 0x7fe859636dcf in nfs3_lookup (req=0x7fe844679148,
> > >> fh=0x7fe852cfc980, fhlen=52,
> > >>

Re: [Gluster-devel] Query specific to getting crash

2017-10-09 Thread Niels de Vos

On Mon, Oct 09, 2017 at 02:07:23PM +0530, Mohit Agrawal wrote:
> +
> 
> On Mon, Oct 9, 2017 at 11:33 AM, Mohit Agrawal  wrote:
> 
> >
> > On Mon, Oct 9, 2017 at 11:16 AM, Mohit Agrawal 
> > wrote:
> >
> >> Hi All,
> >>
> >>
> >> For specific to this patch(https://review.gluster.org/#/c/18436/) i am
> >> getting crash in nfs(only once) for the
> >> test case (./tests/basic/mount-nfs-auth.t), although i tried to execute
> >> the same test case in a loop on centos
> >> machine but i have not found any crash.
> >>
> >> After anaylys the crash it seems cache(entry) is invalidate in thread 10
> >> and same it is trying to access
> >> in thread 1.
> >>
> >> >>>.
> >>
> >> (gdb) thread 1
> >> [Switching to thread 1 (Thread 0x7fe852cfe700 (LWP 19073))]#0
> >>  0x7fe859665c85 in auth_cache_lookup (
> >> cache=0x7fe854027db0, fh=0x7fe84466684c, host_addr=0x7fe844565e40
> >> "23.253.175.80",
> >> timestamp=0x7fe852cfb1e0, can_write=0x7fe852cfb1dc)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/auth-cache.c:295
> >> 295*can_write = lookup_res->item->opts->rw;
> >> (gdb) bt
> >> #0  0x7fe859665c85 in auth_cache_lookup (cache=0x7fe854027db0,
> >> fh=0x7fe84466684c,
> >> host_addr=0x7fe844565e40 "23.253.175.80", timestamp=0x7fe852cfb1e0,
> >> can_write=0x7fe852cfb1dc)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/auth-cache.c:295
> >> #1  0x7fe859665ebc in is_nfs_fh_cached (cache=0x7fe854027db0,
> >> fh=0x7fe84466684c,
> >> host_addr=0x7fe844565e40 "23.253.175.80")
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/auth-cache.c:390
> >> #2  0x7fe85962b82c in mnt3_check_cached_fh (ms=0x7fe854023d60,
> >> fh=0x7fe84466684c,
> >> host_addr=0x7fe844565e40 "23.253.175.80", is_write_op=_gf_false)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/mount3.c:1954
> >> #3  0x7fe85962ba92 in _mnt3_authenticate_req (ms=0x7fe854023d60,
> >> req=0x7fe844679148,
> >> fh=0x7fe84466684c, path=0x0, authorized_export=0x0,
> >> authorized_host=0x0, is_write_op=_gf_false)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/mount3.c:2011
> >> #4  0x7fe85962bf65 in mnt3_authenticate_request (ms=0x7fe854023d60,
> >> req=0x7fe844679148,
> >> fh=0x7fe84466684c, volname=0x0, path=0x0, authorized_path=0x0,
> >> authorized_host=0x0,
> >> is_write_op=_gf_false)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/mount3.c:2130
> >> #5  0x7fe859652370 in nfs3_fh_auth_nfsop (cs=0x7fe8446663c8,
> >> is_write_op=_gf_false)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3981
> >> #6  0x7fe85963631a in nfs3_lookup_resume (carg=0x7fe8446663c8)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3.c:155---Type  to continue, or q
> >>  to quit---
> >> 9
> >> #7  0x7fe859651b98 in nfs3_fh_resolve_entry_hard (cs=0x7fe8446663c8)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3791
> >> #8  0x7fe859651e35 in nfs3_fh_resolve_entry (cs=0x7fe8446663c8)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3844
> >> #9  0x7fe859651e94 in nfs3_fh_resolve_resume (cs=0x7fe8446663c8)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3862
> >> #10 0x7fe8596520ad in nfs3_fh_resolve_root (cs=0x7fe8446663c8)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3915
> >> #11 0x7fe85965245f in nfs3_fh_resolve_and_resume (cs=0x7fe8446663c8,
> >> fh=0x7fe852cfc980,
> >> entry=0x7fe852cfc9c0 "test-bg-write", resum_fn=0x7fe85963621d
> >> )
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3-helpers.c:4011
> >> #12 0x7fe859636dcf in nfs3_lookup (req=0x7fe844679148,
> >> fh=0x7fe852cfc980, fhlen=52,
> >> name=0x7fe852cfc9c0 "test-bg-write")
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3.c:1620
> >> #13 0x7fe85963703f in nfs3svc_lookup (req=0x7fe844679148)
> >> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
> >> 0dev/xlators/nfs/server/src/nfs3.c:1666
> >> #14 0x7fe86765f585 in rpcsvc_handle_rpc_call (svc=0x7fe854022a00,
> >> trans=0x7fe8545c1fa0,
> >> msg=0x7fe844334610)
> >> ---Type  to continue, or q  to quit---
> >> at

Re: [Gluster-devel] Query specific to getting crash

2017-10-09 Thread Mohit Agrawal

+

On Mon, Oct 9, 2017 at 11:33 AM, Mohit Agrawal  wrote:

>
> On Mon, Oct 9, 2017 at 11:16 AM, Mohit Agrawal 
> wrote:
>
>> Hi All,
>>
>>
>> For specific to this patch(https://review.gluster.org/#/c/18436/) i am
>> getting crash in nfs(only once) for the
>> test case (./tests/basic/mount-nfs-auth.t), although i tried to execute
>> the same test case in a loop on centos
>> machine but i have not found any crash.
>>
>> After anaylys the crash it seems cache(entry) is invalidate in thread 10
>> and same it is trying to access
>> in thread 1.
>>
>> >>>.
>>
>> (gdb) thread 1
>> [Switching to thread 1 (Thread 0x7fe852cfe700 (LWP 19073))]#0
>>  0x7fe859665c85 in auth_cache_lookup (
>> cache=0x7fe854027db0, fh=0x7fe84466684c, host_addr=0x7fe844565e40
>> "23.253.175.80",
>> timestamp=0x7fe852cfb1e0, can_write=0x7fe852cfb1dc)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/auth-cache.c:295
>> 295*can_write = lookup_res->item->opts->rw;
>> (gdb) bt
>> #0  0x7fe859665c85 in auth_cache_lookup (cache=0x7fe854027db0,
>> fh=0x7fe84466684c,
>> host_addr=0x7fe844565e40 "23.253.175.80", timestamp=0x7fe852cfb1e0,
>> can_write=0x7fe852cfb1dc)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/auth-cache.c:295
>> #1  0x7fe859665ebc in is_nfs_fh_cached (cache=0x7fe854027db0,
>> fh=0x7fe84466684c,
>> host_addr=0x7fe844565e40 "23.253.175.80")
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/auth-cache.c:390
>> #2  0x7fe85962b82c in mnt3_check_cached_fh (ms=0x7fe854023d60,
>> fh=0x7fe84466684c,
>> host_addr=0x7fe844565e40 "23.253.175.80", is_write_op=_gf_false)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/mount3.c:1954
>> #3  0x7fe85962ba92 in _mnt3_authenticate_req (ms=0x7fe854023d60,
>> req=0x7fe844679148,
>> fh=0x7fe84466684c, path=0x0, authorized_export=0x0,
>> authorized_host=0x0, is_write_op=_gf_false)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/mount3.c:2011
>> #4  0x7fe85962bf65 in mnt3_authenticate_request (ms=0x7fe854023d60,
>> req=0x7fe844679148,
>> fh=0x7fe84466684c, volname=0x0, path=0x0, authorized_path=0x0,
>> authorized_host=0x0,
>> is_write_op=_gf_false)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/mount3.c:2130
>> #5  0x7fe859652370 in nfs3_fh_auth_nfsop (cs=0x7fe8446663c8,
>> is_write_op=_gf_false)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3981
>> #6  0x7fe85963631a in nfs3_lookup_resume (carg=0x7fe8446663c8)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3.c:155---Type  to continue, or q
>>  to quit---
>> 9
>> #7  0x7fe859651b98 in nfs3_fh_resolve_entry_hard (cs=0x7fe8446663c8)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3791
>> #8  0x7fe859651e35 in nfs3_fh_resolve_entry (cs=0x7fe8446663c8)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3844
>> #9  0x7fe859651e94 in nfs3_fh_resolve_resume (cs=0x7fe8446663c8)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3862
>> #10 0x7fe8596520ad in nfs3_fh_resolve_root (cs=0x7fe8446663c8)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:3915
>> #11 0x7fe85965245f in nfs3_fh_resolve_and_resume (cs=0x7fe8446663c8,
>> fh=0x7fe852cfc980,
>> entry=0x7fe852cfc9c0 "test-bg-write", resum_fn=0x7fe85963621d
>> )
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3-helpers.c:4011
>> #12 0x7fe859636dcf in nfs3_lookup (req=0x7fe844679148,
>> fh=0x7fe852cfc980, fhlen=52,
>> name=0x7fe852cfc9c0 "test-bg-write")
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3.c:1620
>> #13 0x7fe85963703f in nfs3svc_lookup (req=0x7fe844679148)
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/xlators/nfs/server/src/nfs3.c:1666
>> #14 0x7fe86765f585 in rpcsvc_handle_rpc_call (svc=0x7fe854022a00,
>> trans=0x7fe8545c1fa0,
>> msg=0x7fe844334610)
>> ---Type  to continue, or q  to quit---
>> at /home/jenkins/root/workspace/my_glusterfs_build/glusterfs-4.
>> 0dev/rpc/rpc-lib/src/rpcsvc.c:711
>> #15 0x7fe86765f8f8 in rpcsvc_notify (trans=0x7fe8545c1fa0,
>> mydata=0x7fe854022a00,
>> event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7fe844334610)
>> at

[Gluster-devel] Query regards to take decision about hashed_subvol to heal user xattr

2017-04-04 Thread Mohit Agrawal

Hi All,

 I have used below approach to heal custom xattr(user,acl,quota) in patch(
https://review.gluster.org/#/c/15468/) to know about hashed_subvol that is
required to first wind a fop(setxattr) on hashed_subvol then try to update
the same on other subvols.

 In dht_revalidate/lookup_cbk i do save hashed_subvolume on inode_ctx and
at the time of set xattr(dht_setxattr) call function
dht_inode_ctx_hashvol_get to know about the saved hashed_subvol.

1) First it will check stored hashed_subvol(after call the function
dht_inode_ctx_hashvol_get) on a inode and
save status in ret variable.

2) If ret is 0 (means hashed_subvol exists in inode) then it will check the
status of hashed_subvol
   if it is up then set the index into fop_wind(variable) and break out
from the loop
   If it(status) is down in that case it will set ret value to 1 and
check next up  subvol and set index into fop_wind.

If hash_subvol index exist in last of array in that case it will set
index in fop_wind of previous up subvolume.

3) if ret is not 0 (means hashed_subvol does not exist in the inode) then
it will set index to last up subvol and break out from the  loop.

Below is the code to take decision about hashed_subvolume in dht_setxattr

>>

ret = dht_inode_ctx_hashvol_get (loc->inode, this,
 _subvol);
for (i = 0; i < call_cnt; i++) {
if (!ret && conf->subvolumes[i] == hashed_subvol) {
if (!conf->subvolume_status[i]) {
gf_msg(this->name, GF_LOG_WARNING,
0,
   DHT_MSG_HASHED_SUBVOL_DOWN,
   "hash subvolume %s is down "
   "for path %s",
   hashed_subvol->name,
loc->path);
ret = 1;
} else {
fop_wind = i;
break;
}
 } else {
if (conf->subvolume_status[i])
fop_wind = i;
  }
 }


>>>


Please share your input if any issue in this approach to decide about
hashed_subvolume.
Appreciate your inputs.

Regards
Mohit Agrawal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-16 Thread Mohit Agrawal

Hi,

I think we should divide the problem into two parts.

  1) User Extended attribute is not correctly showing by getxattr on mount
point.
  2) healing user xattr on brick those were down at the time of run
setxattr.


To print the correct extended attribute on mount point i think quorum
approach is good.I think it is sufficient to consider as a source nodes if
more than half of nodes have same user xattr value.

How we can find correct value?

1) For every volume specific to user xattr key/value pair we can calculate
hash value and store hash value in dict for key as volume-instance.
2) Find out the volumes from dict those have same hash value and return
xattr to the application.

The volumes those have same hash value we can consider as a source and
others are sink.From latest patch(http://review.gluster.org/#/c/15468/) i
am not deleting any xattr, it will replace the existing user xattr on
volume in case if the same does exist otherwise it will create new xattr.

For specific to ALC/SeLinux because i am updating only user xattr so it
will be remain same after done heal function.

Regards
Mohit Agrawal

On Fri, Sep 16, 2016 at 9:42 AM, Nithya Balachandran <nbala...@redhat.com>
wrote:

>
>
> On 15 September 2016 at 17:21, Raghavendra Gowdappa <rgowd...@redhat.com>
> wrote:
>
>>
>>
>> - Original Message -
>> > From: "Xavier Hernandez" <xhernan...@datalab.es>
>> > To: "Raghavendra G" <raghaven...@gluster.com>, "Nithya Balachandran" <
>> nbala...@redhat.com>
>> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohit Agrawal" <
>> moagr...@redhat.com>
>> > Sent: Thursday, September 15, 2016 4:54:25 PM
>> > Subject: Re: [Gluster-devel] Query regards to heal xattr heal in dht
>> >
>> >
>> >
>> > On 15/09/16 11:31, Raghavendra G wrote:
>> > >
>> > >
>> > > On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
>> > > <nbala...@redhat.com <mailto:nbala...@redhat.com>> wrote:
>> > >
>> > >
>> > >
>> > > On 8 September 2016 at 12:02, Mohit Agrawal <moagr...@redhat.com
>> > > <mailto:moagr...@redhat.com>> wrote:
>> > >
>> > > Hi All,
>> > >
>> > >I have one another solution to heal user xattr but before
>> > > implement it i would like to discuss with you.
>> > >
>> > >Can i call function (dht_dir_xattr_heal internally it is
>> > > calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in
>> last
>> > >after make sure we have a valid xattr.
>> > >In function(dht_dir_xattr_heal) it will copy blindly all
>> user
>> > > xattr on all subvolume or i can compare subvol xattr with
>> valid
>> > > xattr if there is any mismatch then i will call
>> syncop_setxattr
>> > > otherwise no need to call. syncop_setxattr.
>> > >
>> > >
>> > >
>> > > This can be problematic if a particular xattr is being removed -
>> it
>> > > might still exist on some subvols. IIUC, the heal would go and
>> reset
>> > > it again?
>> > >
>> > > One option is to use the hash subvol for the dir as the source -
>> so
>> > > perform xattr op on hashed subvol first and on the others only if
>> it
>> > > succeeds on the hashed. This does have the problem of being unable
>> > > to set xattrs if the hashed subvol is unavailable. This might not
>> be
>> > > such a big deal in case of distributed replicate or distribute
>> > > disperse volumes but will affect pure distribute. However, this
>> way
>> > > we can at least be reasonably certain of the correctness (leaving
>> > > rebalance out of the picture).
>> > >
>> > >
>> > > * What is the behavior of getxattr when hashed subvol is down? Should
>> we
>> > > succeed with values from non-hashed subvols or should we fail
>> getxattr?
>> > > With hashed-subvol as source of truth, its difficult to determine
>> > > correctness of xattrs and their values when it is down.
>> > >
>> > > * setxattr is an inode operation (as opposed to entry operation). So,
>> we
>> > > cannot calculate hashed-subvol as in (get)(set)xattr, parent layout
>> and
>> > > "basename" is not ava

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Nithya Balachandran

On 15 September 2016 at 17:21, Raghavendra Gowdappa <rgowd...@redhat.com>
wrote:

>
>
> - Original Message -
> > From: "Xavier Hernandez" <xhernan...@datalab.es>
> > To: "Raghavendra G" <raghaven...@gluster.com>, "Nithya Balachandran" <
> nbala...@redhat.com>
> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohit Agrawal" <
> moagr...@redhat.com>
> > Sent: Thursday, September 15, 2016 4:54:25 PM
> > Subject: Re: [Gluster-devel] Query regards to heal xattr heal in dht
> >
> >
> >
> > On 15/09/16 11:31, Raghavendra G wrote:
> > >
> > >
> > > On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> > > <nbala...@redhat.com <mailto:nbala...@redhat.com>> wrote:
> > >
> > >
> > >
> > > On 8 September 2016 at 12:02, Mohit Agrawal <moagr...@redhat.com
> > > <mailto:moagr...@redhat.com>> wrote:
> > >
> > > Hi All,
> > >
> > >I have one another solution to heal user xattr but before
> > > implement it i would like to discuss with you.
> > >
> > >Can i call function (dht_dir_xattr_heal internally it is
> > > calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in
> last
> > >after make sure we have a valid xattr.
> > >In function(dht_dir_xattr_heal) it will copy blindly all
> user
> > > xattr on all subvolume or i can compare subvol xattr with valid
> > > xattr if there is any mismatch then i will call syncop_setxattr
> > > otherwise no need to call. syncop_setxattr.
> > >
> > >
> > >
> > > This can be problematic if a particular xattr is being removed - it
> > > might still exist on some subvols. IIUC, the heal would go and
> reset
> > > it again?
> > >
> > > One option is to use the hash subvol for the dir as the source - so
> > > perform xattr op on hashed subvol first and on the others only if
> it
> > > succeeds on the hashed. This does have the problem of being unable
> > > to set xattrs if the hashed subvol is unavailable. This might not
> be
> > > such a big deal in case of distributed replicate or distribute
> > > disperse volumes but will affect pure distribute. However, this way
> > > we can at least be reasonably certain of the correctness (leaving
> > > rebalance out of the picture).
> > >
> > >
> > > * What is the behavior of getxattr when hashed subvol is down? Should
> we
> > > succeed with values from non-hashed subvols or should we fail getxattr?
> > > With hashed-subvol as source of truth, its difficult to determine
> > > correctness of xattrs and their values when it is down.
> > >
> > > * setxattr is an inode operation (as opposed to entry operation). So,
> we
> > > cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
> > > "basename" is not available. This forces us to store hashed subvol in
> > > inode-ctx. Now, when the hashed-subvol changes we need to update these
> > > inode-ctxs too.
> > >
> > > What do you think about a Quorum based solution to this problem?
> > >
> > > 1. setxattr succeeds only if it is successful on at least (n/2 + 1)
> > > number of subvols.
> > > 2. getxattr succeeds only if it is successful and values match on at
> > > least (n/2 + 1) number of subvols.
> > >
> > > The flip-side of this solution is we are increasing the probability of
> > > failure of (get)(set)xattr operations as opposed to the hashed-subvol
> as
> > > source of truth solution. Or are we - how do we compare probability of
> > > hashed-subvol going down with probability of (n/2 + 1) nodes going down
> > > simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n
> correct
> > > probability for _a specific subvol (hashed-subvol)_ going down (as
> > > opposed to _any one subvol_ going down)?
> >
> > If we suppose p to be the probability of failure of a subvolume in a
> > period of time (a year for example), all subvolumes have the same
> > probability, and we have N subvolumes, then:
> >
> > Probability of failure of hashed-subvol: p
> > Probability of failure of N/2 + 1 or more subvols: 
>
> Thanks Xavi. That was quick :).
>
> >
> > Note that this probabili

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Raghavendra Gowdappa



- Original Message -
> From: "Xavier Hernandez" <xhernan...@datalab.es>
> To: "Raghavendra G" <raghaven...@gluster.com>, "Nithya Balachandran" 
> <nbala...@redhat.com>
> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohit Agrawal" 
> <moagr...@redhat.com>
> Sent: Thursday, September 15, 2016 4:54:25 PM
> Subject: Re: [Gluster-devel] Query regards to heal xattr heal in dht
> 
> 
> 
> On 15/09/16 11:31, Raghavendra G wrote:
> >
> >
> > On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> > <nbala...@redhat.com <mailto:nbala...@redhat.com>> wrote:
> >
> >
> >
> > On 8 September 2016 at 12:02, Mohit Agrawal <moagr...@redhat.com
> > <mailto:moagr...@redhat.com>> wrote:
> >
> > Hi All,
> >
> >I have one another solution to heal user xattr but before
> > implement it i would like to discuss with you.
> >
> >Can i call function (dht_dir_xattr_heal internally it is
> > calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
> >after make sure we have a valid xattr.
> >In function(dht_dir_xattr_heal) it will copy blindly all user
> > xattr on all subvolume or i can compare subvol xattr with valid
> > xattr if there is any mismatch then i will call syncop_setxattr
> > otherwise no need to call. syncop_setxattr.
> >
> >
> >
> > This can be problematic if a particular xattr is being removed - it
> > might still exist on some subvols. IIUC, the heal would go and reset
> > it again?
> >
> > One option is to use the hash subvol for the dir as the source - so
> > perform xattr op on hashed subvol first and on the others only if it
> > succeeds on the hashed. This does have the problem of being unable
> > to set xattrs if the hashed subvol is unavailable. This might not be
> > such a big deal in case of distributed replicate or distribute
> > disperse volumes but will affect pure distribute. However, this way
> > we can at least be reasonably certain of the correctness (leaving
> > rebalance out of the picture).
> >
> >
> > * What is the behavior of getxattr when hashed subvol is down? Should we
> > succeed with values from non-hashed subvols or should we fail getxattr?
> > With hashed-subvol as source of truth, its difficult to determine
> > correctness of xattrs and their values when it is down.
> >
> > * setxattr is an inode operation (as opposed to entry operation). So, we
> > cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
> > "basename" is not available. This forces us to store hashed subvol in
> > inode-ctx. Now, when the hashed-subvol changes we need to update these
> > inode-ctxs too.
> >
> > What do you think about a Quorum based solution to this problem?
> >
> > 1. setxattr succeeds only if it is successful on at least (n/2 + 1)
> > number of subvols.
> > 2. getxattr succeeds only if it is successful and values match on at
> > least (n/2 + 1) number of subvols.
> >
> > The flip-side of this solution is we are increasing the probability of
> > failure of (get)(set)xattr operations as opposed to the hashed-subvol as
> > source of truth solution. Or are we - how do we compare probability of
> > hashed-subvol going down with probability of (n/2 + 1) nodes going down
> > simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
> > probability for _a specific subvol (hashed-subvol)_ going down (as
> > opposed to _any one subvol_ going down)?
> 
> If we suppose p to be the probability of failure of a subvolume in a
> period of time (a year for example), all subvolumes have the same
> probability, and we have N subvolumes, then:
> 
> Probability of failure of hashed-subvol: p
> Probability of failure of N/2 + 1 or more subvols: 

Thanks Xavi. That was quick :).

> 
> Note that this probability says how much probable is that N/2 + 1
> subvols or more fail in the specified period of time, but not
> necessarily simultaneously. If we suppose that subvolumes are recovered
> as fast as possible, the real probability of simultaneous failure will
> be much smaller.
> 
> In worst case (not recovering the failed subvolumes in the given period
> of time), if p < 0.5 or N = 2 (and p != 1), then it's always better to
> check N/2 + 1 subvolumes. Otherwise, it's better to check the hashed-subvol.
> 
> I think that p should always be much small

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Xavier Hernandez

On 15/09/16 11:31, Raghavendra G wrote:

On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> wrote:

On 8 September 2016 at 12:02, Mohit Agrawal > wrote:

Hi All,

   I have one another solution to heal user xattr but before
implement it i would like to discuss with you.

   Can i call function (dht_dir_xattr_heal internally it is
calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
   after make sure we have a valid xattr.
   In function(dht_dir_xattr_heal) it will copy blindly all user
xattr on all subvolume or i can compare subvol xattr with valid
xattr if there is any mismatch then i will call syncop_setxattr
otherwise no need to call. syncop_setxattr.

This can be problematic if a particular xattr is being removed - it
might still exist on some subvols. IIUC, the heal would go and reset
it again?

One option is to use the hash subvol for the dir as the source - so
perform xattr op on hashed subvol first and on the others only if it
succeeds on the hashed. This does have the problem of being unable
to set xattrs if the hashed subvol is unavailable. This might not be
such a big deal in case of distributed replicate or distribute
disperse volumes but will affect pure distribute. However, this way
we can at least be reasonably certain of the correctness (leaving
rebalance out of the picture).

* What is the behavior of getxattr when hashed subvol is down? Should we
succeed with values from non-hashed subvols or should we fail getxattr?
With hashed-subvol as source of truth, its difficult to determine
correctness of xattrs and their values when it is down.

* setxattr is an inode operation (as opposed to entry operation). So, we
cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
"basename" is not available. This forces us to store hashed subvol in
inode-ctx. Now, when the hashed-subvol changes we need to update these
inode-ctxs too.

What do you think about a Quorum based solution to this problem?

1. setxattr succeeds only if it is successful on at least (n/2 + 1)
number of subvols.
2. getxattr succeeds only if it is successful and values match on at
least (n/2 + 1) number of subvols.

The flip-side of this solution is we are increasing the probability of
failure of (get)(set)xattr operations as opposed to the hashed-subvol as
source of truth solution. Or are we - how do we compare probability of
hashed-subvol going down with probability of (n/2 + 1) nodes going down
simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
probability for _a specific subvol (hashed-subvol)_ going down (as
opposed to _any one subvol_ going down)?

If we suppose p to be the probability of failure of a subvolume in a 
period of time (a year for example), all subvolumes have the same 
probability, and we have N subvolumes, then:

Probability of failure of hashed-subvol: p
Probability of failure of N/2 + 1 or more subvols: 

Note that this probability says how much probable is that N/2 + 1 
subvols or more fail in the specified period of time, but not 
necessarily simultaneously. If we suppose that subvolumes are recovered 
as fast as possible, the real probability of simultaneous failure will 
be much smaller.

In worst case (not recovering the failed subvolumes in the given period 
of time), if p < 0.5 or N = 2 (and p != 1), then it's always better to 
check N/2 + 1 subvolumes. Otherwise, it's better to check the hashed-subvol.

I think that p should always be much smaller than 0.5 for small periods 
of time where subvolume recovery could no be completed before other 
failures, so checking half plus one subvols should always be the best 
option in terms of probability. Performance can suffer though if some 
kind of synchronization is needed.

Xavi

   Let me know if this approach is suitable.

Regards
Mohit Agrawal

On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri
> wrote:

On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal
> wrote:

Hi Pranith,

In current approach i am getting list of xattr from
first up volume and update the user attributes from that
xattr to
all other volumes.

I have assumed first up subvol is source and rest of
them are sink as we are doing same in dht_dir_attr_heal.

I think first up subvol is different for different mounts as
per my understanding, I could be wrong.

Regards
Mohit Agrawal

On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Raghavendra G

On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran 
wrote:

>
>
> On 8 September 2016 at 12:02, Mohit Agrawal  wrote:
>
>> Hi All,
>>
>>I have one another solution to heal user xattr but before implement it
>> i would like to discuss with you.
>>
>>Can i call function (dht_dir_xattr_heal internally it is calling
>> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>>after make sure we have a valid xattr.
>>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
>> all subvolume or i can compare subvol xattr with valid xattr if there is
>> any mismatch then i will call syncop_setxattr otherwise no need to call.
>> syncop_setxattr.
>>
>
>
> This can be problematic if a particular xattr is being removed - it might
> still exist on some subvols. IIUC, the heal would go and reset it again?
>
> One option is to use the hash subvol for the dir as the source - so
> perform xattr op on hashed subvol first and on the others only if it
> succeeds on the hashed. This does have the problem of being unable to set
> xattrs if the hashed subvol is unavailable. This might not be such a big
> deal in case of distributed replicate or distribute disperse volumes but
> will affect pure distribute. However, this way we can at least be
> reasonably certain of the correctness (leaving rebalance out of the
> picture).
>

* What is the behavior of getxattr when hashed subvol is down? Should we
succeed with values from non-hashed subvols or should we fail getxattr?
With hashed-subvol as source of truth, its difficult to determine
correctness of xattrs and their values when it is down.

* setxattr is an inode operation (as opposed to entry operation). So, we
cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
"basename" is not available. This forces us to store hashed subvol in
inode-ctx. Now, when the hashed-subvol changes we need to update these
inode-ctxs too.

What do you think about a Quorum based solution to this problem?

1. setxattr succeeds only if it is successful on at least (n/2 + 1) number
of subvols.
2. getxattr succeeds only if it is successful and values match on at least
(n/2 + 1) number of subvols.

The flip-side of this solution is we are increasing the probability of
failure of (get)(set)xattr operations as opposed to the hashed-subvol as
source of truth solution. Or are we - how do we compare probability of
hashed-subvol going down with probability of (n/2 + 1) nodes going down
simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
probability for _a specific subvol (hashed-subvol)_ going down (as opposed
to _any one subvol_ going down)?

>
>
>>
>>Let me know if this approach is suitable.
>>
>>
>>
>> Regards
>> Mohit Agrawal
>>
>> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>>> wrote:
>>>
 Hi Pranith,

 In current approach i am getting list of xattr from first up volume and
 update the user attributes from that xattr to
 all other volumes.

 I have assumed first up subvol is source and rest of them are sink as
 we are doing same in dht_dir_attr_heal.

>>>
>>> I think first up subvol is different for different mounts as per my
>>> understanding, I could be wrong.
>>>
>>>

 Regards
 Mohit Agrawal

 On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

> hi Mohit,
>How does dht find which subvolume has the correct list of
> xattrs? i.e. how does it determine which subvolume is source and which is
> sink?
>
> On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
> wrote:
>
>> Hi,
>>
>>   I am trying to find out solution of one problem in dht specific to
>> user xattr healing.
>>   I tried to correct it in a same way as we are doing for healing dir
>> attribute but i feel it is not best solution.
>>
>>   To find a right way to heal xattr i want to discuss with you if
>> anyone does have better solution to correct it.
>>
>>   Problem:
>>In a distributed volume environment custom extended attribute
>> value for a directory does not display correct value after stop/start the
>> brick. If any extended attribute value is set for a directory after stop
>> the brick the attribute value is not updated on brick after start the 
>> brick.
>>
>>   Current approach:
>> 1) function set_user_xattr to store user extended attribute in
>> dictionary
>> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
>> attribute on all volume
>> 3) Call the function (dht_dir_xattr_heal) for every directory
>> lookup in dht_lookup_revalidate_cbk
>>
>>   Psuedocode for function dht_dir_xatt_heal is like below

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Pranith Kumar Karampuri

On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran 
wrote:

>
>
> On 8 September 2016 at 12:02, Mohit Agrawal  wrote:
>
>> Hi All,
>>
>>I have one another solution to heal user xattr but before implement it
>> i would like to discuss with you.
>>
>>Can i call function (dht_dir_xattr_heal internally it is calling
>> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>>after make sure we have a valid xattr.
>>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
>> all subvolume or i can compare subvol xattr with valid xattr if there is
>> any mismatch then i will call syncop_setxattr otherwise no need to call.
>> syncop_setxattr.
>>
>
>
> This can be problematic if a particular xattr is being removed - it might
> still exist on some subvols. IIUC, the heal would go and reset it again?
>
> One option is to use the hash subvol for the dir as the source - so
> perform xattr op on hashed subvol first and on the others only if it
> succeeds on the hashed. This does have the problem of being unable to set
> xattrs if the hashed subvol is unavailable. This might not be such a big
> deal in case of distributed replicate or distribute disperse volumes but
> will affect pure distribute. However, this way we can at least be
> reasonably certain of the correctness (leaving rebalance out of the
> picture).
>

Yes, this seems fine.


>
>
>
>>
>>Let me know if this approach is suitable.
>>
>>
>>
>> Regards
>> Mohit Agrawal
>>
>> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>>> wrote:
>>>
 Hi Pranith,


 In current approach i am getting list of xattr from first up volume and
 update the user attributes from that xattr to
 all other volumes.

 I have assumed first up subvol is source and rest of them are sink as
 we are doing same in dht_dir_attr_heal.

>>>
>>> I think first up subvol is different for different mounts as per my
>>> understanding, I could be wrong.
>>>
>>>

 Regards
 Mohit Agrawal

 On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

> hi Mohit,
>How does dht find which subvolume has the correct list of
> xattrs? i.e. how does it determine which subvolume is source and which is
> sink?
>
> On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
> wrote:
>
>> Hi,
>>
>>   I am trying to find out solution of one problem in dht specific to
>> user xattr healing.
>>   I tried to correct it in a same way as we are doing for healing dir
>> attribute but i feel it is not best solution.
>>
>>   To find a right way to heal xattr i want to discuss with you if
>> anyone does have better solution to correct it.
>>
>>   Problem:
>>In a distributed volume environment custom extended attribute
>> value for a directory does not display correct value after stop/start the
>> brick. If any extended attribute value is set for a directory after stop
>> the brick the attribute value is not updated on brick after start the 
>> brick.
>>
>>   Current approach:
>> 1) function set_user_xattr to store user extended attribute in
>> dictionary
>> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
>> attribute on all volume
>> 3) Call the function (dht_dir_xattr_heal) for every directory
>> lookup in dht_lookup_revalidate_cbk
>>
>>   Psuedocode for function dht_dir_xatt_heal is like below
>>
>>1) First it will fetch atttributes from first up volume and store
>> into xattr.
>>2) Run loop on all subvolume and fetch existing attributes from
>> every volume
>>3) Replace user attributes from current attributes with xattr user
>> attributes
>>4) Set latest extended attributes(current + old user attributes)
>> inot subvol.
>>
>>
>>In this current approach problem is
>>
>>1) it will call heal function(dht_dir_xattr_heal) for every
>> directory lookup without comparing xattr.
>> 2) The function internally call syncop xattr for every subvolume
>> that would be a expensive operation.
>>
>>I have one another way like below to correct it but again in this
>> one it does have dependency on time (not sure time is synch on all bricks
>> or not)
>>
>>1) At the time of set extended attribute(setxattr) change time in
>> metadata at server side
>>2) Compare change time before call healing function in
>> dht_revalidate_cbk
>>
>> Please share your input on this.
>> Appreciate your input.
>>
>> Regards
>> Mohit Agrawal
>>
>> ___
>>

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Nithya Balachandran

On 8 September 2016 at 12:02, Mohit Agrawal  wrote:

> Hi All,
>
>I have one another solution to heal user xattr but before implement it
> i would like to discuss with you.
>
>Can i call function (dht_dir_xattr_heal internally it is calling
> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>after make sure we have a valid xattr.
>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
> all subvolume or i can compare subvol xattr with valid xattr if there is
> any mismatch then i will call syncop_setxattr otherwise no need to call.
> syncop_setxattr.
>


This can be problematic if a particular xattr is being removed - it might
still exist on some subvols. IIUC, the heal would go and reset it again?

One option is to use the hash subvol for the dir as the source - so perform
xattr op on hashed subvol first and on the others only if it succeeds on
the hashed. This does have the problem of being unable to set xattrs if the
hashed subvol is unavailable. This might not be such a big deal in case of
distributed replicate or distribute disperse volumes but will affect pure
distribute. However, this way we can at least be reasonably certain of the
correctness (leaving rebalance out of the picture).



>
>Let me know if this approach is suitable.
>
>
>
> Regards
> Mohit Agrawal
>
> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>> wrote:
>>
>>> Hi Pranith,
>>>
>>>
>>> In current approach i am getting list of xattr from first up volume and
>>> update the user attributes from that xattr to
>>> all other volumes.
>>>
>>> I have assumed first up subvol is source and rest of them are sink as we
>>> are doing same in dht_dir_attr_heal.
>>>
>>
>> I think first up subvol is different for different mounts as per my
>> understanding, I could be wrong.
>>
>>
>>>
>>> Regards
>>> Mohit Agrawal
>>>
>>> On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>
 hi Mohit,
How does dht find which subvolume has the correct list of
 xattrs? i.e. how does it determine which subvolume is source and which is
 sink?

 On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
 wrote:

> Hi,
>
>   I am trying to find out solution of one problem in dht specific to
> user xattr healing.
>   I tried to correct it in a same way as we are doing for healing dir
> attribute but i feel it is not best solution.
>
>   To find a right way to heal xattr i want to discuss with you if
> anyone does have better solution to correct it.
>
>   Problem:
>In a distributed volume environment custom extended attribute value
> for a directory does not display correct value after stop/start the brick.
> If any extended attribute value is set for a directory after stop the 
> brick
> the attribute value is not updated on brick after start the brick.
>
>   Current approach:
> 1) function set_user_xattr to store user extended attribute in
> dictionary
> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
> attribute on all volume
> 3) Call the function (dht_dir_xattr_heal) for every directory
> lookup in dht_lookup_revalidate_cbk
>
>   Psuedocode for function dht_dir_xatt_heal is like below
>
>1) First it will fetch atttributes from first up volume and store
> into xattr.
>2) Run loop on all subvolume and fetch existing attributes from
> every volume
>3) Replace user attributes from current attributes with xattr user
> attributes
>4) Set latest extended attributes(current + old user attributes)
> inot subvol.
>
>
>In this current approach problem is
>
>1) it will call heal function(dht_dir_xattr_heal) for every
> directory lookup without comparing xattr.
> 2) The function internally call syncop xattr for every subvolume
> that would be a expensive operation.
>
>I have one another way like below to correct it but again in this
> one it does have dependency on time (not sure time is synch on all bricks
> or not)
>
>1) At the time of set extended attribute(setxattr) change time in
> metadata at server side
>2) Compare change time before call healing function in
> dht_revalidate_cbk
>
> Please share your input on this.
> Appreciate your input.
>
> Regards
> Mohit Agrawal
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



 --
 Pranith

>>>
>>>
>>
>>
>> --
>> Pranith
>>
>
>
> ___
>

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-10 Thread Mohit Agrawal

Hi All,

   I have upload a new patch (http://review.gluster.org/#/c/15456/),Please
do the code review.

Regards
Mohit Agrawal


On Thu, Sep 8, 2016 at 12:02 PM, Mohit Agrawal  wrote:

> Hi All,
>
>I have one another solution to heal user xattr but before implement it
> i would like to discuss with you.
>
>Can i call function (dht_dir_xattr_heal internally it is calling
> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>after make sure we have a valid xattr.
>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
> all subvolume or i can compare subvol xattr with valid xattr if there is
> any mismatch then i will call syncop_setxattr otherwise no need to call.
> syncop_setxattr.
>
>Let me know if this approach is suitable.
>
>
>
> Regards
> Mohit Agrawal
>
> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>> wrote:
>>
>>> Hi Pranith,
>>>
>>>
>>> In current approach i am getting list of xattr from first up volume and
>>> update the user attributes from that xattr to
>>> all other volumes.
>>>
>>> I have assumed first up subvol is source and rest of them are sink as we
>>> are doing same in dht_dir_attr_heal.
>>>
>>
>> I think first up subvol is different for different mounts as per my
>> understanding, I could be wrong.
>>
>>
>>>
>>> Regards
>>> Mohit Agrawal
>>>
>>> On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>
 hi Mohit,
How does dht find which subvolume has the correct list of
 xattrs? i.e. how does it determine which subvolume is source and which is
 sink?

 On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
 wrote:

> Hi,
>
>   I am trying to find out solution of one problem in dht specific to
> user xattr healing.
>   I tried to correct it in a same way as we are doing for healing dir
> attribute but i feel it is not best solution.
>
>   To find a right way to heal xattr i want to discuss with you if
> anyone does have better solution to correct it.
>
>   Problem:
>In a distributed volume environment custom extended attribute value
> for a directory does not display correct value after stop/start the brick.
> If any extended attribute value is set for a directory after stop the 
> brick
> the attribute value is not updated on brick after start the brick.
>
>   Current approach:
> 1) function set_user_xattr to store user extended attribute in
> dictionary
> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
> attribute on all volume
> 3) Call the function (dht_dir_xattr_heal) for every directory
> lookup in dht_lookup_revalidate_cbk
>
>   Psuedocode for function dht_dir_xatt_heal is like below
>
>1) First it will fetch atttributes from first up volume and store
> into xattr.
>2) Run loop on all subvolume and fetch existing attributes from
> every volume
>3) Replace user attributes from current attributes with xattr user
> attributes
>4) Set latest extended attributes(current + old user attributes)
> inot subvol.
>
>
>In this current approach problem is
>
>1) it will call heal function(dht_dir_xattr_heal) for every
> directory lookup without comparing xattr.
> 2) The function internally call syncop xattr for every subvolume
> that would be a expensive operation.
>
>I have one another way like below to correct it but again in this
> one it does have dependency on time (not sure time is synch on all bricks
> or not)
>
>1) At the time of set extended attribute(setxattr) change time in
> metadata at server side
>2) Compare change time before call healing function in
> dht_revalidate_cbk
>
> Please share your input on this.
> Appreciate your input.
>
> Regards
> Mohit Agrawal
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



 --
 Pranith

>>>
>>>
>>
>>
>> --
>> Pranith
>>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-07 Thread Pranith Kumar Karampuri

On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal  wrote:

> Hi Pranith,
>
>
> In current approach i am getting list of xattr from first up volume and
> update the user attributes from that xattr to
> all other volumes.
>
> I have assumed first up subvol is source and rest of them are sink as we
> are doing same in dht_dir_attr_heal.
>

I think first up subvol is different for different mounts as per my
understanding, I could be wrong.


>
> Regards
> Mohit Agrawal
>
> On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>> hi Mohit,
>>How does dht find which subvolume has the correct list of xattrs?
>> i.e. how does it determine which subvolume is source and which is sink?
>>
>> On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
>> wrote:
>>
>>> Hi,
>>>
>>>   I am trying to find out solution of one problem in dht specific to
>>> user xattr healing.
>>>   I tried to correct it in a same way as we are doing for healing dir
>>> attribute but i feel it is not best solution.
>>>
>>>   To find a right way to heal xattr i want to discuss with you if anyone
>>> does have better solution to correct it.
>>>
>>>   Problem:
>>>In a distributed volume environment custom extended attribute value
>>> for a directory does not display correct value after stop/start the brick.
>>> If any extended attribute value is set for a directory after stop the brick
>>> the attribute value is not updated on brick after start the brick.
>>>
>>>   Current approach:
>>> 1) function set_user_xattr to store user extended attribute in
>>> dictionary
>>> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
>>> attribute on all volume
>>> 3) Call the function (dht_dir_xattr_heal) for every directory lookup
>>> in dht_lookup_revalidate_cbk
>>>
>>>   Psuedocode for function dht_dir_xatt_heal is like below
>>>
>>>1) First it will fetch atttributes from first up volume and store
>>> into xattr.
>>>2) Run loop on all subvolume and fetch existing attributes from every
>>> volume
>>>3) Replace user attributes from current attributes with xattr user
>>> attributes
>>>4) Set latest extended attributes(current + old user attributes) inot
>>> subvol.
>>>
>>>
>>>In this current approach problem is
>>>
>>>1) it will call heal function(dht_dir_xattr_heal) for every directory
>>> lookup without comparing xattr.
>>> 2) The function internally call syncop xattr for every subvolume
>>> that would be a expensive operation.
>>>
>>>I have one another way like below to correct it but again in this one
>>> it does have dependency on time (not sure time is synch on all bricks or
>>> not)
>>>
>>>1) At the time of set extended attribute(setxattr) change time in
>>> metadata at server side
>>>2) Compare change time before call healing function in
>>> dht_revalidate_cbk
>>>
>>> Please share your input on this.
>>> Appreciate your input.
>>>
>>> Regards
>>> Mohit Agrawal
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
>> --
>> Pranith
>>
>
>


-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-07 Thread Pranith Kumar Karampuri

hi Mohit,
   How does dht find which subvolume has the correct list of xattrs?
i.e. how does it determine which subvolume is source and which is sink?

On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal  wrote:

> Hi,
>
>   I am trying to find out solution of one problem in dht specific to user
> xattr healing.
>   I tried to correct it in a same way as we are doing for healing dir
> attribute but i feel it is not best solution.
>
>   To find a right way to heal xattr i want to discuss with you if anyone
> does have better solution to correct it.
>
>   Problem:
>In a distributed volume environment custom extended attribute value for
> a directory does not display correct value after stop/start the brick. If
> any extended attribute value is set for a directory after stop the brick
> the attribute value is not updated on brick after start the brick.
>
>   Current approach:
> 1) function set_user_xattr to store user extended attribute in
> dictionary
> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
> attribute on all volume
> 3) Call the function (dht_dir_xattr_heal) for every directory lookup
> in dht_lookup_revalidate_cbk
>
>   Psuedocode for function dht_dir_xatt_heal is like below
>
>1) First it will fetch atttributes from first up volume and store into
> xattr.
>2) Run loop on all subvolume and fetch existing attributes from every
> volume
>3) Replace user attributes from current attributes with xattr user
> attributes
>4) Set latest extended attributes(current + old user attributes) inot
> subvol.
>
>
>In this current approach problem is
>
>1) it will call heal function(dht_dir_xattr_heal) for every directory
> lookup without comparing xattr.
> 2) The function internally call syncop xattr for every subvolume that
> would be a expensive operation.
>
>I have one another way like below to correct it but again in this one
> it does have dependency on time (not sure time is synch on all bricks or
> not)
>
>1) At the time of set extended attribute(setxattr) change time in
> metadata at server side
>2) Compare change time before call healing function in
> dht_revalidate_cbk
>
> Please share your input on this.
> Appreciate your input.
>
> Regards
> Mohit Agrawal
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query regards to heal xattr heal in dht

2016-09-07 Thread Mohit Agrawal

Hi,

  I am trying to find out solution of one problem in dht specific to user
xattr healing.
  I tried to correct it in a same way as we are doing for healing dir
attribute but i feel it is not best solution.

  To find a right way to heal xattr i want to discuss with you if anyone
does have better solution to correct it.

  Problem:
   In a distributed volume environment custom extended attribute value for
a directory does not display correct value after stop/start the brick. If
any extended attribute value is set for a directory after stop the brick
the attribute value is not updated on brick after start the brick.

  Current approach:
1) function set_user_xattr to store user extended attribute in
dictionary
2) function dht_dir_xattr_heal call syncop_setxattr to update the
attribute on all volume
3) Call the function (dht_dir_xattr_heal) for every directory lookup in
dht_lookup_revalidate_cbk

  Psuedocode for function dht_dir_xatt_heal is like below

   1) First it will fetch atttributes from first up volume and store into
xattr.
   2) Run loop on all subvolume and fetch existing attributes from every
volume
   3) Replace user attributes from current attributes with xattr user
attributes
   4) Set latest extended attributes(current + old user attributes) inot
subvol.


   In this current approach problem is

   1) it will call heal function(dht_dir_xattr_heal) for every directory
lookup without comparing xattr.
2) The function internally call syncop xattr for every subvolume that
would be a expensive operation.

   I have one another way like below to correct it but again in this one it
does have dependency on time (not sure time is synch on all bricks or not)

   1) At the time of set extended attribute(setxattr) change time in
metadata at server side
   2) Compare change time before call healing function in dht_revalidate_cbk

Please share your input on this.
Appreciate your input.

Regards
Mohit Agrawal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query!

2016-06-17 Thread ABHISHEK PALIWAL

Hi,

I am using Gluster 3.7.6 and performing plug in plug out of the board but
getting following brick logs after plug in board again:

[2016-06-17 07:14:36.122421] W [trash.c:1858:trash_mkdir]
0-c_glusterfs-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-06-17 07:14:36.122487] E [MSGID: 115056]
[server-rpc-fops.c:509:server_mkdir_cbk] 0-c_glusterfs-server: 9705: MKDIR
/.trashcan (----0001/.trashcan) ==> (Operation
not permitted) [Operation not permitted]
[2016-06-17 07:14:36.139773] W [trash.c:1858:trash_mkdir]
0-c_glusterfs-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-06-17 07:14:36.139861] E [MSGID: 115056]
[server-rpc-fops.c:509:server_mkdir_cbk] 0-c_glusterfs-server: 9722: MKDIR
/.trashcan (----0001/.trashcan) ==> (Operation
not permitted) [Operation not permitted]


Could any one tell me the reason behind this failure like when and why
these log occurs.
I have already pushed same query previously but did not get any response.

-- 




Regards
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL

I am not getting any failure and after restart the glusterd when I run
volume info command it creates the brick directory
as well as .glsuterfs (xattrs).

but some time even after restart the glusterd, volume info command showing
no volume present.

Could you please tell me why this unpredictable problem is occurring.

Regards,
Abhishek

On Fri, May 20, 2016 at 3:50 PM, Kaushal M  wrote:

> This would erase the xattrs set on the brick root (volume-id), which
> identify it as a brick. Brick processes will fail to start when this
> xattr isn't present.
>
>
> On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
>  wrote:
> > Hi
> >
> > What will happen if we format the volume where the bricks of replicate
> > gluster volume's are created and restart the glusterd on both node.
> >
> > It will work fine or in this case need to remove /var/lib/glusterd
> directory
> > as well.
> >
> > --
> > Regards
> > Abhishek Paliwal
> >
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>

-- 

Regards
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query!

2016-05-20 Thread Kaushal M

This would erase the xattrs set on the brick root (volume-id), which
identify it as a brick. Brick processes will fail to start when this
xattr isn't present.


On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
 wrote:
> Hi
>
> What will happen if we format the volume where the bricks of replicate
> gluster volume's are created and restart the glusterd on both node.
>
> It will work fine or in this case need to remove /var/lib/glusterd directory
> as well.
>
> --
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL

Hi

What will happen if we format the volume where the bricks of replicate
gluster volume's are created and restart the glusterd on both node.

It will work fine or in this case need to remove /var/lib/glusterd
directory as well.

-- 
Regards
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query on healing process

2016-03-14 Thread Ravishankar N

On 03/14/2016 10:36 AM, ABHISHEK PALIWAL wrote:

Hi Ravishankar,

I just want to inform that this file have some different properties 
from other files like this is the file which having the fixed size and 
when there is no space in file the next data will start wrapping from 
the top of the file.

Means in this file we are doing the wrapping of the data as well.

So, I just want to know is this feature of file will effect gluster to 
identify the split-brain or xattr attributes?

Hi,
No it shouldn't matter at what offset the writes happen. The xattrs only 
track that the write was  missed  (and therefore a pending heal), 
irrespective of (offset, length).

Ravi

Regards,
Abhishek

On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL 
> wrote:

On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N
> wrote:

On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:

Ok, just to confirm, glusterd  and other brick processes
are running after this node rebooted?
When you run the above command, you need to check
/var/log/glusterfs/glfsheal-volname.log logs errros.
Setting client-log-level to DEBUG would give you a more
verbose message

Yes, glusterd and other brick processes running fine. I have
check the /var/log/glusterfs/glfsheal-volname.log file
without the log-level= DEBUG. Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012]
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs:
could not open the file
/proc/sys/net/ipv4/ip_local_reserved_ports for getting
reserved ports info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081]
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs:
Not able to get reserved ports, hence there is a possibility
that glusterfs may consume reserved port
[2016-03-02 13:51:39.072583] E
[socket.c:2278:socket_connect_finish] 0-gfapi: connection to
127.0.0.1:24007  failed (Connection
refused)

Not sure why ^^ occurs. You could try flushing iptables
(iptables -F), restart glusterd and run the heal info command
again .

No hint from the logs? I'll try your suggestion.

[2016-03-02 13:51:39.072663] E [MSGID: 104024]
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to
connect with remote-host: localhost (Transport endpoint is
not connected) [Transport endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025]
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all
volfile servers [Transport endpoint is not connected]

# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.

And based on the your observation I understood that this
is not the problem of split-brain but *is there any way
through which can find out the file which is not in
split-brain as well as not in sync?*

`gluster volume heal c_glusterfs info split-brain` 
should give you files that need heal.

Sorry  I meant 'gluster volume heal c_glusterfs info' should
give you the files that need heal and 'gluster volume heal
c_glusterfs info split-brain' the list of files in split-brain.
The commands are detailed in

https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md

Yes, I have tried this as well It is also giving Number of entries
: 0 means no healing is required but the file
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
is not in sync both of brick showing the different version of this
file.

You can see it in the getfattr command outcome as well.

# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006**//because
client8 is the latest client in our case and starting 8 digits **
*
*0006are saying like there is something

Re: [Gluster-devel] Query on healing process

2016-03-13 Thread ABHISHEK PALIWAL

Hi Ravishankar,

I just want to inform that this file have some different properties from
other files like this is the file which having the fixed size and when
there is no space in file the next data will start wrapping from the top of
the file.

Means in this file we are doing the wrapping of the data as well.

So, I just want to know is this feature of file will effect gluster to
identify the split-brain or xattr attributes?

Regards,
Abhishek

On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL 
wrote:

>
>
> On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N 
> wrote:
>
>> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>>
>>
>>> Ok, just to confirm, glusterd  and other brick processes are running
>>> after this node rebooted?
>>> When you run the above command, you need to check
>>> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>>> client-log-level to DEBUG would give you a more verbose message
>>>
>>> Yes, glusterd and other brick processes running fine. I have check the
>> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
>> Here is the logs from that file
>>
>> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
>> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
>> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
>> info [No such file or directory]
>> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
>> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
>> get reserved ports, hence there is a possibility that glusterfs may consume
>> reserved port
>> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
>> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>>
>>
>> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
>> restart glusterd and run the heal info command again .
>>
>
> No hint from the logs? I'll try your suggestion.
>
>>
>> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
>> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
>> remote-host: localhost (Transport endpoint is not connected) [Transport
>> endpoint is not connected]
>> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
>> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
>> servers [Transport endpoint is not connected]
>>
>>> # gluster volume heal c_glusterfs info split-brain
>>> c_glusterfs: Not able to fetch volfile from glusterd
>>> Volume heal failed.
>>>
>>>
>>>
>>>
>>> And based on the your observation I understood that this is not the
>>> problem of split-brain but *is there any way through which can find out
>>> the file which is not in split-brain as well as not in sync?*
>>>
>>>
>>> `gluster volume heal c_glusterfs info split-brain`  should give you
>>> files that need heal.
>>>
>>
>> Sorry  I meant 'gluster volume heal c_glusterfs info' should give you
>> the files that need heal and 'gluster volume heal c_glusterfs info
>> split-brain' the list of files in split-brain.
>> The commands are detailed in
>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>>
>
> Yes, I have tried this as well It is also giving Number of entries : 0
> means no healing is required but the file
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is
> not in sync both of brick showing the different version of this file.
>
> You can see it in the getfattr command outcome as well.
>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x
> trusted.afr.c_glusterfs-client-2=0x
> trusted.afr.c_glusterfs-client-4=0x
> trusted.afr.c_glusterfs-client-6=0x
> trusted.afr.c_glusterfs-client-8=*0x0006** //because
> client8 is the latest client in our case and starting 8 digits *
>
> *0006are saying like there is something in changelog data.*
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x** // and
> here we can say that there is no split brain but the file is out of sync*
> trusted.afr.dirty=0x
>

Re: [Gluster-devel] Query on healing process

2016-03-07 Thread ABHISHEK PALIWAL

On Fri, Mar 4, 2016 at 5:31 PM, Ravishankar N 
wrote:

> On 03/04/2016 12:10 PM, ABHISHEK PALIWAL wrote:
>
> Hi Ravi,
>
> 3. On the rebooted node, do you have ssl enabled by any chance? There is a
> bug for "Not able to fetch volfile' when ssl is enabled:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>
> -> I have checked but ssl is disabled but still getting these errors
>
> # gluster volume heal c_glusterfs info
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
> Ok, just to confirm, glusterd  and other brick processes are running after
> this node rebooted?
> When you run the above command, you need to check
> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
> client-log-level to DEBUG would give you a more verbose message
>
> Yes, glusterd and other brick processes running fine. I have check the
/var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012]
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081]
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may consume
reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
[2016-03-02 13:51:39.072663] E [MSGID: 104024]
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
remote-host: localhost (Transport endpoint is not connected) [Transport
endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025]
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
servers [Transport endpoint is not connected]

> # gluster volume heal c_glusterfs info split-brain
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
>
>
> And based on the your observation I understood that this is not the
> problem of split-brain but *is there any way through which can find out
> the file which is not in split-brain as well as not in sync?*
>
>
> `gluster volume heal c_glusterfs info split-brain`  should give you files
> that need heal.
>

I have run "gluster volume heal c_glusterfs info split-brain" command but
it is not showing that file which is out of sync that is the issue file is
not in sync on both of the brick and split-brain is not showing that
command in output for heal required.

Thats is why I am asking that is there any command other than this split
brain command so that I can find out the files those are required the heal
operation but not displayed in the output of "gluster volume heal
c_glusterfs info split-brain" command.

>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x
> trusted.afr.c_glusterfs-client-2=0x
> trusted.afr.c_glusterfs-client-4=0x
> trusted.afr.c_glusterfs-client-6=0x
> trusted.afr.c_glusterfs-client-8=*0x0006** //because
> client8 is the latest client in our case and starting 8 digits *
>
> *0006are saying like there is something in changelog data. *
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x** // and
> here we can say that there is no split brain but the file is out of sync*
> trusted.afr.dirty=0x
> trusted.bit-rot.version=0x001156d86c290005735c
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
> Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
> Options Reconfigured:
> performance.readdir-ahead: on
> network.ping-timeout: 4
> nfs.disable: on
>
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume

Re: [Gluster-devel] Query on healing process

2016-03-04 Thread Ravishankar N


On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:



Ok, just to confirm, glusterd  and other brick processes are
running after this node rebooted?
When you run the above command, you need to check
/var/log/glusterfs/glfsheal-volname.log logs errros. Setting
client-log-level to DEBUG would give you a more verbose message

Yes, glusterd and other brick processes running fine. I have check the 
/var/log/glusterfs/glfsheal-volname.log file without the log-level= 
DEBUG. Here is the logs from that file


[2016-03-02 13:51:39.059440] I [MSGID: 101190] 
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started 
thread with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012] 
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not 
open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting 
reserved ports info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081] 
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able 
to get reserved ports, hence there is a possibility that glusterfs may 
consume reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] 
0-gfapi: connection to 127.0.0.1:24007  failed 
(Connection refused)


Not sure why ^^ occurs. You could try flushing iptables (iptables -F), 
restart glusterd and run the heal info command again .


[2016-03-02 13:51:39.072663] E [MSGID: 104024] 
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with 
remote-host: localhost (Transport endpoint is not connected) 
[Transport endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025] 
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile 
servers [Transport endpoint is not connected]



# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.





And based on the your observation I understood that this is not
the problem of split-brain but *is there any way through which
can find out the file which is not in split-brain as well as not
in sync?*


`gluster volume heal c_glusterfs info split-brain`  should give
you files that need heal.



Sorry  I meant 'gluster volume heal c_glusterfs info' should give you 
the files that need heal and 'gluster volume heal c_glusterfs info 
split-brain' the list of files in split-brain.
The commands are detailed in 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md




I have run "gluster volume heal c_glusterfs info split-brain" command 
but it is not showing that file which is out of sync that is the issue 
file is not in sync on both of the brick and split-brain is not 
showing that command in output for heal required.


Thats is why I am asking that is there any command other than this 
split brain command so that I can find out the files those are 
required the heal operation but not displayed in the output of 
"gluster volume heal c_glusterfs info split-brain" command.








___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query on healing process

2016-03-04 Thread Ravishankar N


On 03/04/2016 12:10 PM, ABHISHEK PALIWAL wrote:

Hi Ravi,

3. On the rebooted node, do you have ssl enabled by any chance? There 
is a bug for "Not able to fetch volfile' when ssl is enabled: 
https://bugzilla.redhat.com/show_bug.cgi?id=1258931


-> I have checked but ssl is disabled but still getting these errors

# gluster volume heal c_glusterfs info
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.



Ok, just to confirm, glusterd  and other brick processes are running 
after this node rebooted?
When you run the above command, you need to check 
/var/log/glusterfs/glfsheal-volname.log logs errros. Setting 
client-log-level to DEBUG would give you a more verbose message



# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.





And based on the your observation I understood that this is not the 
problem of split-brain but *is there any way through which can find 
out the file which is not in split-brain as well as not in sync?*


`gluster volume heal c_glusterfs info split-brain` should give you files 
that need heal.




# getfattr -m . -d -e hex 
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-0=0x
trusted.afr.c_glusterfs-client-2=0x
trusted.afr.c_glusterfs-client-4=0x
trusted.afr.c_glusterfs-client-6=0x
trusted.afr.c_glusterfs-client-8=*0x0006**//because client8 
is the latest client in our case and starting 8 digits **

*
*0006are saying like there is something in changelog data.
*
trusted.afr.dirty=0x
trusted.bit-rot.version=0x001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex 
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

getfattr: Removing leading '/' from absolute path names
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x**// and 
here we can say that there is no split brain but the file is out of sync*

trusted.afr.dirty=0x
trusted.bit-rot.version=0x001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on


# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on

# gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git 

Copyright (c) 2006-2011 Gluster Inc. > 


GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU 
General Public License.

# gluster volume heal info heal-failed
Usage: volume heal  [enable | disable | full |statistics 
[heal-count [replica ]] |info [healed | 
heal-failed | split-brain] |split-brain {bigger-file  
|source-brick  []}]

# gluster volume heal c_glusterfs info heal-failed
Command not supported. Please use "gluster volume heal c_glusterfs 
info" and logs to find the heal information.

# lhsh 002500
 ___  _ _  _ __   _ _ _ _ _
 |   |_] |_]  ||   | \  | | |  \___/
 |_  |   |  |_ __|__ |  \_| |_| _/   \_

002500> gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git 

Copyright (c) 2006-2011 Gluster Inc. > 


GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU 
General Public License.

002500>

Re: [Gluster-devel] Query on healing process

2016-03-03 Thread Ravishankar N

Hi,

On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,

As I discussed earlier this issue, I investigated this issue and find
that healing is not triggered because the "gluster volume heal
c_glusterfs info split-brain" command not showing any entries as a
outcome of this command even though the file in split brain case.

Couple of observations from the 'commands_output' file.

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

The afr xattrs do not indicate that the file is in split brain:
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-1=0x
trusted.afr.dirty=0x
trusted.bit-rot.version=0x000b56d6dd1d000ec7a9
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

trusted.afr.c_glusterfs-client-0=0x0008
trusted.afr.c_glusterfs-client-2=0x0002
trusted.afr.c_glusterfs-client-4=0x0002
trusted.afr.c_glusterfs-client-6=0x0002
trusted.afr.dirty=0x
trusted.bit-rot.version=0x000b56d6dcb7000c87e7
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

1. There doesn't seem to be a split-brain going by the trusted.afr* xattrs.
2. You seem to have re-used the bricks from another volume/setup. For
replica 2, only trusted.afr.c_glusterfs-client-0 and
trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs -
client-0,2,4 and 6
3. On the rebooted node, do you have ssl enabled by any chance? There is
a bug for "Not able to fetch volfile' when ssl is enabled:
https://bugzilla.redhat.com/show_bug.cgi?id=1258931

Btw, you for data and metadata split-brains you can use the gluster CLI
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
instead of modifying the file from the back end.

-Ravi

So, what I have done I manually deleted the gfid entry of that file
from .glusterfs directory and follow the instruction mentioned in the
following link to do heal

https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md

and this works fine for me.

But my question is why the split-brain command not showing any file in
output.

Here I am attaching all the log which I get from the node for you and
also the output of commands from both of the boards

In this tar file two directories are present

000300 - log for the board which is running continuously
002500- log for the board which is rebooted

I am waiting for your reply please help me out on this issue.

Thanks in advanced.

Regards,
Abhishek

On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL
> wrote:

On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N
> wrote:

On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:

Yes correct

Okay, so when you say the files are not in sync until some
time, are you getting stale data when accessing from the mount?
I'm not able to figure out why heal info shows zero when the
files are not in sync, despite all IO happening from the
mounts. Could you provide the output of getfattr -d -m . -e
hex /brick/file-name from both bricks when you hit this issue?

I'll provide the logs once I get. here delay means we are
powering on the second board after the 10 minutes.

On Feb 26, 2016 9:57 AM, "Ravishankar N"
> wrote:

Hello,

On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,

Thanks for the response.

We are using Glugsterfs-3.7.8

Here is the use case:

We have a logging file which saves logs of the events
for every board of a node and these files are in sync
using glusterfs. System in replica 2 mode it means When
one brick in a replicated volume goes offline, the
glusterd daemons on the other nodes keep track of all
the files that are not replicated to the offline brick.
When the offline brick becomes available again, the
cluster initiates a healing process, replicating the
updated files to that brick. But in our casse, we see
that log file of one board is not in the sync and its
format is corrupted means files are not in sync.

Just to understand you correctly, you have mounted the 2
node replica-2 volume on both these nodes and writing to
a logging file from the mounts right?

Even the outcome of #gluster volume heal c_glusterfs
info shows

Re: [Gluster-devel] Query on healing process

2016-02-25 Thread Ravishankar N


Hello,

On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:

Hi Ravi,

Thanks for the response.

We are using Glugsterfs-3.7.8

Here is the use case:

We have a logging file which saves logs of the events for every board 
of a node and these files are in sync using glusterfs. System in 
replica 2 mode it means When one brick in a replicated volume goes 
offline, the glusterd daemons on the other nodes keep track of all the 
files that are not replicated to the offline brick. When the offline 
brick becomes available again, the cluster initiates a healing 
process, replicating the updated files to that brick. But in our 
casse, we see that log file of one board is not in the sync and its 
format is corrupted means files are not in sync.


Just to understand you correctly, you have mounted the 2 node replica-2 
volume on both these nodes and writing to a logging file from the mounts 
right?




Even the outcome of #gluster volume heal c_glusterfs info shows that 
there is no pending heals.


Also , The logging file which is updated is of fixed size and the new 
entries will be wrapped ,overwriting the old entries.


This way we have seen that after few restarts , the contents of the 
same file on two bricks are different , but the volume heal info shows 
zero entries


Solution:

But when we tried to put delay > 5 min before the healing everything 
is working fine.


Regards,
Abhishek

On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N > wrote:


On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:

Hi,

Here, I have one query regarding the time taken by the healing
process.
In current two node setup when we rebooted one node then the
self-healing process starts less than 5min interval on the board
which resulting the corruption of the some files data.


Heal should start immediately after the brick process comes up.
What version of gluster are you using? What do you mean by
corruption of data? Also, how did you observe that the heal
started after 5 minutes?
-Ravi


And to resolve it I have search on google and found the following
link:
https://support.rackspace.com/how-to/glusterfs-troubleshooting/

Mentioning that the healing process can takes upto 10min of time
to start this process.

Here is the statement from the link:

"Healing replicated volumes

When any brick in a replicated volume goes offline, the glusterd
daemons on the remaining nodes keep track of all the files that
are not replicated to the offline brick. When the offline brick
becomes available again, the cluster initiates a healing process,
replicating the updated files to that brick. *The start of this
process can take up to 10 minutes, based on observation.*"

After giving the time of more than 5 min file corruption problem
has been resolved.

So, Here my question is there any way through which we can reduce
the time taken by the healing process to start?


Regards,
Abhishek Paliwal




___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel






--




Regards
Abhishek Paliwal



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query on healing process

2016-02-25 Thread Ravishankar N


On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:

Hi,

Here, I have one query regarding the time taken by the healing process.
In current two node setup when we rebooted one node then the 
self-healing process starts less than 5min interval on the board which 
resulting the corruption of the some files data.


Heal should start immediately after the brick process comes up. What 
version of gluster are you using? What do you mean by corruption of 
data? Also, how did you observe that the heal started after 5 minutes?

-Ravi


And to resolve it I have search on google and found the following link:
https://support.rackspace.com/how-to/glusterfs-troubleshooting/

Mentioning that the healing process can takes upto 10min of time to 
start this process.


Here is the statement from the link:

"Healing replicated volumes

When any brick in a replicated volume goes offline, the glusterd 
daemons on the remaining nodes keep track of all the files that are 
not replicated to the offline brick. When the offline brick becomes 
available again, the cluster initiates a healing process, replicating 
the updated files to that brick. *The start of this process can take 
up to 10 minutes, based on observation.*"


After giving the time of more than 5 min file corruption problem has 
been resolved.


So, Here my question is there any way through which we can reduce the 
time taken by the healing process to start?



Regards,
Abhishek Paliwal




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query on healing process

2016-02-25 Thread ABHISHEK PALIWAL

Hi,

Here, I have one query regarding the time taken by the healing process.
In current two node setup when we rebooted one node then the self-healing
process starts less than 5min interval on the board which resulting the
corruption of the some files data.

And to resolve it I have search on google and found the following link:
https://support.rackspace.com/how-to/glusterfs-troubleshooting/

Mentioning that the healing process can takes upto 10min of time to start
this process.

Here is the statement from the link:

"Healing replicated volumes

When any brick in a replicated volume goes offline, the glusterd daemons on
the remaining nodes keep track of all the files that are not replicated to
the offline brick. When the offline brick becomes available again, the
cluster initiates a healing process, replicating the updated files to that
brick. *The start of this process can take up to 10 minutes, based on
observation.*"

After giving the time of more than 5 min file corruption problem has been
resolved.

So, Here my question is there any way through which we can reduce the time
taken by the healing process to start?


Regards,
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Query on healing process

2016-02-25 Thread ABHISHEK PALIWAL

Hi,

Here, I have one query regarding the time taken by the healing process.
In current two node setup when we rebooted one node then the self-healing
process starts less than 5min interval on the board which resulting the
corruption of the some files data.

And to resolve it I have search on google and found the following link:
https://support.rackspace.com/how-to/glusterfs-troubleshooting/

Mentioning that the healing process can takes upto 10min of time to start
this process.

Here is the statement from the link:

"Healing replicated volumes

When any brick in a replicated volume goes offline, the glusterd daemons on
the remaining nodes keep track of all the files that are not replicated to
the offline brick. When the offline brick becomes available again, the
cluster initiates a healing process, replicating the updated files to that
brick. *The start of this process can take up to 10 minutes, based on
observation.*"

After giving the time of more than 5 min file corruption problem has been
resolved.

So, Here my question is there any way through which we can reduce the time
taken by the healing process to start?


Regards,
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query: Trash feature.

2015-03-18 Thread Anoop C S


On 03/19/2015 10:58 AM, Kotresh Hiremath Ravishankar wrote:
 Hi All,
 
 Why is .trashcan directory created at the same level as .glusterfs and
 not inside .glusterfs? In glusterfs, crawlers, usually crawl back end
 filesystem and ignore .glusterfs. Now in addition, they have to ignore
 .trashcan.

As per the design of trash translator, it should be visible from the
mount point. That's why trash directory is not included in .glusterfs.
Moreover trash directory can be identified with a fixed gfid
{---0005}.

--Anoop C S.

 
 Thanks and Regards,
 Kotresh H R
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

52 matches

Mail list logo