Re: [Gluster-devel] Question about EC locking

2017-01-13 Thread Xavier Hernandez

Hi,

On 13/01/17 10:58, jayakrishnan mm wrote:

Hi Xavier,
I went through the source  code. Some questions remain.

1. If two clients try to write to same file, it should succeed, even if
they overlap. (Locks should ensure it happens in sequence, in the bricks).
from the source code
 lock->flock.l_type = F_WRLCK;
 lock->flock.l_whence = SEEK_SET;

fop->flock.l_len += ec_adjust_offset(fop->xl->private,
 >flock.l_start, 1);
fop->flock.l_len = ec_adjust_size(fop->xl->private,
  fop->flock.l_len, 1);
if flock.l_len is 0, the entire file  is locked for writing

In my test case  with 2 clients, I always  get  flock.l_len as 0. But
still  I am able to write to the same file  from both clients at the
 same time.


How are you sure you are really writing at the same time ? do you get 
partial writes from some of the client ?




If it is  acquiring lock chunk by chunk, why I am getting l_len =0
always ?


EC doesn't acquire partial locks. The entire file is locked when a 
modification is needed. This makes possible to reuse locks for future 
operations (eager locking).



Why I am not getting the actual write size  and offset f(for
flock.l_len & flock.l_start respectively) for each  write FOP ?
(In afr , it is set to transaction.len transaction.start respectively,
which in turn is  write length & offset  for the normal write case)


Because an erasure code splits the data is smaller fragments for each 
brick, so offsets and lengths need to be adjusted.




2. As per source code ,a full file lock is taken by the shd also.

ec_heal_inodelk(heal, F_WRLCK, 1, 0, 0);
 which means  offset=0 & size=0  in  ec_heal_lock() function in ec-heal.c
flock.l_start = offset;
flock.l_len = size;
Does it mean , in a single file write cannot happen simultaneously with
healing?


Correct. Heal procedure is like an additional client. If a client and 
the heal process try to write at the same time, they must be serialized, 
like any other regular write. However heal only takes the full lock for 
some critical operations. Regular self heal of file contents is done 
locking chunk by chunk.


Xavi



Correct me , if I am wrong.

Best Regards
JK






On Wed, Dec 14, 2016 at 12:07 PM, jayakrishnan mm
> wrote:

Thanks Xavier, for making it clear.
Regards
JK


On Dec 13, 2016 3:52 PM, "Xavier Hernandez" > wrote:

Hi JK,


On 12/13/2016 08:34 AM, jayakrishnan mm wrote:

Dear Xavi,

How do I test  the locks, for example locks  for write fop.
I have two
clients(independent), both  are  trying to write to same file.


1. According to my understanding, both  can successfully
write  if the
offsets don't overlap . I mean, the WRITE FOP  takes a chunk
lock on the
file . As
long as the clients don't try  to write to the same chunk,
it should be
OK. If no locks  present, it can lead to inconsistency.


With locks all writes will be fine as defined by posix (i.e. the
final result will be equivalent to the sequential execution of
both operations, though in an undefined order), even if they
overlap. Without locks, there are chances that some bricks
execute the operations in one order and the remaining bricks
execute the same operations in the reverse order, causing data
corruption.




2.  Different FOPs can always run simultaneously. (Example
WRITE  and
READ FOPs, or  two READ FOPs).


All fops can be executed concurrently. If there's any chance
that two operations could interfere, locks are taken in the
appropriate places. For example, reads cannot be merged with
overlapping writes. Otherwise they could return inconsistent data.



3. WRITE & some metadata FOP (like setattr)  together .
Cannot happen
together with locks , even though chances  are very low.


As in 2, if there's any possible interference, the appropriate
locks will be taken.

You can look at the code to see which locks are taken for each
fop. See the corresponding ec_manager_() function, in the
EC_STATE_LOCK switch case. There you will see calls to
ec_lock_prepare_xxx() for each taken lock.

Xavi


Pls. clarify.

Best regards
JK



On Wed, Nov 30, 2016 at 5:49 PM, jayakrishnan mm

>> wrote:

Hi Xavier,


Re: [Gluster-devel] Question about EC locking

2017-01-13 Thread jayakrishnan mm
Hi Xavier,
I went through the source  code. Some questions remain.

1. If two clients try to write to same file, it should succeed, even if
they overlap. (Locks should ensure it happens in sequence, in the bricks).
from the source code
 lock->flock.l_type = F_WRLCK;
 lock->flock.l_whence = SEEK_SET;

fop->flock.l_len += ec_adjust_offset(fop->xl->private,
 >flock.l_start, 1);
fop->flock.l_len = ec_adjust_size(fop->xl->private,
  fop->flock.l_len, 1);
if flock.l_len is 0, the entire file  is locked for writing

In my test case  with 2 clients, I always  get  flock.l_len as 0. But still
 I am able to write to the same file  from both clients at the  same time.

If it is  acquiring lock chunk by chunk, why I am getting l_len =0 always ?
Why I am not getting the actual write size  and offset f(for flock.l_len &
flock.l_start respectively) for each  write FOP ?
(In afr , it is set to transaction.len transaction.start respectively,
which in turn is  write length & offset  for the normal write case)

2. As per source code ,a full file lock is taken by the shd also.

ec_heal_inodelk(heal, F_WRLCK, 1, 0, 0);
 which means  offset=0 & size=0  in  ec_heal_lock() function in ec-heal.c
flock.l_start = offset;
flock.l_len = size;
Does it mean , in a single file write cannot happen simultaneously with
healing?

Correct me , if I am wrong.

Best Regards
JK






On Wed, Dec 14, 2016 at 12:07 PM, jayakrishnan mm  wrote:

> Thanks Xavier, for making it clear.
> Regards
> JK
>
> On Dec 13, 2016 3:52 PM, "Xavier Hernandez"  wrote:
>
> Hi JK,
>
>
> On 12/13/2016 08:34 AM, jayakrishnan mm wrote:
>
>> Dear Xavi,
>>
>> How do I test  the locks, for example locks  for write fop. I have two
>> clients(independent), both  are  trying to write to same file.
>>
>>
>> 1. According to my understanding, both  can successfully write  if the
>> offsets don't overlap . I mean, the WRITE FOP  takes a chunk lock on the
>> file . As
>> long as the clients don't try  to write to the same chunk, it should be
>> OK. If no locks  present, it can lead to inconsistency.
>>
>
> With locks all writes will be fine as defined by posix (i.e. the final
> result will be equivalent to the sequential execution of both operations,
> though in an undefined order), even if they overlap. Without locks, there
> are chances that some bricks execute the operations in one order and the
> remaining bricks execute the same operations in the reverse order, causing
> data corruption.
>
>
>
>>
>> 2.  Different FOPs can always run simultaneously. (Example  WRITE  and
>> READ FOPs, or  two READ FOPs).
>>
>
> All fops can be executed concurrently. If there's any chance that two
> operations could interfere, locks are taken in the appropriate places. For
> example, reads cannot be merged with overlapping writes. Otherwise they
> could return inconsistent data.
>
>
>
>> 3. WRITE & some metadata FOP (like setattr)  together . Cannot happen
>> together with locks , even though chances  are very low.
>>
>
> As in 2, if there's any possible interference, the appropriate locks will
> be taken.
>
> You can look at the code to see which locks are taken for each fop. See
> the corresponding ec_manager_() function, in the EC_STATE_LOCK switch
> case. There you will see calls to ec_lock_prepare_xxx() for each taken lock.
>
> Xavi
>
>
>> Pls. clarify.
>>
>> Best regards
>> JK
>>
>>
>>
>> On Wed, Nov 30, 2016 at 5:49 PM, jayakrishnan mm
>> > wrote:
>>
>> Hi Xavier,
>>
>> Thank you very much for your explanation. This helped  me to
>> understand  more  about  locking in EC.
>>
>> Best Regards
>> JK
>>
>>
>> On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez
>> > wrote:
>>
>> Hi,
>>
>> On 11/28/2016 02:59 AM, jayakrishnan mm wrote:
>>
>> Hi Xavier,
>>
>> Notice  that EC xlator uses blocking locks. Any specific
>> reason for this?
>>
>>
>> In a distributed filesystem like gluster a synchronization
>> mechanism is a must to avoid data corruption.
>>
>>
>> Do you think this will  affect the  performance ?
>>
>>
>> Of course the need for locks has a performance impact, and we
>> cannot avoid them to guarantee data integrity. However some
>> optimizations have been applied, specially the eager locking
>> which allows a lock to be reused without unlocking/locking again.
>>
>>
>> (In comparison AFR  first tries  non blocking locks  and if
>> not
>> successful, tries blocking locks then)
>>
>>
>> EC also tries a non-blocking lock first.
>>
>>
>> Also, why two locks  are  needed  per FOP ? One for normal
>> I/O and

Re: [Gluster-devel] Question about EC locking

2016-12-12 Thread Xavier Hernandez

Hi JK,

On 12/13/2016 08:34 AM, jayakrishnan mm wrote:

Dear Xavi,

How do I test  the locks, for example locks  for write fop. I have two
clients(independent), both  are  trying to write to same file.


1. According to my understanding, both  can successfully write  if the
offsets don't overlap . I mean, the WRITE FOP  takes a chunk lock on the
file . As
long as the clients don't try  to write to the same chunk, it should be
OK. If no locks  present, it can lead to inconsistency.


With locks all writes will be fine as defined by posix (i.e. the final 
result will be equivalent to the sequential execution of both 
operations, though in an undefined order), even if they overlap. Without 
locks, there are chances that some bricks execute the operations in one 
order and the remaining bricks execute the same operations in the 
reverse order, causing data corruption.





2.  Different FOPs can always run simultaneously. (Example  WRITE  and
READ FOPs, or  two READ FOPs).


All fops can be executed concurrently. If there's any chance that two 
operations could interfere, locks are taken in the appropriate places. 
For example, reads cannot be merged with overlapping writes. Otherwise 
they could return inconsistent data.




3. WRITE & some metadata FOP (like setattr)  together . Cannot happen
together with locks , even though chances  are very low.


As in 2, if there's any possible interference, the appropriate locks 
will be taken.


You can look at the code to see which locks are taken for each fop. See 
the corresponding ec_manager_() function, in the EC_STATE_LOCK 
switch case. There you will see calls to ec_lock_prepare_xxx() for each 
taken lock.


Xavi



Pls. clarify.

Best regards
JK



On Wed, Nov 30, 2016 at 5:49 PM, jayakrishnan mm
> wrote:

Hi Xavier,

Thank you very much for your explanation. This helped  me to
understand  more  about  locking in EC.

Best Regards
JK


On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez
> wrote:

Hi,

On 11/28/2016 02:59 AM, jayakrishnan mm wrote:

Hi Xavier,

Notice  that EC xlator uses blocking locks. Any specific
reason for this?


In a distributed filesystem like gluster a synchronization
mechanism is a must to avoid data corruption.


Do you think this will  affect the  performance ?


Of course the need for locks has a performance impact, and we
cannot avoid them to guarantee data integrity. However some
optimizations have been applied, specially the eager locking
which allows a lock to be reused without unlocking/locking again.


(In comparison AFR  first tries  non blocking locks  and if not
successful, tries blocking locks then)


EC also tries a non-blocking lock first.


Also, why two locks  are  needed  per FOP ? One for normal
I/O and
another for self healing?


The only fop that currently needs two locks is 'rename', and
only when source and destination directories are different. All
other fops only take one lock at most.

Best regards,

Xavi


Best regards
JK


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel







___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Question about EC locking

2016-12-12 Thread jayakrishnan mm
Dear Xavi,

How do I test  the locks, for example locks  for write fop. I have two
clients(independent), both  are  trying to write to same file.


1. According to my understanding, both  can successfully write  if the
offsets don't overlap . I mean, the WRITE FOP  takes a chunk lock on the
file . As
long as the clients don't try  to write to the same chunk, it should be OK.
If no locks  present, it can lead to inconsistency.


2.  Different FOPs can always run simultaneously. (Example  WRITE  and READ
FOPs, or  two READ FOPs).

3. WRITE & some metadata FOP (like setattr)  together . Cannot happen
together with locks , even though chances  are very low.

Pls. clarify.

Best regards
JK



On Wed, Nov 30, 2016 at 5:49 PM, jayakrishnan mm 
wrote:

> Hi Xavier,
>
> Thank you very much for your explanation. This helped  me to understand
> more  about  locking in EC.
>
> Best Regards
> JK
>
>
> On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez 
> wrote:
>
>> Hi,
>>
>> On 11/28/2016 02:59 AM, jayakrishnan mm wrote:
>>
>>> Hi Xavier,
>>>
>>> Notice  that EC xlator uses blocking locks. Any specific reason for this?
>>>
>>
>> In a distributed filesystem like gluster a synchronization mechanism is a
>> must to avoid data corruption.
>>
>>
>>> Do you think this will  affect the  performance ?
>>>
>>
>> Of course the need for locks has a performance impact, and we cannot
>> avoid them to guarantee data integrity. However some optimizations have
>> been applied, specially the eager locking which allows a lock to be reused
>> without unlocking/locking again.
>>
>>
>>> (In comparison AFR  first tries  non blocking locks  and if not
>>> successful, tries blocking locks then)
>>>
>>
>> EC also tries a non-blocking lock first.
>>
>>
>>> Also, why two locks  are  needed  per FOP ? One for normal I/O and
>>> another for self healing?
>>>
>>
>> The only fop that currently needs two locks is 'rename', and only when
>> source and destination directories are different. All other fops only take
>> one lock at most.
>>
>> Best regards,
>>
>> Xavi
>>
>>
>>> Best regards
>>> JK
>>>
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Question about EC locking

2016-11-30 Thread jayakrishnan mm
Hi Xavier,

Thank you very much for your explanation. This helped  me to understand
more  about  locking in EC.

Best Regards
JK


On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez 
wrote:

> Hi,
>
> On 11/28/2016 02:59 AM, jayakrishnan mm wrote:
>
>> Hi Xavier,
>>
>> Notice  that EC xlator uses blocking locks. Any specific reason for this?
>>
>
> In a distributed filesystem like gluster a synchronization mechanism is a
> must to avoid data corruption.
>
>
>> Do you think this will  affect the  performance ?
>>
>
> Of course the need for locks has a performance impact, and we cannot avoid
> them to guarantee data integrity. However some optimizations have been
> applied, specially the eager locking which allows a lock to be reused
> without unlocking/locking again.
>
>
>> (In comparison AFR  first tries  non blocking locks  and if not
>> successful, tries blocking locks then)
>>
>
> EC also tries a non-blocking lock first.
>
>
>> Also, why two locks  are  needed  per FOP ? One for normal I/O and
>> another for self healing?
>>
>
> The only fop that currently needs two locks is 'rename', and only when
> source and destination directories are different. All other fops only take
> one lock at most.
>
> Best regards,
>
> Xavi
>
>
>> Best regards
>> JK
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Question about EC locking

2016-11-28 Thread Xavier Hernandez

Hi,

On 11/28/2016 02:59 AM, jayakrishnan mm wrote:

Hi Xavier,

Notice  that EC xlator uses blocking locks. Any specific reason for this?


In a distributed filesystem like gluster a synchronization mechanism is 
a must to avoid data corruption.




Do you think this will  affect the  performance ?


Of course the need for locks has a performance impact, and we cannot 
avoid them to guarantee data integrity. However some optimizations have 
been applied, specially the eager locking which allows a lock to be 
reused without unlocking/locking again.




(In comparison AFR  first tries  non blocking locks  and if not
successful, tries blocking locks then)


EC also tries a non-blocking lock first.



Also, why two locks  are  needed  per FOP ? One for normal I/O and
another for self healing?


The only fop that currently needs two locks is 'rename', and only when 
source and destination directories are different. All other fops only 
take one lock at most.


Best regards,

Xavi



Best regards
JK


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Question about EC locking

2016-11-27 Thread jayakrishnan mm
Hi Xavier,

Notice  that EC xlator uses blocking locks. Any specific reason for this?

Do you think this will  affect the  performance ?

(In comparison AFR  first tries  non blocking locks  and if not successful,
tries blocking locks then)

Also, why two locks  are  needed  per FOP ? One for normal I/O and another
for self healing?

Best regards
JK
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Question about EC Locking

2016-11-27 Thread jayakrishnan mm
Hi Xavier,

Noticed  that  EC xlator  uses  blocking  locks. Any specific reason for
this ?

Do you think this will affect the read/write performance ?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel