> > Assume we have 2 nodes.
> > 1. Node A & B reach step 3) in the same time.
> > 2. sfex_lock on Node B is scheduled out due to some other reasons.
> > 3. sfex_lock on Node A goes through step 3 to 6, and Node A holds 
> > the lock now.

Node A is sure to hold the lock at this moment.
sfex_lock() is going to return the value 0, and RA will start monitoring on
Node A.
during the monitor operation, sfex_update() is running, and it can check and
update the status of Node A.

If Node B updates the lock status _at just the right moment_,
sfex_update() detects that the other node is trying to update its status,
and it will be terminated with exit(2).

> > 4. sfex_lock on Node B is scheduled back, and goes through step 3 to 
> > 6 also.

RA monitor on Node A will also be stopped.
Node B can get the lock during a situation like this.

> This statement is wrong according to your code.
> Especially, your check-and-reserve is not an atomic CAS operation.

By the way, the lock status stores on the partition, (not using file system)
so, as a communication media, it can keep read-write operation atomicity.
All nodes' action, like read (check) or write (reserve) the status won't
bump against each other.
inconsequent remark?

Thanks,
Junko

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to