On Sun, 21 Feb 2010, Derrick Brashear (Code Review) wrote:
> this is common, not merely linux; the change in 
> 49b7bbdd3b45df694fadbef48f9ed99d9bfe07b9 doesn't appear to play with the 
> length other than for linux, though.
> 
> is this change deliberate for all platforms? rationale?
> 
> To respond, visit http://gerrit.openafs.org/1353

[I can’t currently sign into Gerrit, so I’m replying by email for now.  
I’ll copy this to Gerrit when my account gets fixed.]

Yes, my reasoning is as follows:

The Linux off-by-one bug fixed by 49b7bbd has been present since the 
beginning of time (87c10e8 Initial IBM OpenAFS 1.0 tree).  That bug caused 
l_len to be set to OFFSET_MAX rather than 0 for whole-file locks, which 
would have made it _appear_ as if the afs_lockctl code needs to check for 
both 0 and various 0x7fffffff-like values.  To be safe, those checks were 
implemented in the platform-independent code.

If there were other platforms that also have the off-by-one bug, they 
would also need to be fixed, but I don’t think there are.  Other platforms 
are given an l_len directly, instead of needing to compute one from fl_end 
- fl_start + 1.

Now, as it turns out, there is also a buggy application that sets l_len = 
0x7fffffffffffffff.  But on Linux, thanks to the off-by-one bug, 
afs_lockctl was seeing the wrong value 0x7ffffffffffffffe.  Instead of 
fixing the off-by-one bug, a (platform-independent!) check for that value 
was added in commits 0188224 lock-mask-64bit-negative-1-for-java-20070212, 
65d8923 java-locking-redux-20070214, 226c1ee java-lock-fix-200702310.  So 
that check needs to stay, but it should be adjusted to check the right 
value 0x7fffffffffffffff.  No platform other than Linux would have been 
seeing 0x7ffffffffffffffe.

If there were buggy applications that set l_len = 0x7fffffff, then we’d 
also need to check for that value.  However, that would have shown up on 
Linux as 0x7ffffffe until very recently, and a check for 0x7ffffffe has 
never been found to be necessary.  Furthermore, checking for 0x7fffffff 
may be wrong on ≥ 2 GiB files, so I left it out.  But I could go either 
way on this.

(The _real_ fix for this giant mess would be to stop ignoring byte-range 
locks, so that we don’t need heuristics to detect them.  Hopefully some 
day the protocol will be extended to deal with byte-range locks, but until 
then, it would still be nice to treat byte-range locks as byte-range locks 
locally and send a whole-file lock to the server, rather than risking 
corruption from other machines.)

Anders
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Reply via email to