R. David Murray <rdmur...@bitdance.com> added the comment:

Well, it turns out that this sporadic failure is not a test bug, but a real bug 
in the mailbox module that the test is revealing.

This issue is the same one that motivated the changes in issue 6896.  Those 
changes, however, merely reduced the problem, but didn't solve it.

The fundamental problem is that mailbox is relying on comparing the system 
clock with the mtime.  But the mtime, it turns out, is not guaranteed to follow 
the system clock.  You can see this most easily if you think about, say, an NSF 
file system.  The mtime is set by the server's clock, and if the server's clock 
is different, the mtime won't match the local clock.  It appears to be also 
true for some reason on a vserver virthost: as reported in issue 6896, the 
mtime is sometimes set to a value a full second before the time.time() time.

Ironically this was discussed in the original bug report that introduced the 
mtime checking code (issue 1607951), and I found that bug on the first page of 
hits while searching for mtime/system clock synchronization problems.  The 
solution that Andrew proposed in that issue is slightly different from the one 
he actually implemented, but his proposed solution was also flawed.

The actual solution involves dealing correctly with two factors: the mtime 
"clock" is not synchronized with the system clock, and the mtime only has a 
resolution of one second.  The first means that when looking for changes to the 
mtime we must compare it to the previous value of the mtime.  The second means 
that if _refresh was last called less than a second ago by our clock (the only 
one we can query), we had best recheck because the directory may have been 
updated without the mtime changing.

I also added an additional delta in case the file system clock is skewing 
relative to the system clock.  I made this a class attribute so that it is 
adjustable; perhaps it should be made public and documented.

Attached is a patch implementing the fix.  It undoes the 6896 patch, since it 
is no longer needed.  At this writing my buildbot has run test_mailbox 50 times 
without failing, where before it would fail every third run or so.

Sadly, I had to reintroduce a 1.1 second fixed sleep into the test.  No way 
around it.  But it is a deterministic sleep, not a "hope this is long enough" 
sleep.

----------
Added file: http://bugs.python.org/file21904/mailbox_mtime.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11999>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to