Justin Mason wrote:
Daryl C. W. O'Shea writes:
Justin Mason wrote:
Daryl C. W. O'Shea writes:
This won't work. I did nearly the same thing and reverted it. The server will basically hang up trying to retry these messages. The outstanding message list needs to be dealt with to do this.

This was yet another thing I wanted to get to before being slammed with "real" work this last week. :(
uh, crap.  The nightly mass-check on the zone is currently hosed because
ofthis bug, and I'm not going to be able to get to it either for a few
days, what with a massive switch of servers and an impending driving
test...
I'd at least revert this change for now, as it is it'll cause the mass-check processes never to end (if it skips even a single message), consuming more and more memory (and cpu time depending on the number of messages skipped) every time another instance starts.

argh.

I think I've got this fixed when not running in --cs_paths_only mode. I couldn't break it or cause it to hang/loop in a couple quick tests.


What's causing the messages to disappear during the mass-check run?

probably the corpus being updated via rsync.  it's a very big corpus.

To avoid this in a probably nearly identical setup I "tag" the corpus by making a linked duplicate of it for that particular mass-check run and then delete the linked copy when the server exits. "cp -al" is your friend.

As an aside, if bandwidth is free, the whole mass-check will run quite a bit faster if you rsync the corpus to each of the slaves. Of course that assumes you've got the disk space and i/o to spare (i/o you may already have if /tmp isn't a ramdisk).


Daryl

Reply via email to