Daryl C. W. O'Shea writes: > I think I've got this fixed when not running in --cs_paths_only mode. I > couldn't break it or cause it to hang/loop in a couple quick tests.
yay ;) > >> What's causing the messages to disappear during the mass-check run? > > > > probably the corpus being updated via rsync. it's a very big corpus. > > To avoid this in a probably nearly identical setup I "tag" the corpus by > making a linked duplicate of it for that particular mass-check run and > then delete the linked copy when the server exits. "cp -al" is your friend. This may be a good option. I'd prefer if mass-check was just resilient, though. ;) The zone's nightly-mc corpus (uploaded corpora) are this big (in KB): 2 /export/home/bbmass/rawcor/doc 19760 /export/home/bbmass/rawcor/fredt 6764040 /export/home/bbmass/rawcor/jm (mostly spam, since May 2007) 209393 /export/home/bbmass/rawcor/zmi so that's pretty big. In terms of disk space usage, that probably wouldn't take much space to cp -al; but it'd take a fair bit of time, esp on the zone, which has serious I/O bottleneck problems. > As an aside, if bandwidth is free, the whole mass-check will run quite a > bit faster if you rsync the corpus to each of the slaves. Of course > that assumes you've got the disk space and i/o to spare (i/o you may > already have if /tmp isn't a ramdisk). yeah, rsyncing about 7GB of corpora, nightly, would definitely be slow ;) --j.
