I suspect that the answer to this question is, "Don't do that, then," but in case anyone has any suggestions:
I was doing a load test. Configuration is 18.1, with an MBE w/5 250's and an SBE that NFS-mounts the MBE's recording directory, with 1 350. CPU's are AMD 2800+'s, 512MB RAM, 100baseT ethernet through a hub; disks are Seagate 7200.7 200GB PATA (one per CPU); filesystem for the non-recordings is ext3fs. (Recordings use JFS.) I set up test recordings on all 6 tuners simultaneously, then started a dirvish backup of the entire MBE's filesystem, -excluding- the actual video directory and things like /dev and /proc and so forth; I was doing this in part to set up the initial dirvish vault for keeping the machine backed up. (For those who've never heard of dirvish, this amounts to running rsync on the entire filesystem and schlepping the results to a third machine.) About 1 minute into the transfer (and about 6 minutes into the recording), the recording on the 350 (e.g., the SBE machine) got corrupted for a few seconds that threw off its audio sync throughout the rest of the recording and then crashed the 350 itself when I played back that particular stream a few hours later (more about that in a message to ivtv-users). The slave logged three complaints across 300ms in its kernel log that the master's NFS server timed out while it was making that recording, so I'm hardly surprised that the recording had problems. Interestingly, it only glitched that one time; I'm not sure why. I have real-time commflagging turned on, and the test recordings started about 5 minutes before the rsync, and most of the commflaggers run on the slave, so that added to both disk and network contention. I suppose in retrospect it's not surprising that something hiccuped, and that -was- the point of the test... (Though OTOH the commflagger waits several minutes [exactly 5? more?] before starting, so maybe it wasn't that. OTGH, I have a 10s job-queue check interval, so 5 of them could have started within a 50-second window once enough recording had accumulated.) The question is, can I do better? I don't imagine I'll often be rsyncing the entire disk, but I will certainly be rsyncing the delta since the last rsync, and I'd rather not have to worry about that disrupting recordings. (That's a bit harder to test under load, since the deltas will vary, but I'll attempt it to ensure that rsync's scanning phase isn't enough to cause disruption.) Current NFS mount options are rsize and wsize of 8192; would increasing these help on a 100baseT hub, or just lead to more fragment reassembly? (Actually, the hub itself is an SMC 8508T gigabit hub with jumbo packet support, but the NICs on the CPUs are only 100 megabit.) Other NFS options are soft & nfsvers=3, which are pretty standard. Would increasing ivtv's buffer sizes on the slave help? Or would I be screwed anyway 'cause the buffers are already flushed by the time the NFS timeout manifests? P.S. Is it feasible to let the slave record on its own filesystem instead of on the NFS-mounted master's? I suspect that there's no way to seamlessly be able to play, commflag, or transcode if recordings might be split across multiple filesystems, but if somebody has a good idea, I'm willing to listen... Thanks!
_______________________________________________ mythtv-users mailing list mythtv-users@mythtv.org http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users