Miguel Barreiro writes: > > Why would you be sending a file to a datagram socket? > > To send prepacketized video to a DTV headend.
How do you handle flow control and/or pacing? How do you set the message boundaries? > Context: most IPTV deployments deliver video as MPEG-2 Transport Streams over > UDP. Digital cable headends are also increasingly IP based. For high-density > video on demand you need to send as many streams from a node as possible; the > bottleneck is almost always I/O... unless you stream from a Thumper. Yep; understood. > To make things more interesting, you cannot always send jumbo frames and > streams > should follow the exact timestamps with a precision on the order of 20-50 > msecs > or better. That would seem to make it particularly unsuited to sendfile()-like behavior, which doesn't try to follow any timestamps or precise timing. (Leaving aside the argument that precise timing is not actually required if you have a modest amount of buffering at the receiver.) > The following data is from a x4500 sending about 1.2 Gbps of video streams > (about 550 streams in this particular setup) over two interfaces. All of them > from different media, and all of them 2 to 4 GB long. Sender processes > basically read(), usleep() and send() following a prebuilt index. > > - That much processor use in kernel space? > > $ sar -u 3 > 15:59:05 %usr %sys %wio %idle > 16:00:08 2 49 0 49 > 16:00:11 2 49 0 49 I think you'd probably need to do some more in-depth characterization to find out where the time is being spent. Sar is useful, but pretty blunt as a design tool. > Yes, we are doing a huge lot of syscalls/sec here and a lot of > kernel-userspace > copying (thus I'd love to try the same with sendfile, as we already do in > Linux), see below... > > $ vmstat 1 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m1 m1 m1 m2 in sy cs us sy id > 0 0 0 1903944 2118896 0 0 0 0 0 0 0 0 0 0 0 9577 148303 27172 2 55 > 43 > 0 0 0 1820744 2035696 0 0 0 0 0 0 0 0 0 0 0 7518 149480 25834 2 50 > 48 > 0 0 0 1762504 1977456 0 0 0 0 0 0 0 0 0 0 0 8547 150297 26799 2 53 > 45 Yes, an interface with fewer copies may well be helpful. I just don't think that interface looks like sendfile(). > - Is that the Fire Engine being too aggressive at interrupt affinity? (see the > number of interrupts per core) There are a number of factors here, and you can't really get detailed information from just mpstat. I suspect that the large number of interrupts on CPU 1 represents one of your network interfaces. The hardware interrupt has to go somewhere. Yes, the squeue (Fire Engine) does attempt to do as much processing on a single CPU as it can in order to keep the cache hot. That's not necessarily a bad thing -- jumping back and forth between CPUs in an attempt to "balance the load" can actually lower performance by forcing cache invalidations. > I'd love to hear other suggestions on how to make this beast perform better. > Should I get very different results with a recent Nevada snapshot (with the > Yosemite patches)? Exactly what version/update are you running? I don't think you said (other than "S10"). Yes, this is precisely the sort of workload that Yosemite was designed to handle, so it's well worth a try. -- James Carlson, Solaris Networking <[EMAIL PROTECTED]> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ networking-discuss mailing list [email protected]
