Eugene, I ran pipe performance tests on 2.56GHz PIV 333MHz FSB, same box I ran 2.16.16 and 2.16.18 pipe performance test on last time (about 18 months ago). I show 2.18 as comparable to 2.16.16. The performance increases of 2.16.18 only showed improvements when message sizes were within a FASTBUF.
Performance tests on RH7.2 2.4.20-28.7bigmem (SMP kernel I tested on last time) shows LiS 2.18 STREAMS-based pipes clocking in at a dismal 13% when compared to Linux SVR3-style native pipes. 2.16.18 showed about 20% 18 months ago, but only beneath 64-byte read/write sizes and then fell back to about 10% after that. Performances tests on Centos4 (RHEL4 clone) and FC4 show the performance gains of the 2.6 kernel (and recent re-optimizing compilers) to be quite significant. On FC4 (a regparms kernel), the per-byte read/write latency drops from about 750 ps (picoseconds) on RH7.2 and CL4 to about 500 ps on FC4. LiS 2.18 experiences a per-byte read/write latency drop from 1600 ps on RH7.2 to 900 ps on CL4 and FC4. I attribute the gain on native to the regparms FC4 kernel. I attribute the gain on LiS to the better compilers (3.4.3 and 4.0) on CL4 and FC4 that better find their way around cruft in the code. Per message read/write delays for LiS 2.18 drops from 20 us on RH7.2 to 8 us on CL4 and FC4, while native pipes sit at around 2.5 us on all three. I attribute the gain on LiS to the tighter scheduling latency and O(1) scheduler of the 2.6 kernel. STREAMS-based pipes are far more susceptible to scheduling latency. Overall, when compared to native pipes, LiS 2.18 performs at 12.7% for RH7.2, 28.1% for CL4 and a top end of 38.8% for FC4. The FC4 native pipes really cruise, so 38.8% is quite good. I attribute the good FC4 results to the O(1) scheduler, the regparms kernel and the re-optimizing GCC 4 compiler. It is interesting that kernel improvements generate better performance gains than could be accomplished within LiS with the changes from 2.16.16 to 2.16.18. There, it was only a 2x gain when compared to native pipes and only beneath 64-byte writes. The FC4 improvesments are across all messages sizes (tested linear with .999 correlation up to 4096 bytes). So I suppose the story with LiS is, if you want the best performance use a good 2.6 kernel. Because 2.16.18 only runs on a 2.6 kernel, 2.18 is the better choice of the two for performance. If you are running on a 2.4 kernel; however, expect better performance from 2.16.18 at message sizes within a FASTBUF. Now, Linux Fast-STREAMS... LfS (streams-0.7a.4) in the same performance tests relative to Linux native pipes clocked in a 40%, 60% and 75% on RH7.2, CL4 and FC4 over all message sizes. When compared to LiS at 13%, 28% and 39% in the same tests, LfS performs 3.1x, 2.1x, 1.9x compared to LiS. The 3x performance gain on 2.4 SMP over LiS 2.18 is quite impressive, particularly when you consider that compared to native pipes, LfS runs as fast on RH7.2 2.4 as LiS 2.18 runs on FC4. The other impressive figure is that LfS on FC4 is running at 75% of the performance of a native Linux pipe. This exceeds John Boyd's "impressed" threshold (better than 50% native pipe performance). LfS is the best performance choice on any kernel. Transitioning from LiS 2.16.18 to LfS is a better performance choice than to LiS 2.18. But then, that's why I called it "Fast". I will send a separate note on some of my discoveries reagarding performance on LiS and LfS. Here is the raw (well, half-cooked) data: (obtained using the perftest program included in the OpenSS7 LiS 2.18.2 release and the streams 0.7a.4 release): Linear regression was performed on pipe throughput at 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048 and 4096 byte message sizes running the pipe wide open (100% cpu utilization). Correlations were usually 99.9% Slope is per-byte read/write delay, intercept is per message read/write delay. The delay is y = mx + b, where x is the message size in bytes. Per byte delay, slope, (picoseconds): RH7.2 CL4 FC4 --------- -------- -------- LiS 1620 886 925 LfS 1230 919 987 Linux 760 750 482 Per write delay, intercept, (microseconds): RH7.2 CL4 FC4 --------- -------- -------- LiS 19.20 7.72 7.83 LfS 6.54 3.57 4.02 Linux 2.43 2.17 3.03 --brian On Fri, 02 Dec 2005, [EMAIL PROTECTED] wrote: > > Hello, > > I'm curious about people impressions about LiS-2.18 performance. > Is it better comparing to LiS-2.16.18 ? > > Are there any known 2.18 issues that can be fixed to improve > performance? > > My understanding is that in LiS-2.18 most(all?) of the queue > processing > is done by LiS kernel threads and queuerun is never executed from > the driver tasklet context. That may result, I guess, in excessive > process > switching overhead and poorer performance. > I might be missing something, though. > > The other thing I noticed when I ran my tests on a 4 processor system > is that only one LiS thread accumulated CPU time: > > root 9574 1 0 Dec01 ? 00:02:27 [LiS-2.18.0:0] <------- > root 9575 1 0 Dec01 ? 00:00:01 [LiS-2.18.0:1] > root 9576 1 0 Dec01 ? 00:00:00 [LiS-2.18.0:2] > root 9577 1 0 Dec01 ? 00:00:00 [LiS-2.18.0:3] > > Is it the way it's supposed to be, or it's a bug? > > > I'd appreciate any comment/advices regarding performance issues on > LiS-2.18. > > -- > Eugene > > > > _________________________________________________________________ > > Try the New Netscape Mail Today! > Virtually Spam-Free | More Storage | Import Your Contact List > [1]http://mail.netscape.com > > References > > 1. http://mail.netscape.com/ -- Brian F. G. Bidulock ¦ The reasonable man adapts himself to the ¦ [EMAIL PROTECTED] ¦ world; the unreasonable one persists in ¦ http://www.openss7.org/ ¦ trying to adapt the world to himself. ¦ ¦ Therefore all progress depends on the ¦ ¦ unreasonable man. -- George Bernard Shaw ¦ _______________________________________________ Linux-streams mailing list Linux-streams@gsyc.escet.urjc.es http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams