In a former life, I once benchmarked C+stdio vs. Perl <FH> and found no difference at all. My application (and, therefore, my test) was line oriented, iirc.
What's more, I tried by-passing stdio in the C program and also found only a miniscule benefit (especially for larger files where the buffering actually helps). Running sar (on Solaris) on the test programs showed a lot of time in I/O wait states. Go figure. At the time I decided to relax and enjoy <FH>. :-) I wonder if the result would be different if I repeated the test today with a good array disk controller. >>> "UG" == "Uri Guttman" <u...@stemsystems.com> writes: >>> "CW" == Conor Walsh <c...@adverb.ly> writes: CW> Uri noted a couple days ago that Perl's stream CW> I/O in <FH> calls is terrible. This matches CW> somewhat with my own experience, and whenever CW> I need to parse a file I either slurp it (if CW> I'm certain it's within certain bounds) or CW> do something like while {sysread(LARGE_BUFSIZE)}. CW> Is anyone here perlgutsy enough to say *why* CW> <FH> is so slow? Is it just the split /(?=$/)/, CW> or is there more going on there that I'm CW> missing? CW> If I bypass <FH>, am I gaining speed by not CW> doing work I don't need to do, or is it just CW> one of the more atrocious legacy code paths? UG> my guess is that stdio does more work than needed UG> for slurping. it does buffering with smaller reads, UG> it does line ending, it does eof checking, etc. it UG> is designed for flexibility rather than speed. i UG> also bet it has a large tree of sub calls to do its UG> work. stdio does many things which aren't needed UG> for just slurping in a file. _______________________________________________ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm