In a former life, I once benchmarked C+stdio vs. Perl <FH> and found no 
difference at all.  My application (and, therefore, my test) was line oriented, 
iirc.

What's more, I tried by-passing stdio in the C program and also found only a 
miniscule benefit (especially for larger files where the buffering actually 
helps).

Running sar (on Solaris) on the test programs showed a lot of time in I/O wait 
states. Go figure.  At the time I decided to relax and enjoy <FH>.  :-)  

I wonder if the result would be different if I repeated the test today with a 
good array disk controller.


>>> "UG" == "Uri Guttman" <u...@stemsystems.com> writes:
>>> "CW" == Conor Walsh <c...@adverb.ly> writes:

CW> Uri noted a couple days ago that Perl's stream 
CW> I/O in <FH> calls is terrible.  This matches 
CW> somewhat with my own experience, and whenever
CW> I need to parse a file I either slurp it (if 
CW> I'm certain it's within certain bounds) or 
CW> do something like while {sysread(LARGE_BUFSIZE)}.

CW> Is anyone here perlgutsy enough to say *why* 
CW> <FH> is so slow?  Is it just the split /(?=$/)/, 
CW> or is there more going on there that I'm
CW> missing?

CW> If I bypass <FH>, am I gaining speed by not 
CW> doing work I don't need to do, or is it just
CW> one of the more atrocious legacy code paths?

UG> my guess is that stdio does more work than needed 
UG> for slurping. it does buffering with smaller reads, 
UG> it does line ending, it does eof checking, etc. it 
UG> is designed for flexibility rather than speed. i 
UG> also bet it has a large tree of sub calls to do its 
UG> work. stdio does many things which aren't needed 
UG> for just slurping in a file.


_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to