On Jul 29, 2008, at 1:11 AM, Hein, Nashua NH wrote:

On Jul 29, 1:17 am, [EMAIL PROTECTED] (Jeremy Begg) wrote:


This appears to be fixed in perl 5.10.
On 5.8-6 the extra elapsed time is burned strictly in usermode cpu
time (in DECCRTL)

Thanks Jeremy and Hein for looking deeper. I think the difference between 5.8.x and 5.10.x is that in 5.10 the bottom layer of Perl's I/ O layering is unixio rather than stdio, i.e. read/write and friends rather than fread/fwrite and friends. But that's not the whole story. I think the real problem is in our implementation of fwrite() and 5.10 just dodges the problem by not using it. We don't use the CRTL version of fwrite because it's been coopted to do record I/O and introduces record boundaries in many places where they are not wanted.

You can see what our home-grown version of fwrite does by searching for my_fwrite in the file here:

http://public.activestate.com/cgi-bin/perlbrowse/f/vms/vms.c

You'll see that we implement it in terms of fputs and fputc. Since fputs recognizes null terminators, we use fputc to output the null bytes and fputs to output any non-null chunks. That's probably not too much of a hit when you have the occasional null byte in the midst of mostly non-null data, but when every byte is null, you end up writing out one byte at a time with fputc. I'm pretty sure this is the cause of the performance hit Jörg noted, though I haven't had time yet to step through it the debugger and confirm that.

I'm not sure what to do about this. What we have now works reliably, but that much of a performance hit is a real problem. What's probably needed is a complete reimplementation of fwrite that doesn't have the old behavior inherited from VAX C days of thinking that an "item" to be output corresponds directly to an RMS record. This could be tricky to get right for all file types.

________________________________________
Craig A. Berry
mailto:[EMAIL PROTECTED]

"... getting out of a sonnet is much more
 difficult than getting in."
                 Brad Leithauser

Reply via email to