On 29 Sep 2009, at 5:39 pm, John Cowan wrote:

> Alaric Snell-Pym scripsit:
>
>>> Character encoding/decoding needs to be done in big buffers for the
>>> same reason that actual I/O does.
>>
>> Why? Just because of the procedure call overhead of read-char?
>
> Well, yes -- in a compiler.  But in an interpreter, you are crossing
> an
> expensive abstraction barrier between interpreted code and compiled
> code
> every time you convert a character, which (in file-oriented programs)
> is probably your inner loop.  Much better to have that inner loop in
> C,
> possibly inside libiconv, which knows more about the subject than the
> average Scheme implementer.

Ok; but then the issue is more one of providing procedures to read
whole strings, rather than char-at-a-time; reading ahead a string to
feed to the user a char at a time is, indeed, a problem if you don't
know when you're going to want to stop reading strings and go back to
reading bytes. So don't do it ;-) Give the user (read-string-of-fixed-
length <bytes> [<port>] [<encoding>]) or (read-string-up-to-delimiter
<codepoint>|(<codepoint>...) [<port>] [<encoding>]), or some such.

> Similarly, why not just read one byte from the filesystem whenever we
> want binary data?  The kernel does plenty of hidden buffering already;
> why bother with stdio and user buffers?  Because it is expensive to
> cross the abstraction barrier between userland and the kernel, and
> your
> performance goes to hell.

Yep, but such readahead can be hidden from the user. You call
getchar() on a FILE *, and without you needing to pay any attention,
you only do a read() syscall every few KB, and everyone's a winner.

ABS

--
Alaric Snell-Pym
Work: http://www.snell-systems.co.uk/
Play: http://www.snell-pym.org.uk/alaric/
Blog: http://www.snell-pym.org.uk/archives/author/alaric/




_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to