On Mon, 21 Oct 2013 17:49:43 +0100, H. S. Teoh <hst...@quickfur.ath.cx> wrote:

On Mon, Oct 21, 2013 at 04:47:05PM +0100, Regan Heath wrote:
On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh
<hst...@quickfur.ath.cx> wrote:

>On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
>>On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <pub...@dicebot.lv> wrote:
>>
>>>On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
>>>>That's bad API design, pure and simple. The function should e.g.
>>>>return the string including the line terminator, and only return
>>>>an empty (or null) string upon EOF.
>>>
>>>I'd say it should throw upon EOF as it is pretty high-level
>>>convenience function.
>>
>>I disagree.  Exceptions should never be used for flow control so the
>>rule is to throw on exceptional occurrences ONLY not on something
>>that you will ALWAYS eventually happen.
>[...]
>
>    while (!file.eof) {
>            auto line = file.readln(); // never throws
>            ...
>    }

For a file this is implementable (without a buffer) but not for a
socket or similar source/stream where a read MUST be performed to
detect EOF.  So, if you're implementing a line reader over multiple
sources, you would need to buffer.  Not the end of the world, but
definitely more complicated than just returning a null, no?
[...]

This is actually a very interesting issue to me, and one which I've
thought about a lot in the past. There are two incompatible (albeit with
much overlap) approaches here. One is the Unix approach where EOF is
unknown until you try to read past the end of a file (socket, etc.), and
the other is where EOF is known *before* you perform a read.

Personally, I prefer the second approach as being conceptually cleaner:
an input stream should "know" when it doesn't have any more data, so
that its EOF state can be queried at any time. Conceptually speaking one
shouldn't need to (try to) read from it before realizing there's nothing
left.

However, I understand that the Unix approach is easier to implement, in
the sense that if you have a network socket, it may be the case that
when you attempt to read from it, it is still connected, but before any
further data is received, the remote end disconnects. In this case, the
OS can't reasonably predict when there will be more incoming data, so
you do have to read the socket before finding out that the remote end
is going to disconnect without sending anything more.

In terms of API design, though, I still lean towards the approach where
EOF is always query-able, because it leads to cleaner code. This can be
implemented on Posix by having .eof read a single byte (or whatever unit
is expected) and buffering it, and the subsequent readln() takes this
buffering into account. This slight complication in implementation is
worth achieving the nicer user-facing API, IMO.

I don't agree the user-facing API is nicer. It is more complex both in concept and implementation.

API #1: 1 function, readline(), returns null on EOF. You call readline() and check the result for null. The check, naturally follows the attempt to read, which is the task you are trying to accomplish. Simple, straight forward.

API #2: 2 functions, readline() throws on EOF, isEof() checks for EOF. Your purpose is to read lines, so you call readline(), it is naturally easy to forget to call isEof(). Coding the example loop above requires you think about EOF /before/ you read a line, this is not how people think. This API is therefore more complex, and less intuitive for no gain.

So, having a usable null state allows the simpler, more direct API. Lack of it requires a more complicated design and a more complicated implementation.

R

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Reply via email to