Anne van Kesteren wrote:
On Mon, 21 Sep 2009 17:39:14 +0200, Per-Erik Brodin wrote:
So what you are saying is that "\r\n" will always be a Windows line
ending and never a Mac line ending followed by a Unix line ending?

That's what should happen as that would be consistent with other text formats, e.g. text/html. I guess this should be stated below the ABNF or the ABNF should be rewritten to a more parser/state-like thingy.

I'm envisioning a scenario where event stream data is aggregated from
various sources, and done so improperly so that multiple different line
endings end up in the stream. For example, appending a carriage return
to a string that is already ending with carriage return produces a
different result than appending a line feed to the same string. Since
it's a new format being defined, why not make it clean and simple?

Consider the following example:
print "data: hello\r";
print "data: world\r";
print "\n";  # dispatch!

Keep in mind that we are parsing a continuous stream where data arrives
in chunks. It is entirely possible for a "\r\n" pair to be split up
between two chunks which could be handled by either 1) dispatching an
event immediately when receiving a carriage return and then upon
reception of the next chunk "remember" that the last character in the
previous chunk was a carriage return and discard the first character if
it happens to be line feed, or 2) not dispatching an event until the
next character after carriage return has been received which could lead
to delays in event dispatch. Both these options are far from ideal.

The first option should not be too hard to implement right? Just a simple state variable in the tokenizer.


My point was not that it would be particularly hard to implement.

--
Per-Erik Brodin
Ericsson Research




Reply via email to