Hi Rainer,

I don't know UTF-8 very well. I am of the impression that 0x00 can occur
in UTF-8, in multi-byte character sequences. You've been researching the
UTF-8. Can you determine if that's true? If it is, then we cannot limit
octets to 1..255 values.

dbh

> -----Original Message-----
> From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 06, 2004 11:51 AM
> To: Anton Okmianski; Harrington, David; [EMAIL PROTECTED]
> Subject: RE: -international: trailer
>
> Anton:
>
> > I agree with your conclusion that we need to support all
> > Unicode/UTF.  I
> > also think that doing any kind of escaping is generally bad
> and should
> > be deferred until it is absolutely necessary (like maybe
> escaping line
> > breaks for storage).
>
> I agree, escaping needs to be done when it goes to the
> storage subsystem
> - eventually. We always think "text file" though some use databases,
> where this is no issue at all. But that's not the point.
>
> I thought a while over this issue during the course of the
> day... We can
> have the storage subsystem escape non-printable characters. Obviously,
> it is up to the storage subsystem how it does this. When the data is
> then read back, the storage subsystem should decode the persisted
> message and provide the original block to e.g. the message
> verifier. So
> we do not have an issue with -sign.
>
> Obviously, a syslog-storage RFC comes into the mind, but I
> think we are
> busy enough with current discussions ;) Let's make one step after
> another...
>
> So I am more or less prepared to edit protcol-03 so that all
> characters
> are allowed, including ascii control characters.
>
> The thing left that makes me really frightend is the 0x00 character. I
> know allowing it will break a lot of existing code and make it hard to
> update it to the new format. On the other hand, explicitely
> allowing it
> will remove a potential security weakness... some of the bad guys may
> have fun with sending 0x00 especially when we do not allow it.
>
> I am still tempted to allow only octets in the range of 1..255. ;)
>
> I'd appreciate comments on this issue. If we can solve this, we can
> solve this issue here as well as the trailer. And I think we are close
> to doing that.
>
> >
> > I guess this means, we can't have a line separator trailer unless we
> > escape all others inside of message.  I really would prefer
> > no escaping.
>
> I agree to this - UTF-8 kills the TRAILER.
>
> On second thought, the trailer was a bad idea initially. After all, my
> intension was to have an extra sanity check for the framing - but that
> is a transport issue, not a general format issue (for a
> transport-ignorant message format).
>
> > I think alternatively, a UDP transport can define an
> optional/required
> > structured element for message length in octets, but it is tricky.
>
> I am in favour to not do this. As you say, it is tricky - and
> the extra
> sanity check does not buy us much in UDP (so many things that can go
> wrong anyhow...)
>
> Rainer
>


Reply via email to