Truncation of UTF-8 is actually slightly worse than has been described.

It is possible to determine from the UTF-8 octets where one coded character ends
and another begins.  But because Unicode contains combining characters, with no
limit on how many of these there can be, and these modify the meaning of
previous or later coded characters, it is not possible to determine where one
'symbol' ends.  So truncation at a UTF-8 boundary could subtlety change the
meaning of a message, even breach security.  Not something we can guard against
but should mention.

Tom Petch

----- Original Message -----
From: "Rainer Gerhards" <[EMAIL PROTECTED]>
To: "Anton Okmianski (aokmians)" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, January 11, 2006 11:30 AM
Subject: RE: [Syslog] Sec 6.1: Truncation


Anton and all,

I have now changed section 6.1 to:

###
6.1.  Message Length

   Syslog message size limits are dictated by the syslog transport
   mapping in use.  There is no upper limit per se.  Each transport
   mapping MUST define the minimum required message length support.  Any
   syslog transport mapping MUST support messages of up to and including
   480 octets in length.

   Any syslog receiver MUST be able to accept messages of up to and
   including 480 octets in length.  All receiver implementations SHOULD
   be able to accept messages of up to and including 2048 octets in
   length.  Receivers MAY receive messages larger than 2048 octets in
   length.  If a receiver receives a message with a length larger than
   it supports, the receiver MAY discard the message or truncate the
   payload.

   If a receiver truncates messages, the truncation MUST occur at the
   end of the message.  UTF-8 encoding and STRUCTURED-DATA MUST be kept
   valid during truncation.  SD-ELEMENTs MUST NOT partly be truncated.
   If an SD-ELEMENT is to be truncated, the whole SD-ELEMENT MUST be
   deleted.  If the last SD-ELEMENT of a message is deleted, the
   STRUCTURED-DATA field MUST be changed to NILVALUE.
###

I have explicitly stated that there is no intrinsic upper size limit. I
did this, because we had so much confusion/misunderstanding on that fact
in the past. I've also added some details on truncation. The rest is as
suggested by Anton :)

Please review and comment.

Rainer

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards
> Sent: Monday, January 09, 2006 4:49 PM
> To: Anton Okmianski (aokmians)
> Cc: [EMAIL PROTECTED]
> Subject: RE: [Syslog] Sec 6.1: Truncation
>
> > Rainer:
> >
> > I agree - this is better than a convoluted rule.
> >
> > I think we only have any business in defining truncation for
> > relays.  For collectors, we have tried to stay away from
> > describing how messages are stored.
> >
> > For relays, I think it would be useful to state that relay
> > can't just drop arbitrary message parts. Your statements
> > about "some parts ... are lost" may be interpreted that way.
>
> Actually, this was what I meant ;) [I saw a number of use
> cases where it
> would make sense to strip some known-not-so-relavant SD-IDs to be
> strippedd], but ...
> >
> > I would recommend that we state that any truncation must
> > happen at the end of the message, which I think is what
> > truncation means to a lot of people anyway. This would
> > prevent an implementation which prefers to throw out
> > STRUCTURED-DATA before the MSG content.  A consistent
> > behavior is useful for interop and, in particular, may help
> > in dealing with security issues.
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ... this is more important. I now agree with your point.
>
> As a side-note, we had the idea that relay operations may become a
> separate document, so I would prefer not to dig too deep into relay
> behaviour. To specify what you recommend, this is not
> necessary, so this
> is not really a discussion topic here.
>
> Rainer
> >
> > Thanks,
> > Anton.
> >
> > > -----Original Message-----
> > > From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
> > > Sent: Monday, January 09, 2006 3:21 AM
> > > To: Anton Okmianski (aokmians)
> > > Subject: RE: [Syslog] Sec 6.1: Truncation
> > >
> > > Anton, Darren,
> > >
> > > I agree that the truncation rule is probably not really
> > > useful, even confusing. I think it is hard to predict for any
> > > potential message if the more interesting content is in
> > > STRUCTURED-DATA or in the MSG part.
> > > For example, with our current SD-IDs, I'd prefer to trunctate
> > > them instead of MSG. Obviously, the case is different for
> > > your LINKDOWN sample. I also agree with Darren that
> > > truncation probably happens on the transport layer, the
> > > application may not even see the full message.
> > >
> > > My conclusion, however, is slightly different: I recommend
> > > now that we remove truncation rules from -protocol. We should
> > > just say that truncation might happen and that in this case
> > > some parts of the message are lost - what is at the
> > > discretion of the receiver.
> > >
> > > Rainer
> > >
> > > > -----Original Message-----
> > > > From: [EMAIL PROTECTED]
> > > > [mailto:[EMAIL PROTECTED] On Behalf Of Anton
> > Okmianski
> > > > (aokmians)
> > > > Sent: Friday, January 06, 2006 9:48 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: [Syslog] Sec 6.1: Truncation
> > > >
> > > > Rainer and all:
> > > >
> > > > I started reading draft #16. Since we are revisiting
> > > everything... I
> > > > am not very comfortable with the current truncation rules.
> > > >
> > > > "Receivers SHOULD follow this order of preference when it
> > comes to
> > > > truncation:
> > > >
> > > >  1) No truncation
> > > >  2) Truncation by dropping SD-ELEMENTs
> > > >  3) If 2) not sufficient, truncate MSG"
> > > >
> > > > I don't think that this is a good recommendation.  I would
> > > assume that
> > > > in many cases people would try to tokenize more important
> > data into
> > > > structured data and use message text as a secondary
> user-friendly
> > > > description. For example, for LINK_DOWN message, I
> would probably
> > > > encode link ID in the structured elements as this is
> > something that
> > > > should be readily available for receivers. The MSGID could be
> > > > "LINK_DOWN" and the MSG text may simply be "Link down".  If you
> > > > truncate the structured data, it makes it harder.
> > > >
> > > > I also think, in general it is useful to put more
> important data
> > > > first, which is another reason for putting more valuable
> > data into
> > > > structured data in a more compact way.
> > > >
> > > > Additionally, structured data can be used to provide
> > > message length or
> > > > digest, which can help receiver to determine if message was
> > > truncated.
> > > >
> > > > Also, I think this statement is very convoluted:
> > > >
> > > > "Please note that it is possible that the MSG field is
> truncated
> > > > without dropping any SD-PARAMS.  This is the case if a
> > > message with an
> > > > empty STRUCTURED-DATA field must be truncated."
> > > >
> > > > I think I understand what you are driving at, but I don't
> > see it as
> > > > adding any requirements or clarification.
> > > >
> > > > This sentence is not clear although I know what you are
> > > trying to say:
> > > >
> > > > "The limits below are minimum maximum lengths, not
> > maximum length."
> > > >
> > > > I propose replacing the entire section 6.1 with this text:
> > > >
> > > > "Syslog message limits are dictated by the syslog transport
> > > mapping in
> > > > use. Each transport mapping MUST define the minimum
> > > required message
> > > > length support. Any syslog transport mapping MUST support
> > > messages of
> > > > up to and including 480 octets in length.
> > > >
> > > > Any syslog receiver MUST be able to accept messages of
> up to and
> > > > including 480 octets in length.  All receiver
> > > implementations SHOULD
> > > > be able to accept messages of up to and including 2048
> octets in
> > > > length. Receivers MAY receive messages larger than 2048
> octets in
> > > > length. If a receiver receives a message with a length
> > > larger than it
> > > > supports, the receiver MAY discard the message or truncate the
> > > > payload.
> > > >
> > > > If truncation is performed by the receiver, it MUST first
> > > truncate the
> > > > MSG field as necessary to meet the supported length limit. If
> > > > truncation of the entire MSG field is not sufficient, then
> > > > additionally, the STRUCTURED-DATA field MUST be truncated
> > > by removing
> > > > one or more SD-ELEMENT fields. A minimum number of
> > > SD-ELEMENT fields
> > > > MUST be truncated starting from the end as necessary to
> meet the
> > > > supported length limit. SD-ELEMENT field can't be truncated
> > > partially.
> > > > If all SD-ELEMENT fields are removed, NILVALUE MUST be
> > > specified for
> > > > STRUCTURED-DATA field. Truncation of HEADER
> > > > field MUST NOT be performed."
> > > >
> > > > BTW, in your text or mine, what happens if message is
> > malformed?  A
> > > > proxy won't be able to truncate it properly then. We
> > don't want to
> > > > prevent it from truncating it in some way and sending
> the message
> > > > further, I would think.  At least you will see something at
> > > the final
> > > > destination, which maybe more useful than nothing. If we
> > just made
> > > > truncation a simple take the first X octets out of Y
> > > octets, it would
> > > > not be an issue, but then proxy would be allowed to turn a
> > > well-formed
> > > > message into malformed message upon truncation.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Anton.
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Syslog mailing list
> > > > Syslog@lists.ietf.org
> > > > https://www1.ietf.org/mailman/listinfo/syslog
> > > >
> > >
> >
>
> _______________________________________________
> Syslog mailing list
> Syslog@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/syslog
>

_______________________________________________
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


_______________________________________________
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog

Reply via email to