RE: -protocol-03

Rainer Gerhards Thu, 19 Feb 2004 10:14:40 -0800

Anton,

thanks again for a great set of comments. Be prepared for a looooong
message - you raised many good points and you deserve a long break to
read my reply ;)


Before I go into detail to your points, please let me elaborate a little
on why the draft is sometimes very specific and lenghty.

I am working in the IT security space, as probably many of you do. If I
look at past and recent vulnerabilities, there are at least two classes:
one is "simple" program bugs and the other one is interoperability
weaknesses/differences. Let's dig down on the second class: Many attacks
are successful because the underlaying spec leave (too) much room for
interpretation. A good example is malware transported via email as
vector. To avoid misunderstanding, I am NOT bashing on the mail-related
RFCs - they are well thought-out and have prooven to be big drivers
behind the Internet. However, they were created in a time where the
Internet was a much friendlier space than it is today. In these days,
everybody was interested in getting things going and if something went
wrong, then it was an unintentional bug, not an intentional attack.
Consequently, it paid to try to understand what the other peer meant and
try to correct it so that things work.

Nowadays, I need to sadly admit, a lot of effort is put into trying to
*break* things. Wrack systems. Spread malware. And the like. If you
receive a malformed packet today, chances are greater that it is
intentionally malformed than they are that it is a simple program bug.

Trying to make things work with such malformed packets can actually
cause more trouble than it is worth. A good example was the email
malware vector recently discussed on e.g. the full disclosure mailing
list. As people pointed out there, different programs have different
ways of handling improperly formed MIME message. Each implementation
tries its best in guessing what the sender might have meant and work on
that assumption. Unfortunately, the assumptions are different from
vendor to vendor and program to program (some change even between
program versions). So you do not have a consistent behaviour. Malware
authors can use this to generate malformed MIME encodings which include
executable content, that the ultimate recipient (MUA) actually
interprets as executable. But they may do it in a way that the
malware-scanning MTA interprets differently. Thus, it does not think the
message contains executable code and does not scan it (it may not even
see a file at all). The end result is that malware is delivered to the
MUA (and executed there) because of different interpretations of a
malformed MIME encoding - different implementations implemented in the
MTA and MUA.

Keep in mind that this is not a hypothetical theory, but a practical
sample that has been discussed (and to be best of my knowledge,
exploited!) in the wild.

What does me lead this to? I am of the very strong opinion, that in
today's Internet, a specification must very precisely define what it
considers correct and what incorrect. And I think it should also provide
detailled instructions on what should happen to malformed packets. So
the goal should be that messages that are malformed in whatever way
should not be processed or at least treated in a consistent way. Of
course, even with such a detailled spec, not everybody will actually
implement it in this way, but chances are much greater that most will
do. Plus, sombody emiting invalidly formatted messages will quickly be
punished by not being understood by their peers. In today's highly
connected world, this should probably put enough pressure on the
misbehaving implementation to change - at least I hope so.

Before somebody else raises hands ... yes, I know I am in quite some
contrast to Jon Postel's "Be liberal in what you accept, and
conservative in what you send.". I don't claim I am wiser than Jon was.
I am in no way. In fact, there are many people on this list a lot
smarter than me (thanks for all their comments and letting me learn).
But I have to admit that I consider Jon's wisdom as wisdom from another
time (where the Internet was a friendly place).

If I look at todays highly "brutal" Internet, is it really a good idea
to assume that everybody is friendly and you try to help them out if
they made a little mistake? Or is it probably better to think "if this
guy sends me invalid parameters, I'll better tell him to do it right or
go away (as he is probably trying to probe me for weaknesses)". I have
to admit, I think the later is the case.

While conducting some search for actual cases, I found this thread here:

http://www.netsys.com/ietf/1999/5245.html

It is stated there:

###
Almost every time I see the "be liberal..." quote, the context in
which it is said it dropped. I belive the context is MORE IMPORTANT
than the quoted text. The context is achieving interoperability.
Interoperability is more important than any other thing.
###

This is a 1999 statement, where the Internet was much more healthier
than it is today (but I began to diminish...). I think a very good
summary of the spirit behind Jon's quote is:

"Interoperability is more important than any other thing."

This is what I am questioning today. I think in today's Internet,
security is the most important thing and then comes interop ... and then
the rest. So I think it is a valid point to require stricter adherance
to standards than in the past - which in turn requires the standards to
be more precise in what is valid and what not.

I guess I am asking for a quite some (not so positive) feedback, but I
am ready to stand it...

OK, now to the issues themselfs (please read on ;))

> 1. I don't think you specified maximum message size anywhere.
>  Or did I
> miss it?

No, that was an error I made. I rush-edited the draft to meet the ID
submission deadline. Unfortunately I deleted one paragraph too much.
That will be back in in -04 and the max size will be 1280 if nobody
objects. Apologies for any confusion... (I've put this info on my web
site and into my announcment mail, but it is easy to overlook).

> 2. You talk a lot about server parsing the syslog message.
> Are we making
> an assumption that receiver must parse messages (for example 5th
> paragraph in 4.1.1)?

I think you make a point here that I missed receivers that are not
interested at all in parsing a message. However, this case should be
very, very seldom because without parsing, there is nothing else that
you can do to the message - because you have just a bunch of octets
without semantics. OK, you can use this to write a raw log, but that's
it. To assign any meaningfullness to the message, you MUST parse it - no
way around it.

I can add a section for non-parsing receivers, but that won't reduce the
size of the draft ;) I am not sure if it pays to specify this very
uncommon case.

> Generally, there seems to be a lot of stuff
> dictating behavior for receiver. Why can't we just say: this
> is valid on
> the wire, this is not and leave it up to implementation to decide what
> it does with it.

See my intro ;)

> For example, we ask receivers to "log" diagnostic
> messages.  What does "log" mean here?

I assumed that "log" is a fairly well-known, well-defined term. If we
really need to define it, I can do this in the -protocol context, but
again this adds size.

To me, "to log" means to somehow persist a notification of an important
event that happened. How this is persisted and what the exact format is
purely depends on the application and environment. I have seen other
RFCs, too, that just say "SHOULD log a diagnostic message", so I still
assume this needs no further explanation. But as I said, I can add it
(but it would not be very detailled because that is actually not a
syslog issue).

> 3. You use enterprise ID of 0 (IETF) in examples.  Is this valid?
> Should we say 0 is reserved and should never be used?  What
> does 0 mean?

Enterprise ID is still a placeholder. I would have removed it if David
had not said he has some more good comments. I still see no real demand
for it, so it still is on my "to be removed list" (or probably better to
be moved to the origin structured data element, but that's another
topic).

>
> 4. Section 4.1.3.  "Any implementation MUST support free configuration
> of the FACILITY on the sender."  I think by implementation you are
> always assuming a dedicated sender or receiver library
> product.  I don't
> see why I can't just implement sending logic in my app
> directly and not
> have a fixed facility.  I think at best, this is a SHOULD.

I agree that a SHOULD is more appropriate here, we can not actually
enforce this. It's neither a protocol nor a security issue if it is not
configurable. It can be a big backdraw for the operator, but they can
choose to dictate this by not purchasing products which don't support
free configuration. Agree, it MUST be a SHOULD ;)

But I think it is a strong SHOULD, not a weak one. Because I have seen
soooo much troubles in the real world out of the inability of some
products to provide free configuration.

>
> 5. Section 4.1.6 - Hostname.  So, we specify FQDN and if to
> present IP.
> Not sure if we had a discussion on this, but did we decide to bypass
> hostname?

Yes, because it is not meaningful in most cases. I think this is the
best pointer to previous discussion:
http://www.syslog.cc/ietf/autoarc/msg00715.html.

> I think it will be a common case where hostname is present,
> but machine does not know its domain suffix. I would
> generally prefer IP
> unless it is dynamic (DHCP).

OK, I see the point.

Probably it is best to ask the sender to provide
a) FQDN
b) static IP
c) hostname only (if on dynamic ip)
d) dynamic IP
e) oops... what if it knows nothing? Well, it should, so this is a
no-case. Right? Or is it worth another paragraph (like puting
"DumbDevice" or "127.0.0.1" into the hostname)?

Of course, this is a sequence of SHOULDs, not MUSTs.

>
> 6. Section 4.2.  Which version of Unicode do we support? UTF8 may
> support all, but I think we should limit it to some basic Unicode
> version.

I think the UTF-8 specification I am quoting is a precise description.
If it isn't I am not educated enough on Unicode to note this. I would
appreciate if someone with in-depth knowledge could comment. I agree we
should precisely define what we expect to see.

>
> 7. Section 4.3.  Why do we restrict sec-frac to 3 digits?  If somebody
> has better precision why not allow it?

It's actually limited to *6* characters - I've just gone over the paper
for a typo, but I found no reference to 3. Can you pinpoint me to it.
But: it is limited to 6 because a) it is hardly envisable that this is
not sufficiently enough and b) we had comments on list that we should
limit field sizes to reasonables values so that a parser can be build in
a more robust manner. I think the size of 6 servers both needs (plus, I
don't trust a nanosecond time source on a normal computer without an
atomic clock).

>
> 8. Section 6.2.3.  I don't think you explain the purpose for allowing
> partcount to grow.  I assume this is for streaming.  Needs to be
> explained.  I also think it is a strange scheme.  Why don't you allow
> incrementing it by one every time?

Actually, I based this on some ideas that I took from you - eventually
not correctly understood. My reasoning is as follows:

You said that we must support devices that do not know in advance how
many message part message will be needed to send a single oversize
message (for streaming, as you mention). So we should allow a client to
increase the number of message part messages it needs to transmit the
original message. This means the partcount must grow (but it can grow
incrementally).

On the other hand, if my device knows in advance how many message part
messages are needed, it is a plus to pass this knowledge to the
receiver. The receiver can eventually use this knowledge for buffer
allocation and other internal processing (whatever this is). There is
also a plus that loss of the last message(s) in a multi-part message can
be detected in a scheme where the number of messages is known in
advance.

This can't be done if the partcount is just being incremented. If it
would just be incremented, it would be a redundant counter which we
could remove. In fact, I see the streaming case as an acceptable
degradation of the scheme (acceptable because its usefulness outweights
its backdraws). It definitley looses reliability. So we need to be able
to support a scheme were from the very first message part message the
partcount is fully correct.

To support both schemes, I could have specified that the partcount must
either bei incrementing OR never change. I decided to let it grow, but
not demand it to grow incrementally. This leaves a streaming
implementation that has learned something more about the expected number
of fragments to pass this information as soon as it knows it. There may
be benefits a recipient may gain from such advance knowledge. The only
thing I can not allow is to shrink the number of messages, as this looks
like something went wrong (a device witout a clue should not increase
the partcount to a value higher than it knows it will send - if it is in
doubt, it can still resort to just incrementing). Allowing it to grow -
but not necessarily increment - still provides us the ability to detect
missing end message(s) when the sender was aware of the actual number of
messages and the receiver could receive at least one message with this
information before the message loss occured.

In short: allowing growth, but not requiring incrementing provides added
reliablilty and eventually other (secondary) benefits.

I also think this scheme is not uncommon, I think I have seen similar
schemes in other protocols (but I have to admit I have no pointers at
hand).

>
> 9. Section 7.2.  Can we use "yes" or "no" instead of "0" and "1"?

I, too, thought about this. In respect to the message size limitation, I
decided to go for the shorter form. "0" and "1" should be fairly clear
as boolean indicator. But I will gladly change this if that is the
concensus on this list.

>
> 10. Section 8.1. If relay can't add structured data elements, it can't
> record source IP of the message.  I think we should not lose such
> information. Also, need to allow for recording of time or original
> reception.

That's a tough issue. It will cause us the loss of digital signatures?
Which goal has the higher priority? Or should we work around this issue?
Of course it's doable, but only on the expense of growing the spec. We
would "simply" need to define a way that a relay can add a container
structured data element, which in turn could contain other structured
data elements - and that are clearly flagged as being not part of the
original message so that a signature verifier could remove that part.
This also sounds like a call, again, for XML ;)

It's doable, but it will add considerable complexity.

On the other hand, we can allow a relay to break signatures. I am not
sure if that is a good mode.

Or we do not allow it to modify the message (as currently specified),
but then we are not able to save the information that you request - and
I agree this information is valuable...

>
> 11. Section 8.3. You allow relay to break message into multiple parts.
> What happens with a message that is already multi-part?  How do you
> distinguish first level of fragmentation from second?

Actually, I think that is easier as it first looks. I specified that a
message part message is a full syslog message in its own rights. As
such, you can apply all rules applying to any syslog message to a
message part message, too (because it is a regular message once it has
been formed). At least this is the sprit that I had on my mind.

So, when a (message part) message is disassembled (being broken in
parts), the multi-part message headers must also be disassembled. Then
an additional multi-part-message handler is added. We end up with a
message that, after reassembly, becomes the orginal message part
message, thus in itself a message that must be reassmbled to become the
original message.

In eventually more familiar terms: let's use "fragmentation" for a
moment. If a message fragement becomes further fragmented, an additional
fragmentation header is added and the fragments of this message will
then travel as a fragemented part of a fragmented message. Double
fragementation. Obviously, there is quite some overhead.

I consider relays splitting message into multi-part messages as a last
resort when there is no other way to handle the situation. It is
definitely NOT desirable.

> 12. The ID is now 45 pages long and growing with every revision.  I
> think it would help if we shortened it whenever possible.
> After all it
> is just a syslog protocol.  This is protocol used for troubleshooting.
> It can't be itself overly complicated or give such impression.

Well ... actually it get's larger with each of your well-thought out
comments ;)

Honestly, I like to keep it short. But just look at this mail. How often
do you rightly ask if we can specify something more precisely? So how
can we shrink it and also make it more precise? Well, I assume we can
save some pages if a good native English editor goes over it. In
general, however, I think it is expanding. Of course, we can move all
the specifics and clarifications out into a separate web page, but what
exactly is the value of this? And who guarantees that in implementor
will visit these pages?

Of course, the growth of the document is also related to my try to keep
things right in bounds (see my intro). But again, if you look at the WG
mailing list archive, you will see lots of comments that "this and that"
is unclear and needs to be specified. If we do, the spec obviously grows
;)

Besides its growing size, I think it is still moderately easy to
implement. There are few things that are actually required. Out of the
content, there are hints to implementors, which I assume make it easier
to implement syslog - and make implementation more robust, as I try to
fingerpoint those issues that are known to cause security problems in
the real world.

Also, in my experience, the more is in-depth specified, the easier it is
to implement the spec, because the developer does not need to think that
hard ... He or she will find most things in need inside the spec.

So for now, I think it is not a bad thing that it is growing ... but I
may be wrong ;)

Comments on everything higly appreciated...
Rainer

RE: -protocol-03

Reply via email to