Anton, thanks again for a great set of comments. Be prepared for a looooong message - you raised many good points and you deserve a long break to read my reply ;)
Before I go into detail to your points, please let me elaborate a little on why the draft is sometimes very specific and lenghty. I am working in the IT security space, as probably many of you do. If I look at past and recent vulnerabilities, there are at least two classes: one is "simple" program bugs and the other one is interoperability weaknesses/differences. Let's dig down on the second class: Many attacks are successful because the underlaying spec leave (too) much room for interpretation. A good example is malware transported via email as vector. To avoid misunderstanding, I am NOT bashing on the mail-related RFCs - they are well thought-out and have prooven to be big drivers behind the Internet. However, they were created in a time where the Internet was a much friendlier space than it is today. In these days, everybody was interested in getting things going and if something went wrong, then it was an unintentional bug, not an intentional attack. Consequently, it paid to try to understand what the other peer meant and try to correct it so that things work. Nowadays, I need to sadly admit, a lot of effort is put into trying to *break* things. Wrack systems. Spread malware. And the like. If you receive a malformed packet today, chances are greater that it is intentionally malformed than they are that it is a simple program bug. Trying to make things work with such malformed packets can actually cause more trouble than it is worth. A good example was the email malware vector recently discussed on e.g. the full disclosure mailing list. As people pointed out there, different programs have different ways of handling improperly formed MIME message. Each implementation tries its best in guessing what the sender might have meant and work on that assumption. Unfortunately, the assumptions are different from vendor to vendor and program to program (some change even between program versions). So you do not have a consistent behaviour. Malware authors can use this to generate malformed MIME encodings which include executable content, that the ultimate recipient (MUA) actually interprets as executable. But they may do it in a way that the malware-scanning MTA interprets differently. Thus, it does not think the message contains executable code and does not scan it (it may not even see a file at all). The end result is that malware is delivered to the MUA (and executed there) because of different interpretations of a malformed MIME encoding - different implementations implemented in the MTA and MUA. Keep in mind that this is not a hypothetical theory, but a practical sample that has been discussed (and to be best of my knowledge, exploited!) in the wild. What does me lead this to? I am of the very strong opinion, that in today's Internet, a specification must very precisely define what it considers correct and what incorrect. And I think it should also provide detailled instructions on what should happen to malformed packets. So the goal should be that messages that are malformed in whatever way should not be processed or at least treated in a consistent way. Of course, even with such a detailled spec, not everybody will actually implement it in this way, but chances are much greater that most will do. Plus, sombody emiting invalidly formatted messages will quickly be punished by not being understood by their peers. In today's highly connected world, this should probably put enough pressure on the misbehaving implementation to change - at least I hope so. Before somebody else raises hands ... yes, I know I am in quite some contrast to Jon Postel's "Be liberal in what you accept, and conservative in what you send.". I don't claim I am wiser than Jon was. I am in no way. In fact, there are many people on this list a lot smarter than me (thanks for all their comments and letting me learn). But I have to admit that I consider Jon's wisdom as wisdom from another time (where the Internet was a friendly place). If I look at todays highly "brutal" Internet, is it really a good idea to assume that everybody is friendly and you try to help them out if they made a little mistake? Or is it probably better to think "if this guy sends me invalid parameters, I'll better tell him to do it right or go away (as he is probably trying to probe me for weaknesses)". I have to admit, I think the later is the case. While conducting some search for actual cases, I found this thread here: http://www.netsys.com/ietf/1999/5245.html It is stated there: ### Almost every time I see the "be liberal..." quote, the context in which it is said it dropped. I belive the context is MORE IMPORTANT than the quoted text. The context is achieving interoperability. Interoperability is more important than any other thing. ### This is a 1999 statement, where the Internet was much more healthier than it is today (but I began to diminish...). I think a very good summary of the spirit behind Jon's quote is: "Interoperability is more important than any other thing." This is what I am questioning today. I think in today's Internet, security is the most important thing and then comes interop ... and then the rest. So I think it is a valid point to require stricter adherance to standards than in the past - which in turn requires the standards to be more precise in what is valid and what not. I guess I am asking for a quite some (not so positive) feedback, but I am ready to stand it... OK, now to the issues themselfs (please read on ;)) > 1. I don't think you specified maximum message size anywhere. > Or did I > miss it? No, that was an error I made. I rush-edited the draft to meet the ID submission deadline. Unfortunately I deleted one paragraph too much. That will be back in in -04 and the max size will be 1280 if nobody objects. Apologies for any confusion... (I've put this info on my web site and into my announcment mail, but it is easy to overlook). > 2. You talk a lot about server parsing the syslog message. > Are we making > an assumption that receiver must parse messages (for example 5th > paragraph in 4.1.1)? I think you make a point here that I missed receivers that are not interested at all in parsing a message. However, this case should be very, very seldom because without parsing, there is nothing else that you can do to the message - because you have just a bunch of octets without semantics. OK, you can use this to write a raw log, but that's it. To assign any meaningfullness to the message, you MUST parse it - no way around it. I can add a section for non-parsing receivers, but that won't reduce the size of the draft ;) I am not sure if it pays to specify this very uncommon case. > Generally, there seems to be a lot of stuff > dictating behavior for receiver. Why can't we just say: this > is valid on > the wire, this is not and leave it up to implementation to decide what > it does with it. See my intro ;) > For example, we ask receivers to "log" diagnostic > messages. What does "log" mean here? I assumed that "log" is a fairly well-known, well-defined term. If we really need to define it, I can do this in the -protocol context, but again this adds size. To me, "to log" means to somehow persist a notification of an important event that happened. How this is persisted and what the exact format is purely depends on the application and environment. I have seen other RFCs, too, that just say "SHOULD log a diagnostic message", so I still assume this needs no further explanation. But as I said, I can add it (but it would not be very detailled because that is actually not a syslog issue). > 3. You use enterprise ID of 0 (IETF) in examples. Is this valid? > Should we say 0 is reserved and should never be used? What > does 0 mean? Enterprise ID is still a placeholder. I would have removed it if David had not said he has some more good comments. I still see no real demand for it, so it still is on my "to be removed list" (or probably better to be moved to the origin structured data element, but that's another topic). > > 4. Section 4.1.3. "Any implementation MUST support free configuration > of the FACILITY on the sender." I think by implementation you are > always assuming a dedicated sender or receiver library > product. I don't > see why I can't just implement sending logic in my app > directly and not > have a fixed facility. I think at best, this is a SHOULD. I agree that a SHOULD is more appropriate here, we can not actually enforce this. It's neither a protocol nor a security issue if it is not configurable. It can be a big backdraw for the operator, but they can choose to dictate this by not purchasing products which don't support free configuration. Agree, it MUST be a SHOULD ;) But I think it is a strong SHOULD, not a weak one. Because I have seen soooo much troubles in the real world out of the inability of some products to provide free configuration. > > 5. Section 4.1.6 - Hostname. So, we specify FQDN and if to > present IP. > Not sure if we had a discussion on this, but did we decide to bypass > hostname? Yes, because it is not meaningful in most cases. I think this is the best pointer to previous discussion: http://www.syslog.cc/ietf/autoarc/msg00715.html. > I think it will be a common case where hostname is present, > but machine does not know its domain suffix. I would > generally prefer IP > unless it is dynamic (DHCP). OK, I see the point. Probably it is best to ask the sender to provide a) FQDN b) static IP c) hostname only (if on dynamic ip) d) dynamic IP e) oops... what if it knows nothing? Well, it should, so this is a no-case. Right? Or is it worth another paragraph (like puting "DumbDevice" or "127.0.0.1" into the hostname)? Of course, this is a sequence of SHOULDs, not MUSTs. > > 6. Section 4.2. Which version of Unicode do we support? UTF8 may > support all, but I think we should limit it to some basic Unicode > version. I think the UTF-8 specification I am quoting is a precise description. If it isn't I am not educated enough on Unicode to note this. I would appreciate if someone with in-depth knowledge could comment. I agree we should precisely define what we expect to see. > > 7. Section 4.3. Why do we restrict sec-frac to 3 digits? If somebody > has better precision why not allow it? It's actually limited to *6* characters - I've just gone over the paper for a typo, but I found no reference to 3. Can you pinpoint me to it. But: it is limited to 6 because a) it is hardly envisable that this is not sufficiently enough and b) we had comments on list that we should limit field sizes to reasonables values so that a parser can be build in a more robust manner. I think the size of 6 servers both needs (plus, I don't trust a nanosecond time source on a normal computer without an atomic clock). > > 8. Section 6.2.3. I don't think you explain the purpose for allowing > partcount to grow. I assume this is for streaming. Needs to be > explained. I also think it is a strange scheme. Why don't you allow > incrementing it by one every time? Actually, I based this on some ideas that I took from you - eventually not correctly understood. My reasoning is as follows: You said that we must support devices that do not know in advance how many message part message will be needed to send a single oversize message (for streaming, as you mention). So we should allow a client to increase the number of message part messages it needs to transmit the original message. This means the partcount must grow (but it can grow incrementally). On the other hand, if my device knows in advance how many message part messages are needed, it is a plus to pass this knowledge to the receiver. The receiver can eventually use this knowledge for buffer allocation and other internal processing (whatever this is). There is also a plus that loss of the last message(s) in a multi-part message can be detected in a scheme where the number of messages is known in advance. This can't be done if the partcount is just being incremented. If it would just be incremented, it would be a redundant counter which we could remove. In fact, I see the streaming case as an acceptable degradation of the scheme (acceptable because its usefulness outweights its backdraws). It definitley looses reliability. So we need to be able to support a scheme were from the very first message part message the partcount is fully correct. To support both schemes, I could have specified that the partcount must either bei incrementing OR never change. I decided to let it grow, but not demand it to grow incrementally. This leaves a streaming implementation that has learned something more about the expected number of fragments to pass this information as soon as it knows it. There may be benefits a recipient may gain from such advance knowledge. The only thing I can not allow is to shrink the number of messages, as this looks like something went wrong (a device witout a clue should not increase the partcount to a value higher than it knows it will send - if it is in doubt, it can still resort to just incrementing). Allowing it to grow - but not necessarily increment - still provides us the ability to detect missing end message(s) when the sender was aware of the actual number of messages and the receiver could receive at least one message with this information before the message loss occured. In short: allowing growth, but not requiring incrementing provides added reliablilty and eventually other (secondary) benefits. I also think this scheme is not uncommon, I think I have seen similar schemes in other protocols (but I have to admit I have no pointers at hand). > > 9. Section 7.2. Can we use "yes" or "no" instead of "0" and "1"? I, too, thought about this. In respect to the message size limitation, I decided to go for the shorter form. "0" and "1" should be fairly clear as boolean indicator. But I will gladly change this if that is the concensus on this list. > > 10. Section 8.1. If relay can't add structured data elements, it can't > record source IP of the message. I think we should not lose such > information. Also, need to allow for recording of time or original > reception. That's a tough issue. It will cause us the loss of digital signatures? Which goal has the higher priority? Or should we work around this issue? Of course it's doable, but only on the expense of growing the spec. We would "simply" need to define a way that a relay can add a container structured data element, which in turn could contain other structured data elements - and that are clearly flagged as being not part of the original message so that a signature verifier could remove that part. This also sounds like a call, again, for XML ;) It's doable, but it will add considerable complexity. On the other hand, we can allow a relay to break signatures. I am not sure if that is a good mode. Or we do not allow it to modify the message (as currently specified), but then we are not able to save the information that you request - and I agree this information is valuable... > > 11. Section 8.3. You allow relay to break message into multiple parts. > What happens with a message that is already multi-part? How do you > distinguish first level of fragmentation from second? Actually, I think that is easier as it first looks. I specified that a message part message is a full syslog message in its own rights. As such, you can apply all rules applying to any syslog message to a message part message, too (because it is a regular message once it has been formed). At least this is the sprit that I had on my mind. So, when a (message part) message is disassembled (being broken in parts), the multi-part message headers must also be disassembled. Then an additional multi-part-message handler is added. We end up with a message that, after reassembly, becomes the orginal message part message, thus in itself a message that must be reassmbled to become the original message. In eventually more familiar terms: let's use "fragmentation" for a moment. If a message fragement becomes further fragmented, an additional fragmentation header is added and the fragments of this message will then travel as a fragemented part of a fragmented message. Double fragementation. Obviously, there is quite some overhead. I consider relays splitting message into multi-part messages as a last resort when there is no other way to handle the situation. It is definitely NOT desirable. > 12. The ID is now 45 pages long and growing with every revision. I > think it would help if we shortened it whenever possible. > After all it > is just a syslog protocol. This is protocol used for troubleshooting. > It can't be itself overly complicated or give such impression. Well ... actually it get's larger with each of your well-thought out comments ;) Honestly, I like to keep it short. But just look at this mail. How often do you rightly ask if we can specify something more precisely? So how can we shrink it and also make it more precise? Well, I assume we can save some pages if a good native English editor goes over it. In general, however, I think it is expanding. Of course, we can move all the specifics and clarifications out into a separate web page, but what exactly is the value of this? And who guarantees that in implementor will visit these pages? Of course, the growth of the document is also related to my try to keep things right in bounds (see my intro). But again, if you look at the WG mailing list archive, you will see lots of comments that "this and that" is unclear and needs to be specified. If we do, the spec obviously grows ;) Besides its growing size, I think it is still moderately easy to implement. There are few things that are actually required. Out of the content, there are hints to implementors, which I assume make it easier to implement syslog - and make implementation more robust, as I try to fingerpoint those issues that are known to cause security problems in the real world. Also, in my experience, the more is in-depth specified, the easier it is to implement the spec, because the developer does not need to think that hard ... He or she will find most things in need inside the spec. So for now, I think it is not a bad thing that it is growing ... but I may be wrong ;) Comments on everything higly appreciated... Rainer