(Apologies for the slow response)
On Fri Jun 29 15:17:50 2007, Joe Hildebrand wrote:
On Jun 29, 2007, at 3:40 AM, Dave Cridland wrote:
On Thu Jun 28 23:12:25 2007, Joe Hildebrand wrote:
The current spec could absolutely be used for this. The hardest
part is spec'ing how to generate a string that has all of the
capabilities, so that you can run the hash. Canonical XML is
massive overkill, but, for example, if we just said:
- For all sorting, use the Unicode Collation Algorithm (http://
www.unicode.org/unicode/reports/tr10/)
Feh. UTF-8 encode then i;octet - much faster, just as stable, and
a heck of a lot simpler to implement, especially given that
namespaces will be us-ascii anyway (hence UTF-8). RFC4790 defines
this. (i;basic uses TR10, but ISTR it's not yet ready).
+1. What's the standards-language way of saying that?
All ordering operations are performed using the "i;octet" collation,
as defined in
section 9.3 of RFC4790.
"i;octet" in turn basically acts like a non-localized strcmp(). This
means it's case-sensitive. If you want, you could use
"i;ascii-casemap", which case-folds us-ascii letters in a
locale-unspecific manner.
Which makes me wonder - we might need to normalize unicode input, if
there really is any. capabilityprep, anyone?
I'd propose something mildly more structured here, really such
that it's simpler to view by eye to ensure the formatting and
ordering is correct. This has no security impact, it's just
easier to implement.
I'm not sure why readability is important here. It's never going
on the wire.
Absolutely, but the result of the hash is. Therefore, it's useful for
debugging purposes that two hash inputs can be compared by eye to see
why they don't match. I believe this has been shown to be a bit of a
lack in things like DIGEST-MD5, for example.
For better security, we could use HMAC, and/or a different hash
function. One option would be that, if H is the result of the
hash/ hmac function, then V, the version string, is formed by
prepending its algorithm and a $, something like:
E = *(cat-line) *(feat-line)
H = <hash/hmac of E>
V = hash-func-name "$" H
hash-func-name = hash-name / "HMAC-" hash-name
hash-name = "MD5" / "SHA1" / "SHA-256"
We mandate that HMAC-MD5 is used, but a future specification MAY
change this requirement. MD5 does have the minor advantage of
being smaller.
I think this is overkill. Finding a one-way collision in a hash
function seems like adequate protection against a DoS attack. The
simpler this is to check, and the less ways of messing it up, the
more likely it will be to get implemented. Let's just pick a
single algorithm and stick with it.
I'm quite happy with MD5 for the moment, and I don't see this
changing anytime soon.
However, I'm no expert on hash function weaknesses, and I'm dimly
aware that use of MD5 may cause a prefix match to become possible -
ie, it's possible to obtain two inputs sharing a common prefix only,
but producing the same hash. This might be problematic.
I still suspect that the time required to make use of this would make
life too tricky to take advantage of this, but it's a reasonable
decision to consider a different hash instead of MD5 as a result.
Now, we believe SHA-256 to be secure, at this stage. But, we all
thought MD5 was secure back in the day, too, so it's not terribly
foresighted to say that any time you use a hash algorithm, you need a
method for hash agility - indeed, I think the IETF is mandating this
these days.
Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade