Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Mridul Muralidharan wrote: Joe Hildebrand wrote: Changing the meaning of node breaks backwards compatibility, whereas nothing else in the current proposal does. If there's no good reason to break backward compatibility, I suggest that we avoid it. I am not sure what was decided as the final design for the spec regarding hashing, but moving from existing scheme of ver ext also breaks backward compatibility. I don't think it does. 1. The 'ver' attribute used to be opaque, so existing implementations should cope perfectly if they receive one containing a base64-encoded hash value instead of something like 2.3. 2. The 'ext' attribute used to optional, so legacy applications won't mind if it is never specified by implementations of the latest protocol versions. 3. The new 'hash' attribute (containing the name of the hash) will be ignored by existing implementations. New implementations will need to be aware that if no 'hash' value is specified they should ignore the caps element (they should not attempt to be compatible with legacy caps since that would make them vulnerable to cache poisoning). - Ian
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On 3 Jul 2007, at 23:43, Rachel Blackman wrote: Beyond that, I can't think of a good reason for it, but add as an anecdotal note when I pulled displaying the client version in tooltips out of the early builds of Astra, the testers howled bloody murder until I put it back. So, for whatever reason, I can attest that my users actually do want it. :) Psi has had a similar experience, and we only switched from iq:version automatically to showing caps info. /K -- Kevin Smith KTP Associate - Exeter University / ai Corporation Psi Jabber client developer/project leader (http://psi-im.org/) XMPP Standards Foundation Council Member (http://xmpp.org)
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Tue Jul 3 23:45:53 2007, Justin Karneges wrote: Apologies for not understanding this thread at all and just commenting out of nowhere, but what security is gained by using a hash in the caps protocol? It's an attempt at preventing a theoretical attack, as I understand things. The only instance of caps pollution being an issue appears - from my reading of this thread - to be an inadvertant error, not a deliberate attack. If there is no security gained by using a hash (e.g. everyone has access to the raw data such that they can all calculate the same hash) then what difference does it make which algorithm is used? I know Ian and Joe have answered this, I'm hoping I might add a different perspective on this, plus there's actually a new point at the bottom. :-) In principle, an attacker capable of mounting a selected preimage attack (specifically, one that involves being able to create a caps list such that it produces a hash identical to that provided by legitimate clients *and* it gains some benefit to the attacker, in a reasonable timeframe) might be able to subvert communications. An example of this might be convincing a client that one or more users on the roster are, contrary to reality, unable to handle esessions, or some new encrypted Jingle, enabling the attacker to eavesdrop communications. To do this, the attacker has to detirmine the real hash, use the preimage attack to find a caps list to supply by disco which is both syntactically correct and excludes the extensions that the attacker wishes to remove, do so before the target client can query any real clients, and finally place themselves in a position such that they answer such queries. The latter can be achieved either by sending a directed presence or by subverting the server entirely - we can treat this as the easy bit. The hard part remains the timing issue - in order to have any value, you'd need to pollute the target clients capability cache prior to it discovering the real capabilities, and that's an extraordinarily short time window. A simple MD5 hash will adequately prevent any chance of inadvertant cache pollution, which leaves the selection of any hash algorithm purely down to the time it'd take to mount a preimage attack. I've reviewed the various papers on MD5 as best I can, and I don't think its known weaknesses are such that a preimage attack can be mounted within a useful timeframe, hence I'm not too fussed, but I'd be happy to see SHA-1 used if people are genuinely concerned. Whatever we choose, hash functions are continually eroded, and what's reasonable now will not be in the future. (FWIW, Ian's mention of a one hour attack is a collision attack, not a preimage attack, and finds a pair of two-block messages which collide, both of which have specific properties, and the time figures are quoted for an IBM P690, which is somewhat bigger iron than I have about, anyway. Our attacker needs a selected preimage attack, and will almost certainly need one where the legitimate message is several blocks long for MD5, and their primary source of computing power is likely to be a distributed botnet at best - I'm not clear if this attack is distributable or not, but I'm not concerned by it). I mentioned earlier that we could gain a benefit from ver/ext by using prepackaged sets of capabilities, in order that there was more likelyhood of a cache-hit, and moreover, allows clients to ship with a hardcoded cache containing these prepackaged sets already, avoiding the need to probe at all. I think it might be worth noting that the more commonality we have between clients in this respect, the harder it is to mount such an attack, although correspondingly higher gains can be made. If clients are able to ship with a pre-polulated cache, then the window of opportunity for an attacker vanishes entirely for those clients, however, allowing those clients to effectively claim immunity from such attacks. Sorry, it's another trade-off. FWIW, I lean heavily toward pre-defined sets, as I think that good clients gain in both security and efficiency, whereas old clients are unaffected. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Hello On Wed, Jul 04, 2007 at 10:38:26AM +0100, Dave Cridland wrote: (FWIW, Ian's mention of a one hour attack is a collision attack, not a preimage attack, and finds a pair of two-block messages which collide, both of which have specific properties, and the time figures are quoted for an IBM P690, which is somewhat bigger iron than I have about, anyway. Our attacker needs a selected preimage attack, and will almost certainly need one where the legitimate message is several blocks long for MD5, and their primary source of computing power is likely to be a distributed botnet at best - I'm not clear if this attack is distributable or not, but I'm not concerned by it). Not sure what attack he mentioned, but there is collision project. Collisions in time of minutes on PC, and there is something about generating a colliding data with some prefix if I understand it well (there was something it can generate data that has given MD5 and with some initial hash state or so). http://cryptography.hyperlink.cz/MD5_collisions.html Not that I would understand it much, nor read it properly, just that the author is from the same country as me, so I heard about it. So I think if you have few hours or days, you have no much problems in finding something, if you know how. -- The human mind ordinarily operates at only ten percent of its capacity -- the rest is overhead for the operating system Michal 'vorner' Vaner pgpcIx1foAxWO.pgp Description: PGP signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jul 4, 2007, at 5:35 AM, Ian Paterson wrote: 'ext' and pre-defined sets only improve security if the choice of a weak hash makes pre-image attacks possible. So why don't we make things easier for everyone and simply recommend a stronger hash instead? So, to pull those bits together, I'm recommending: base64(sha1(dave-formatted id/features)) which would give ver's that look like: C+7Hteo/D9vJXQ3UfzxbwnXaijM= Which is small enough for me. -- Joe Hildebrand
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Ian Paterson wrote: Mridul wrote: So queries for both bare jid and ns#ver will be supported (and return the same value) ? And all clients using newer spec would use bare jid I suppose ? (so that we can deprecate ns#ver and remove this in the future) Yes. But we do lose ability to enable/disable plugins without invalidating user's caps data... might be an acceptable tradeoff. Yes, if 'ext' is obsoleted, the hash value in the caps element will change whenever the supported features change (including when a plugin is enabled disabled). But as you say, the tradeoff (for simplicity) might be acceptable, since the disadvantage (of more hash values) may be marginal. Especially because we have a finite number of protocols: http://www.xmpp.org/registrar/namespaces.html And some of those are payload namespaces that would not be advertised in service discovery. Granted, the number of protocols a client might advertise will increase over time, and the number of potential combinations is large. But in practice I think that most clients will support a rather narrow range of combinations. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Joe Hildebrand wrote: On Jul 4, 2007, at 5:35 AM, Ian Paterson wrote: 'ext' and pre-defined sets only improve security if the choice of a weak hash makes pre-image attacks possible. So why don't we make things easier for everyone and simply recommend a stronger hash instead? So, to pull those bits together, I'm recommending: base64(sha1(dave-formatted id/features)) Seems reasonable to me. which would give ver's that look like: C+7Hteo/D9vJXQ3UfzxbwnXaijM= Which is small enough for me. Me too. I'll write that up provisionally in XEP-0115 v1.4pre1 so we can see how it looks... /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jul 3, 2007, at 6:48 AM, Ian Paterson wrote: Rachel Blackman wrote: Let's say we have node='http://ceruleanstudios.com/astra/caps' and ver='h$someverylongstring' and ext='h$otherverylongstring' Or how about simply: node='$' ver='base64encodedHashOfFeatures' No. The other reason for caps is so that receivers can show a different icon for each different client that they have received presence from. There has to be a URI to define the sending client. -- Joe Hildebrand
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jul 2, 2007, at 5:12 PM, Rachel Blackman wrote: Because the caching logic is not identical; hash-forms are global, rather than client-specific. If Psi and Exodus have precisely the same capabilities, they will generate the same hash and I should not need to re-query it, even if they have different caps nodes. This is an optimization that a receiving client might choose to use, but I'm not sure that it needs to be in the spec, other than as an implementation note. -- Joe Hildebrand
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Joe Hildebrand wrote: On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Forgot to add, change name from ver ext to verh and exth ? Why? Conflict with existing clients - too many of them in the wild dont use these semantics. Others have already responded to this, but just to reinforce, I *did* talk about backward compatibility. Existing clients would continue to work just fine. New clients just have to be able to detect other new clients, to know if they are supposed to be able to check the hash. Presumably, new clients could choose by policy to ignore un-hashed caps from old clients. Not sure if anyone addressed the actual I was thinking of (need to read rest of thread). Essentially, how would 'new' clients know is something exhibited in ver or ext is hash or 'old' value ? Aren't those identifiers not expected to be opaque (though consistent) ? Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by two clients - how would the reciever know what is hashed as per 'new' idea and what is 'old' ver/ext ? In first case, it wont hash properly to what is exhibited by disco - which might make the 'new' client think it is hitting a problem client instead of a old client. - Mridul PS : new - client based on proposed idea, old - client conforming to current xep.
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Mridul wrote: Joe Hildebrand wrote: On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Forgot to add, change name from ver ext to verh and exth ? Why? Conflict with existing clients - too many of them in the wild dont use these semantics. Others have already responded to this, but just to reinforce, I *did* talk about backward compatibility. Existing clients would continue to work just fine. New clients just have to be able to detect other new clients, to know if they are supposed to be able to check the hash. Presumably, new clients could choose by policy to ignore un-hashed caps from old clients. Not sure if anyone addressed the actual I was thinking of (need to read rest of thread). Essentially, how would 'new' clients know is something exhibited in ver or ext is hash or 'old' value ? Aren't those identifiers not expected to be opaque (though consistent) ? Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by two clients - how would the reciever know what is hashed as per 'new' idea and what is 'old' ver/ext ? As I understand the proposal, there would not be #hash1 #hash2 -- why do you need multiple values here? You concatenate all the supported namespaces according to some rule and then hash the whole thing. So there's only one hash. But that means something different from 'ext' or 'node' or 'ver' so I think it needs to go in its own attribute. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Hello On Tue, Jul 03, 2007 at 09:18:59AM -0600, Joe Hildebrand wrote: hash='MD5' and make it mutually-exclusive with ext. Why exclusive? ext for the old clients, hash to check if it makes sense? caps:c node='client' ext='f1 f2 f3' hash='the-hash'/ Or am I missing something? -- chown -R us $BASE Michal 'vorner' Vaner pgpWOroXVyCNJ.pgp Description: PGP signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Rachel Blackman wrote: That said, I think we can come up with some simpler logic. If a given token is prefixed with 'h$', for instance, we know it's a hash and should be both validated against the result, and -- if it matches -- cached globally instead of per-client. But for backwards compatibility, a disco on node#h$hash would still give you the proper results, and COULD be cached on a per-client basis. Possible. But where does the token go? It seems preferable to define a new attribute for this. Hmph. Why? Let's say we have node='http://ceruleanstudios.com/astra/caps' and ver='h$someverylongstring' and ext='h$otherverylongstring' Why can't something like... http://ceruleanstudios.com/astra/caps#h$someverylongstring work just like an old ver='4.0.0.47' would generate an http://ceruleanstudios.com/astra/caps#4.0.0.47 would under the old system, for an old client? After all, you still have to query the hash to get the features represented, which you then hash to validate and -- if it's valid -- store it globally so that /all/ clients which have 'h$someverylongstring' have that featureset. Hmm, OK. On that model, what is the use of 'ext'? Do you put a hash of all the base features in 'ver' and a hash of all the extended features in 'ext'? That seems potentially sub-optimal, because different clients might divide base and extended differently, which means you'll need to send and receive a lot more disco queries. It seems better to me if we have only one hash for all the features. Thus, old clients can use it just fine with old-style logic. Only the new client needs to know that h$ means 'take everything after that $, and treat it as a hash;' old clients can still query seamlessly. Right. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Michal 'vorner' Vaner wrote: Hello On Tue, Jul 03, 2007 at 09:18:59AM -0600, Joe Hildebrand wrote: hash='MD5' and make it mutually-exclusive with ext. Why exclusive? ext for the old clients, hash to check if it makes sense? caps:c node='client' ext='f1 f2 f3' hash='the-hash'/ Or am I missing something? Yes that seems to work. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jul 3, 2007, at 7:01 AM, Joe Hildebrand wrote: On Jul 2, 2007, at 5:12 PM, Rachel Blackman wrote: Because the caching logic is not identical; hash-forms are global, rather than client-specific. If Psi and Exodus have precisely the same capabilities, they will generate the same hash and I should not need to re-query it, even if they have different caps nodes. This is an optimization that a receiving client might choose to use, but I'm not sure that it needs to be in the spec, other than as an implementation note. The two objections to caps are always that a) ZOMG someone can maybe maliciously pollute the cache, and b) we should have exts hardcoded so you never need to query ever and they should be the same across all clients. My understanding was that this proposal was addressing /both/; not only making caps something self-validating, but also extending the cache to be globally valid? -- Rachel Blackman [EMAIL PROTECTED] Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
The XEP could also specify that if a client sets the value of the 'node' attribute to $ then it MUST NOT include an 'ext' attribute. Not sure about this, it really depends on how ext is actually used in the wild, as Joe said. I'd be tempted to leave this somewhat open, at least for now. It could be that we could grow a set of extensions of commonly co-implemented features, bearing no actual relation to client plugins, and cut down traffic that way. But such things require quite a bit of research. Ext is used in the wild. My initial reaction is that it is still needed, but on further thought, I can't see why. If you remove ext, you create MORE separate things to cache, and thus recreate more network traffic. Because now, client Foo with plugin Bar installed will have an entirely different hash than client Foo without Bar installed. With ext, client Foo has the same capabilities hash in both cases, but one has an additional ext hash for plugin Bar's capabilities. -- Rachel Blackman [EMAIL PROTECTED] Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul wrote: Joe Hildebrand wrote: On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Forgot to add, change name from ver ext to verh and exth ? Why? Conflict with existing clients - too many of them in the wild dont use these semantics. Others have already responded to this, but just to reinforce, I *did* talk about backward compatibility. Existing clients would continue to work just fine. New clients just have to be able to detect other new clients, to know if they are supposed to be able to check the hash. Presumably, new clients could choose by policy to ignore un-hashed caps from old clients. Not sure if anyone addressed the actual I was thinking of (need to read rest of thread). Essentially, how would 'new' clients know is something exhibited in ver or ext is hash or 'old' value ? Aren't those identifiers not expected to be opaque (though consistent) ? Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by two clients - how would the reciever know what is hashed as per 'new' idea and what is 'old' ver/ext ? As I understand the proposal, there would not be #hash1 #hash2 -- why do you need multiple values here? You concatenate all the supported namespaces according to some rule and then hash the whole thing. So there's only one hash. But that means something different from 'ext' or 'node' or 'ver' so I think it needs to go in its own attribute. /psa What Joe Hildebrand proposed initially had hashes for node ext's - I was refering to that here. The approach has the advantage that, clients can query and validate (and so independently cache) namespace#node_hash, namespace#ext1_hash, etc - and so having multiple hash'es allows reuse of cap's info across clients and allows them to modify ext's each independent of the others. Addition or removal of a plugin will not result in the entire hash being invalidated - just a specific hash will be removed, or modified. A single hash has the drawback that it will either protect the entire set, or none at all - and so effectively we lose ability of separating node's from ext's since we cannot independently validate each. Which is why my initial query was if this should be 'verh' and 'exth' to indicate hash'es and not 'ver' 'ext' for the data itself (because of new clients rejecting caps from old clients) - and 'new' clients would query for these instead of ver/ext. I am sure there would be better ideas to tackle this problem. Though cache pollution looks like a serious issue (especially server cache for pep case), I am wondering if we are not taking this way too seriously - I did not see so much discussion for esessions ;-) Regards, Mridul s/node/ver/g Apologies. Mridul
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Tue Jul 3 17:49:24 2007, Mridul Muralidharan wrote: Though cache pollution looks like a serious issue (especially server cache for pep case), I am wondering if we are not taking this way too seriously - I did not see so much discussion for esessions ;-) Esessions are neither paintable, nor used for the storage of two-wheeled, self-propelled, personal transport systems based on a pedal/chain/gear drive system. A less facetious answer is that this is something that is relatively easy to understand, and therefore comment on. I'll happily be the first to admit that esessions are a little beyond me at the moment, whereas I can get my head around this quite easily. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jul 3, 2007, at 8:01 AM, Dave Cridland wrote: All ordering operations are performed using the i;octet collation, as defined in section 9.3 of RFC4790. fine. Which makes me wonder - we might need to normalize unicode input, if there really is any. capabilityprep, anyone? They're URIs, right? Don't they have defined comparability already? I'm not sure why readability is important here. It's never going on the wire. Absolutely, but the result of the hash is. Therefore, it's useful for debugging purposes that two hash inputs can be compared by eye to see why they don't match. I believe this has been shown to be a bit of a lack in things like DIGEST-MD5, for example. Fine. Let's specify it really closely, though, including precisely which line-endings to use, and the like. -- Joe Hildebrand
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Joe Hildebrand wrote: I just talked to stpeter IRL (he's all of 10 feet from me; should have done that first thing this morning!), I just measured. It's 15 feet. :P and made sure he understood what I was after. I'm replying to Rachel's mail, since it hits on the two (in my mind) remaining interesting questions, namely: 1) is it necessary to explicitly flag that we're doing hashes? I like Kevin Smith's suggestion that the receiver should always do the hash. For old clients, the hash will never match, but it's the same policy decision to allow old clients as to allow broken senders that send bad hashes. I like that too. But I also agree with Dave that we want to future-proof this. What if 3 years from now we want to use SHA-256 and 7 years from now we want to use an algorithm that emerges from the current NIST work? It might be good to include a 'hash' attribute whose value is one of the hash function text names from the IANA registry (but the value defaults to MD5 or whatever we settle on now so that you don't have to include it until we decide to allow alternate algorithms). 2) do we need ext in the hash world? Rachel makes a good point that without ext, more data has to be cached, since there will be redundant features associated with different ver's. I think it's a toss-up; no ext's is considerably simpler for all involved; no partitioning of features on the sender side and no unions on the receiving side. I don't think we need 'ext' in the hash world. To be clear and explicit, all in one place, here is what I'm recommending: presence from='[EMAIL PROTECTED]/globe' c xmlns='http://jabber.org/protocol/caps' node='http://psi-im.org/caps' ver='big-long-hash-goes-here'/ /presence Again, I think simplicity dictates that we pick a single hash algorithm and stick with it; I'm almost entirely disinterested in what that algorithm is, as long as it doesn't provide output that has too many bytes in it. Too many bytes, my dear Mozart! What is too many bytes? Too many bytes for what purpose? As noted, 3 years from now we might decide to use SHA-256 or whatever even if the hashes are longer because the security properties are preferable. So yes let's settle on one hash algorithm to start, but let's not close the door to other algorithms in the future if needed. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Peter Saint-Andre wrote: I also agree with Dave that we want to future-proof this. What if 3 years from now we want to use SHA-256 and 7 years from now we want to use an algorithm that emerges from the current NIST work? It might be good to include a 'hash' attribute whose value is one of the hash function text names from the IANA registry (but the value defaults to MD5 or whatever we settle on now so that you don't have to include it until we decide to allow alternate algorithms). +1 If we want to prevent malicious cache poisoning going forward, then clients need to be able to upgrade the hash they are using. MD5 is not secure enough even for this purpose. (I've read about attacks that require less than an hour of computing time!) IMHO, SHA256 is the most reasonable default. 2) do we need ext in the hash world? Rachel makes a good point that without ext, more data has to be cached, since there will be redundant features associated with different ver's. I think it's a toss-up; no ext's is considerably simpler for all involved; no partitioning of features on the sender side and no unions on the receiving side. I don't think we need 'ext' in the hash world. +1 the protocol is far simpler to implement without extensions. More storage will be required. But I'm not sure that we'll hit the sweet spot for storage-challenged clients. i.e. It may well be that those clients that have insufficient storage to cache the hash for each combination of plugins also have insufficient storage to cache hashes for each separate extension (i.e. they can't use caps at all). Joe Hildebrand wrote: On Jul 3, 2007, at 6:48 AM, Ian Paterson wrote: Rachel Blackman wrote: Let's say we have node='http://ceruleanstudios.com/astra/caps' and ver='h$someverylongstring' and ext='h$otherverylongstring' Or how about simply: node='$' ver='base64encodedHashOfFeatures' No. The other reason for caps is so that receivers can show a different icon for each different client that they have received presence from. There has to be a URI to define the sending client. Yes, that cuts down on the old iq:version flood. Or so we hope. :) Hmm, going forward, are the clients that most people use going to continue showing these icons? Is this a feature we need to care about? Even though I'm one of the small group of people involved in the XMPP community, I really don't care what client my contacts are using. Will there ever be mass demand for this feature? On the rare occasions where people are interested, they'll probably be perfectly happy to explicitly ask their client to find out the other user's client version on a case-by-case basis. IMHO the 'node' attribute could be repurposed to be the name of the hash function (for backwards compatibility). We could also add some language to the XEP stating that clients SHOULD NOT perform an iq:version flood. (IMHO, assuming the features hash is available via caps, there is little justification for such behavior.) Dave Cridland wrote: Assuming you didn't really mean base64, since hashes are typically represented as strings simply as hex digits. Base64 would be smaller, but unusual, and potentially include character-space clashes with Disco. I did mean base64, but if people think that is too hard to implement, then hex is fine (even though it is 50% longer). I don't understand how base64 could create clashes with Disco. - Ian
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Justin Karneges wrote: Apologies for not understanding this thread at all and just commenting out of nowhere, but what security is gained by using a hash in the caps protocol? If there is no security gained by using a hash (e.g. everyone has access to the raw data such that they can all calculate the same hash) then what difference does it make which algorithm is used? What if the raw data is supplied by the attacker? Imagine Eve wants to poison the caches of clients that haven't yet received presence from a brand new release of Psi. If it is easy to discover collisions for the hash used by Psi, then Eve can send Psi's hash to a client and respond to its resulting disco request with a false set of features that she generated earlier. The false set would probably include a single unrecognizable feature whose 'var' value could be manipulated to ensure the set has the correct hash value, for example: feature var='[EMAIL PROTECTED]'/. - Ian
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Forgot to add, change name from ver ext to verh and exth ? Why? Conflict with existing clients - too many of them in the wild dont use these semantics. But introducing new attributes is backward-incompatible, no? Given that both the 'ver' and 'ext' attributes have no semantic meaning in XEP-0115 right now, I don't see why it is a problem to use those attribute names. In fact we're adding semantic meaning with the hashes, but existing clients should work just fine AFAICS. /psa smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
Peter Saint-Andre wrote: Mridul Muralidharan wrote: Peter Saint-Andre wrote: Mridul Muralidharan wrote: Forgot to add, change name from ver ext to verh and exth ? Why? Conflict with existing clients - too many of them in the wild dont use these semantics. But introducing new attributes is backward-incompatible, no? Given that both the 'ver' and 'ext' attributes have no semantic meaning in XEP-0115 right now, I don't see why it is a problem to use those attribute names. In fact we're adding semantic meaning with the hashes, but existing clients should work just fine AFAICS. /psa When new clients attempt to interop with existing clients, they will not be able to do so - since none of the ver/ext exhibited by existing clients will match what gets generated through the md5/sha/etc sum of features capabilities when newer clients attempt to validate this. So we will not have interop going forward (well, existing clients will be able to use new ones though ... weird situation, since usually a changes leaves older clients hanging !). Or did I get it wrong ? Regards, Mridul
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Thu Jun 28 23:12:25 2007, Joe Hildebrand wrote: The current spec could absolutely be used for this. The hardest part is spec'ing how to generate a string that has all of the capabilities, so that you can run the hash. Canonical XML is massive overkill, but, for example, if we just said: - For all sorting, use the Unicode Collation Algorithm (http:// www.unicode.org/unicode/reports/tr10/) Feh. UTF-8 encode then i;octet - much faster, just as stable, and a heck of a lot simpler to implement, especially given that namespaces will be us-ascii anyway (hence UTF-8). RFC4790 defines this. (i;basic uses TR10, but ISTR it's not yet ready). - Initialize an empty string E - sort the identities by category, then type - for each identity, append the category, then the type (if it exists) to E (note: any information inside an identity will be ignored) I'd propose something mildly more structured here, really such that it's simpler to view by eye to ensure the formatting and ordering is correct. This has no security impact, it's just easier to implement. Something like: For each identity, append the following production: cat-line = cat-tag SP category [SP type] CRLF ;; Note that type MUST be present if it exists. cat-tag = %43 %41 %54 ;; CAT, case insensitively. - sort the features by URI - for each feature, append the URI to E (note: any information inside a feature will be ignored) Similarly: feat-line = feat-tag SP feat-uri CRLF feat-tag = %46 %45 %41 %54 ;; FEAT case insensitively. - calculate the MD5 sum for E MD5 has a bad reputation, but note that the stricter the input formatting, the less likely it is to be forged (ie, the less likely it is that someone could find a colliding input that makes semantic sense.) For better security, we could use HMAC, and/or a different hash function. One option would be that, if H is the result of the hash/hmac function, then V, the version string, is formed by prepending its algorithm and a $, something like: E = *(cat-line) *(feat-line) H = hash/hmac of E V = hash-func-name $ H hash-func-name = hash-name / HMAC- hash-name hash-name = MD5 / SHA1 / SHA-256 We mandate that HMAC-MD5 is used, but a future specification MAY change this requirement. MD5 does have the minor advantage of being smaller. - use this for the version number or extension name (Given my suggestion above, we'd use V, rather than Hash(E)). Example (adapted from XEP-115, example 2): presence from='[EMAIL PROTECTED]/home' c xmlns='http://jabber.org/protocol/caps' node='http://exodus.jabberstudio.org/caps' ver='730c80b442e150dd5e19a31f8edfa8b1' ext='d6224a352df544cfde1fbce177301c67 d0ef9e8327acf5873d16fe083b4d3f3f'/ /presence This example would have the same form, roughly: ver='HMAC-MD5$[...]' ext='HMAC-MD5$[...] HMAC-MD5$[...]' The receiving client SHOULD check the hashes, after doing the IQ/gets: md5(clientpchttp://jabber.org/protocol/disco#infohttp://jabber.org/ protocol/disco#itemshttp://jabber.org/protocol/feature-neghttp:// jabber.org/protocol/muc) = 730c80b442e150dd5e19a31f8edfa8b1 This one becomes (using literal whitespace for clarification, not syntax): Hash( CAT client pc\r\n FEAT http://jabber.org/protocol/disco#info\r\n FEAT http://jabber.org/protocol/disco#items\r\n FEAT http://jabber.org/protocol/feature-neg\r\n FEAT http://jabber.org/protocol/muc\r\n ) I'll skip the remaining examples, but presumably you get the notion. If the receiving client detects an inconsistency, it MUST NOT use the information it received, and SHOULD show an error of some kind. For backwards-compatibility, any version number that is not 32 octets long consisting only of [0-9a-f] MUST be treated as if it does not implement MD5 checking. We've got slightly better error checking if we explicitly tag the data with the prefix defined by the V ABNF production above. Analysis: - Existing entities, both sending and receiving, should work fine - Over time, we can phase in entities that send md5 versions and ext's - Receiving clients that care about security can start checking MD5 hashes of the features to check for poisoning. - Downside: more bytes in presence than today. We send these out, currently, with every presence update, correct? Is it worth looking at an alternate mechanism, or a generalized presence-delta? (After all, I'm pretty sure that this data won't change as often as my status). I have to admit, I have an odd feeling that combining all extensions together might generate a better result, too, but that's nothing more than a gut feeling. - Assertion: anything else we do will be at least this bad if not worse. If we add these bits to -115, will everyone agree to never bring up changing caps again, and to all agree on that the next time a n00b comes around? I hate to say never, but I can't see how we can get much better than this. Dave. -- Dave
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Fri Jun 29 01:13:26 2007, Joe Hildebrand wrote: You're worried about the attack where someone generates a set of features that has the same hash as the a different set of features. In this case, the birthday attack doesn't help, since you only get to pick one set of ciphertext. Also, as I think I mentioned, the more structured the input text, the harder it is to find a collision. Let's assume that it's still possible to come up with a collision, given sufficient computing power. Why would someone expend such computing power to achieve this? We're talking weeks of work, here, and even if it dropped to hours, there's a race involved - the attacker would need to find a spoof set of capability data which served whatever purpose was intended and matched the hash function's output, *and* do so before the victim's client cached the legitimate data. That seems like the cost of such an attack outweighs the benefits, to me. And that's just using a very cheap hash function. I actually suspect that HMAC-MD4 would be sufficient, if it weren't for the fact that MD4 implementations are pretty hard to find now. MD5 (and HMAC) is everywhere, and cheap, so a good one to use. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities
On Jun 29, 2007, at 3:40 AM, Dave Cridland wrote: On Thu Jun 28 23:12:25 2007, Joe Hildebrand wrote: The current spec could absolutely be used for this. The hardest part is spec'ing how to generate a string that has all of the capabilities, so that you can run the hash. Canonical XML is massive overkill, but, for example, if we just said: - For all sorting, use the Unicode Collation Algorithm (http:// www.unicode.org/unicode/reports/tr10/) Feh. UTF-8 encode then i;octet - much faster, just as stable, and a heck of a lot simpler to implement, especially given that namespaces will be us-ascii anyway (hence UTF-8). RFC4790 defines this. (i;basic uses TR10, but ISTR it's not yet ready). +1. What's the standards-language way of saying that? - Initialize an empty string E - sort the identities by category, then type - for each identity, append the category, then the type (if it exists) to E (note: any information inside an identity will be ignored) I'd propose something mildly more structured here, really such that it's simpler to view by eye to ensure the formatting and ordering is correct. This has no security impact, it's just easier to implement. I'm not sure why readability is important here. It's never going on the wire. For better security, we could use HMAC, and/or a different hash function. One option would be that, if H is the result of the hash/ hmac function, then V, the version string, is formed by prepending its algorithm and a $, something like: E = *(cat-line) *(feat-line) H = hash/hmac of E V = hash-func-name $ H hash-func-name = hash-name / HMAC- hash-name hash-name = MD5 / SHA1 / SHA-256 We mandate that HMAC-MD5 is used, but a future specification MAY change this requirement. MD5 does have the minor advantage of being smaller. I think this is overkill. Finding a one-way collision in a hash function seems like adequate protection against a DoS attack. The simpler this is to check, and the less ways of messing it up, the more likely it will be to get implemented. Let's just pick a single algorithm and stick with it. If the receiving client detects an inconsistency, it MUST NOT use the information it received, and SHOULD show an error of some kind. For backwards-compatibility, any version number that is not 32 octets long consisting only of [0-9a-f] MUST be treated as if it does not implement MD5 checking. We've got slightly better error checking if we explicitly tag the data with the prefix defined by the V ABNF production above. Fine. But let's just pick one prefix. Is there a urn:hash: URI scheme? We send these out, currently, with every presence update, correct? Is it worth looking at an alternate mechanism, or a generalized presence-delta? (After all, I'm pretty sure that this data won't change as often as my status). There's a server optimization that keeps it from going out as often, but I don't know that it's been implemented. I have to admit, I have an odd feeling that combining all extensions together might generate a better result, too, but that's nothing more than a gut feeling. It probably depends upon how often extensions are turned on and off, whether the server optimization is in effect, and the like. I suppose there's nothing stopping the writer of an entity from doing this, and never sending ext. -- Joe Hildebrand