Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Ian Paterson

Mridul Muralidharan wrote:

Joe Hildebrand wrote:
Changing the meaning of node breaks backwards compatibility, whereas 
nothing else in the current proposal does.  If there's no good reason 
to break backward compatibility, I suggest that we avoid it.


I am not sure what was decided as the final design for the spec 
regarding hashing, but moving from existing scheme of ver  ext also 
breaks backward compatibility.


I don't think it does.

1. The 'ver' attribute used to be opaque, so existing implementations 
should cope perfectly if they receive one containing a base64-encoded 
hash value instead of something like 2.3.
2. The 'ext' attribute used to optional, so legacy applications won't 
mind if it is never specified by implementations of the latest protocol 
versions.
3. The new 'hash' attribute (containing the name of the hash) will be 
ignored by existing implementations.


New implementations will need to be aware that if no 'hash' value is 
specified they should ignore the caps element (they should not attempt 
to be compatible with legacy caps since that would make them vulnerable 
to cache poisoning).


- Ian



Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Kevin Smith

On 3 Jul 2007, at 23:43, Rachel Blackman wrote:
Beyond that, I can't think of a good reason for it, but add as an  
anecdotal note when I pulled displaying the client version in  
tooltips out of the early builds of Astra, the testers howled  
bloody murder until I put it back.  So, for whatever reason, I can  
attest that my users actually do want it. :)



Psi has had a similar experience, and we only switched from  
iq:version automatically to showing caps info.


/K

--
Kevin Smith
KTP Associate - Exeter University / ai Corporation
Psi Jabber client developer/project leader (http://psi-im.org/)
XMPP Standards Foundation Council Member (http://xmpp.org)




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Dave Cridland

On Tue Jul  3 23:45:53 2007, Justin Karneges wrote:
Apologies for not understanding this thread at all and just 
commenting out of nowhere, but what security is gained by using a 
hash in the caps protocol?  


It's an attempt at preventing a theoretical attack, as I understand 
things. The only instance of caps pollution being an issue appears 
- from my reading of this thread - to be an inadvertant error, not a 
deliberate attack.



If there is no security gained by using a hash (e.g. everyone has 
access to the raw data such that they can all calculate the same 
hash) then what difference does it make which algorithm is used?



I know Ian and Joe have answered this, I'm hoping I might add a 
different perspective on this, plus there's actually a new point at 
the bottom. :-)


In principle, an attacker capable of mounting a selected preimage 
attack (specifically, one that involves being able to create a caps 
list such that it produces a hash identical to that provided by 
legitimate clients *and* it gains some benefit to the attacker, in a 
reasonable timeframe) might be able to subvert communications.


An example of this might be convincing a client that one or more 
users on the roster are, contrary to reality, unable to handle 
esessions, or some new encrypted Jingle, enabling the attacker to 
eavesdrop communications. To do this, the attacker has to detirmine 
the real hash, use the preimage attack to find a caps list to 
supply by disco which is both syntactically correct and excludes the 
extensions that the attacker wishes to remove, do so before the 
target client can query any real clients, and finally place 
themselves in a position such that they answer such queries.


The latter can be achieved either by sending a directed presence or 
by subverting the server entirely - we can treat this as the easy bit.


The hard part remains the timing issue - in order to have any value, 
you'd need to pollute the target clients capability cache prior to it 
discovering the real capabilities, and that's an extraordinarily 
short time window.


A simple MD5 hash will adequately prevent any chance of inadvertant 
cache pollution, which leaves the selection of any hash algorithm 
purely down to the time it'd take to mount a preimage attack.


I've reviewed the various papers on MD5 as best I can, and I don't 
think its known weaknesses are such that a preimage attack can be 
mounted within a useful timeframe, hence I'm not too fussed, but I'd 
be happy to see SHA-1 used if people are genuinely concerned. 
Whatever we choose, hash functions are continually eroded, and what's 
reasonable now will not be in the future.


(FWIW, Ian's mention of a one hour attack is a collision attack, 
not a preimage attack, and finds a pair of two-block messages which 
collide, both of which have specific properties, and the time figures 
are quoted for an IBM P690, which is somewhat bigger iron than I have 
about, anyway. Our attacker needs a selected preimage attack, and 
will almost certainly need one where the legitimate message is 
several blocks long for MD5, and their primary source of computing 
power is likely to be a distributed botnet at best - I'm not clear if 
this attack is distributable or not, but I'm not concerned by it).


I mentioned earlier that we could gain a benefit from ver/ext by 
using prepackaged sets of capabilities, in order that there was 
more likelyhood of a cache-hit, and moreover, allows clients to ship 
with a hardcoded cache containing these prepackaged sets already, 
avoiding the need to probe at all.


I think it might be worth noting that the more commonality we have 
between clients in this respect, the harder it is to mount such an 
attack, although correspondingly higher gains can be made. If clients 
are able to ship with a pre-polulated cache, then the window of 
opportunity for an attacker vanishes entirely for those clients, 
however, allowing those clients to effectively claim immunity from 
such attacks. Sorry, it's another trade-off.


FWIW, I lean heavily toward pre-defined sets, as I think that good 
clients gain in both security and efficiency, whereas old clients 
are unaffected.


Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Michal 'vorner' Vaner
Hello

On Wed, Jul 04, 2007 at 10:38:26AM +0100, Dave Cridland wrote:
  (FWIW, Ian's mention of a one hour attack is a collision attack, not a 
  preimage attack, and finds a pair of two-block messages which collide, both 
  of which have specific properties, and the time figures are quoted for an 
  IBM P690, which is somewhat bigger iron than I have about, anyway. Our 
  attacker needs a selected preimage attack, and will almost certainly need 
  one where the legitimate message is several blocks long for MD5, and their 
  primary source of computing power is likely to be a distributed botnet at 
  best - I'm not clear if this attack is distributable or not, but I'm not 
  concerned by it).

Not sure what attack he mentioned, but there is collision project.
Collisions in time of minutes on PC, and there is something about
generating a colliding data with some prefix if I understand it well
(there was something it can generate data that has given MD5 and with
some initial hash state or so).

http://cryptography.hyperlink.cz/MD5_collisions.html

Not that I would understand it much, nor read it properly, just that the
author is from the same country as me, so I heard about it.

So I think if you have few hours or days, you have no much problems in
finding something, if you know how.

-- 
The human mind ordinarily operates at only ten percent of its capacity
-- the rest is overhead for the operating system

Michal 'vorner' Vaner


pgpcIx1foAxWO.pgp
Description: PGP signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Joe Hildebrand


On Jul 4, 2007, at 5:35 AM, Ian Paterson wrote:

'ext' and pre-defined sets only improve security if the choice of a  
weak hash makes pre-image attacks possible. So why don't we  
make things easier for everyone and simply recommend a stronger  
hash instead?


So, to pull those bits together, I'm recommending:

base64(sha1(dave-formatted id/features))

which would give ver's that look like:

C+7Hteo/D9vJXQ3UfzxbwnXaijM=

Which is small enough for me.

--
Joe Hildebrand




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Peter Saint-Andre
Ian Paterson wrote:
 Mridul wrote:
 So queries for both bare jid and ns#ver will be supported (and return
 the same value) ? And all clients using newer spec would use bare jid I
 suppose ? (so that we can deprecate ns#ver and remove this in the future)
   
 
 Yes.
 
 But we do lose ability to enable/disable plugins without invalidating
 user's caps data... might be an acceptable tradeoff.
   
 
 Yes, if 'ext' is obsoleted, the hash value in the caps element will
 change whenever the supported features change (including when a plugin
 is enabled disabled).
 
 But as you say, the tradeoff (for simplicity) might be acceptable, since
 the disadvantage (of more hash values) may be marginal.

Especially because we have a finite number of protocols:

http://www.xmpp.org/registrar/namespaces.html

And some of those are payload namespaces that would not be advertised
in service discovery.

Granted, the number of protocols a client might advertise will increase
over time, and the number of potential combinations is large. But in
practice I think that most clients will support a rather narrow range of
combinations.

/psa



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-04 Thread Peter Saint-Andre
Joe Hildebrand wrote:
 
 On Jul 4, 2007, at 5:35 AM, Ian Paterson wrote:
 
 'ext' and pre-defined sets only improve security if the choice of a
 weak hash makes pre-image attacks possible. So why don't we make
 things easier for everyone and simply recommend a stronger hash instead?
 
 So, to pull those bits together, I'm recommending:
 
 base64(sha1(dave-formatted id/features))

Seems reasonable to me.

 which would give ver's that look like:
 
 C+7Hteo/D9vJXQ3UfzxbwnXaijM=
 
 Which is small enough for me.

Me too.

I'll write that up provisionally in XEP-0115 v1.4pre1 so we can see how
it looks...

/psa


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Joe Hildebrand


On Jul 3, 2007, at 6:48 AM, Ian Paterson wrote:


Rachel Blackman wrote:
Let's say we have node='http://ceruleanstudios.com/astra/caps' and  
ver='h$someverylongstring' and ext='h$otherverylongstring'


Or how about simply:
node='$' ver='base64encodedHashOfFeatures'


No.  The other reason for caps is so that receivers can show a  
different icon for each different client that they have received  
presence from.  There has to be a URI to define the sending client.


--
Joe Hildebrand




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Joe Hildebrand


On Jul 2, 2007, at 5:12 PM, Rachel Blackman wrote:

Because the caching logic is not identical; hash-forms are global,  
rather than client-specific.  If Psi and Exodus have precisely the  
same capabilities, they will generate the same hash and I should  
not need to re-query it, even if they have different caps nodes.


This is an optimization that a receiving client might choose to use,  
but I'm not sure that it needs to be in the spec, other than as an  
implementation note.


--
Joe Hildebrand




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Mridul


Joe Hildebrand wrote:
 
 On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote:
 
 Peter Saint-Andre wrote:
 Mridul Muralidharan wrote:
 Forgot to add, change name from ver  ext to verh and exth ?
 Why?

 Conflict with existing clients - too many of them in the wild dont use
 these semantics.
 
 Others have already responded to this, but just to reinforce, I *did*
 talk about backward compatibility.  Existing clients would continue to
 work just fine. New clients just have to be able to detect other new
 clients, to know if they are supposed to be able to check the hash. 
 Presumably, new clients could choose by policy to ignore un-hashed caps
 from old clients.

Not sure if anyone addressed the actual I was thinking of (need to read
rest of thread).

Essentially, how would 'new' clients know is something exhibited in ver
or ext is hash or 'old' value ? Aren't those identifiers not expected to
be opaque (though consistent) ?

Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by
two clients - how would the reciever know what is hashed as per 'new'
idea and what is 'old' ver/ext ?
In first case, it wont hash properly to what is exhibited by disco -
which might make the 'new' client think it is hitting a problem client
instead of a old client.

- Mridul

PS : new - client based on proposed idea, old - client conforming to
current xep.


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Peter Saint-Andre
Mridul wrote:
 
 Joe Hildebrand wrote:
 On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote:

 Peter Saint-Andre wrote:
 Mridul Muralidharan wrote:
 Forgot to add, change name from ver  ext to verh and exth ?
 Why?
 Conflict with existing clients - too many of them in the wild dont use
 these semantics.
 Others have already responded to this, but just to reinforce, I *did*
 talk about backward compatibility.  Existing clients would continue to
 work just fine. New clients just have to be able to detect other new
 clients, to know if they are supposed to be able to check the hash. 
 Presumably, new clients could choose by policy to ignore un-hashed caps
 from old clients.
 
 Not sure if anyone addressed the actual I was thinking of (need to read
 rest of thread).
 
 Essentially, how would 'new' clients know is something exhibited in ver
 or ext is hash or 'old' value ? Aren't those identifiers not expected to
 be opaque (though consistent) ?
 
 Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by
 two clients - how would the reciever know what is hashed as per 'new'
 idea and what is 'old' ver/ext ?

As I understand the proposal, there would not be #hash1 #hash2 -- why
do you need multiple values here? You concatenate all the supported
namespaces according to some rule and then hash the whole thing. So
there's only one hash. But that means something different from 'ext' or
'node' or 'ver' so I think it needs to go in its own attribute.

/psa


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Michal 'vorner' Vaner
Hello

On Tue, Jul 03, 2007 at 09:18:59AM -0600, Joe Hildebrand wrote:
  hash='MD5'
 
  and make it mutually-exclusive with ext.

Why exclusive? ext for the old clients, hash to check if it makes sense?

caps:c node='client' ext='f1 f2 f3' hash='the-hash'/

Or am I missing something?

-- 
chown -R us $BASE

Michal 'vorner' Vaner


pgpWOroXVyCNJ.pgp
Description: PGP signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Peter Saint-Andre
Rachel Blackman wrote:
 That said, I think we can come up with some simpler logic.  If a given
 token is prefixed with 'h$', for instance, we know it's a hash and
 should be both validated against the result, and -- if it matches --
 cached globally instead of per-client.  But for backwards compatibility,
 a disco on node#h$hash would still give you the proper results, and
 COULD be cached on a per-client basis.

 Possible. But where does the token go? It seems preferable to define a
 new attribute for this. Hmph.
 
 Why?
 
 Let's say we have node='http://ceruleanstudios.com/astra/caps' and
 ver='h$someverylongstring' and ext='h$otherverylongstring'
 
 Why can't something like...
 
 http://ceruleanstudios.com/astra/caps#h$someverylongstring work just
 like an old ver='4.0.0.47' would generate an
 http://ceruleanstudios.com/astra/caps#4.0.0.47 would under the old
 system, for an old client?  After all, you still have to query the hash
 to get the features represented, which you then hash to validate and --
 if it's valid -- store it globally so that /all/ clients which have
 'h$someverylongstring' have that featureset.

Hmm, OK. On that model, what is the use of 'ext'? Do you put a hash of
all the base features in 'ver' and a hash of all the extended
features in 'ext'? That seems potentially sub-optimal, because different
clients might divide base and extended differently, which means
you'll need to send and receive a lot more disco queries. It seems
better to me if we have only one hash for all the features.

 Thus, old clients can use it just fine with old-style logic.  Only the
 new client needs to know that h$ means 'take everything after that $,
 and treat it as a hash;' old clients can still query seamlessly.

Right.

/psa


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Peter Saint-Andre
Michal 'vorner' Vaner wrote:
 Hello
 
 On Tue, Jul 03, 2007 at 09:18:59AM -0600, Joe Hildebrand wrote:
  hash='MD5'

  and make it mutually-exclusive with ext.
 
 Why exclusive? ext for the old clients, hash to check if it makes sense?
 
 caps:c node='client' ext='f1 f2 f3' hash='the-hash'/
 
 Or am I missing something?

Yes that seems to work.

/psa


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Rachel Blackman


On Jul 3, 2007, at 7:01 AM, Joe Hildebrand wrote:



On Jul 2, 2007, at 5:12 PM, Rachel Blackman wrote:

Because the caching logic is not identical; hash-forms are global,  
rather than client-specific.  If Psi and Exodus have precisely the  
same capabilities, they will generate the same hash and I should  
not need to re-query it, even if they have different caps nodes.


This is an optimization that a receiving client might choose to  
use, but I'm not sure that it needs to be in the spec, other than  
as an implementation note.


The two objections to caps are always that a) ZOMG someone can maybe  
maliciously pollute the cache, and b) we should have exts hardcoded  
so you never need to query ever and they should be the same across  
all clients.


My understanding was that this proposal was addressing /both/; not  
only making caps something self-validating, but also extending the  
cache to be globally valid?


--
Rachel Blackman [EMAIL PROTECTED]
Trillian Messenger - http://www.trillianastra.com/




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Rachel Blackman
The XEP could also specify that if a client sets the value of the  
'node' attribute to $ then it MUST NOT include an 'ext' attribute.
Not sure about this, it really depends on how ext is actually used  
in the wild, as Joe said. I'd be tempted to leave this somewhat  
open, at least for now. It could be that we could grow a set of  
extensions of commonly co-implemented features, bearing no actual  
relation to client plugins, and cut down traffic that way. But  
such things require quite a bit of research.


Ext is used in the wild.  My initial reaction is that it is still  
needed, but on further thought, I can't see why.


If you remove ext, you create MORE separate things to cache, and thus  
recreate more network traffic.  Because now, client Foo with plugin  
Bar installed will have an entirely different hash than client Foo  
without Bar installed.  With ext, client Foo has the same  
capabilities hash in both cases, but one has an additional ext hash  
for plugin Bar's capabilities.


--
Rachel Blackman [EMAIL PROTECTED]
Trillian Messenger - http://www.trillianastra.com/




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Mridul Muralidharan

Mridul Muralidharan wrote:

Peter Saint-Andre wrote:

Mridul wrote:

Joe Hildebrand wrote:

On Jul 2, 2007, at 4:49 PM, Mridul Muralidharan wrote:


Peter Saint-Andre wrote:

Mridul Muralidharan wrote:

Forgot to add, change name from ver  ext to verh and exth ?

Why?

Conflict with existing clients - too many of them in the wild dont use
these semantics.

Others have already responded to this, but just to reinforce, I *did*
talk about backward compatibility.  Existing clients would continue to
work just fine. New clients just have to be able to detect other new
clients, to know if they are supposed to be able to check the hash. 
Presumably, new clients could choose by policy to ignore un-hashed caps

from old clients.

Not sure if anyone addressed the actual I was thinking of (need to read
rest of thread).

Essentially, how would 'new' clients know is something exhibited in ver
or ext is hash or 'old' value ? Aren't those identifiers not expected to
be opaque (though consistent) ?

Considering an ext of my_ext 1233ab and #hash1 #hash2 exhibited by
two clients - how would the reciever know what is hashed as per 'new'
idea and what is 'old' ver/ext ?


As I understand the proposal, there would not be #hash1 #hash2 -- why
do you need multiple values here? You concatenate all the supported
namespaces according to some rule and then hash the whole thing. So
there's only one hash. But that means something different from 'ext' or
'node' or 'ver' so I think it needs to go in its own attribute.

/psa


What Joe Hildebrand proposed initially had hashes for node  ext's - I 
was refering to that here.
The approach has the advantage that, clients can query and validate (and 
so independently cache) namespace#node_hash, namespace#ext1_hash, etc - 
and so having multiple hash'es allows reuse of cap's info across clients 
and allows them to modify ext's each independent of the others.
Addition or removal of a plugin will not result in the entire hash being 
invalidated - just a specific hash will be removed, or modified.



A single hash has the drawback that it will either protect the entire 
set, or none at all - and so effectively we lose ability of separating 
node's from ext's since we cannot independently validate each.


Which is why my initial query was if this should be 'verh' and 'exth' to 
indicate hash'es and not 'ver'  'ext' for the data itself (because of 
new clients rejecting caps from old clients) - and 'new' clients would 
query for these instead of ver/ext. I am sure there would be better 
ideas to tackle this problem.


Though cache pollution looks like a serious issue (especially server 
cache for pep case), I am wondering if we are not taking this way too 
seriously - I did not see so much discussion for esessions ;-)


Regards,
Mridul



s/node/ver/g
Apologies.

Mridul


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Dave Cridland

On Tue Jul  3 17:49:24 2007, Mridul Muralidharan wrote:
Though cache pollution looks like a serious issue (especially 
server cache for pep case), I am wondering if we are not taking 
this way too seriously - I did not see so much discussion for 
esessions ;-)


Esessions are neither paintable, nor used for the storage of 
two-wheeled, self-propelled, personal transport systems based on a 
pedal/chain/gear drive system.


A less facetious answer is that this is something that is relatively 
easy to understand, and therefore comment on. I'll happily be the 
first to admit that esessions are a little beyond me at the moment, 
whereas I can get my head around this quite easily.


Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Joe Hildebrand


On Jul 3, 2007, at 8:01 AM, Dave Cridland wrote:

	All ordering operations are performed using the i;octet  
collation, as defined in

section 9.3 of RFC4790.


fine.

Which makes me wonder - we might need to normalize unicode input,  
if there really is any. capabilityprep, anyone?


They're URIs, right?  Don't they have defined comparability already?

I'm not sure why readability is important here.  It's never going  
on  the wire.
Absolutely, but the result of the hash is. Therefore, it's useful  
for debugging purposes that two hash inputs can be compared by eye  
to see why they don't match. I believe this has been shown to be a  
bit of a lack in things like DIGEST-MD5, for example.


Fine.  Let's specify it really closely, though, including precisely  
which line-endings to use, and the like.


--
Joe Hildebrand




Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Peter Saint-Andre
Joe Hildebrand wrote:
 I just talked to stpeter IRL (he's all of 10 feet from me; should have
 done that first thing this morning!), 

I just measured. It's 15 feet. :P

 and made sure he understood what I
 was after.  I'm replying to Rachel's mail, since it hits on the two (in
 my mind) remaining interesting questions, namely:
 
 1) is it necessary to explicitly flag that we're doing hashes?
 
 I like Kevin Smith's suggestion that the receiver should always do the
 hash.  For old clients, the hash will never match, but it's the same
 policy decision to allow old clients as to allow broken senders that
 send bad hashes.

I like that too. But I also agree with Dave that we want to future-proof
this. What if 3 years from now we want to use SHA-256 and 7 years from
now we want to use an algorithm that emerges from the current NIST work?
 It might be good to include a 'hash' attribute whose value is one of
the hash function text names from the IANA registry (but the value
defaults to MD5 or whatever we settle on now so that you don't have to
include it until we decide to allow alternate algorithms).

 2) do we need ext in the hash world?
 
 Rachel makes a good point that without ext, more data has to be cached,
 since there will be redundant features associated with different ver's. 
 I think it's a toss-up; no ext's is considerably simpler for all
 involved; no partitioning of features on the sender side and no unions
 on the receiving side.

I don't think we need 'ext' in the hash world.

 To be clear and explicit, all in one place, here is what I'm recommending:
 
 presence from='[EMAIL PROTECTED]/globe'
   c xmlns='http://jabber.org/protocol/caps'
  node='http://psi-im.org/caps'
  ver='big-long-hash-goes-here'/
 /presence
 
 Again, I think simplicity dictates that we pick a single hash algorithm
 and stick with it; I'm almost entirely disinterested in what that
 algorithm is, as long as it doesn't provide output that has too many
 bytes in it.

Too many bytes, my dear Mozart!

What is too many bytes? Too many bytes for what purpose? As noted, 3
years from now we might decide to use SHA-256 or whatever even if the
hashes are longer because the security properties are preferable.

So yes let's settle on one hash algorithm to start, but let's not close
the door to other algorithms in the future if needed.

/psa





smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Ian Paterson

Peter Saint-Andre wrote:

I also agree with Dave that we want to future-proof
this. What if 3 years from now we want to use SHA-256 and 7 years from
now we want to use an algorithm that emerges from the current NIST work?
 It might be good to include a 'hash' attribute whose value is one of
the hash function text names from the IANA registry (but the value
defaults to MD5 or whatever we settle on now so that you don't have to
include it until we decide to allow alternate algorithms).
  


+1 If we want to prevent malicious cache poisoning going forward, then 
clients need to be able to upgrade the hash they are using. MD5 is not 
secure enough even for this purpose. (I've read about attacks that 
require less than an hour of computing time!) IMHO, SHA256 is the most 
reasonable default.



2) do we need ext in the hash world?

Rachel makes a good point that without ext, more data has to be cached,
since there will be redundant features associated with different ver's. 
I think it's a toss-up; no ext's is considerably simpler for all

involved; no partitioning of features on the sender side and no unions
on the receiving side.



I don't think we need 'ext' in the hash world.
  


+1 the protocol is far simpler to implement without extensions. More 
storage will be required. But I'm not sure that we'll hit the sweet spot 
for storage-challenged clients. i.e. It may well be that those clients 
that have insufficient storage to cache the hash for each combination of 
plugins also have insufficient storage to cache hashes for each separate 
extension (i.e. they can't use caps at all).



Joe Hildebrand wrote:
  

On Jul 3, 2007, at 6:48 AM, Ian Paterson wrote:



Rachel Blackman wrote:
  

Let's say we have node='http://ceruleanstudios.com/astra/caps' and
ver='h$someverylongstring' and ext='h$otherverylongstring'


Or how about simply:
node='$' ver='base64encodedHashOfFeatures'
  

No.  The other reason for caps is so that receivers can show a different
icon for each different client that they have received presence from. 
There has to be a URI to define the sending client.



Yes, that cuts down on the old iq:version flood. Or so we hope. :)
  


Hmm, going forward, are the clients that most people use going to 
continue showing these icons? Is this a feature we need to care about? 
Even though I'm one of the small group of people involved in the XMPP 
community, I really don't care what client my contacts are using. Will 
there ever be mass demand for this feature? On the rare occasions where 
people are interested, they'll probably be perfectly happy to explicitly 
ask their client to find out the other user's client version on a 
case-by-case basis.


IMHO the 'node' attribute could be repurposed to be the name of the hash 
function (for backwards compatibility). We could also add some language 
to the XEP stating that clients SHOULD NOT perform an iq:version flood. 
(IMHO, assuming the features hash is available via caps, there is little 
justification for such behavior.)


Dave Cridland wrote:
Assuming you didn't really mean base64, since hashes are typically 
represented as strings simply as hex digits. Base64 would be smaller, 
but unusual, and potentially include character-space clashes with Disco.


I did mean base64, but if people think that is too hard to implement, 
then hex is fine (even though it is 50% longer). I don't understand how 
base64 could create clashes with Disco.


- Ian



Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-03 Thread Ian Paterson

Justin Karneges wrote:
Apologies for not understanding this thread at all and just commenting out of 
nowhere, but what security is gained by using a hash in the caps protocol?  
If there is no security gained by using a hash (e.g. everyone has access to 
the raw data such that they can all calculate the same hash) then what 
difference does it make which algorithm is used?
  


What if the raw data is supplied by the attacker?

Imagine Eve wants to poison the caches of clients that haven't yet 
received presence from a brand new release of Psi.


If it is easy to discover collisions for the hash used by Psi, then Eve 
can send Psi's hash to a client and respond to its resulting disco 
request with a false set of features that she generated earlier. The 
false set would probably include a single unrecognizable feature whose 
'var' value could be manipulated to ensure the set has the correct hash 
value, for example:

feature var='[EMAIL PROTECTED]'/.

- Ian



Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-02 Thread Peter Saint-Andre
Mridul Muralidharan wrote:
 Peter Saint-Andre wrote:
 Mridul Muralidharan wrote:

 Forgot to add, change name from ver  ext to verh and exth ?

 Why?

 
 Conflict with existing clients - too many of them in the wild dont use
 these semantics.

But introducing new attributes is backward-incompatible, no?

Given that both the 'ver' and 'ext' attributes have no semantic meaning
in XEP-0115 right now, I don't see why it is a problem to use those
attribute names. In fact we're adding semantic meaning with the hashes,
but existing clients should work just fine AFAICS.

/psa



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-07-02 Thread Mridul Muralidharan

Peter Saint-Andre wrote:

Mridul Muralidharan wrote:

Peter Saint-Andre wrote:

Mridul Muralidharan wrote:


Forgot to add, change name from ver  ext to verh and exth ?

Why?


Conflict with existing clients - too many of them in the wild dont use
these semantics.


But introducing new attributes is backward-incompatible, no?

Given that both the 'ver' and 'ext' attributes have no semantic meaning
in XEP-0115 right now, I don't see why it is a problem to use those
attribute names. In fact we're adding semantic meaning with the hashes,
but existing clients should work just fine AFAICS.

/psa



  When new clients attempt to interop with existing clients, they will 
not be able to do so - since none of the ver/ext exhibited by existing 
clients will match what gets generated through the md5/sha/etc sum of 
features  capabilities when newer clients attempt to validate this.
So we will not have interop going forward (well, existing clients will 
be able to use new ones though ... weird situation, since usually a 
changes leaves older clients hanging !).


Or did I get it wrong ?

Regards,
Mridul


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-06-29 Thread Dave Cridland

On Thu Jun 28 23:12:25 2007, Joe Hildebrand wrote:
The current spec could absolutely be used for this.  The hardest 
part  is spec'ing how to generate a string that has all of the  
capabilities, so that you can run the hash.  Canonical XML is 
massive  overkill, but, for example, if we just said:


- For all sorting, use the Unicode Collation Algorithm (http:// 
www.unicode.org/unicode/reports/tr10/)


Feh. UTF-8 encode then i;octet - much faster, just as stable, and a 
heck of a lot simpler to implement, especially given that namespaces 
will be us-ascii anyway (hence UTF-8). RFC4790 defines this. (i;basic 
uses TR10, but ISTR it's not yet ready).




- Initialize an empty string E
- sort the identities by category, then type
- for each identity, append the category, then the type (if it  
exists) to E (note: any information inside an identity will be 
ignored)


I'd propose something mildly more structured here, really such that 
it's simpler to view by eye to ensure the formatting and ordering is 
correct. This has no security impact, it's just easier to implement. 
Something like:


For each identity, append the following production:

cat-line = cat-tag SP category [SP type] CRLF
;; Note that type MUST be present if it exists.
cat-tag = %43 %41 %54
;; CAT, case insensitively.



- sort the features by URI
- for each feature, append the URI to E  (note: any information  
inside a feature will be ignored)


Similarly:

feat-line = feat-tag SP feat-uri CRLF
feat-tag = %46 %45 %41 %54
;; FEAT case insensitively.



- calculate the MD5 sum for E


MD5 has a bad reputation, but note that the stricter the input 
formatting, the less likely it is to be forged (ie, the less likely 
it is that someone could find a colliding input that makes semantic 
sense.)


For better security, we could use HMAC, and/or a different hash 
function. One option would be that, if H is the result of the 
hash/hmac function, then V, the version string, is formed by 
prepending its algorithm and a $, something like:


E = *(cat-line) *(feat-line)
H = hash/hmac of E
V = hash-func-name $ H
hash-func-name = hash-name / HMAC- hash-name
hash-name = MD5 / SHA1 / SHA-256

We mandate that HMAC-MD5 is used, but a future specification MAY 
change this requirement. MD5 does have the minor advantage of being 
smaller.




- use this for the version number or extension name



(Given my suggestion above, we'd use V, rather than Hash(E)).



Example (adapted from XEP-115, example 2):

presence from='[EMAIL PROTECTED]/home'
  c xmlns='http://jabber.org/protocol/caps'
 node='http://exodus.jabberstudio.org/caps'
 ver='730c80b442e150dd5e19a31f8edfa8b1'
 ext='d6224a352df544cfde1fbce177301c67  
d0ef9e8327acf5873d16fe083b4d3f3f'/

/presence



This example would have the same form, roughly:

ver='HMAC-MD5$[...]'
ext='HMAC-MD5$[...] HMAC-MD5$[...]'


The receiving client SHOULD check the hashes, after doing the 
IQ/gets:


md5(clientpchttp://jabber.org/protocol/disco#infohttp://jabber.org/ 
protocol/disco#itemshttp://jabber.org/protocol/feature-neghttp:// 
jabber.org/protocol/muc) = 730c80b442e150dd5e19a31f8edfa8b1


This one becomes (using literal whitespace for clarification, not 
syntax):


Hash(
CAT client pc\r\n
FEAT http://jabber.org/protocol/disco#info\r\n
FEAT http://jabber.org/protocol/disco#items\r\n
FEAT http://jabber.org/protocol/feature-neg\r\n
FEAT http://jabber.org/protocol/muc\r\n
)

I'll skip the remaining examples, but presumably you get the notion.

If the receiving client detects an inconsistency, it MUST NOT use 
the  information it received, and SHOULD show an error of some kind.


For backwards-compatibility, any version number that is not 32 
octets  long consisting only of [0-9a-f] MUST be treated as if it 
does not  implement MD5 checking.



We've got slightly better error checking if we explicitly tag the 
data with the prefix defined by the V ABNF production above.




Analysis:
- Existing entities, both sending and receiving, should work fine
- Over time, we can phase in entities that send md5 versions and 
ext's
- Receiving clients that care about security can start checking MD5 
 hashes of the features to check for poisoning.

- Downside: more bytes in presence than today.


We send these out, currently, with every presence update, correct? Is 
it worth looking at an alternate mechanism, or a generalized 
presence-delta? (After all, I'm pretty sure that this data won't 
change as often as my status).


I have to admit, I have an odd feeling that combining all extensions 
together might generate a better result, too, but that's nothing more 
than a gut feeling.



- Assertion: anything else we do will be at least this bad if not 
worse.


If we add these bits to -115, will everyone agree to never bring up 
 changing caps again, and to all agree on that the next time a n00b 
 comes around?


I hate to say never, but I can't see how we can get much better than 
this.


Dave.
--
Dave 

Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-06-29 Thread Dave Cridland

On Fri Jun 29 01:13:26 2007, Joe Hildebrand wrote:
You're worried about the attack where someone generates a set of  
features that has the same hash as the a different set of features. 
  In this case, the birthday attack doesn't help, since you only 
get to  pick one set of ciphertext.



Also, as I think I mentioned, the more structured the input text, the 
harder it is to find a collision.


Let's assume that it's still possible to come up with a collision, 
given sufficient computing power. Why would someone expend such 
computing power to achieve this? We're talking weeks of work, here, 
and even if it dropped to hours, there's a race involved - the 
attacker would need to find a spoof set of capability data which 
served whatever purpose was intended and matched the hash function's 
output, *and* do so before the victim's client cached the legitimate 
data.


That seems like the cost of such an attack outweighs the benefits, to 
me. And that's just using a very cheap hash function. I actually 
suspect that HMAC-MD4 would be sufficient, if it weren't for the fact 
that MD4 implementations are pretty hard to find now. MD5 (and HMAC) 
is everywhere, and cheap, so a good one to use.


Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [Standards] Re: [jdev] XEP-0115: Entity Capabilities

2007-06-29 Thread Joe Hildebrand


On Jun 29, 2007, at 3:40 AM, Dave Cridland wrote:


On Thu Jun 28 23:12:25 2007, Joe Hildebrand wrote:
The current spec could absolutely be used for this.  The hardest  
part  is spec'ing how to generate a string that has all of the   
capabilities, so that you can run the hash.  Canonical XML is  
massive  overkill, but, for example, if we just said:
- For all sorting, use the Unicode Collation Algorithm (http://  
www.unicode.org/unicode/reports/tr10/)


Feh. UTF-8 encode then i;octet - much faster, just as stable, and a  
heck of a lot simpler to implement, especially given that  
namespaces will be us-ascii anyway (hence UTF-8). RFC4790 defines  
this. (i;basic uses TR10, but ISTR it's not yet ready).


+1.  What's the standards-language way of saying that?


- Initialize an empty string E
- sort the identities by category, then type
- for each identity, append the category, then the type (if it   
exists) to E (note: any information inside an identity will be  
ignored)


I'd propose something mildly more structured here, really such that  
it's simpler to view by eye to ensure the formatting and ordering  
is correct. This has no security impact, it's just easier to  
implement.


I'm not sure why readability is important here.  It's never going on  
the wire.


For better security, we could use HMAC, and/or a different hash  
function. One option would be that, if H is the result of the hash/ 
hmac function, then V, the version string, is formed by prepending  
its algorithm and a $, something like:


E = *(cat-line) *(feat-line)
H = hash/hmac of E
V = hash-func-name $ H
hash-func-name = hash-name / HMAC- hash-name
hash-name = MD5 / SHA1 / SHA-256

We mandate that HMAC-MD5 is used, but a future specification MAY  
change this requirement. MD5 does have the minor advantage of being  
smaller.


I think this is overkill.  Finding a one-way collision in a hash  
function seems like adequate protection against a DoS attack.  The  
simpler this is to check, and the less ways of messing it up, the  
more likely it will be to get implemented.  Let's just pick a single  
algorithm and stick with it.


If the receiving client detects an inconsistency, it MUST NOT use  
the  information it received, and SHOULD show an error of some kind.
For backwards-compatibility, any version number that is not 32  
octets  long consisting only of [0-9a-f] MUST be treated as if it  
does not  implement MD5 checking.
We've got slightly better error checking if we explicitly tag the  
data with the prefix defined by the V ABNF production above.


Fine.  But let's just pick one prefix.  Is there a urn:hash: URI scheme?

We send these out, currently, with every presence update, correct?  
Is it worth looking at an alternate mechanism, or a generalized  
presence-delta? (After all, I'm pretty sure that this data won't  
change as often as my status).


There's a server optimization that keeps it from going out as often,  
but I don't know that it's been implemented.


I have to admit, I have an odd feeling that combining all  
extensions together might generate a better result, too, but that's  
nothing more than a gut feeling.


It probably depends upon how often extensions are turned on and off,  
whether the server optimization is in effect, and the like.  I  
suppose there's nothing stopping the writer of an entity from doing  
this, and never sending ext.


--
Joe Hildebrand