Dirk Meyer wrote:
> Peter Saint-Andre wrote:
>> The first installment in this series was about VoIP security. Now I turn
>> my attention to e2e XMPP security. The usual caveats apply (IANAMOTSM).
>>
>> For direct client-to-client ("c2c") communication where two entities
>> communicate over a local or wide-area network with no server
>> infrastructure in place (Serverless Messaging =
>> <http://xmpp.org/extensions/xep-0174.html>), the insecure channel is an
>> XML stream over TCP, and the stream can be secured using STARTTLS just
>> as for c2s and s2s.
> 
> Yes. We already have that in place.
> 
>> For end-to-end communication where two entities communicate over XMPP
>> through one or two intermediate servers, the insecure channel is XMPP
>> itself (typically in the form of In-Band Bytestreams =
>> <http://xmpp.org/extensions/xep-0047.html>) or potentially some
>> out-of-band streaming transport (such as SOCKS5 Bytestreams =
>> <http://xmpp.org/extensions/xep-0065.html> or someday ICE-TCP), and here
>> again the stream can be secured using STARTTLS.
>>
>> So we have 4 cases: c2s, s2s, c2c, and e2e. In all of them, we start
>> with an insecure channel and upgrade it to secure using STARTTLS.
> 
> We could do that, maybe we don't want to. The good thing about XTLS
> compared to Jingle is that it is simple and fast. The whole STARTTLS
> features takes many roundtrips we may not need...
> 
>> For e2e, we need a way to start the stream over XMPP itself. The method
>> we are proposing is to use Jingle to negotiate the transport and other
>> parameters as described in <http://xmpp.org/extensions/xep-0247.html>.
> 
> ... we can do this without XEP-0247. It is similar to serverless
> messaging and therefore easy to implement, but takes some time and is
> more or less limited to XML streams. For secure file transfer, it is not
> needed and it would be confusing to send XML streams on the stream
> before we re-use it for file transfer...
> 
>> 1. Initiator sends Jingle session-initiate with offer, including hints
>> about TLS methods and fingerprints
>>
>> 2. Initiator and responder agree on transport and negotiate IBB or
>> SOCKS5 (or future ICE-TCP) connection
> 
> I agree up to this point.
> 
>> 3. Parties start XML stream over negotiated transport (e.g.,
>> encapsulated in IBB packets)
>>
>> 4. Parties upgrade stream using STARTTLS
>>
>> 5. If STARTTLS succeeds, the e2e stream is now secured
> 
> Why not skip all this and fire up the TLS lib afer (2)? 

I have no objections to that approach. So send raw TLS handshake over
IBB or SOCKS5 Bytestreams or ICE-TCP or whatever. Correct?

> We know that we
> want to use TLS, there is no point in doing all this. We can negotiate
> XEP-0250 while we create the stream in (2). After that, give the stream
> to the TLS lib and wait until it is set up. I propose:
> 
> 2a. Initiator and responder agree on transport and negotiate IBB or
> SOCKS5 (or future ICE-TCP) connection
> 
> 2b. Initiator and responder exchange XEP-0250 information

Couldn't that be part of the Jingle offer? I would foresee something
like this:

<iq from='[email protected]/foo'
    id='jingle1'
    to='[email protected]/bar'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'
          action='session-initiate'
          initiator='[email protected]/foo'
          sid='851ba2'>
    <content creator='initiator' name='e2e-tls'>
      <description xmlns='urn:xmpp:jingle:apps:tls'>
        <methods>
          <method name='x509'>

<print>72:72:DF:06:3A:4D:7D:80:97:0E:53:77:0A:8F:CD:A9:80:A3:CB:38</print>
          </method>
          <method name='openpgp'>

<print>E5:59:E9:8A:9B:C9:ED:0F:60:21:FD:EA:34:BF:24:E4:0D:B0:FE:FC</print>
          </method>
          <method name='srp'/>
        </methods>
      </description>
      <transport xmlns='urn:xmpp:jingle:transports:ibb:0'/>
    </content>
  </jingle>
</iq>

That is, I'd like to negotiate an e2e TLS session with you over IBB.
Hint: I could use X.509, OpenPGP, or SRP as the TLS method. For X.509
I'd use a certificate whose fingerprint is "72:72:DF:..." and for
OpenPGP I'd use a key whose fingerprint is "E5:59:E9:..." and for SRP
we'd use some shared secret known to the two of us.

If we want to provide the keying material outside the content
description then we could do this instead:

<iq from='[email protected]/foo'
    id='jingle1'
    to='[email protected]/bar'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'
          action='session-initiate'
          initiator='[email protected]/foo'
          sid='851ba2'>
    <content creator='initiator' name='e2e-tls'>
      <description xmlns='urn:xmpp:jingle:apps:tls'>
        <methods>
          <method name='x509'/>
          <method name='openpgp'/>
          <method name='srp'/>
        </methods>
      </description>
      <transport xmlns='urn:xmpp:jingle:transports:ibb:0'/>
      <offer xmlns='urn:xmpp:security-offer'>
<print>72:72:DF:06:3A:4D:7D:80:97:0E:53:77:0A:8F:CD:A9:80:A3:CB:38</print>
<print>E5:59:E9:8A:9B:C9:ED:0F:60:21:FD:EA:34:BF:24:E4:0D:B0:FE:FC</print>
      </offer>
    </content>
  </jingle>
</iq>

Or somesuch. But it seems to me that the keys are there for the sake of
the TLS negotiation here.

> 3. Responder acts as TLS client, initiator as TLS server. The TLS lib
> makes its four-way handshake

Would it be better for the responder to be the TLS server and the
initiator to be the TLS client? That is similar to how we do things for
c2s right now (receiving entity is the TLS server).

> After that
> 
>> 6. Responder sends Jingle session-accept to initiator
> 
> And then we are done. We can use the now secured stream for XML stanzas,
> we can use it for file transfer, and for many other things.

Correct.

Peter

Reply via email to