On Fri, Sep 18, 2009 at 3:04 PM, Pedro Melo <[email protected]> wrote:
> Hi,
>
> On 2009/09/17, at 19:56, Waqas Hussain wrote:
>
>> ---
>> The problem: The only thing the current caps XEP adds over legacy caps is
>> obfuscation. Everything else is equivalent from a security perspective.
>>
>
> Not true. The ver attribute value in the presence is derived in a
> (hopefully) secure way from the response you will get of a disco#info.
>
> This link between ver and the disco response allows for better caching
> opportunities without the risks of cache poisoning from malicious clients.
>
Note the qualifiers: 1) The _current_ caps XEP (which is open to trivial
preimage attacks), and 2) from a security perspective. Oh, and legacy caps
did allow caching. That was the whole point of the XEP. Because this problem
exists is why we are having this discussion.
> So, with a <feature var=''/> and a form with an empty FORM_TYPE value
>> added, we have separation between the sections.
>>
>
> I used NUL as the separator between sections because it is an invalid XML
> character. The usage of an empty feature or FORM_TYPE will only duplicate
> the > separator, so the result is similar: add another separator between
> sections.
>
> With '<' escaped, the features section is safe.
>>
>> With '/' not forbidden or escaped, the identity section is not safe. With
>> some exceptions, any <identity/> element with one of the attribute values
>> containing a '/' is open to attack.
>>
>
> If we use invalid XML characters, we don't need to worry about safe or
> unsafe characters inside the text values.
>
> If you don't like NUL use another character, or even multiple characters,
> but I fail to see why would we use a valid character as the separator. It
> only complicates the process, because now we have to encode the texts before
> using.
>
> Now we are left with the service discovery extensions section. This
>> section happens to have subsections,
>>
>
> When you mention subsections, we are referring to having multiple
> jabber:x:data elements, each one with a different FORM_TYPE, right?
>
> which are open to attack. We'll have to add a special field to every
>> extension, so that the boundaries between extensions are preserved. Ah, of
>> course, there also need to be boundaries between individual fields of an
>> extension. And of course, these boundaries at three levels need to be
>> distinguishable. Nice, no?
>>
>
> I need to check but I'm sure we have more invalid XML characters. Or we can
> use \0{size} where size is int32 network-order integer with the size of the
> segment that follows.
>
> As long as the separator is a sequence that will not occur in the text, and
> we have no optional parts (empty values or missing attributes are
> represented not as missing parts but empty ones, we are safe.
>
> The way I see it, there is no simple way to preserve structural
>> information for the extensions section. It's several levels deep, which
>> makes things _very_ complicated. I don't think this can reasonably be done
>> while maintaining backwards compatibility.
>>
>> The replacement algorithm Pedro Melo suggested still has many of these
>> issues.
>>
>
> My suggestion can be improved I'm sure.
>
> I don't expect to keep backwards compatibility. I see breaking backwards
> compatibility with 1.5 a feature. Let me explain.
>
All my messages in this portion of the email were about fixes which
maintained backwards compatibility. Once you drop that requirement, your
options are unlimited.
Personally, I don't think maintaining backwards compatibility is feasible.
The reason for discussing it is because many people consider this to be very
important.
> If we release 1.6 in the future with a different algorithm to generate the
> ver attribute, one that we find secure against the problems described, we
> will break compatibility with clients that already use 1.5.
>
> Lets examine the scenarios: two clients A and B. The scenarios where both
> use the same version are irrelevant, they are compatible. I also don't
> consider legacy caps because that is already explained in the current
> version and I don't think it needs change.
>
> a) A 1.5 vs B 1.6
>
> When a 1.6 client receives the caps presence from a 1.5 client, it will
> calculate the hash using the new algorithm and the verification will fail.
> The fallback for that purpose is not to cache the information but use the
> received information (as specified in 0115 5.4.3.9).
>
> This is a good thing because 1.5 is insecure so a new client should not use
> the hash and cache it, to prevent poisoning.
>
> b) A 1.6 vs B 1.5
>
> In the opposite scenario, the old client using 1.5 will also fail
> verification and will not cache the new value, and it will use the
> disco#info information returned.
>
> Same solution.
>
>
> So on both cases no poisoning will occur, and the worst case is no caching
> and extra traffic.
I think dropping support just like that is harsh, mainly because I think the
extra traffic would be quite significant. Dozens of IQ requests every time I
change my presence, simply because I'm an early adopter? Thank you, but no.
What would Peter do, with his thousands of contacts?
As much as I would like taking this idealistic approach, pragmatism says
otherwise.
> ---
>> Now, moving on to incompatible changes:
>>
>> Many of us don't like service discovery extensions. I think this is due to
>> the problems they are causing in the caps algorithm. While I have agreed
>> with making them obsolete in the past, they have valid use-cases. They
>> should be fixed (or replaced by disco#meta).
>>
>
> They have valid use cases, and until there is something better, they are
> here to stay.
>
> You could write 115 ignoring service extensions and therefore only cache
> features and identities. And if your application really requires the service
> extensions part, fall back to a disco#info query.
>
> But I think that it is possible to implement 115 including service
> extensions. We only need to come up with a canonical representation for the
> information contained.
>
I think it would be quite interesting to come up with a generic (not caps
specific) canonical representation format, along with a hashing function to
work on it. Not sure how feasible it would be, but having something like
that would be very useful indeed. I think it's worth discussing. Yes, I'm
aware of the many problems, but I think they could be overcome.
> So, and picking up the topic of a new XEP, I do think that changing the
> gen-ver algorithm and releasing a 1.6 version is the best solution here. The
> incompatibilities with 1.5 are a good thing, as they prevent old clients to
> poison the new clients cache.
>
> A new XEP will have to solve the same problems we have here. You would
> still have to come up with a canonical encoding for all the information
> contained in the disco#info response. Unless there is a new very radical
> different approach to disco#info caching, I think we are well served with
> 115 if we just fix the gen-ver.
>
As stated above, I think the cost of a new value for the 'hash' attribute is
too significant. A new XEP has several advantages. First, it can work in
parallel with the current one. Second, it can drop the historical baggage
the current XEP inherited from legacy caps, making the design cleaner and
simpler design. Third, it can possibly have a more extensible design (while
I'm not proposing any, I think hashes for data other than disco#info are
worth discussing).
The current XEP is implemented in many clients, but the only ones it's
implemented in are the early adopters. As I see it, they could use two XEPs
in parallels for a few versions, and then drop the old one.
On the server side, Prosody is the one of the very few implementations (I'm
told Jabber XCP is another, but that's about it). A new XEP with the old one
deprecated would give servers more incentive to move (there wasn't much
incentive to move from legacy caps to the current one, and servers didn't).
In summary,
1. The Current XEP needs to be changed.
2. Changing it while keeping backwards compatibility is not feasible (IMHO).
3. Changing it without keeping backwards compatibility has a significant
cost: Software needs to implement it AND a high bandwidth cost during
transition.
4. A new XEP has a cost: Software needs to implement it.
Conclusion: A new XEP (which can work in parallel with the old until the old
is phased out) is the least cost option.
Best regards,
>
>
--
Waqas Hussain