Re: [Standards] Binary data over XMPP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Rachel Blackman wrote: > Alternatively, if we have decided that sending 100k custom emoticons > over mobile phones generates 33k of 'needless' traffic which is a > deal-breaker to the point that solution is throwing out XMPP 1.0 and > starting over with 2.0, I would say that the more practical solution > here is not to support custom emoticons on mobile phones. I remember, years ago, talking about a xmpp<->binary proxy designed for pay-per-byte environments. > I just think we may be overthinking this. I agree. - -- Jesus Cea Avion _/_/ _/_/_/_/_/_/ [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/_/_/ _/_/_/_/ _/_/ jabber / xmpp:[EMAIL PROTECTED] _/_/_/_/ _/_/_/_/_/ _/_/ _/_/_/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/_/_/ _/_/_/_/ _/_/ "My name is Dump, Core Dump" _/_/_/_/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBR0ynFZlgi5GaxT1NAQJOpAP/YNg3wc8eKJdQ6umXHfsMjG5hmtNWjSJP K1QaeeUURezccJhF60q5X4MbkMLEEXpE0F3aC/+qG+kabBsFjRflcwcaWdPvcazj myto0F/ayMg0E3wbDClu6kt6Yn1zmiu0EsNZe3+hdwiOrtxjeAAX67G3Q6voZAWP niZt1nT9q8U= =/B15 -END PGP SIGNATURE-
Re: [Standards] Binary data over XMPP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dave Cridland wrote: > Note that compressing first, then base64 encoding, then compressing > *again* actually gave better results than base64 *then* compressing, > meaning that almost every file transfer we do under base64 should be > compressed first. But if you compress the output before sending, you have again 8 bit data that you must encode to send inside XMPP. The question is: if you compress before base64, you send less data. Sure. But that is orthogonal to the baseX encoding. - -- Jesus Cea Avion _/_/ _/_/_/_/_/_/ [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/_/_/ _/_/_/_/ _/_/ jabber / xmpp:[EMAIL PROTECTED] _/_/_/_/ _/_/_/_/_/ _/_/ _/_/_/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/_/_/ _/_/_/_/ _/_/ "My name is Dump, Core Dump" _/_/_/_/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBR0ymRZlgi5GaxT1NAQJpzQQAiLxk8lapCaEpDaweXn3U2tO12IPVv8Ry MbrcgRKZzozTS6fWCfpqkE4h4toIkQHt6Uv7/ftaFnGzyF2E7PdFf5fdgi8KY8xF QRXQOVdyIaj9GSE2tljR6MNvoslHFrFA8ScO4hTgt6M690AP7f2tWoRf8eQ7Zucm VIjuH+FuZT0= =IXsU -END PGP SIGNATURE-
Re: [Standards] Binary data over XMPP
On Sat Nov 10 02:07:08 2007, Justin Karneges wrote: On Friday 09 November 2007 3:35 pm, Dave Cridland wrote: > ubiquitous encryption Best laugh of the day! Oh, I'm not laughing. Other protocols have been fighting this battle for years. Is XMPP so much different? I can see the headlines: "XMPP finally gets everyone in the world to use encryption. Email working group wasted their lives." To understand why those efforts failed, it's worth looking at what's changed over the years. When Internet Mail started, it was purely an interoperability facility between heterogeneous systems - as were pretty well all protocols back then. You can see this in the way that an email address is specified - there's no specification at all for the local-part - it can contain pretty much anything at all, it may or may not be case-sensitive, etc. As I said, most protocols of the time were similar. FTP exposes the host's filesystem semantics, so using FTP requires that you know the remote host's filesystem layout. IMAP, similarly, exposes the host's mailbox layout and hierarchy - giving endless fun for client developers who usually expect all IMAP servers to look the same. So providing any end-to-end service over email is tricky, because the majority of email servers - still - are not "Internet" mailservers, but LAN mail systems that have a gateway. (Exchange is, now, finally dealing with Internet Mail internally, but until very recently it was X.400 internally, and was much happier talking X.400 P1 rather than ESMTP). Hence most ESMTP extensions assume that somewhere, the Internet Mail system stops, and gets gatewayed into something local. A sea-change (or paradigm shift, if you like playing buzzword bingo) in protocol design happened around the early 90's, when protocol designers shifted from exposing local semantics into providing a homogenous model. XMPP is a late protocol, by this metric, as is HTTP. Many protocols have shifted toward this style, too - FTP now has TVFS, IMAP servers increasingly provide a fairly homogeneous layout, etc. This makes deploying end to end services significantly easier. The other factor is that email isn't a close-knit community. At the SDO level it is - the majority of email standards developers know each other to some degree. However the vast majority of client - and even server - developers don't participate. This contrasts heavily with XMPP, where the vast majority of client and server developers are active on this list. Finally, we're a much younger protocol. Email is thoroughly ancient, and encryption is a comparitively new issue, and even there, multiple paths have been explored, and problems discovered. We've got the benefit of hindsight here - we know which bits have proven difficult to deploy, and which bits have proven easy. We know what end-users actually want, as well. All of this knowledge has effectively come from email. I strongly suspect that we're in a much better position to achieve ubiquitous (or near ubiquitous) encryption than email ever was, and I certainly don't think that it's worth giving up before we've started. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
On Nov 11, 2007 12:09 AM, Fabio Forno <[EMAIL PROTECTED]> wrote: > > But if the ids are independently chosen by the clients there may be > the risk of colliding acks, so how can the server chose in the correct > way? I answer to myself, when I read the XEP the first time I've mistaken the id used for reconnecting with the ids used for acks. Since the "ack_id" for "reconnecting" is given at the beginning of the examples and then the example using it for reconnecting is described at the end, I suggest to give it a more meaningful name in order to avoid misunderstandings, as it happened ;) -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
On Nov 9, 2007 7:37 PM, Rachel Blackman <[EMAIL PROTECTED]> wrote: > Facetious comments aside, my point is that if we're talking about > modifying how the XMPP parser works, why bother doing things halfway > with little workarounds? Throw out XMPP 1.0 entirely and come up with > an extensible 2.0 binary protocol. > If we like to chant the 'XMPP is not really XML' mantra and the 'we > must shave off every byte we can to spare the poor mobile users' > mantras, that's great. But considering we only have 3 actual main > stanza types, a purely binary (and not necessarily XML-related) > protocol would be more efficient. That's exactly my point: XMPP 1.0 is good for desktop clients, and at present for a series of reasons I've already talked about I prefer BOSH for mobiles, but an extensible binary xml protocol would be the best of both worlds. > I think we've lost sight of whatever the original problem we were > trying to solve was (inline images? Size of binary blobs to mobiles?) > and have become caught up in hypothetical solutions which may no > longer be directly connected to the issue. :) One more good reason for using BOSH with mobiles: you can fix very quickly the binary data issue offering the decoded, more compact, data on the same channel, accessing it using a different path in the request. The change would be almost trivial, leaving the time for a decent binary XMPP 2.0 -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
On Nov 8, 2007 8:11 PM, Justin Karneges <[EMAIL PROTECTED]> wrote: > When you connect again, you specify the ack session id of the disconnected > connection, so that the server knows which session you are trying to recover. > But if the ids are independently chosen by the clients there may be the risk of colliding acks, so how can the server chose in the correct way? > According to the XEP, you then would do resource binding. At the very least, > the XEP should be updated to state that the client must bind to the same > resource as before, and if it doesn't then the server must assert the correct > resource in the bind iq reply. However, I'm tempted to say that when you > resume an ack session, the resource binding step should just be skipped. > After all, both parties already know what the resource is supposed to be. I think the resource binding MUST be skipped and the resource maintained, usually a session object in servers is associated to a given resource that cannot be changed during its life. Moreover the other clients will see a presence of type unvailable from the former resource, and a new presence from the new one. I don't think that this is the behavior we want. -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
On Friday 09 November 2007 3:35 pm, Dave Cridland wrote: > ubiquitous encryption Best laugh of the day! Other protocols have been fighting this battle for years. Is XMPP so much different? I can see the headlines: "XMPP finally gets everyone in the world to use encryption. Email working group wasted their lives." -Justin
Re: [Standards] Binary data over XMPP
On Fri Nov 9 18:37:08 2007, Rachel Blackman wrote: If we like to chant the 'XMPP is not really XML' mantra and the 'we must shave off every byte we can to spare the poor mobile users' mantras, that's great. I'm not chanting any mantras, sorry. If encrypted sessions become the rule, rather than the rare exception - and I do want this to happen - then 25% of a server owner's bandwidth bill is going to be down to base64. If you're okay with that, please send me the cash instead. :-) But considering we only have 3 actual main stanza types, a purely binary (and not necessarily XML-related) protocol would be more efficient. Much harder to code and debug, though - we need a middle ground here. An escape mechanism makes sense to me, but I'm easy to persuade otherwise. And if we're going to break the world by changing how XMPP parsing works, then why on earth would we go through the pain of breaking our protocol to glue the ability to include a few extra characters in just to go ASCII85 or BASE91 instead of BASE64? This I definitely agree with, not least because it still doesn't gain us anything particularly useful in terms of bandwidth improvements. We might drop that overhead from 33% to 10% with a serious amount of work, but that's as good as it gets, and means introducing tricky-to-write untested codecs everywhere. Fun and games. I think we've lost sight of whatever the original problem we were trying to solve was (inline images? Size of binary blobs to mobiles?) and have become caught up in hypothetical solutions which may no longer be directly connected to the issue. :) The problem is hypothetical, which makes solutions also hypothetical. The hypothesis is: XMPP will display a tendency toward being used increasingly for binary data, in particular via encryption, but also for various other things (including file transfer). As this trend continues, the issue of base64 encoding will play a significant role in bandwidth figures for both servers and clients. This trend is desirable, because it indicates an uptake of encryption, and therefore is to be encouraged by support within the protocol. Inlined images aren't driving this for me at all. At best it seems that addressing these if we can has merit. I'm really thinking in terms of IBB, and leveraging that for use in encrypted session support et al ready for the future when we'll actually need this. I'm entirely cool with agreeing it's not needed now, but the sooner we start thinking about this the better - I think you're clearly stating that if we choose to address this, it'll be a major bit of work. Please don't consider this in terms of inlined images and fringe users - think of it in terms of ubiquitous encryption and servers. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Dnia 09-11-2007, Pt o godzinie 10:23 -0800, Justin Karneges pisze: > Each session is given a unique id. There is no "guessing" for the > server to > do, because no two sessions would be given the same id. Right. I've reread the XEP and there is no confusion for me anymore. > Why is this thread still going? :) I think, the amount of text and its formatting in XEP-0198 is overwhelming. All the people I talked about it with said, that it's very hard to grasp and we almost need to methodically decipher it. ;-) -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On 9 Nov 2007, at 20:49, Peter Saint-Andre wrote: So let's bite the bullet and say that In-Band Bytestreams is perfectly fine for small bits of data (and maybe even larger blobs of data). If we need something that's good for including really tiny bits of data in a stanza (e.g., via data: URL) then let's define that too so that we can do small incline icons or rasterized images for whiteboards or whatever all else people want to build. All this hypothetical stuff is well and good but it's a tangent. That all sounds eminently sensible. /K
Re: [Standards] Binary data over XMPP
Rachel Blackman wrote: > I think we've lost sight of whatever the original problem we were trying > to solve was (inline images? Size of binary blobs to mobiles?) and have > become caught up in hypothetical solutions which may no longer be > directly connected to the issue. :) Right! So let's bite the bullet and say that In-Band Bytestreams is perfectly fine for small bits of data (and maybe even larger blobs of data). If we need something that's good for including really tiny bits of data in a stanza (e.g., via data: URL) then let's define that too so that we can do small incline icons or rasterized images for whiteboards or whatever all else people want to build. All this hypothetical stuff is well and good but it's a tangent. Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Binary data over XMPP
Exactly, it doesn't matter what character the already available methods/implementation output as long as they don't output more than 101 different characters. If they output one we can't use we just replace it with one we can use. cheers Tobias On Nov 9, 2007 7:45 PM, Michal 'vorner' Vaner <[EMAIL PROTECTED]> wrote: > Hello > > On Fri, Nov 09, 2007 at 10:01:39AM -0700, Joe Hildebrand wrote: > > > > > On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: > > > >> There are already several binary-to-text encodings which perform a bit > >> better than Base64, two of them are: > >> > >> 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe > >> 2. http://base91.sourceforge.net/ > > > > Both of those seem to allow < and &, which make them less than ideal for > > embedding in XML. > > Why not replace these 2 with something else? > > -- > If you are over 80 years old and accompanied by your parents, we will > cash your check. > > Michal 'vorner' Vaner >
Re: [Standards] Binary data over XMPP
Hello On Fri, Nov 09, 2007 at 10:01:39AM -0700, Joe Hildebrand wrote: > > On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: > >> There are already several binary-to-text encodings which perform a bit >> better than Base64, two of them are: >> >> 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe >> 2. http://base91.sourceforge.net/ > > Both of those seem to allow < and &, which make them less than ideal for > embedding in XML. Why not replace these 2 with something else? -- If you are over 80 years old and accompanied by your parents, we will cash your check. Michal 'vorner' Vaner pgpTvutzcdXDn.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
On Nov 9, 2007, at 10:27 AM, Rachel Blackman wrote: On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: There are already several binary-to-text encodings which perform a bit better than Base64, two of them are: 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe 2. http://base91.sourceforge.net/ Both of those seem to allow < and &, which make them less than ideal for embedding in XML. "XMPP is not XML" :-))) No. But just because a is not b does not imply that b is not a. XMPP is a /subset/ of XML: all XML is not valid XMPP, but all XMPP is (or should be) valid XML when the session is taken as a document. :) Both from a design standpoint, and a practical standpoint (re-using existing XML parsers for XMPP is easy given that XMPP obeys a subset of the XML rules). So one would think that < and & are still equally important not to have appearing raw in an XMPP stream. On top of which, if you modify the XMPP stream/parser rules to allow raw & and < in a stream you really have to roll your own parser anyway. So at that point, why the hell not just send the raw binary blob rather than trying to needlessly encode it? I mean, if you are completely throwing out the idea and redoing how streams work, why do it halfway? Why change it so that you can allow < and & raw in a stream, just so that you can shave a few bytes off by replacing BASE64? Let's just go to a completely-binary protocol like AIM's OSCAR; it opens up a lot of doors without having to worry about parsing rules. Just define a binary packet format with a header and a length field and hey, we're good to go on whatever! Facetious comments aside, my point is that if we're talking about modifying how the XMPP parser works, why bother doing things halfway with little workarounds? Throw out XMPP 1.0 entirely and come up with an extensible 2.0 binary protocol. If we like to chant the 'XMPP is not really XML' mantra and the 'we must shave off every byte we can to spare the poor mobile users' mantras, that's great. But considering we only have 3 actual main stanza types, a purely binary (and not necessarily XML-related) protocol would be more efficient. And if we're going to break the world by changing how XMPP parsing works, then why on earth would we go through the pain of breaking our protocol to glue the ability to include a few extra characters in just to go ASCII85 or BASE91 instead of BASE64? I think we've lost sight of whatever the original problem we were trying to solve was (inline images? Size of binary blobs to mobiles?) and have become caught up in hypothetical solutions which may no longer be directly connected to the issue. :) -- Rachel Blackman <[EMAIL PROTECTED]> Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Binary data over XMPP
On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: There are already several binary-to-text encodings which perform a bit better than Base64, two of them are: 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe 2. http://base91.sourceforge.net/ Both of those seem to allow < and &, which make them less than ideal for embedding in XML. "XMPP is not XML" :-))) No. But just because a is not b does not imply that b is not a. XMPP is a /subset/ of XML: all XML is not valid XMPP, but all XMPP is (or should be) valid XML when the session is taken as a document. :) Both from a design standpoint, and a practical standpoint (re-using existing XML parsers for XMPP is easy given that XMPP obeys a subset of the XML rules). So one would think that < and & are still equally important not to have appearing raw in an XMPP stream. -- Rachel Blackman <[EMAIL PROTECTED]> Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Binary data over XMPP
On Friday 09 November 2007 1:10 am, Tomasz Sterna wrote: > Dnia 08-11-2007, Cz o godzinie 11:11 -0800, Justin Karneges pisze: > > I mean: in the > > > > > unlucky case of an entity having two open sessions and losing both > > > > of > > > > > them, how can the server decide which is the session to recover, but > > > adding some semantics to the ids? > > > > When you connect again, you specify the ack session id of the > > disconnected > > connection, so that the server knows which session you are trying to > > recover. > > The problem described is what happens with two lost sessions at the same > time. > Without semantics encoded in the ID server has no way of guessing which > session is trying to reconnect. Each session is given a unique id. There is no "guessing" for the server to do, because no two sessions would be given the same id. Why is this thread still going? :) -Justin
Re: [Standards] Binary data over XMPP
On Fri, Nov 09, 2007 at 10:01:39AM -0700, Joe Hildebrand wrote: > > On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: > > >There are already several binary-to-text encodings which perform a bit > >better than Base64, two of them are: > > > >1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe > >2. http://base91.sourceforge.net/ > > Both of those seem to allow < and &, which make them less than ideal > for embedding in XML. "XMPP is not XML" :-))) R
Re: [Standards] Binary data over XMPP
On Nov 9, 2007, at 8:47 AM, Tobias Markmann wrote: There are already several binary-to-text encodings which perform a bit better than Base64, two of them are: 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe 2. http://base91.sourceforge.net/ Both of those seem to allow < and &, which make them less than ideal for embedding in XML. -- Joe Hildebrand
Re: [Standards] Binary data over XMPP
There are already several binary-to-text encodings which perform a bit better than Base64, two of them are: 1. http://en.wikipedia.org/wiki/ASCII85 invented by Adobe 2. http://base91.sourceforge.net/ cheers Tobias Markmann On Nov 9, 2007 10:29 AM, Michal 'vorner' Vaner <[EMAIL PROTECTED]> wrote: > Hello > > On Fri, Nov 09, 2007 at 10:18:51AM +0100, Matthias Wimmer wrote: > > Hi Thomasz! > > > > Tomasz Sterna schrieb: > > > Simplest that comes to mind: > > > Let's take first 256 allowable UTF-8 characters and assign them to 256 > > > values of a single byte. > > > > It is not possible to sent the complete set of the first 256 Unicode > > code points within XML. E.g. U+ cannot be present in an XML document. > > That's why there was 'allowable' -- the ones which you are allowed to > send -- put characters in line, strike out all the ones you can't send > and take the first 256. > > -- > Hallowed be the zeroes and ones > > Michal 'vorner' Vaner >
Re: [Standards] Binary data over XMPP
Dnia 09-11-2007, Pt o godzinie 10:29 +0100, Michal 'vorner' Vaner pisze: > put characters in line, strike out all the ones you can't send > and take the first 256. Thanks Michal. Couldn't word out it better. :-) English is not my native language... still. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Hello On Fri, Nov 09, 2007 at 10:18:51AM +0100, Matthias Wimmer wrote: > Hi Thomasz! > > Tomasz Sterna schrieb: > > Simplest that comes to mind: > > Let's take first 256 allowable UTF-8 characters and assign them to 256 > > values of a single byte. > > It is not possible to sent the complete set of the first 256 Unicode > code points within XML. E.g. U+ cannot be present in an XML document. That's why there was 'allowable' -- the ones which you are allowed to send -- put characters in line, strike out all the ones you can't send and take the first 256. -- Hallowed be the zeroes and ones Michal 'vorner' Vaner pgpDwzwX3EEWX.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Hi Thomasz! Tomasz Sterna schrieb: > Simplest that comes to mind: > Let's take first 256 allowable UTF-8 characters and assign them to 256 > values of a single byte. It is not possible to sent the complete set of the first 256 Unicode code points within XML. E.g. U+ cannot be present in an XML document. Matthias -- Matthias Wimmer Fon +49-700 77 00 77 70 Züricher Str. 243Fax +49-89 95 89 91 56 81476 Münchenhttp://ma.tthias.eu/
Re: [Standards] Binary data over XMPP
Dnia 08-11-2007, Cz o godzinie 11:11 -0800, Justin Karneges pisze: > I mean: in the > > unlucky case of an entity having two open sessions and losing both > of > > them, how can the server decide which is the session to recover, but > > adding some semantics to the ids? > > When you connect again, you specify the ack session id of the > disconnected > connection, so that the server knows which session you are trying to > recover. The problem described is what happens with two lost sessions at the same time. Without semantics encoded in the ID server has no way of guessing which session is trying to reconnect. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On Thursday 08 November 2007 1:26 am, Fabio Forno wrote: > On Nov 8, 2007 9:52 AM, Tomasz Sterna <[EMAIL PROTECTED]> wrote: > > Dnia 08-11-2007, Cz o godzinie 00:29 +0100, Fabio Forno pisze: > > > One of the reasons I tend to use id bosh is the ability of keeping the > > > session open when the client temporary disconnects > > > > XEP-0198: Stanza Acknowledgements already supports session recovery > > after disconnection with element. > > It also ensures, that no packets get lost with the connection failure. > > Too many xeps, sometime sou get lost ;) Yep it seems to work also for > tcp connections, there is a thing I cant understand: when the > initiating entity tries to recover from a packet of a previouos > session, how do the server choses the correct session if the > of the resources and stanzas acks are offered together? I mean: in the > unlucky case of an entity having two open sessions and losing both of > them, how can the server decide which is the session to recover, but > adding some semantics to the ids? When you connect again, you specify the ack session id of the disconnected connection, so that the server knows which session you are trying to recover. According to the XEP, you then would do resource binding. At the very least, the XEP should be updated to state that the client must bind to the same resource as before, and if it doesn't then the server must assert the correct resource in the bind iq reply. However, I'm tempted to say that when you resume an ack session, the resource binding step should just be skipped. After all, both parties already know what the resource is supposed to be. -Justin
Re: [Standards] Binary data over XMPP
On Nov 8, 2007 9:52 AM, Tomasz Sterna <[EMAIL PROTECTED]> wrote: > Dnia 08-11-2007, Cz o godzinie 00:29 +0100, Fabio Forno pisze: > > One of the reasons I tend to use id bosh is the ability of keeping the > > session open when the client temporary disconnects > > XEP-0198: Stanza Acknowledgements already supports session recovery > after disconnection with element. > It also ensures, that no packets get lost with the connection failure. Too many xeps, sometime sou get lost ;) Yep it seems to work also for tcp connections, there is a thing I cant understand: when the initiating entity tries to recover from a packet of a previouos session, how do the server choses the correct session if the of the resources and stanzas acks are offered together? I mean: in the unlucky case of an entity having two open sessions and losing both of them, how can the server decide which is the session to recover, but adding some semantics to the ids? -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
Dnia 08-11-2007, Cz o godzinie 00:29 +0100, Fabio Forno pisze: > One of the reasons I tend to use id bosh is the ability of keeping the > session open when the client temporary disconnects XEP-0198: Stanza Acknowledgements already supports session recovery after disconnection with element. It also ensures, that no packets get lost with the connection failure. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Dnia 06-11-2007, Wt o godzinie 11:09 +0100, Michal 'vorner' Vaner pisze: > Because the FTP data channel (not to mention it offers passive > transfer, > too) is _inbound_. If you opened not one TCP connection to the server, > but two, one for XML and one for blobs, how it would be different from > single TCP connection? So, what you're suggesting is opening second connection to the 5222 port, and negotiating second, binary stream? Well... I like the idea. But this, would in reality create second, XMPP-Core/servers based, but effectively unrelated to XMPP-IM, network for binary packet routing. This network could be used to ie. Ethernet over XMPP with JIDs in place of MAC or any other wild idea. :-) -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Robin Redeker wrote: On Mon, Nov 05, 2007 at 09:56:12AM -0800, Justin Karneges wrote: 1) XML element to indicate binary mode: this is probably the least destructive approach. Keep in mind that we already have an XML to binary protocol change in XMPP: the TLS and SASL encryption layers. Your XML parser needs to be able to stop on a dime when it sees that final '>' character, so asking for that in this discussion should not be a big deal. Just keep in mind that we don't have a way to change "back". The current change is a very drastic one, like "flush the whole parser state and begin from start". We can do exactly the same with the switching to binary mode: (I put comments in [] brackets) [We are in XMPP stream. Here the new stanza begins:] [at this point the parser drops its state; stops precisely after the closing '>' ]random-bytes-as-under-TLS-layer[we know which byte is the last - the length could be written as a prefix of the blob or as an attribute of opening XML tags] [These two closing tags mark the end of a stanza and have always to be the same - we can merely look for '' string or whatever. Here one doesn't need XML parser's intervention.] The XML parser have lost its state (probably just removing openings from the stack), but the XMPP layer still remembers that the stream is open and is able to receive next stanzas. Suppose client and server agreed to use such a protocol as a replacement for base64. Since we can efficiently send binary data only as a topmost XML chunk, additional identifers are needed that indicate which blob goes where. I mean, instead of: large-base64-data we could send two stanzas: arbitrary-bytes The overhead is roughly few hundred bytes, so for <1kB base64 works better. It doesn't matter, since we are looking for a way for midsize blocks. If we didn't care of breaking current implementations, it would be good solution to enforce all to do parsing in natural multilayer way - as AFAIK some XMPP software already does. I mean: when TLS starts, * suspend the outer XML parser (don't flush) * intercept next bytes to feed them into new inner TLS/XML stack * continue with the outer parser - may parse the closing This way we could do the binary mode switching just in places where base64 data would otherwise appear. Dawid
Re: [Standards] Binary data over XMPP
On Nov 7, 2007 11:53 PM, Dave Cridland <[EMAIL PROTECTED]> wrote: > > (Hmm, this reminds me, I need to get around to finishing and > publishing an I-D before the deadline on fast reauth). Perhaps I'm missing something... Fast reauth? You mean just a speedup in the login process (e.g. a token for rebinding a session) or also some optimizations such as avoiding the initial presence burst when going online? One of the reasons I tend to use id bosh is the ability of keeping the session open when the client temporary disconnects > > and therefore we prefere bosh based > > connections. > > I do think that BOSH is an exceedingly good design. But, FWIW, I use > long-lived TCP connections over mobile networks quite a lot, and I > find they work fine, even when moving between cells. (I use XMPP, but > also IMAP and ACAP, all of which have server initiated data > transfers, or "push" as the media calls it). I'm aware of this. We support also long lived TCP connections and they work fine (well, the main reason is that we're not ready, yet, for proxying through bosh all the possible traffic, and public servers, at present, support almost only tcp connections). Also BOSH, when implemented on pipelined http 1.1, exploits a long lived TCP sockets and the conections are pretty robust. The real advantage of BOSH id that packets are implicitely framed by http requests/responses and it's very easy to recover from any error, and it's also easy to interleave other types of packets. Unfortunately TCP streams don't have this kind of framing and any attempt of inserting somenthing between the raw socket and the xml stanzas is dangerous... > As far as I know, the OMA are increasingly interested in long-lived > TCP based protocols, too, so the stability of mobile networks will > hopefully improve. The problem is that 100% reliable connections are impossible and a framed binding such as bosh helps in recovering ;) Baiscally my rationale is: why putting a lot effort and resources for recovering from a very small failure rate, knowing that fixing everything is impossible, when a smarter data protocol does all the job? > > You're absolutely right - right now, exchanging large amounts of > binary data over long thin pipes is a very unlikely state of affairs. > > I think this will change, primarily due to encryption, and - as a > much more minor issue - due to increased "rich messaging". (I'm > thinking about radio stations showing you pictures of the band now > playing, and such things, which certainly some mobile companies are > very keen on). Yeah, but also in this cases but encryption, I don't think that in band binary stuff is really necessary. Perhaps there is only one practical reason: many platforms (i'm thinking of j2me) don't allow opening new connections without user authorization, so receiving obb data may be annoying as user experience. -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
On Wed Nov 7 22:27:34 2007, Fabio Forno wrote: We're developing a mobile client and we think that kind of information should be threated in a different manner. In mobiles networks regular socket connections have many problems (mainly disconnection handling that forces a new login) (Hmm, this reminds me, I need to get around to finishing and publishing an I-D before the deadline on fast reauth). and therefore we prefere bosh based connections. I do think that BOSH is an exceedingly good design. But, FWIW, I use long-lived TCP connections over mobile networks quite a lot, and I find they work fine, even when moving between cells. (I use XMPP, but also IMAP and ACAP, all of which have server initiated data transfers, or "push" as the media calls it). As far as I know, the OMA are increasingly interested in long-lived TCP based protocols, too, so the stability of mobile networks will hopefully improve. as rosters, avatars...). Moreover large amounts of binary data exchanaged with mobiles are very unlikely, so I don't see the necessity of making xml streams more complex for use cases that are not well defined, if not improbable. Right - I'll skip your discussion of binary XML formats (although important), but I do want to pick up on this. You're absolutely right - right now, exchanging large amounts of binary data over long thin pipes is a very unlikely state of affairs. I think this will change, primarily due to encryption, and - as a much more minor issue - due to increased "rich messaging". (I'm thinking about radio stations showing you pictures of the band now playing, and such things, which certainly some mobile companies are very keen on). If the rich messaging doesn't happen, I won't be too bothered. If encryption doesn't happen on mobile devices because it's too expensive, I'll be very troubled indeed. Dave -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
On Nov 7, 2007 1:56 PM, Dave Cridland <[EMAIL PROTECTED]> wrote: > > > Yes, base64 is acceptable here, although bear in mind that over a > charged-by-transfer medium - such as many mobile phone tariffs - that > 100k image is transferred as 133k, and 33k that you didn't really > need to transfer sounds like an additional cost we could drop if we > had the technology to do so. It's not a driver for it, though, I > agree. We're developing a mobile client and we think that kind of information should be threated in a different manner. In mobiles networks regular socket connections have many problems (mainly disconnection handling that forces a new login) and therefore we prefere bosh based connections. Bosh has also the advantage that it may act more intelligently than a simple proxy: the connector could be an agent shaping information in more a suitable way for the mobile client (optimized compression, caching, sending only the diffs of some data as rosters, avatars...). Moreover large amounts of binary data exchanaged with mobiles are very unlikely, so I don't see the necessity of making xml streams more complex for use cases that are not well defined, if not improbable. What would be nice (and we're making some thoughts about it) is binary bosh binding, with binary xml and binary data if necessary. Binary xml + compression is by far the most bandwidth efficient way for exchanging xml and it may have the not trascurable advantage of being able to implement parsers in very small clients (e.g. pic based nodes in sensor networks). IMHO this is the only approach allowing full compatibility with existing installations and ibb binary data for the few clients that really need it. For regular socket based clients, as others have already pointed out, there always alternatives and ibb is the fallback for the few clients / applications that cannot do otherwise. -- Fabio Forno, PhD Istituto Superiore Mario Boella Jabber ID: xmpp:[EMAIL PROTECTED] ** Try Jabber http://www.jabber.org
Re: [Standards] Binary data over XMPP
On Wednesday 07 November 2007 4:56 am, Dave Cridland wrote: > I personally feel that if we're to say that XMPP truly supports end > to end encryption, we need to ensure it's of near-equal cost to the > current way of doing things. Totally disagree. Encrypted e2e communication is almost universally ASCII-encoded (see PGP/MIME, S/MIME, OTR, PGP over IM, XEP-27). It would be nice to transmit this more efficiently, but I don't buy encryption as a motivator. -Justin
Re: [Standards] Binary data over XMPP
On Nov 7, 2007, at 10:00 AM, Michal 'vorner' Vaner wrote: Sure it is not optimal. I was just wondering, how far we want to go in solving it - how bug the problem is. I still think redefining XMMP from the ground because of this is not the right way. +1. Throwing out the underlying XMPP-ness of XMPP seems the wrong approach here to me. Or if we do, we need to immediately go, okay, this is no longer XMPP as you know it and make a concerted effort. Alternatively, if we have decided that sending 100k custom emoticons over mobile phones generates 33k of 'needless' traffic which is a deal- breaker to the point that solution is throwing out XMPP 1.0 and starting over with 2.0, I would say that the more practical solution here is not to support custom emoticons on mobile phones. About the only situation I can see for a mobile phone to be sending a binary file inline is if you have just taken a picture with your cellphone camera and want to send it to a contact in an tag. Which is a useful situation, but not one where I expect BASE64 use would be a deal-breaker, as it is not like you will be getting 14 inline images per IM session, generally. Unlike the 'custom emoticon' use case mentioned earlier. I just think we may be overthinking this. If things go through the server, they're safely XML enclosed. If you want to send raw binary data without escaping it, we have at least two methods to negotiate client-to-client streams: Jingle and the somewhat more retro stream initiation. Yes, that leaves you a bit out in the cold if you are behind some firewall and XMPP is your only tunnel to the outside world, but then you have IBB (and 33k should not be a deal-breaker in that case, I would think, since you are not on a cellular bandwidth plan). I'm just not convinced this is a problem which doesn't already have multiple available solutions, much less one severe enough to require throwing out the underlying stream definition and starting over. :) -- Rachel Blackman <[EMAIL PROTECTED]> Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Binary data over XMPP
Hello On Wed, Nov 07, 2007 at 04:11:21PM +, Dave Cridland wrote: > On Wed Nov 7 15:02:57 2007, Michal 'vorner' Vaner wrote: >> Can't compression solve this? Does anyone know, how the base64 encoded >> data grow/shrink, if they are put trough zlib? Would be nice to know, >> how far it is worth going with the blob transfers & modifications to >> protocol. > > I've been accused - on this list - of treating compression as a panacea. > But it's not a substitute for efficiency. Base64 encoding is recovered to a > degree by a good minimal redundancy algorithm, but it tends to shield > patterns from a dictionary algorithm. DEFLATE uses a Lempel-Ziv dictionary > algorithm first, then Huffman, a minimal redundancy algorithm. Sure it is not optimal. I was just wondering, how far we want to go in solving it - how bug the problem is. I still think redefining XMMP from the ground because of this is not the right way. -- The problem with graduate students, in general, is that they have to sleep every few days. Michal 'vorner' Vaner pgpprtOvHNqcJ.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Tomasz Sterna wrote: > Dnia 07-11-2007, Śr o godzinie 02:27 -0700, Peter Saint-Andre pisze: >> 2. Attach a larger color sketch -- a file, the image for which a >> thumbnail is a representation, or whatever (50k to 1M?). I think we >> use >> HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a >> fallback. > > The OOB approach does not work in cases, where XMPP is the only window > to the world - either as a BOSH, SSL tunnel to a server listening on 443 > (https) port, firewall rule allowing traffic to 5222 port... In that situation is it acceptable to fall back to IBB? Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Binary data over XMPP
Hello On Wed, Nov 07, 2007 at 12:56:31PM +, Dave Cridland wrote: > Yes, base64 is acceptable here, although bear in mind that over a > charged-by-transfer medium - such as many mobile phone tariffs - that 100k > image is transferred as 133k, and 33k that you didn't really need to > transfer sounds like an additional cost we could drop if we had the > technology to do so. It's not a driver for it, though, I agree. Can't compression solve this? Does anyone know, how the base64 encoded data grow/shrink, if they are put trough zlib? Would be nice to know, how far it is worth going with the blob transfers & modifications to protocol. -- Wait few minutes before opening this email. The temperature difference could lead to vapour condensation. Michal 'vorner' Vaner pgpaq4QdizNSG.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
On Wed Nov 7 15:02:57 2007, Michal 'vorner' Vaner wrote: Can't compression solve this? Does anyone know, how the base64 encoded data grow/shrink, if they are put trough zlib? Would be nice to know, how far it is worth going with the blob transfers & modifications to protocol. I've been accused - on this list - of treating compression as a panacea. But it's not a substitute for efficiency. Base64 encoding is recovered to a degree by a good minimal redundancy algorithm, but it tends to shield patterns from a dictionary algorithm. DEFLATE uses a Lempel-Ziv dictionary algorithm first, then Huffman, a minimal redundancy algorithm. Lucky, practise is easier than theory. Grab some suitable data, compress it, base64+compress it, and compare all the sizes. Gzip is a useful tool to do this - the results aren't 100% accurate due to gzip overhead, but are close to the zlib compression we use in the application layer of XMPP, and are pretty close to DEFLATE (as we should be using, and as TLS uses). I took a C source file, and found this: -rwxr-xr-x 1 dwd dwd 36K 2007-11-07 15:43 connection.c The original file. (100%) -rw-r--r-- 1 dwd dwd 49K 2007-11-07 15:44 connection.c.b64 Base64 encoded, traditionally, with newlines. (135%) -rw-r--r-- 1 dwd dwd 15K 2007-11-07 15:44 connection.c.b64.gz Base64, then gzipped. (40%) -rw-r--r-- 1 dwd dwd 8.1K 2007-11-07 15:44 connection.c.gz Just gzipped. Note it's nearly half the size. We'll use this as an uncompressible object. (22% / 100%) -rw-r--r-- 1 dwd dwd 11K 2007-11-07 15:45 connection.c.gz.b64 Gzipped, then base64. (30% / 135%) -rw-r--r-- 1 dwd dwd 8.4K 2007-11-07 15:45 connection.c.gz.b64.gz Now gzip it again. In principle, this should have recovered the base64 encoding, but note that it hasn't. (23% / 103%) This suggests to me that not only does gzip not recover the base64 encoding fully - although close - but base64 encoding prior to compression really hurts the compressor. Note that compressing first, then base64 encoding, then compressing *again* actually gave better results than base64 *then* compressing, meaning that almost every file transfer we do under base64 should be compressed first. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
On Wed Nov 7 09:27:49 2007, Peter Saint-Andre wrote: As always, what are the use cases? If XML is black-and-white, I see: 1. Include a little dab of color -- an emoticon, a PNG avatar, a thumbnail for a file, a small inline image for whiteboarding, or whatever (something less than 50k and perhaps less than 10k). Here Base64 might be all we need, via data: or cid: URLs perhaps. Yes, base64 is acceptable here, although bear in mind that over a charged-by-transfer medium - such as many mobile phone tariffs - that 100k image is transferred as 133k, and 33k that you didn't really need to transfer sounds like an additional cost we could drop if we had the technology to do so. It's not a driver for it, though, I agree. 2. Attach a larger color sketch -- a file, the image for which a thumbnail is a representation, or whatever (50k to 1M?). I think we use HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a fallback. Right, we're into "would be very nice" territory here. 333k (or thereabouts) is a noticable chunk on my mobile bill, and it's around 11 seconds of time on my (256k uplink) DSL. 3. Send a huge color canvas -- a music file, a podcast, a video, or whatever (1M+?). I don't know what we use for this. Once we're into filesharing, then yes, we need either a binary streaming protocol (A binary variant of IBB), or else we want to ship the data out of band. There's also a use-case you seem to have forgotten, which is the reason I raised this now: 4. XTLS and similar encrypted server-mediated client-client streams. To send these via a peer-to-peer session negotiated via XMPP - like Jingle - strikes me as losing a fundamental benefit of XMPP, but it's also much cheaper in terms of bandwidth than sending them via the server right now. I personally feel that if we're to say that XMPP truly supports end to end encryption, we need to ensure it's of near-equal cost to the current way of doing things. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Dnia 07-11-2007, Śr o godzinie 02:27 -0700, Peter Saint-Andre pisze: > 2. Attach a larger color sketch -- a file, the image for which a > thumbnail is a representation, or whatever (50k to 1M?). I think we > use > HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a > fallback. The OOB approach does not work in cases, where XMPP is the only window to the world - either as a BOSH, SSL tunnel to a server listening on 443 (https) port, firewall rule allowing traffic to 5222 port... I know, this usage is very common, and XMPP is known to be "the way" to get a running IM in networks, where Internet == HTTP. And the decision what is to large for the XMPP server to put through should be a deployment decision - some servers administrators do not like 10kB transfers and some are just fine with 100MB transfers. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Kevin Smith wrote: > On 7 Nov 2007, at 09:27, Peter Saint-Andre wrote: >> 2. Attach a larger color sketch -- a file, the image for which a >> thumbnail is a representation, or whatever (50k to 1M?). I think we use >> HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a fallback. >> 3. Send a huge color canvas -- a music file, a podcast, a video, or >> whatever (1M+?). I don't know what we use for this. > > The Jabber Disk method seems to work rather well for these scenarios... Yes it does: http://dev.jabbim.cz/jdisk I would be perfectly happy to standardize on that approach for "larger" blobs (64k+ or whatever), with IBB as a fallback. For "smaller" blobs (I think of this as less than 64k since that's the upper stanza size limit on the jabber.org service, but it might even be smaller) it seems just fine to include the blob "inline" via some method yet to be worked out. Adam Nemeth was working on something like this for emoticons. It's funny, I was chatting with Jeremie Miller the other day (he's not on this list AFAIK) and he said "If I had known that someday people would choose their IM technology based on emoticons, I would have designed a simple binary-inclusion technology into Jabber from the beginning." So now we have the chance to remedy the oversight. But please let's keep it simple, shall we? This is for small stuff like emoticons and thumbnails. Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Binary data over XMPP
On 7 Nov 2007, at 09:27, Peter Saint-Andre wrote: 2. Attach a larger color sketch -- a file, the image for which a thumbnail is a representation, or whatever (50k to 1M?). I think we use HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a fallback. 3. Send a huge color canvas -- a music file, a podcast, a video, or whatever (1M+?). I don't know what we use for this. The Jabber Disk method seems to work rather well for these scenarios... /K
Re: [Standards] Binary data over XMPP
Dave Cridland wrote: > On Mon Nov 5 15:11:33 2007, Thomas Charron wrote: >> On 11/5/07, Michal 'vorner' Vaner <[EMAIL PROTECTED]> wrote: >> > Hello >> > On Mon, Nov 05, 2007 at 02:45:05PM +, Dave Cridland wrote: >> > > Another option would be to setup a distinct connection (and >> protocol) for >> > > routing blobs, and so send them through the server, yet not >> in-band. I'm >> > > not comfortable with this, because it means essentially >> duplicating all >> > > security information, and maintaining synchronization between two >> distinct >> > > streams. >> > Or make the connection blobs by default, and some blobs could contain >> > complete XML documents, like this: >> > lenght of first block >> > >> > length of second block >> > >> > length of third block >> > some binary data. >> > It is as much drastic approach as the blobs, it changes the protocol >> > from the very basic ground. Furthermore, you can extract the stanza and >> > feed it to any XML parser. >> >> Not to mention the documentation would be much easier. We could >> just refer to the BEEP standards instead of having to write our own. >> Of course, one could argue, just use BEEP at that point. > > Way ahead of you. See the first paragraph of the mail quoted above. :-) > > The essential principle is much the same, but I'm not advocating > bringing the whole of BEEP into play, here. That has flow-control and > all sorts, and supports the splitting of a message into multiple frames, > which brings in a lot of complexity. > > This complexity is unwarranted, in my opinion, in the context of XMPP. > The one thing we might want - and I stress might - is the framing of > arbitrary data by framing everything. > > We've always relied, in XMPP, on the implicit framing that XML can give > us, but that's not always the best option, as we've seen. Base64 doesn't > - in my opinion - grant us sufficient efficiency in a number of > circumstances. > > So we need something else, and our two options boil down to either > framing everything - the BEEP method - or an escape mechanism which is > used to frame non-XML data - we can call this the IMAP method, since > it's pretty similar. > > I strongly suspect, given the way the discussion is going, that we > either have to consider framing everything - and that's a huge break > from XMPP - or else we need an escape mechanism that works. Or, of > course, we decide to give up and frame using XML as now, and use base64 > to cope. As always, what are the use cases? If XML is black-and-white, I see: 1. Include a little dab of color -- an emoticon, a PNG avatar, a thumbnail for a file, a small inline image for whiteboarding, or whatever (something less than 50k and perhaps less than 10k). Here Base64 might be all we need, via data: or cid: URLs perhaps. 2. Attach a larger color sketch -- a file, the image for which a thumbnail is a representation, or whatever (50k to 1M?). I think we use HTTP-PUT (perhaps via WebDAV) and jabber:x:oob, with IBB as a fallback. 3. Send a huge color canvas -- a music file, a podcast, a video, or whatever (1M+?). I don't know what we use for this. Peter -- Peter Saint-Andre https://stpeter.im/ smime.p7s Description: S/MIME Cryptographic Signature
Re: [Standards] Binary data over XMPP
On Tue Nov 6 15:25:44 2007, Tomasz Sterna wrote: Dnia 06-11-2007, Wt o godzinie 14:56 +, Dave Cridland pisze: > I'm not following something. So encode the octets #x00 #x01 #x02 > #x5D #x3E, and tell me what you get. Like this: Binary <-> Encoded 0x00 <-> 0xC4, 0x80 0x01 <-> 0xC4, 0x81 ... Ah, okay - so you're adding 0x100 to these. I thought this would yield 3-octet characters, hence my confusion. 0x20 <-> 0x20 0x21 <-> 0x21 .. 0x7F <-> 0x7F 0x80 <-> 0xC2, 0x80 .. 0xFF <-> 0xC3, 0xBF Right. > I get three bytes that are not legal in a CDATA section, followed by > a sequence of bytes which decode (via UTF-8) to "]]>", which in turn > would end the CDATA section. Good point. We either transfer this chunk in &...; escaping, or just transcode 0x3E or 0x5D bytes to 2byte UTF-8 character. (Maybe '>' to '»' :) Or add 0x100 again. (I checked this time, 0x5D encodes to 0xC5 0x9D). However, using this technique, truly random data will expand by - roughly - 60.5%. Base64 beats this, at only 33%. There's only 101 octets that are legal single-byte UTF-8 octets that we can allow safely in CDATA sections, by my count, so that leaves 155 that are double-byte. Base64 operates by encoding 6 bits into an alphabet of 64 symbols; encoding 7 bits needs an alphabet of 2^7, or 128 symbols, and would give us growth of 14.2% - we don't have 128 symbols to play with, though. We could choose an additional 17 double-octet symbols, in which case we'd see growth of 20.5% overall. Slightly better than base64. So we'd encode each 7 bits using an alphabet of #x9 | #xA | #xD | [#x20-#x3D] | [#x3F-#x5C] | [#x5E-#x111], which would then be UTF-8 encoded, and be roughly 90% of the size of base64. However, I think you need to factor in the overhead that no encoder/decoder library exists for this, and each individual implementation would have to code one, (or wait for someone else to do so). Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Dnia 06-11-2007, Wt o godzinie 14:56 +, Dave Cridland pisze: > I'm not following something. So encode the octets #x00 #x01 #x02 > #x5D #x3E, and tell me what you get. Like this: Binary <-> Encoded 0x00 <-> 0xC4, 0x80 0x01 <-> 0xC4, 0x81 ... 0x20 <-> 0x20 0x21 <-> 0x21 .. 0x7F <-> 0x7F 0x80 <-> 0xC2, 0x80 .. 0xFF <-> 0xC3, 0xBF > I get three bytes that are not legal in a CDATA section, followed by > a sequence of bytes which decode (via UTF-8) to "]]>", which in turn > would end the CDATA section. Good point. We either transfer this chunk in &...; escaping, or just transcode 0x3E or 0x5D bytes to 2byte UTF-8 character. (Maybe '>' to '»' :) -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On Tue Nov 6 14:46:32 2007, Tomasz Sterna wrote: Dnia 06-11-2007, Wt o godzinie 14:35 +, Dave Cridland pisze: > > Let's take first 256 allowable UTF-8 characters [...] > Can't do that, because many of those characters are going to be > illegal even in CDATA sections. First _allowable_ 256 UTF-8 characters are for sure legal in CDATA section. I'm not following something. So encode the octets #x00 #x01 #x02 #x5D #x5D #x3E, and tell me what you get. I get three bytes that are not legal in a CDATA section, followed by a sequence of bytes which decode (via UTF-8) to "]]>", which in turn would end the CDATA section. As far as I can tell, all those octet values would need to be further escaped. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
First _allowable_ 256 UTF-8 characters are for sure legal in CDATA section. What about 0x0..0x19? These chars are invalid in CDATA sections except 0x9, 0xA and 0xC.
Re: [Standards] Binary data over XMPP
Michal 'vorner' Vaner wrote: > But a different question - is binary XML able to transfer binary data? > And is it possible to map normal XML <-> binary XML one to one? If so, > we could have a stream feature "use binary XML instead and transfer blob > elements not-base64-encoded" or something like that. If the server > needed to push it to a non-binary stream, it would have to base64 it (or > something like that). > > Does it make sense? (Just an crazy idea, I do not know, if it could be > of any use). > EXI seems to encode binary data as a length-prefixed blob, and I think all EXI files can be converted to normal XML, so that may work nicely. The downside of not doing framing or a separate connection is that a large image in a chat message will stall all subsequent messages until it is done. And how should a file transfer be handled if NAT boxes or firewalls prevent Jingle connections? Wouldn't it be a pragmatic solution to negotiate a framing protocol like BEEP (or maybe simpler) when opening the connection, or fall back to base 64 if one of the parties doesn't support that? -- Niklas
Re: [Standards] Binary data over XMPP
Dnia 06-11-2007, Wt o godzinie 14:35 +, Dave Cridland pisze: > > Let's take first 256 allowable UTF-8 characters [...] > Can't do that, because many of those characters are going to be > illegal even in CDATA sections. First _allowable_ 256 UTF-8 characters are for sure legal in CDATA section. > But bear in mind that even then, to encode a single octet will yield > between 1 and 3 characters. I would only use those UTF-8 characters that maps to maximum 2 bytes. Leaving the 3byte and more... And a better mapping: Bytes that are valid UTF-8 characters are mapped 1 to 1. Only the invalid ones are mapped to 2byte characters. This way if the "binary" data is ASCII text, it stays human readable. This is a simple 256 rows translation table, that could be defined verbatim. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On Tue Nov 6 13:00:44 2007, Tomasz Sterna wrote: Dnia 05-11-2007, Pn o godzinie 16:23 +0100, Tomasz Sterna pisze: > Alternatively we could invent binary-2-utf mapping which has less > overhead than BASE64. Simplest that comes to mind: Let's take first 256 allowable UTF-8 characters and assign them to 256 values of a single byte. That would be less than 33% BASE64 overhead. Can't do that, because many of those characters are going to be illegal even in CDATA sections. You could take all those ones, though, and add 256 to the codepoint value before encoding - that would - I think - be sufficient. But bear in mind that even then, to encode a single octet will yield between 1 and 3 characters. Encoding essentially random data - which includes the output of any decent encryption algorithm - will encode half the octets using 2-byte characters, yielding - on average - a 50% inflation. That's higher than base64, of course. It's possible that a modified UTF-7 might be better. (And UTF-7, modified or not, is acceptable UTF-8). Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Hello On Tue, Nov 06, 2007 at 02:00:44PM +0100, Tomasz Sterna wrote: > Dnia 05-11-2007, Pn o godzinie 16:23 +0100, Tomasz Sterna pisze: > > Alternatively we could invent binary-2-utf mapping which has less > > overhead than BASE64. > > Simplest that comes to mind: > Let's take first 256 allowable UTF-8 characters and assign them to 256 > values of a single byte. > That would be less than 33% BASE64 overhead. > > But I'm sure one of the more knowledgeable in the UTF internals would > come up with better mapping. If you want to map every byte to char (for simplicity), then you can not come with anything better, since the chars at the beginning are the shortest ones and their size grows with their position. But, how the data sizes transfered would change, if the stream was UTF-7? Most of it are namespaces, which contain only ASCII. Then you have base64 data and most of the text transfered is usually ASCII too. This could be quite simple to add as a stream feature. -- Anyone who goes to a psychiatrist ought to have his head examined. -- Samuel Goldwyn Michal 'vorner' Vaner pgppSRabIvB1z.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 16:23 +0100, Tomasz Sterna pisze: > Alternatively we could invent binary-2-utf mapping which has less > overhead than BASE64. Simplest that comes to mind: Let's take first 256 allowable UTF-8 characters and assign them to 256 values of a single byte. That would be less than 33% BASE64 overhead. But I'm sure one of the more knowledgeable in the UTF internals would come up with better mapping. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On Tue, Nov 06, 2007 at 10:16:19AM +, Dave Cridland wrote: [.snip.] > > You would no longer be able to do that with binary blobs; you > >would have to special-case blob stanzas fairly heavily, since I > >guarantee you that if the characters '<' or '>' appear un-escaped > >in the binary blob, Expat will choke and die. > > > > > Sure, but there's two options with an escaping mechanism - either > synchronized or non-synchronized - and they can be negotiated easily. > > With a non-synch mechanism, the sender just sends out the > element, then sends out the binary data, then continues with XML. It > can be done in a single TCP packet, but it requires that the receiver > processes the data into stanzas prior to processing through the XML > parser. Some receivers already do this, so it seems reasonable that > this can be an option. Also take into account that the sender also has to customize the XML writer to allow writing raw octets, which breaks multiple layers of nice cosy XML abstractions. (Of course with current XMPP you already need a XML writer that allows some customisations). > With a synch mechanism, the sender sends out a element, and > then waits. The receiver then says it's ready for binary data > (sending a stanza to indicate this), and the sender then sends the > binary data - followed immediately by more XML as required, since a > "binary parser" is going to be octet counting anyway. For people who > parse all the network traffic at once through a SAX-like parser, this > should work fine, at the expense of some efficiency. > > Note that anyone can send non-synchronized blobs, but not everyone > can receive them, so a client (for instance) which is built to stream > network data directly into a SAX parser can still *send* blobs > efficiently. How do you propose the receiver determines the end of the binary data? Is it going to be prefixed by a lenght? Generally: pumping binary crap through XMPP is another big step _away_ from XML compatibility. Also transforming a stream (TLS) into packets (stanzas) which are used to again emulate a stream (TLS) sounds crazy to me anyway :) To encrypt stanzas (packets) with TLS (stream encryption)... *shudder* :) Robin
Re: [Standards] Binary data over XMPP
Ever tried to get FTP protocol through FW/NAT? It requires protocol level command channel tracking, to find out related data channels and let them in. Special handling, special modules, special setup - ergo: nobody bothers. Well as has been already pointed out by Dave what you are talking about (PORT FTP) is completely different from what I suggested in that its the client opening the port and not the server, what I was suggesting was the server having an extra port open (or even the normal XMPP C2S port with a special negotiation turning it into a framed binary connection) and just maintaining two connections to the server, one that carries the normal XMPP traffic and one that carries the binary frames, you could even just use a single framed binary connection (rather than two) and have a special XMPP XML frame type to denote it containing XMPP stanzas, this is what I do in my server implementation which supports framed as well as normal XMPP streams, among other things I find it makes implementing a low overhead keepalive/pingpong protocol a whole lot easier. This is one of the reasons why HTTP (one connection) is omnipresent, even for file archives, and FTP is becoming forgotten. Sorry but that is not the actual reason by any means, and there are plenty of FTP archived around, have you never downloaded a linux ISO? Most of the linux ISO download servers i've come across have been FTP servers, but anyway this is getting rather off topic. Richard
Re: [Standards] Binary data over XMPP
On Tue Nov 6 10:09:41 2007, Michal 'vorner' Vaner wrote: Because the FTP data channel (not to mention it offers passive transfer, too) is _inbound_. Well, PASV initiated connections are client->server, whereas PORT intiated are server->client callbacks. PORT is *almost* dead, now, as a result of the complexities of running an ALG in the firewall. If you opened not one TCP connection to the server, but two, one for XML and one for blobs, how it would be different from single TCP connection? Well, to state the obvious, it's not a *single* TCP connection. There's still a distinct increase in attack surface by trying to ensure that two connections are assuredly the same client. In addition, you've got to synchronize the blobs on one session with the XML on the other. I think this would get complicated fast. But a different question - is binary XML able to transfer binary data? And is it possible to map normal XML <-> binary XML one to one? If so, we could have a stream feature "use binary XML instead and transfer blob elements not-base64-encoded" or something like that. If the server needed to push it to a non-binary stream, it would have to base64 it (or something like that). Does it make sense? (Just an crazy idea, I do not know, if it could be of any use). I don't know the binary XML representations very well, but it's certainly something I'd be curious about. One thing of note, though - the bulk of XMPP traffic *now* is not binary. We want this to change - or at least, we want this to be able to change - without penalty. So a binary XML format would have to maintain near-equal efficiency when used for traditional XMPP traffic, and in addition be a simple upgrade for implementors. Dave -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Forgive me for sounding like an idiot, but I seem to be missing the point here: On Mon Nov 5 17:45:53 2007, Rachel Blackman wrote: Expat (a fairly common XML parser out there) will do the job just fine. Your network engine has to separate each stanza out, sure, but that's not hard. And then you can pass each stanza unaltered through expat and get back your usual XML structures. Is this saying that given a string containing multiple stanzas, you need to seperate them out into one stanza per string, before feeding them in? I thought that with a SAX-like XML parser, you needn't bother doing that. You would no longer be able to do that with binary blobs; you would have to special-case blob stanzas fairly heavily, since I guarantee you that if the characters '<' or '>' appear un-escaped in the binary blob, Expat will choke and die. Sure, but there's two options with an escaping mechanism - either synchronized or non-synchronized - and they can be negotiated easily. With a non-synch mechanism, the sender just sends out the element, then sends out the binary data, then continues with XML. It can be done in a single TCP packet, but it requires that the receiver processes the data into stanzas prior to processing through the XML parser. Some receivers already do this, so it seems reasonable that this can be an option. With a synch mechanism, the sender sends out a element, and then waits. The receiver then says it's ready for binary data (sending a stanza to indicate this), and the sender then sends the binary data - followed immediately by more XML as required, since a "binary parser" is going to be octet counting anyway. For people who parse all the network traffic at once through a SAX-like parser, this should work fine, at the expense of some efficiency. Note that anyone can send non-synchronized blobs, but not everyone can receive them, so a client (for instance) which is built to stream network data directly into a SAX parser can still *send* blobs efficiently. If we really need a non-BASE64 method of sending binary data between clients, I suggest we re-use Jingle. That already is a mechanism for negotiation of 'I want to send you this type of data, how do I get it to you?' There's very few cases I can think of where we would want to be sending binary blobs in a server-cached manner anyway. Server-proxied, not cached. This implies that encrypted chat sessions don't go via the server, for example, meaning that a client intending to encrypt all conversations by default is going to use XMPP purely as a session initiation protocol, and lose all efficiency (and a degree of privacy) as a result. Or else it'll be base64 encoding the entire conversation, and lose efficiency that way. Either way, it will directly impact the usage of encryption - and that's ignoring the other ways that binary data is commonly used within XMPP. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Hello On Tue, Nov 06, 2007 at 10:44:15AM +0100, Tomasz Sterna wrote: > Dnia 06-11-2007, Wt o godzinie 09:17 +, Richard Dobson pisze: > > > And repeat the FTP + statefull firewall nightmare? > > Sorry but what??? Can you explain exactly what you mean by this. > > Ever tried to get FTP protocol through FW/NAT? > It requires protocol level command channel tracking, to find out related > data channels and let them in. > Special handling, special modules, special setup - ergo: nobody bothers. Because the FTP data channel (not to mention it offers passive transfer, too) is _inbound_. If you opened not one TCP connection to the server, but two, one for XML and one for blobs, how it would be different from single TCP connection? But a different question - is binary XML able to transfer binary data? And is it possible to map normal XML <-> binary XML one to one? If so, we could have a stream feature "use binary XML instead and transfer blob elements not-base64-encoded" or something like that. If the server needed to push it to a non-binary stream, it would have to base64 it (or something like that). Does it make sense? (Just an crazy idea, I do not know, if it could be of any use). -- Q: Why was Stonehenge abandoned? A: It wasn't IBM compatible. Michal 'vorner' Vaner pgpocuJEYcEMI.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Dnia 06-11-2007, Wt o godzinie 09:17 +, Richard Dobson pisze: > > And repeat the FTP + statefull firewall nightmare? > Sorry but what??? Can you explain exactly what you mean by this. Ever tried to get FTP protocol through FW/NAT? It requires protocol level command channel tracking, to find out related data channels and let them in. Special handling, special modules, special setup - ergo: nobody bothers. This is one of the reasons why HTTP (one connection) is omnipresent, even for file archives, and FTP is becoming forgotten. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Tomasz Sterna wrote: Dnia 05-11-2007, Pn o godzinie 16:24 +, Richard Dobson pisze: Personally I think it would be better to do as someone already suggested and have a separate connection for framed blobs that you maintain or establish when needed to send those And repeat the FTP + statefull firewall nightmare? Sorry but what??? Can you explain exactly what you mean by this.
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 16:24 +, Richard Dobson pisze: > Personally I think it would be better to do as someone already > suggested > and have a separate connection for framed blobs that you maintain or > establish when needed to send those And repeat the FTP + statefull firewall nightmare? -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On Mon, Nov 05, 2007 at 09:56:12AM -0800, Justin Karneges wrote: > On Monday 05 November 2007 3:40 am, Dave Cridland wrote: > > Now, we can't expect that the entire Internet will bend to our will > > and instantly upgrade, so we need a sane fallback - probably to IBB, > > or something fairly similar. The interesting question is whether we > > choose to have this negotiated end to end (which means we'll need to > > have each hop along the route tested), or whether we say that this > > down-conversion happens within servers. > > Binary over XMPP has been on my TODO for awhile now, and I have some notes > written up about it but nothing publicized. I think a hop-by-hop approach is > best, if we want to have any hope for compatibility. > > Comments on the two formatting approaches: > > 1) XML element to indicate binary mode: this is probably the least > destructive approach. Keep in mind that we already have an XML to binary > protocol change in XMPP: the TLS and SASL encryption layers. Your XML parser > needs to be able to stop on a dime when it sees that final '>' character, so > asking for that in this discussion should not be a big deal. Just keep in mind that we don't have a way to change "back". The current change is a very drastic one, like "flush the whole parser state and begin from start". Robin
Re: [Standards] Binary data over XMPP
On Mon, Nov 05, 2007 at 06:47:40PM +0100, Michal 'vorner' Vaner wrote: > Hello > > On Mon, Nov 05, 2007 at 06:27:33PM +0100, Robin Redeker wrote: > > On Mon, Nov 05, 2007 at 04:04:10PM +0100, Michal 'vorner' Vaner wrote: > > > > > > It is as much drastic approach as the blobs, it changes the protocol > > > from the very basic ground. Furthermore, you can extract the stanza and > > > feed it to any XML parser. > > > > +1 for "real" protocol frames! > > Actually, I was just showing, how deep change was the blobs thing. I'm > against changing the whole infrastructure inside out. I didn't mean to > propagate the frames, I just took them as example. Heh, ok. Just wanted to take the chance to promote the idea once more :) R
Re: [Standards] Binary data over XMPP
You do not want to use 65 too much. If I skip the fact it is going to get deprecated by jingle, probably, it is really heavy for small blobs, like an icon or a funny image in a message. (Of course, it is the right way for 1GB file you want to send). Jingle doesnt depreciate XEP-0065 as far as I am aware (just XEP-0095/96) given that Peter was working on writing a spec so you can use XEP-0065 proxys with Jingle. Also if all you are using it for is small things like icons or little images then I fail to see what real benefits this has over just using IBB given the complexity it will take to implement framing in XMPP, if they are small the overall overhead is neglidgeable compared so the effort we would all need to go though to implement framing.
Re: [Standards] Binary data over XMPP
On Monday 05 November 2007 3:40 am, Dave Cridland wrote: > Now, we can't expect that the entire Internet will bend to our will > and instantly upgrade, so we need a sane fallback - probably to IBB, > or something fairly similar. The interesting question is whether we > choose to have this negotiated end to end (which means we'll need to > have each hop along the route tested), or whether we say that this > down-conversion happens within servers. Binary over XMPP has been on my TODO for awhile now, and I have some notes written up about it but nothing publicized. I think a hop-by-hop approach is best, if we want to have any hope for compatibility. Comments on the two formatting approaches: 1) XML element to indicate binary mode: this is probably the least destructive approach. Keep in mind that we already have an XML to binary protocol change in XMPP: the TLS and SASL encryption layers. Your XML parser needs to be able to stop on a dime when it sees that final '>' character, so asking for that in this discussion should not be a big deal. 2) Framing mode: this is probably the most optimized approach, but then the protocol becomes very unlike XMPP, and yes it may be worth using BEEP then (although honestly I haven't read the BEEP RFC in awhile, it probably does more than we need). For framing, I came up with two approaches: "interleaved binary" and "stream multiplexing". Either way you have your TLV framing, and a very tight binding to what we're trying to accomplish. For the interleaved binary, there are two types: XML (0) and binary (1). :) Either packet type can contain arbitrary amounts of data. It would not be required for the XML type to contain a complete element, for example. The following two transmissions would be equivalent (whitespace added for clarity). C0: C0: C0: SGVsbG8gd29ybGQ= C0: C0: C0: C0: C1: Hello world C0: C0: The binary type could be converted to and from Base64 by any hop. Thus, it is important to consider with this protocol that you're not sending a random blob of binary, you're sending Base64'd CDATA just in a more optimized format. This simplifies integration into existing XMPP applications. Stanza input and output would look exactly as they do today (containing binary that is Base64 encoded). Only the transport layer would worry about converting back and forth. Indeed, this means that if binary data is received on the network, it would probably be Base64 encoded and plugged into the stanza as CDATA before passing upwards to the application (to then be decoded again :) ). The advantage of the interleaved approach is that anywhere there is Base64 we could do a binary transfer. So not just IBB, but a presence signature, a vcard avatar, etc. For the stream multiplexing approach, there would be a number of "channels". Channel 0 would be the XML stream, and would operate like normal. Channel 1 would be an IBB packet. This gives is a very tight binding to IBB, but that may be fine since that's the main way you'd want to transfer binary anyway. Typical IBB handshake: C0: C0: C0: S0: Client sets channel 1 to be used for this IBB stream: C0: Client sends some IBB packets: C1: Hello world C1: Data sent on this channel is not Base64 encoded Server replies also using a channel: S0: S1: You're right, and neither is this data! If the next hop does not support ibbbind, then you would transmit as a regular IBB packet. Yes, this means a server supporting ibbbind would have to know the IBB protocol (it would not be enough to expand the binary back into Base64 and send, it would truly have to reconstruct the ibb iq packet with the right sequence number, etc). However, this intimate binding would end up being very optimized. -Justin
Re: [Standards] Binary data over XMPP
Hello On Mon, Nov 05, 2007 at 06:27:33PM +0100, Robin Redeker wrote: > On Mon, Nov 05, 2007 at 04:04:10PM +0100, Michal 'vorner' Vaner wrote: > > > > It is as much drastic approach as the blobs, it changes the protocol > > from the very basic ground. Furthermore, you can extract the stanza and > > feed it to any XML parser. > > +1 for "real" protocol frames! Actually, I was just showing, how deep change was the blobs thing. I'm against changing the whole infrastructure inside out. I didn't mean to propagate the frames, I just took them as example. -- Einstein argued that there must be simplified explanations of nature, because God is not capricious or arbitrary. No such faith comforts the software engineer. -- Fred Brooks Michal 'vorner' Vaner pgpvuuTU0R7sK.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
On Nov 5, 2007, at 6:35 AM, Tomasz Sterna wrote: Dnia 05-11-2007, Pn o godzinie 12:51 +0100, Michal 'vorner' Vaner pisze: You probably can not do that with any reasonably out-of-the-box XML parser. You cannot use out-of-the-box XML parser anyway. You can too. Expat (a fairly common XML parser out there) will do the job just fine. Your network engine has to separate each stanza out, sure, but that's not hard. And then you can pass each stanza unaltered through expat and get back your usual XML structures. You would no longer be able to do that with binary blobs; you would have to special-case blob stanzas fairly heavily, since I guarantee you that if the characters '<' or '>' appear un-escaped in the binary blob, Expat will choke and die. I'm reasonably sure the same could be said of most other off-the-shelf XML parsers. Sorry, but I'm with vorner on this one; the blob mechanism is neat, but too much of a departure from what we have to make it a smooth upgrade. Changing stanza types and so on is one thing, but changing the entire parser -- and requiring people to literally re-invent the wheel and roll their own XML parsers -- is much less likely to be a friendly upgrade, or receive any sort of wide adoption. If we really need a non-BASE64 method of sending binary data between clients, I suggest we re-use Jingle. That already is a mechanism for negotiation of 'I want to send you this type of data, how do I get it to you?' There's very few cases I can think of where we would want to be sending binary blobs in a server-cached manner anyway. -- Rachel Blackman <[EMAIL PROTECTED]> Trillian Messenger - http://www.trillianastra.com/
Re: [Standards] Binary data over XMPP
On Mon, Nov 05, 2007 at 04:04:10PM +0100, Michal 'vorner' Vaner wrote: > Hello > > On Mon, Nov 05, 2007 at 02:45:05PM +, Dave Cridland wrote: > > Another option would be to setup a distinct connection (and protocol) for > > routing blobs, and so send them through the server, yet not in-band. I'm > > not comfortable with this, because it means essentially duplicating all > > security information, and maintaining synchronization between two distinct > > streams. > > Or make the connection blobs by default, and some blobs could contain > complete XML documents, like this: > lenght of first block > > length of second block > > length of third block > some binary data. > > It is as much drastic approach as the blobs, it changes the protocol > from the very basic ground. Furthermore, you can extract the stanza and > feed it to any XML parser. +1 for "real" protocol frames! R
Re: [Standards] Binary data over XMPP
Hello On Mon, Nov 05, 2007 at 04:24:17PM +, Richard Dobson wrote: >> I strongly suspect, given the way the discussion is going, that we either >> have to consider framing everything - and that's a huge break from XMPP - >> or else we need an escape mechanism that works. Or, of course, we decide >> to give up and frame using XML as now, and use base64 to cope. > Personally I think it would be better to do as someone already suggested > and have a separate connection for framed blobs that you maintain or > establish when needed to send those, sort of like XEP-0065, or why not just > use XEP-0065 itself??, and if the server you are using doesn't have a > XEP-0065 proxy then you can safely assume that the server administrators > don't want you sending lots of data through their server infrastructure. You do not want to use 65 too much. If I skip the fact it is going to get deprecated by jingle, probably, it is really heavy for small blobs, like an icon or a funny image in a message. (Of course, it is the right way for 1GB file you want to send). -- Einstein argued that there must be simplified explanations of nature, because God is not capricious or arbitrary. No such faith comforts the software engineer. -- Fred Brooks Michal 'vorner' Vaner pgp2CpNMCHfnO.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
I strongly suspect, given the way the discussion is going, that we either have to consider framing everything - and that's a huge break from XMPP - or else we need an escape mechanism that works. Or, of course, we decide to give up and frame using XML as now, and use base64 to cope. Personally I think it would be better to do as someone already suggested and have a separate connection for framed blobs that you maintain or establish when needed to send those, sort of like XEP-0065, or why not just use XEP-0065 itself??, and if the server you are using doesn't have a XEP-0065 proxy then you can safely assume that the server administrators don't want you sending lots of data through their server infrastructure. Richard
Re: [Standards] Binary data over XMPP
On Mon Nov 5 15:11:33 2007, Thomas Charron wrote: On 11/5/07, Michal 'vorner' Vaner <[EMAIL PROTECTED]> wrote: > Hello > On Mon, Nov 05, 2007 at 02:45:05PM +, Dave Cridland wrote: > > Another option would be to setup a distinct connection (and protocol) for > > routing blobs, and so send them through the server, yet not in-band. I'm > > not comfortable with this, because it means essentially duplicating all > > security information, and maintaining synchronization between two distinct > > streams. > Or make the connection blobs by default, and some blobs could contain > complete XML documents, like this: > lenght of first block > > length of second block > > length of third block > some binary data. > It is as much drastic approach as the blobs, it changes the protocol > from the very basic ground. Furthermore, you can extract the stanza and > feed it to any XML parser. Not to mention the documentation would be much easier. We could just refer to the BEEP standards instead of having to write our own. Of course, one could argue, just use BEEP at that point. Way ahead of you. See the first paragraph of the mail quoted above. :-) The essential principle is much the same, but I'm not advocating bringing the whole of BEEP into play, here. That has flow-control and all sorts, and supports the splitting of a message into multiple frames, which brings in a lot of complexity. This complexity is unwarranted, in my opinion, in the context of XMPP. The one thing we might want - and I stress might - is the framing of arbitrary data by framing everything. We've always relied, in XMPP, on the implicit framing that XML can give us, but that's not always the best option, as we've seen. Base64 doesn't - in my opinion - grant us sufficient efficiency in a number of circumstances. So we need something else, and our two options boil down to either framing everything - the BEEP method - or an escape mechanism which is used to frame non-XML data - we can call this the IMAP method, since it's pretty similar. I strongly suspect, given the way the discussion is going, that we either have to consider framing everything - and that's a huge break from XMPP - or else we need an escape mechanism that works. Or, of course, we decide to give up and frame using XML as now, and use base64 to cope. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 16:36 +0100, Michal 'vorner' Vaner pisze: > Now I can use SAX. If I had to care about the blobs, I couldn't, or I > couldn't in an easy way. Now I see your point. Taken. > So you start splitting the data with one more run trough them? That is > nasty Matter of taste. ;-) > and slow. I would guess that you would need a _very_ targeted benchmark to actually see a slowdown. Let's keep in mind, we're I/O bound, so processor time in our context is cheap. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Hello On Mon, Nov 05, 2007 at 04:21:21PM +0100, Tomasz Sterna wrote: > Dnia 05-11-2007, Pn o godzinie 15:59 +0100, Michal 'vorner' Vaner pisze: > > > Dnia 05-11-2007, Pn o godzinie 12:51 +0100, Michal 'vorner' Vaner pisze: > > > > You probably can not do that with any reasonably out-of-the-box XML > > > > parser. > > > > > > You cannot use out-of-the-box XML parser anyway. > > > You need a one that parses and returns every subelement > > > separately. > > > > Sax. > > So you can use out-of-the-box parser, or you cannot? > Please make up your mind. ;-) Now I can use SAX. If I had to care about the blobs, I couldn't, or I couldn't in an easy way. > > > you stop feeding the data read from socket to parser, and fetch it > > > directly for routing. > > > > Unless you work like: > > Got something on network, read all or full buffer (lets say max 4kB), > > push it trough utf-8->internal strings and take the whole lot and feed > > it to the parser. > > So you read until '>' is spotted, as Greg suggested. So you start splitting the data with one more run trough them? That is nasty and slow. -- I left the ssh key under the doormat Michal 'vorner' Vaner pgpNdHkXV7UlP.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 11:40 +, Dave Cridland pisze: > It seems to me that there's a number of cases where shipping binary > blobs over XMPP is useful, and we don't want to be resorting to > base64 every time. Alternatively we could invent binary-2-utf mapping which has less overhead than BASE64. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 15:59 +0100, Michal 'vorner' Vaner pisze: > > Dnia 05-11-2007, Pn o godzinie 12:51 +0100, Michal 'vorner' Vaner pisze: > > > You probably can not do that with any reasonably out-of-the-box XML > > > parser. > > > > You cannot use out-of-the-box XML parser anyway. > > You need a one that parses and returns every subelement > > separately. > > Sax. So you can use out-of-the-box parser, or you cannot? Please make up your mind. ;-) > > you stop feeding the data read from socket to parser, and fetch it > > directly for routing. > > Unless you work like: > Got something on network, read all or full buffer (lets say max 4kB), > push it trough utf-8->internal strings and take the whole lot and feed > it to the parser. So you read until '>' is spotted, as Greg suggested. -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
On 11/5/07, Michal 'vorner' Vaner <[EMAIL PROTECTED]> wrote: > Hello > On Mon, Nov 05, 2007 at 02:45:05PM +, Dave Cridland wrote: > > Another option would be to setup a distinct connection (and protocol) for > > routing blobs, and so send them through the server, yet not in-band. I'm > > not comfortable with this, because it means essentially duplicating all > > security information, and maintaining synchronization between two distinct > > streams. > Or make the connection blobs by default, and some blobs could contain > complete XML documents, like this: > lenght of first block > > length of second block > > length of third block > some binary data. > It is as much drastic approach as the blobs, it changes the protocol > from the very basic ground. Furthermore, you can extract the stanza and > feed it to any XML parser. Not to mention the documentation would be much easier. We could just refer to the BEEP standards instead of having to write our own. Of course, one could argue, just use BEEP at that point. :-D -- -- Thomas
Re: [Standards] Binary data over XMPP
Hello On Mon, Nov 05, 2007 at 02:45:05PM +, Dave Cridland wrote: > Another option would be to setup a distinct connection (and protocol) for > routing blobs, and so send them through the server, yet not in-band. I'm > not comfortable with this, because it means essentially duplicating all > security information, and maintaining synchronization between two distinct > streams. Or make the connection blobs by default, and some blobs could contain complete XML documents, like this: lenght of first block length of second block length of third block some binary data. It is as much drastic approach as the blobs, it changes the protocol from the very basic ground. Furthermore, you can extract the stanza and feed it to any XML parser. -- When eating an elephant take one bite at a time. -- Gen. C. Abrams Michal 'vorner' Vaner pgpoJcX8j7WDU.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
Hello On Mon, Nov 05, 2007 at 03:35:19PM +0100, Tomasz Sterna wrote: > Dnia 05-11-2007, Pn o godzinie 12:51 +0100, Michal 'vorner' Vaner pisze: > > You probably can not do that with any reasonably out-of-the-box XML > > parser. > > You cannot use out-of-the-box XML parser anyway. > You need a one that parses and returns every subelement > separately. Sax. > you stop feeding the data read from socket to parser, and fetch it > directly for routing. Unless you work like: Got something on network, read all or full buffer (lets say max 4kB), push it trough utf-8->internal strings and take the whole lot and feed it to the parser. Now you got a blob somewhere in the middle you dragged trough the codepage changer (and destroyed it, destroying the rest of the data too, potencialy) and pushed it down the throat of the poor parser, when it reported the blob start. > > Furthermore, you may need to pass the stream trough charset > > decoder to get some internal stringish representation. > > What for? > Does your language-of-chice not have an effective binary blob > representation? But I want to feed my parser with strings. I can not even fill it with chars one by one, because I do not know, when each utf-8 char ends. -- This message has optimized support for formating. Please choose green font and black background so it looks like it should. Michal 'vorner' Vaner pgpH5cWNDJOIj.pgp Description: PGP signature
Re: [Standards] Binary data over XMPP
On Mon, 2007-11-05 at 15:35 +0100, Tomasz Sterna wrote: > You cannot use out-of-the-box XML parser anyway. SAX-model XML parsers still qualify as "out of the box." > So, once is extracted from the stream and reported, > you stop feeding the data read from socket to parser, and fetch it > directly for routing. By the time you have received an event reporting the blob element, you have potentially already fed it a chunk containing the and some of your binary data. (Unless you're handing it characters one by one, or being careful to never feed chunks which contain a > character except at the end. Either is inefficient and hackish.)
Re: [Standards] Binary data over XMPP
On Mon Nov 5 11:51:16 2007, Michal 'vorner' Vaner wrote: On Mon, Nov 05, 2007 at 11:40:18AM +, Dave Cridland wrote: > A new top-level stanza of (say) , which much the same attributes as > any other routable stanza, but also has an octet count. Upon receipt, the > XML processing is suspended, and the following octets are handled verbatim: > > to='[EMAIL PROTECTED]/court' > octet-count='4'/>1234 You probably can not do that with any reasonably out-of-the-box XML parser. Furthermore, you may need to pass the stream trough charset decoder to get some internal stringish representation. This will make it mad. So, in short, I strongly disagree here. An alternate would be to encapsulate both XML and blobs, which'd be an even more radical departure. (And look impressively like BEEP). So for each chunk, you'd predefine how long it was, and whether it was blob or XML. (Yes, there are defined formats for doing so, which have been mentioned on this list before). Or - just a thought - we could pinch IMAP's synchronizing literals: C: S: C: [4096 octets of blob] C: [... more XML ...] This adds a round-trip to all blob-stanza transfers, of course. (Although it's a hop-by-hop RTT, not an end-to-end RTT). No reason that couldn't be an option, too, so implementations which can cope with non-synchronizing blobs can say so. (I personally suspect many will be able to). But you may like SCTP or how's the protocol called and push the blobs out-of-the stream. Yes, but I doubt that'd get much traction. SCTP stacks are rare enough, and especially so in those areas where the base64 encoding overhead of (say) IBB makes a serious difference. (Yes, you could encourage all XMPP clients to include a SCTP/UDP implementation, but that's a heavy requirement, I'd have thought). Or another "blobby" TCP connection to the server. (if you really want to send these things trough the server). Well, I think increasingly we need to send these things via the server. In fact, we're doing so quite a bit - the question is, do we care about the base64 overhead enough that we want to address this. Another option would be to setup a distinct connection (and protocol) for routing blobs, and so send them through the server, yet not in-band. I'm not comfortable with this, because it means essentially duplicating all security information, and maintaining synchronization between two distinct streams. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] Binary data over XMPP
Dnia 05-11-2007, Pn o godzinie 12:51 +0100, Michal 'vorner' Vaner pisze: > You probably can not do that with any reasonably out-of-the-box XML > parser. You cannot use out-of-the-box XML parser anyway. You need a one that parses and returns every subelement separately. So, once is extracted from the stream and reported, you stop feeding the data read from socket to parser, and fetch it directly for routing. > Furthermore, you may need to pass the stream trough charset > decoder to get some internal stringish representation. What for? Does your language-of-chice not have an effective binary blob representation? -- /\_./o__ Tomasz Sterna (/^/(_^^' Xiaoka.com ._.(_.)_ XMPP: [EMAIL PROTECTED]
Re: [Standards] Binary data over XMPP
Ahoj On Mon, Nov 05, 2007 at 11:40:18AM +, Dave Cridland wrote: > A new top-level stanza of (say) , which much the same attributes as > any other routable stanza, but also has an octet count. Upon receipt, the > XML processing is suspended, and the following octets are handled verbatim: > > octet-count='4'/>1234 You probably can not do that with any reasonably out-of-the-box XML parser. Furthermore, you may need to pass the stream trough charset decoder to get some internal stringish representation. This will make it mad. So, in short, I strongly disagree here. But you may like SCTP or how's the protocol called and push the blobs out-of-the stream. Or another "blobby" TCP connection to the server. (if you really want to send these things trough the server). -- "Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats." -- Howard Aiken Michal 'vorner' Vaner pgpN9uBBE4v9P.pgp Description: PGP signature
[Standards] Binary data over XMPP
It seems to me that there's a number of cases where shipping binary blobs over XMPP is useful, and we don't want to be resorting to base64 every time. I'm thinking, in particular, that this is needed for encrypted stanzas, images, and file transfers. Is it worth our while to consider a single standardized mechanism for doing so? There's a number of ways this might work, here's one as a basis for discussion: A new top-level stanza of (say) , which much the same attributes as any other routable stanza, but also has an octet count. Upon receipt, the XML processing is suspended, and the following octets are handled verbatim: octet-count='4'/>1234 I'm using characters here instead of octets for clarity, but the "contents" of the blob element could contain NUL octets, non-UTF-8 data, etc. Note that I've chosen to express it as an empty element followed by the contents - this is primarily because I strongly suspect that this is simpler to process for many implementations, although it is distinctly un-XML-ish. The above won't handle imagery, and other blobs that need referencing. There's two ways of tackling this - we either allow for blobs to be sent inlined with other elements (which I think would be difficult to handle), or else we define a new URI scheme - or reuse cid - and stick id and content-type attributes on , so: to='[EMAIL PROTECTED]/court'> Yo, Shylock, here's a pound of flesh. Yo, Shylock, here's a pound of flesh: id='foo' octet-count='426' content-type='matter-transport/flesh'/>[426 octets of, presumably, image] (See RFC1437 for the top-level MIME type used). Alternately, we might prefer that the blobs are carried on demand in this instance. Finally, we should probably consider blocking and flow-control - at this point, I'll either suggest we examine BEEP, or else we just reuse what we have in IBB. Now, we can't expect that the entire Internet will bend to our will and instantly upgrade, so we need a sane fallback - probably to IBB, or something fairly similar. The interesting question is whether we choose to have this negotiated end to end (which means we'll need to have each hop along the route tested), or whether we say that this down-conversion happens within servers. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade