[jdev] Splitting the stream
Hello, Do you think, if I have the XML stream (incoming one) and I want to split it, is it enough to count starting and ending tags, or they could be embedded in something like attribute? (eg. is tag attr='tag'/ a legal thing or not?). And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? stream:stream xmlns= message ... . . . /message /stream:stream - can the thing depend or previous stanzas, or it depends only on the stream header? (with prefixes and so) Thank you -- This email has not been checked by an antivirus system. No virus found. Michal vorner Vaner pgpbiOl51gkvF.pgp Description: PGP signature
Re: [jdev] Splitting the stream
Hi Michal! Michal 'vorner' Vaner schrieb: Do you think, if I have the XML stream (incoming one) and I want to split it, is it enough to count starting and ending tags, or they could be embedded in something like attribute? (eg. is tag attr='tag'/ a legal thing or not?). tag attr='tag'/ is not legal ... but note that tag attr='tag'/ is legal ... And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? stream:stream xmlns= message ... . . . /message /stream:stream This representation (single stanza wrapped inside a stream root element) is what I am currently using in my new changes to jadc2s. - can the thing depend or previous stanzas, or it depends only on the stream header? (with prefixes and so) Some implementations (e.g. jabberd14, jabberd2) even only store and process the stanza without the stream root element. So a stanza wrapped by the root element should be okay as well. Tot kijk Matthias -- Matthias Wimmer Fon +49-700 77 00 77 70 Züricher Str. 243Fax +49-89 95 89 91 56 81476 Münchenhttp://ma.tthias.eu/
Re: [jdev] Splitting the stream
On 11/1/06, Michal 'vorner' Vaner [EMAIL PROTECTED] wrote: Do you think, if I have the XML stream (incoming one) and I want to split it, is it enough to count starting and ending tags, yes, this is the best way. or they could be embedded in something like attribute? (eg. is tag attr='tag'/ a legal thing or not?). it's is ilegal xml. And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? yes. stream:stream xmlns= message ... . /message /stream:stream - can the thing depend or previous stanzas, or it depends only on the stream header? (with prefixes and so) it only depends on the header, not previous stanzas. I have to ask why you're implementing yet-another-xmpp-parsing-library. Why not pick up on of the existing ones and use that? -- - Norman Rasmussen - Email: [EMAIL PROTECTED] - Home page: http://norman.rasmussen.co.za/
Re: [jdev] Splitting the stream
Michal 'vorner' Vaner schrieb: And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? But that is something I would not do. It requires you to parse the complete data twice. I'd just use a SAX parser for the complete stream. And with the SAX-events you can generate DOM-Nodes in a DOM-Document. E.g. you could have one document for the stream, generated inside the SAX-Events. When you notice, that a stanza has been completed, you can call the handler for the stanza with the handle/pointer/... to that element in the document. After the handler returned/the stanza has been processed, you can just remove that stanza root element from the DOM document again. Tot kijk Matthias -- Matthias Wimmer Fon +49-700 77 00 77 70 Züricher Str. 243Fax +49-89 95 89 91 56 81476 Münchenhttp://ma.tthias.eu/
Re: [jdev] XMPP Ping method?
Cross-posting to sjig: On 1 Nov 2006, at 16:45, Peter Saint-Andre wrote: Scott Robinson wrote: What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a client mysteriously and unexpectedly drops off the Internet, it won't know it until the TCP connection times out. Most servers use whitespace pings. It would be good for us to document that method in an XMPP extension. Perhaps we could also revisit the ping/ack (J|X)EP, as the whitespace ping doesn't really address the issue as we might desire? /K -- Kevin Smith Psi XMPP Client Project Leader (http://psi-im.org)
Re: [jdev] XMPP Ping method?
Isn't that a TCP problem since that can happen to any protocol which is based to TCP?On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote:What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a clientmysteriously and unexpectedly drops off the Internet, it won't know ituntil the TCP connection times out.This obviously can lead to two inconvenient situations: 1. Dropped messages that could otherwise stored.2. A stale presense of availablity.A quick Google search didn't result in anything useful, except a threadnoting how it _shouldn't_ be done.Any ideas? --Scott Robinson [EMAIL PROTECTED]http://quadhome.com/-BEGIN PGP SIGNATURE-Version: GnuPG v1.2.5 (GNU/Linux) iEYEARECAAYFAkVH7w8ACgkQ2wcaZqTSGsTUygCg1PrqPL5KBGe6kFEmAAL003yudjsAoKIAXlj00sRoFpI/WqIVUL9hpFOV=tucZ-END PGP SIGNATURE-
Re: [jdev] XMPP Ping method?
On Wed, Nov 01, 2006 at 06:07:39PM +0100, Tobias Markmann wrote: Isn't that a TCP problem since that can happen to any protocol which is based to TCP? Well, it is partly implementation problem, many OSes (as I heard) are able to tell you how much was already delivered and if you remember what part of data was what stanza, you can resend it after reconnection. But that is bit more work, of course, and alot more data. -- This email has not been checked by an antivirus system. No virus found. Michal vorner Vaner pgp5kO5FcWw55.pgp Description: PGP signature
[jdev] Re: Splitting the stream
Matthias Wimmer wrote: Michal 'vorner' Vaner schrieb: And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? But that is something I would not do. It requires you to parse the complete data twice. I'd just use a SAX parser for the complete stream. And with the SAX-events you can generate DOM-Nodes in a DOM-Document. E.g. you could have one document for the stream, generated inside the SAX-Events. When you notice, that a stanza has been completed, you can call the handler for the stanza with the handle/pointer/... to that element in the document. After the handler returned/the stanza has been processed, you can just remove that stanza root element from the DOM document again. i agree with Matthias. The technique you should use also depends on the programming language and the XML libraries you use. From my experience SAX like techniques works very well for XMPP streams. == Parsing the stream with Sax and build your Dom or stanzas from the sax callbacks. As Matthias already mentioned some developers see the complete stream (including the header) as one XML Document, others use the header only to grab the correct namespace and see each stanza as a new DOM which is not in the context of the stream header. All techniques have their pros and cons which also strongly depends on your programming language and XML libraries. It also depends on your project and requirements. It's a big difference if you need a light and really fast implementation for a embedded device, or if you don't care about speed but want a complete namespace correct and validating implementation. Alex
Re: [jdev] XMPP Ping method?
Justin Karneges wrote: On Tuesday 31 October 2006 4:49 pm, Scott Robinson wrote: What is the proper method of performing a ping across a client XMPP connection. There isn't. There was a JEP/XEP proposal to add Acking and Ping features to the protocol, but it wasn't accepted as a XEP (that is, it died before it even got a XEP number). The JSF Council recommendation was to take care of these features in the XMPP-Core RFC instead of a XEP. But the latest discussions about this (a month or two ago?) were that this is most appropriate as an XMPP extension (XEP), so please send me the latest version and we'll get that on the docket for the next XMPP Council meeting. This seems unlikely though, as I think the RFC revision has already been made, and there was no consideration for these features. It is not accurate to say that the RFC revision has already been made. I published -00 versions of rfc3920bis and rfc3921bis, but that is the start of discussion, not the end. Peter -- Peter Saint-Andre Jabber Software Foundation http://www.jabber.org/people/stpeter.shtml smime.p7s Description: S/MIME Cryptographic Signature
Re: [jdev] Re: Splitting the stream
Hello, On Wed, Nov 01, 2006 at 06:12:15PM +0100, Alexander Gnauck wrote: Matthias Wimmer wrote: Michal 'vorner' Vaner schrieb: And, if I have these split things and insert them between the stream header I got in the beginning and an corresponding stream end, can I parse it using DOM parser for each separate stanza? But that is something I would not do. It requires you to parse the complete data twice. I'd just use a SAX parser for the complete stream. And with the SAX-events you can generate DOM-Nodes in a DOM-Document. E.g. you could have one document for the stream, generated inside the SAX-Events. When you notice, that a stanza has been completed, you can call the handler for the stanza with the handle/pointer/... to that element in the document. After the handler returned/the stanza has been processed, you can just remove that stanza root element from the DOM document again. i agree with Matthias. The technique you should use also depends on the programming language and the XML libraries you use. From my experience SAX like techniques works very well for XMPP streams. == Parsing the stream with Sax and build your Dom or stanzas from the sax callbacks. Well, I just think I do not need to _parse_ it if I'm not interested in the information there. I only want to split it to parts and feed that to different program. As Matthias already mentioned some developers see the complete stream (including the header) as one XML Document, others use the header only to grab the correct namespace and see each stanza as a new DOM which is not in the context of the stream header. All techniques have their pros and cons which also strongly depends on your programming language and XML libraries. It also depends on your project and requirements. It's a big difference if you need a light and really fast implementation for a embedded device, or if you don't care about speed but want a complete namespace correct and validating implementation. I need to make it many small independent programs. Which I think nobody yet did. It is just an experiment, maybe it wont work at all, it is possible something very flexible may become of it, I just do not know. Thank you for your comments. -- Please stay calm. There is no use both of us being hysterical. Michal vorner Vaner pgphk74LQX82N.pgp Description: PGP signature
Re: [jdev] XMPP Ping method?
On Wednesday 01 November 2006 9:17 am, Peter Saint-Andre wrote: But the latest discussions about this (a month or two ago?) were that this is most appropriate as an XMPP extension (XEP), so please send me the latest version and we'll get that on the docket for the next XMPP Council meeting. http://www.xmpp.org/extensions/inbox/ack.html -Justin
RE: [jdev] XMPP Ping method?
From the very first day I wrote a Jabber packet, I was looking for a Ping command. I would still like to see one. I usually end up using IQ:Time, or IQ:Version as a ping. Almost everything supports this - clients, servers, bots, etc. I would still love to see IQ:Ping. -- Chris Mullins -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Saint-Andre Sent: Wednesday, November 01, 2006 8:45 AM To: Jabber software development list Subject: Re: [jdev] XMPP Ping method? Scott Robinson wrote: What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a client mysteriously and unexpectedly drops off the Internet, it won't know it until the TCP connection times out. Most servers use whitespace pings. It would be good for us to document that method in an XMPP extension. Peter -- Peter Saint-Andre Jabber Software Foundation http://www.jabber.org/people/stpeter.shtml
RE: [jdev] XMPP Ping method?
That's quite a bit more complicated that I envisioned. I was just thinking IQ:Ping, and leave it there. For all of the use cases I've come across, this would be sufficient. -- Chris Mullins -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Justin Karneges Sent: Wednesday, November 01, 2006 9:27 AM To: Jabber software development list Subject: Re: [jdev] XMPP Ping method? On Wednesday 01 November 2006 9:17 am, Peter Saint-Andre wrote: But the latest discussions about this (a month or two ago?) were that this is most appropriate as an XMPP extension (XEP), so please send me the latest version and we'll get that on the docket for the next XMPP Council meeting. http://www.xmpp.org/extensions/inbox/ack.html -Justin
[jdev] Re: Splitting the stream
Michal 'vorner' Vaner wrote: Well, I just think I do not need to _parse_ it if I'm not interested in the information there. I only want to split it to parts and feed that to different program. what is parsing for you? splitting is parsing for me. A SAX or Pull Parser, or a Xml Tokenizer is splitting your stream and does not care about the content. Alex
[jdev] Re: XMPP Ping method?
Michal 'vorner' Vaner [EMAIL PROTECTED] writes: Well, it is partly implementation problem, many OSes (as I heard) are able to tell you how much was already delivered and if you remember what part of data was what stanza, you can resend it after reconnection. But that is bit more work, of course, and alot more data. I'm in no way an expert in network programming, so what I'm about to write might qualify as disinformation; please write corrections or completions. We want to know if the remote side has ACKed receipt of all the bytes in a stanza. In Linux, there is a way to get the size of the TCP send queue: http://mail.jabber.org/pipermail/standards-jig/2003-December/004570.html But it seems to me that using that method would be cumbersome: 1. Send stanza A to connection. Save copy of A and size(A). 2. Prepare to send stanza B. If send queue is 0, forget A and goto 1. Else save B and size(B), and increase size(A) with size(B). 3. If send queue size is less than (the modified) size(A), consider A to be acked. Likewise for B. 4. If connection fails, queue or bounce all stanzas sent but not acked. Or something like that. I probably got it wrong somewhere, and I would probably make more errors if I tried to convert that into code. So it would be nice if sending a piece of data returned the sequence number of the last byte sent. Then you could just compare it to the sequence number of the last byte ACKed, and then you immediately know if the stanza was received. Thus, we should try to convince makers of socket APIs to include functions to do just that. Or did I miss anything? -- Magnus JID: [EMAIL PROTECTED]
Re: [jdev] XMPP Ping method?
On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote: What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a client mysteriously and unexpectedly drops off the Internet, it won't know it until the TCP connection times out. Isn't TCP enough to handle the situation? Why do you need to implement it on application level? -- smk
Re: [jdev] XMPP Ping method?
On Wednesday 01 November 2006 20:08, Tomasz Sterna wrote: On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote: What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a client mysteriously and unexpectedly drops off the Internet, it won't know it until the TCP connection times out. Isn't TCP enough to handle the situation? Why do you need to implement it on application level? If 2 machines are connected over TCP/IP connection and there is one or more firewalls in between, try to power-off one of the machines. Another side will not notice connection drop. Default TCP/IP timeout is 300sec. This kind of power-off disconnections happen surprisingly often. And this is especially problem for network servers serving many simultaneous connections as such clients reconnect in a few seconds so server ends up with many dead connections... Artur -- Artur Hefczyc http://www.tigase.org/ http://wttools.sf.net/
[jdev] Re: XMPP Ping method?
Tomasz Sterna wrote: On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote: What is the proper method of performing a ping across a client XMPP connection. That is, from a sever's perspective, if a client mysteriously and unexpectedly drops off the Internet, it won't know it until the TCP connection times out. Isn't TCP enough to handle the situation? Why do you need to implement it on application level? for some devices it's not. If you work with wireless devices (WLAN, GSM, UMTS) you will see lot's of strange behavior. Alex
Re: [jdev] XMPP Ping method?
On Wed Nov 1 17:07:11 2006, Michal 'vorner' Vaner wrote: On Wed, Nov 01, 2006 at 06:07:39PM +0100, Tobias Markmann wrote: Isn't that a TCP problem since that can happen to any protocol which is based to TCP? Well, it is partly implementation problem, many OSes (as I heard) are able to tell you how much was already delivered and if you remember what part of data was what stanza, you can resend it after reconnection. But that is bit more work, of course, and alot more data. No OS can tell you what's been delivered, but some might be able to tell you what hasn't been sent, and what hasn't been acknowledged. I looked for how to do this on Linux, which usually provides the richest API to the network layer, but I couldn't find anything to tell me either. But this isn't quite the same thing anyway - you want to know what stanzas have been accepted - what happens if the ACKs get lost, or the server dies? Consider ESMTP, which has got data level acknowledgement. There's a long-known problem whereby after DATA (and these days, BDAT and BURL), there's a chance that you'll lose the connection before you get the 2xx acknowledgement from the server. This is on the increase again, partly due to the preference for protocol-level rejections instead of DSNs, partly due to the marked increase in usage of ESMTP over things like GPRS. It's important to note that this specifically is about hop-by-hop, and not end-to-end, which are different problems entirely. Finding out if the guy you're talking to is still connected is quite easy, just send an IQ (in principle *any* IQ), and you'll see. Hop-by-hop tests are quite easy, too, but there's a gotcha - when they fail, you want to know which stanzas you need to resend. And XMPP does not provide any mechanism for that, and nor do pings. My last suggestion - adding a sequence attribute to stanzas - didn't seem to impress most people, partly because it requires servers to rewrite stanzas between hops. If instead the sender appends a distinct stanza (which could be an iq, or could be something else) to every TCP segment sent, which itself contains a sequence, then that can be used as the restart token with almost precisely the same effect, and requires no rewriting of stanzas. So, the sender appends, for instance, iq type='set' id='ping123'ping xmlns='urn:xmpp:ping' sent='1' recv='47'//iq to each send() call's payload, and the receiver can then note this simply, and respond with an iq reply when it suits it, which also contains sent and recv sequence counts. Loosely, you'd add that to the end of each TCP packet, in practise about every 1.5k or at the end of each send() should be quite safe. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [jdev] Re: Splitting the stream
Michal 'vorner' Vaner schrieb: Well, I just think I do not need to _parse_ it if I'm not interested in the information there. I only want to split it to parts and feed that to different program. As Alexander already said: I think what you plan to do (you wrote in a private mail, that you plan to use regular expressions) is parsing as well. I am not yet sure how you plan to write your regular expressions, but you should not be able to write a regular expression, that matches any valid stanza. With regular expressions you can only express regular languages (Chompsky type 3). But with the pumping lemma for regular languages, you can show that the set of all valid stanzas is no regular language. I need to make it many small independent programs. Which I think nobody yet did. It is just an experiment, maybe it wont work at all, it is possible something very flexible may become of it, I just do not know. What do you plan to do if the component splitted an invalid stanza and forwarded it to the small program. I think that might cause you problems as well. You just not just expect, that all you get from a server is valid. This might make you more vulnerable to attacks. The overhead you have by using a proven XML parser should not be very much, and you can be sure, that other people already spent their thinking in handling all special cases that can occur in XML documents. E.g. with a native approach in finding the start and end tags you might get confused, if you receive something like the following: message from='[EMAIL PROTECTED]' to='[EMAIL PROTECTED]' id='123/'... and others. How do you plan to route the stanzas to the small programs? I guess you will need some information out of the stanzas for that as well, no? If you'd parse the stream using an XML parser and generate a DOM- or DOM-like document, your programs could then use XPath expressions to subscribe to the stanzas they are interested in, which might be very handy as well. Tot kijk Matthias -- Matthias Wimmer Fon +49-700 77 00 77 70 Züricher Str. 243Fax +49-89 95 89 91 56 81476 Münchenhttp://ma.tthias.eu/
Re: [jdev] Re: XMPP Ping method?
On 11/1/06, Artur Hefczyc [EMAIL PROTECTED] wrote: If 2 machines are connected over TCP/IP connection and there is one or more firewalls in between, try to power-off one of the machines. Another side will not notice connection drop. Why not? TCP handles disappearing end well. Default TCP/IP timeout is 300sec. Is there a MUST to use defaults? On 11/1/06, Alexander Gnauck [EMAIL PROTECTED] wrote: for some devices it's not. If you work with wireless devices (WLAN, GSM, UMTS) you will see lot's of strange behavior. Isn't that broken TCP implementation then? -- smk
[jdev] Re: XMPP Ping method?
Tomasz Sterna wrote: for some devices it's not. If you work with wireless devices (WLAN, GSM, UMTS) you will see lot's of strange behavior. Isn't that broken TCP implementation then? sometimes it is, but sometimes it's by design. I think we have to address this issues and can't say the tcp implementation is broken. In this cases it's sometimes better to use HTTP-Polling and HTTP-Binding. Alex
Re: [jdev] Re: Splitting the stream
On 11/1/06, Matthias Wimmer [EMAIL PROTECTED] wrote: As Alexander already said: I think what you plan to do (you wrote in a private mail, that you plan to use regular expressions) is parsing as well. Yegh, try and steer clear of RegEx to parse XML. Use a standard SAX parser should be more than sufficient, and it's probably the lest memory intensive, and fastest parser for what you want to do. -- - Norman Rasmussen - Email: [EMAIL PROTECTED] - Home page: http://norman.rasmussen.co.za/