[jdev] Splitting the stream

2006-11-01 Thread Michal 'vorner' Vaner
Hello,
Do you think, if I have the XML stream (incoming one) and I want to
split it, is it enough to count starting and ending tags, or they could
be embedded in something like attribute? (eg. is tag attr='tag'/ a
legal thing or not?).

And, if I have these split things and insert them between the stream
header I got in the beginning and an corresponding stream end, can I
parse it using DOM parser for each separate stanza?

stream:stream xmlns=
  message ...
.
.
.
  /message
/stream:stream

- can the thing depend or previous stanzas, or it depends only on the
stream header? (with prefixes and so)

Thank you

-- 
This email has not been checked by an antivirus system.
No virus found.

Michal vorner Vaner


pgpbiOl51gkvF.pgp
Description: PGP signature


Re: [jdev] Splitting the stream

2006-11-01 Thread Matthias Wimmer
Hi Michal!

Michal 'vorner' Vaner schrieb:
 Do you think, if I have the XML stream (incoming one) and I want to
 split it, is it enough to count starting and ending tags, or they could
 be embedded in something like attribute? (eg. is tag attr='tag'/ a
 legal thing or not?).

tag attr='tag'/ is not legal ... but note that tag attr='tag'/ is
legal ...

 And, if I have these split things and insert them between the stream
 header I got in the beginning and an corresponding stream end, can I
 parse it using DOM parser for each separate stanza?
 
 stream:stream xmlns=
   message ...
 .
 .
 .
   /message
 /stream:stream

This representation (single stanza wrapped inside a stream root element)
is what I am currently using in my new changes to jadc2s.

 - can the thing depend or previous stanzas, or it depends only on the
 stream header? (with prefixes and so)

Some implementations (e.g. jabberd14, jabberd2) even only store and
process the stanza without the stream root element. So a stanza wrapped
by the root element should be okay as well.


Tot kijk
Matthias

-- 
Matthias Wimmer  Fon +49-700 77 00 77 70
Züricher Str. 243Fax +49-89 95 89 91 56
81476 Münchenhttp://ma.tthias.eu/


Re: [jdev] Splitting the stream

2006-11-01 Thread Norman Rasmussen

On 11/1/06, Michal 'vorner' Vaner [EMAIL PROTECTED] wrote:

Do you think, if I have the XML stream (incoming one) and I want to
split it, is it enough to count starting and ending tags,

yes, this is the best way.


or they could
be embedded in something like attribute? (eg. is tag attr='tag'/ a
legal thing or not?).

it's is ilegal xml.


And, if I have these split things and insert them between the stream
header I got in the beginning and an corresponding stream end, can I
parse it using DOM parser for each separate stanza?

yes.


stream:stream xmlns=
  message ...
.
  /message
/stream:stream

- can the thing depend or previous stanzas, or it depends only on the
stream header? (with prefixes and so)


it only depends on the header, not previous stanzas.


I have to ask why you're implementing
yet-another-xmpp-parsing-library.  Why not pick up on of the existing
ones and use that?

--
- Norman Rasmussen
- Email: [EMAIL PROTECTED]
- Home page: http://norman.rasmussen.co.za/


Re: [jdev] Splitting the stream

2006-11-01 Thread Matthias Wimmer
Michal 'vorner' Vaner schrieb:
 And, if I have these split things and insert them between the stream
 header I got in the beginning and an corresponding stream end, can I
 parse it using DOM parser for each separate stanza?

But that is something I would not do. It requires you to parse the
complete data twice. I'd just use a SAX parser for the complete stream.
And with the SAX-events you can generate DOM-Nodes in a DOM-Document.

E.g. you could have one document for the stream, generated inside the
SAX-Events. When you notice, that a stanza has been completed, you can
call the handler for the stanza with the handle/pointer/... to that
element in the document. After the handler returned/the stanza has been
processed, you can just remove that stanza root element from the DOM
document again.


Tot kijk
Matthias

-- 
Matthias Wimmer  Fon +49-700 77 00 77 70
Züricher Str. 243Fax +49-89 95 89 91 56
81476 Münchenhttp://ma.tthias.eu/


Re: [jdev] XMPP Ping method?

2006-11-01 Thread Kevin Smith

Cross-posting to sjig:

On 1 Nov 2006, at 16:45, Peter Saint-Andre wrote:

Scott Robinson wrote:

What is the proper method of performing a ping across a client XMPP
connection. That is, from a sever's perspective, if a client
mysteriously and unexpectedly drops off the Internet, it won't  
know it

until the TCP connection times out.


Most servers use whitespace pings. It would be good for us to  
document

that method in an XMPP extension.


Perhaps we could also revisit the ping/ack (J|X)EP, as the whitespace  
ping doesn't really address the issue as we might desire?


/K

--
Kevin Smith
Psi XMPP Client Project Leader (http://psi-im.org)





Re: [jdev] XMPP Ping method?

2006-11-01 Thread Tobias Markmann
Isn't that a TCP problem since that can happen to any protocol which is based to TCP?On 11/1/06, Scott Robinson 
[EMAIL PROTECTED] wrote:What is the proper method of performing a ping across a client XMPP
connection. That is, from a sever's perspective, if a clientmysteriously and unexpectedly drops off the Internet, it won't know ituntil the TCP connection times out.This obviously can lead to two inconvenient situations:
1. Dropped messages that could otherwise stored.2. A stale presense of availablity.A quick Google search didn't result in anything useful, except a threadnoting how it _shouldn't_ be done.Any ideas?
--Scott Robinson [EMAIL PROTECTED]http://quadhome.com/-BEGIN PGP SIGNATURE-Version: GnuPG v1.2.5 (GNU/Linux)
iEYEARECAAYFAkVH7w8ACgkQ2wcaZqTSGsTUygCg1PrqPL5KBGe6kFEmAAL003yudjsAoKIAXlj00sRoFpI/WqIVUL9hpFOV=tucZ-END PGP SIGNATURE-


Re: [jdev] XMPP Ping method?

2006-11-01 Thread Michal 'vorner' Vaner
On Wed, Nov 01, 2006 at 06:07:39PM +0100, Tobias Markmann wrote:
 Isn't that a TCP problem since that can happen to any protocol which is
 based to TCP?

Well, it is partly implementation problem, many OSes (as I heard) are
able to tell you how much was already delivered and if you remember what
part of data was what stanza, you can resend it after reconnection.

But that is bit more work, of course, and alot more data.

-- 
This email has not been checked by an antivirus system.
No virus found.

Michal vorner Vaner


pgp5kO5FcWw55.pgp
Description: PGP signature


[jdev] Re: Splitting the stream

2006-11-01 Thread Alexander Gnauck

Matthias Wimmer wrote:

Michal 'vorner' Vaner schrieb:

And, if I have these split things and insert them between the stream
header I got in the beginning and an corresponding stream end, can I
parse it using DOM parser for each separate stanza?


But that is something I would not do. It requires you to parse the
complete data twice. I'd just use a SAX parser for the complete stream.
And with the SAX-events you can generate DOM-Nodes in a DOM-Document.

E.g. you could have one document for the stream, generated inside the
SAX-Events. When you notice, that a stanza has been completed, you can
call the handler for the stanza with the handle/pointer/... to that
element in the document. After the handler returned/the stanza has been
processed, you can just remove that stanza root element from the DOM
document again.


i agree with Matthias.
The technique you should use also depends on the programming language 
and the XML libraries you use. From my experience SAX like techniques 
works very well for XMPP streams.
== Parsing the stream with Sax and build your Dom or stanzas from the 
sax callbacks.


As Matthias already mentioned some developers see the complete stream 
(including the header) as one XML Document, others use the header only 
to grab the correct namespace and see each stanza as a new DOM which is 
not in the context of the stream header.


All techniques have their pros and cons which also strongly depends on 
your programming language and XML libraries.


It also depends on your project and requirements. It's a big difference 
if you need a light and really fast implementation for a embedded 
device, or if you don't care about speed but want a complete namespace 
correct and validating implementation.


Alex




Re: [jdev] XMPP Ping method?

2006-11-01 Thread Peter Saint-Andre
Justin Karneges wrote:
 On Tuesday 31 October 2006 4:49 pm, Scott Robinson wrote:
 What is the proper method of performing a ping across a client XMPP
 connection.
 
 There isn't.  There was a JEP/XEP proposal to add Acking and Ping features to 
 the protocol, but it wasn't accepted as a XEP (that is, it died before it 
 even got a XEP number).  The JSF Council recommendation was to take care of 
 these features in the XMPP-Core RFC instead of a XEP.  

But the latest discussions about this (a month or two ago?) were that
this is most appropriate as an XMPP extension (XEP), so please send me
the latest version and we'll get that on the docket for the next XMPP
Council meeting.

 This seems unlikely 
 though, as I think the RFC revision has already been made, and there was no 
 consideration for these features.

It is not accurate to say that the RFC revision has already been made.
I published -00 versions of rfc3920bis and rfc3921bis, but that is the
start of discussion, not the end.

Peter

-- 
Peter Saint-Andre
Jabber Software Foundation
http://www.jabber.org/people/stpeter.shtml



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [jdev] Re: Splitting the stream

2006-11-01 Thread Michal 'vorner' Vaner
Hello,

On Wed, Nov 01, 2006 at 06:12:15PM +0100, Alexander Gnauck wrote:
 Matthias Wimmer wrote:
 Michal 'vorner' Vaner schrieb:
 And, if I have these split things and insert them between the stream
 header I got in the beginning and an corresponding stream end, can I
 parse it using DOM parser for each separate stanza?
 
 But that is something I would not do. It requires you to parse the
 complete data twice. I'd just use a SAX parser for the complete stream.
 And with the SAX-events you can generate DOM-Nodes in a DOM-Document.
 
 E.g. you could have one document for the stream, generated inside the
 SAX-Events. When you notice, that a stanza has been completed, you can
 call the handler for the stanza with the handle/pointer/... to that
 element in the document. After the handler returned/the stanza has been
 processed, you can just remove that stanza root element from the DOM
 document again.

 i agree with Matthias.
 The technique you should use also depends on the programming language 
 and the XML libraries you use. From my experience SAX like techniques 
 works very well for XMPP streams.
 == Parsing the stream with Sax and build your Dom or stanzas from the 
 sax callbacks.


Well, I just think I do not need to _parse_ it if I'm not interested in
the information there. I only want to split it to parts and feed that to
different program.

 As Matthias already mentioned some developers see the complete stream 
 (including the header) as one XML Document, others use the header only 
 to grab the correct namespace and see each stanza as a new DOM which is 
 not in the context of the stream header.
 
 All techniques have their pros and cons which also strongly depends on 
 your programming language and XML libraries.
 
 It also depends on your project and requirements. It's a big difference 
 if you need a light and really fast implementation for a embedded 
 device, or if you don't care about speed but want a complete namespace 
 correct and validating implementation.
 
I need to make it many small independent programs. Which I think nobody
yet did. It is just an experiment, maybe it wont work at all, it is
possible something very flexible may become of it, I just do not know.

Thank you for your comments.

-- 
Please stay calm. There is no use both of us being hysterical.

Michal vorner Vaner


pgphk74LQX82N.pgp
Description: PGP signature


Re: [jdev] XMPP Ping method?

2006-11-01 Thread Justin Karneges
On Wednesday 01 November 2006 9:17 am, Peter Saint-Andre wrote:
 But the latest discussions about this (a month or two ago?) were that
 this is most appropriate as an XMPP extension (XEP), so please send me
 the latest version and we'll get that on the docket for the next XMPP
 Council meeting.

http://www.xmpp.org/extensions/inbox/ack.html

-Justin


RE: [jdev] XMPP Ping method?

2006-11-01 Thread Chris Mullins
From the very first day I wrote a Jabber packet, I was looking for a
Ping command. I would still like to see one.

I usually end up using IQ:Time, or IQ:Version as a ping. Almost
everything supports this - clients, servers, bots, etc. 

I would still love to see IQ:Ping. 

--
Chris Mullins

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Peter Saint-Andre
Sent: Wednesday, November 01, 2006 8:45 AM
To: Jabber software development list
Subject: Re: [jdev] XMPP Ping method?

Scott Robinson wrote:
 What is the proper method of performing a ping across a client XMPP
 connection. That is, from a sever's perspective, if a client
 mysteriously and unexpectedly drops off the Internet, it won't know it
 until the TCP connection times out.

Most servers use whitespace pings. It would be good for us to document
that method in an XMPP extension.

Peter

-- 
Peter Saint-Andre
Jabber Software Foundation
http://www.jabber.org/people/stpeter.shtml



RE: [jdev] XMPP Ping method?

2006-11-01 Thread Chris Mullins
That's quite a bit more complicated that I envisioned. 

I was just thinking IQ:Ping, and leave it there. For all of the use cases 
I've come across, this would be sufficient. 

--
Chris Mullins

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Justin Karneges
Sent: Wednesday, November 01, 2006 9:27 AM
To: Jabber software development list
Subject: Re: [jdev] XMPP Ping method?

On Wednesday 01 November 2006 9:17 am, Peter Saint-Andre wrote:
 But the latest discussions about this (a month or two ago?) were that
 this is most appropriate as an XMPP extension (XEP), so please send me
 the latest version and we'll get that on the docket for the next XMPP
 Council meeting.

http://www.xmpp.org/extensions/inbox/ack.html

-Justin


[jdev] Re: Splitting the stream

2006-11-01 Thread Alexander Gnauck

Michal 'vorner' Vaner wrote:

Well, I just think I do not need to _parse_ it if I'm not interested in
the information there. I only want to split it to parts and feed that to
different program.


what is parsing for you?
splitting is parsing for me. A SAX or Pull Parser, or a Xml Tokenizer is 
splitting your stream and does not care about the content.


Alex



[jdev] Re: XMPP Ping method?

2006-11-01 Thread Magnus Henoch
Michal 'vorner' Vaner [EMAIL PROTECTED] writes:

 Well, it is partly implementation problem, many OSes (as I heard) are
 able to tell you how much was already delivered and if you remember what
 part of data was what stanza, you can resend it after reconnection.

 But that is bit more work, of course, and alot more data.

I'm in no way an expert in network programming, so what I'm about to
write might qualify as disinformation; please write corrections or
completions.

We want to know if the remote side has ACKed receipt of all the bytes
in a stanza.  In Linux, there is a way to get the size of the TCP send
queue:
http://mail.jabber.org/pipermail/standards-jig/2003-December/004570.html
But it seems to me that using that method would be cumbersome:

1. Send stanza A to connection.  Save copy of A and size(A).
2. Prepare to send stanza B.  If send queue is 0, forget A and goto 1.
   Else save B and size(B), and increase size(A) with size(B).
3. If send queue size is less than (the modified) size(A), consider A
   to be acked.  Likewise for B.
4. If connection fails, queue or bounce all stanzas sent but not
   acked.

Or something like that.  I probably got it wrong somewhere, and I
would probably make more errors if I tried to convert that into code.

So it would be nice if sending a piece of data returned the sequence
number of the last byte sent.  Then you could just compare it to the
sequence number of the last byte ACKed, and then you immediately know
if the stanza was received.

Thus, we should try to convince makers of socket APIs to include
functions to do just that.  Or did I miss anything?

-- 
Magnus
JID: [EMAIL PROTECTED]



Re: [jdev] XMPP Ping method?

2006-11-01 Thread Tomasz Sterna

On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote:

What is the proper method of performing a ping across a client XMPP
connection. That is, from a sever's perspective, if a client
mysteriously and unexpectedly drops off the Internet, it won't know it
until the TCP connection times out.


Isn't TCP enough to handle the situation?
Why do you need to implement it on application level?

--
smk


Re: [jdev] XMPP Ping method?

2006-11-01 Thread Artur Hefczyc
On Wednesday 01 November 2006 20:08, Tomasz Sterna wrote:
 On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote:
  What is the proper method of performing a ping across a client XMPP
  connection. That is, from a sever's perspective, if a client
  mysteriously and unexpectedly drops off the Internet, it won't know it
  until the TCP connection times out.

 Isn't TCP enough to handle the situation?
 Why do you need to implement it on application level?

If 2 machines are connected over TCP/IP connection and there is one
or more firewalls in between, try to power-off one of the machines.
Another side will not notice connection drop.

Default TCP/IP timeout is 300sec.

This kind of power-off disconnections happen surprisingly often.
And this is especially problem for network servers serving many
simultaneous connections as such clients reconnect in a few
seconds so server ends up with many dead connections...

Artur
-- 
Artur Hefczyc
http://www.tigase.org/
http://wttools.sf.net/


[jdev] Re: XMPP Ping method?

2006-11-01 Thread Alexander Gnauck

Tomasz Sterna wrote:

On 11/1/06, Scott Robinson [EMAIL PROTECTED] wrote:

What is the proper method of performing a ping across a client XMPP
connection. That is, from a sever's perspective, if a client
mysteriously and unexpectedly drops off the Internet, it won't know it
until the TCP connection times out.


Isn't TCP enough to handle the situation?
Why do you need to implement it on application level?


for some devices it's not. If you work with wireless devices (WLAN, GSM, 
UMTS) you will see lot's of strange behavior.


Alex



Re: [jdev] XMPP Ping method?

2006-11-01 Thread Dave Cridland

On Wed Nov  1 17:07:11 2006, Michal 'vorner' Vaner wrote:

On Wed, Nov 01, 2006 at 06:07:39PM +0100, Tobias Markmann wrote:
 Isn't that a TCP problem since that can happen to any protocol 
which is

 based to TCP?

Well, it is partly implementation problem, many OSes (as I heard) 
are
able to tell you how much was already delivered and if you remember 
what

part of data was what stanza, you can resend it after reconnection.

But that is bit more work, of course, and alot more data.


No OS can tell you what's been delivered, but some might be able to 
tell you what hasn't been sent, and what hasn't been acknowledged. I 
looked for how to do this on Linux, which usually provides the 
richest API to the network layer, but I couldn't find anything to 
tell me either.


But this isn't quite the same thing anyway - you want to know what 
stanzas have been accepted - what happens if the ACKs get lost, or 
the server dies?


Consider ESMTP, which has got data level acknowledgement. There's a 
long-known problem whereby after DATA (and these days, BDAT and 
BURL), there's a chance that you'll lose the connection before you 
get the 2xx acknowledgement from the server. This is on the increase 
again, partly due to the preference for protocol-level rejections 
instead of DSNs, partly due to the marked increase in usage of ESMTP 
over things like GPRS.


It's important to note that this specifically is about hop-by-hop, 
and not end-to-end, which are different problems entirely. Finding 
out if the guy you're talking to is still connected is quite easy, 
just send an IQ (in principle *any* IQ), and you'll see.


Hop-by-hop tests are quite easy, too, but there's a gotcha - when 
they fail, you want to know which stanzas you need to resend. And 
XMPP does not provide any mechanism for that, and nor do pings.


My last suggestion - adding a sequence attribute to stanzas - didn't 
seem to impress most people, partly because it requires servers to 
rewrite stanzas between hops.


If instead the sender appends a distinct stanza (which could be an 
iq, or could be something else) to every TCP segment sent, which 
itself contains a sequence, then that can be used as the restart 
token with almost precisely the same effect, and requires no 
rewriting of stanzas.


So, the sender appends, for instance, iq type='set' 
id='ping123'ping xmlns='urn:xmpp:ping' sent='1' recv='47'//iq to 
each send() call's payload, and the receiver can then note this 
simply, and respond with an iq reply when it suits it, which also 
contains sent and recv sequence counts.


Loosely, you'd add that to the end of each TCP packet, in practise 
about every 1.5k or at the end of each send() should be quite safe.


Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [jdev] Re: Splitting the stream

2006-11-01 Thread Matthias Wimmer
Michal 'vorner' Vaner schrieb:
 Well, I just think I do not need to _parse_ it if I'm not interested in
 the information there. I only want to split it to parts and feed that to
 different program.

As Alexander already said: I think what you plan to do (you wrote in a
private mail, that you plan to use regular expressions) is parsing as well.

I am not yet sure how you plan to write your regular expressions, but
you should not be able to write a regular expression, that matches any
valid stanza.
With regular expressions you can only express regular languages
(Chompsky type 3). But with the pumping lemma for regular languages, you
can show that the set of all valid stanzas is no regular language.

 I need to make it many small independent programs. Which I think nobody
 yet did. It is just an experiment, maybe it wont work at all, it is
 possible something very flexible may become of it, I just do not know.

What do you plan to do if the component splitted an invalid stanza and
forwarded it to the small program. I think that might cause you problems
as well. You just not just expect, that all you get from a server is
valid. This might make you more vulnerable to attacks.

The overhead you have by using a proven XML parser should not be very
much, and you can be sure, that other people already spent their
thinking in handling all special cases that can occur in XML documents.
E.g. with a native approach in finding the start and end tags you might
get confused, if you receive something like the following: message
from='[EMAIL PROTECTED]' to='[EMAIL PROTECTED]' id='123/'... and others.

How do you plan to route the stanzas to the small programs? I guess you
will need some information out of the stanzas for that as well, no? If
you'd parse the stream using an XML parser and generate a DOM- or
DOM-like document, your programs could then use XPath expressions to
subscribe to the stanzas they are interested in, which might be very
handy as well.


Tot kijk
Matthias

-- 
Matthias Wimmer  Fon +49-700 77 00 77 70
Züricher Str. 243Fax +49-89 95 89 91 56
81476 Münchenhttp://ma.tthias.eu/


Re: [jdev] Re: XMPP Ping method?

2006-11-01 Thread Tomasz Sterna

On 11/1/06, Artur Hefczyc [EMAIL PROTECTED] wrote:

If 2 machines are connected over TCP/IP connection and there is one
or more firewalls in between, try to power-off one of the machines.
Another side will not notice connection drop.


Why not?
TCP handles disappearing end well.



Default TCP/IP timeout is 300sec.


Is there a MUST to use defaults?


On 11/1/06, Alexander Gnauck [EMAIL PROTECTED] wrote:

for some devices it's not. If you work with wireless devices (WLAN, GSM,
UMTS) you will see lot's of strange behavior.


Isn't that broken TCP implementation then?


--
smk


[jdev] Re: XMPP Ping method?

2006-11-01 Thread Alexander Gnauck

Tomasz Sterna wrote:

for some devices it's not. If you work with wireless devices (WLAN, GSM,
UMTS) you will see lot's of strange behavior.


Isn't that broken TCP implementation then?


sometimes it is, but sometimes it's by design. I think we have to 
address this issues and can't say the tcp implementation is broken.

In this cases it's sometimes better to use HTTP-Polling and HTTP-Binding.

Alex



Re: [jdev] Re: Splitting the stream

2006-11-01 Thread Norman Rasmussen

On 11/1/06, Matthias Wimmer [EMAIL PROTECTED] wrote:

As Alexander already said: I think what you plan to do (you wrote in a
private mail, that you plan to use regular expressions) is parsing as well.


Yegh, try and steer clear of RegEx to parse XML.  Use a standard SAX
parser should be more than sufficient, and it's probably the lest
memory intensive, and fastest parser for what you want to do.

--
- Norman Rasmussen
- Email: [EMAIL PROTECTED]
- Home page: http://norman.rasmussen.co.za/