Re: [Standards] UPDATED: XEP-0292 (vCard4 Over XMPP)

2011-06-24 Thread Peter Saint-Andre
On 6/24/11 10:02 AM, XMPP Extensions Editor wrote:
> Version 0.5 of XEP-0292 (vCard4 Over XMPP) has been released.
> 
> Abstract: This document specifies an XMPP extension for use of the
> vCard4 XML format in XMPP systems, with the intent of obsoleting the
> vcard-temp format.
> 
> Changelog: Corrected XSLT script; provided detailed examples of the
> vcard-temp and vCard4 XML formats. (psa)
> 
> Diff: http://xmpp.org/extensions/diff/api/xep/0292/diff/0.4/vs/0.5
> 
> URL: http://xmpp.org/extensions/xep-0292.html

Folks, I'm done with this one for a little while (too many other things
to work on). Reviews would be welcome.

As background, to make progress on this spec I compared the vCard
"flavors" specified in these documents:

vcard3 = http://tools.ietf.org/html/rfc2426

vcard3 XML = http://tools.ietf.org/id/draft-dawson-vcard-xml-dtd-03.txt

vcard-temp = http://xmpp.org/extensions/xep-0054.html

vcard4 = http://tools.ietf.org/html/draft-ietf-vcarddav-vcardrev-22

vcard4 XML = http://tools.ietf.org/html/draft-ietf-vcarddav-vcardxml-11

There are many differences and discrepancies between those flavors,
complicating the task of migrating from vcard-temp to vcard4 XML.
Although I think that I've defined things correctly now, there might be
some more ugly beasts lurking about in this dark forest. :)

Thanks!

/psa


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 8:53 AM, Mark Rejhon wrote:

> On Fri, Jun 24, 2011 at 11:43 AM, Kurt Zeilenga  
> wrote:
> > earlier? I think this will make most people happy, and will only add a
> > few lines to the spec. See for example XEP-0085, section 4:
> > http://xmpp.org/extensions/xep-0085.html
> >
> > Kurt et cetra, would this be satisfactory in the short term? 
> 
> Yes.
> 
> Ok, it's not a painful change, and allows me to get the spec up sooner before 
> too many companies do damage with proprietary RTT.  Kurt?
>  
> 
> > It would at least mean XMPP RTT would now have a basic mechanism of 
> > discovering whether the other end supports RTT, and being able to restrain 
> > from sending RTT if the other end does not support RTT. This would not be 
> > the complete session negotiation algorithm, but would allay the cheif 
> > concern of Kurt.
> 
> Correct, and it would allow for fall back to unextended XMPP if RTT was not 
> available end-to-end, which I would think quite important in emergency and 
> deaf communications.
> 
> Yes, but RTT is backwards compatible, so both RTT and non-RTT conversations 
> look exactly the same to a client that do not support RTT.

My point is that if one uses an extension without it first being successfully 
negotiated, one runs the risk of blocking which is far more disrupting approach 
than simple feature negotiation disruption.

Where an extension causes harm (or is perceived to be harmful), one can expect 
service/network operators to take steps to prevent that harm (or perceived 
harm).  Operators would much rather just prevent such extensions by disrupting 
the negotiation of the feature's use than taking more disruptive action, such 
as dropping the XMPP sessions of the clients which send RTT without first 
successfully completing negotiation of the feature.  But if push comes to 
shove, the network need to generally well operate will likely trump the desire 
of a few users to use some extension deemed harmful to the network.

-- Kurt

[Standards] UPDATED: XEP-0292 (vCard4 Over XMPP)

2011-06-24 Thread XMPP Extensions Editor
Version 0.5 of XEP-0292 (vCard4 Over XMPP) has been released.

Abstract: This document specifies an XMPP extension for use of the vCard4 XML 
format in XMPP systems, with the intent of obsoleting the vcard-temp format.

Changelog: Corrected XSLT script; provided detailed examples of the vcard-temp 
and vCard4 XML formats. (psa)

Diff: http://xmpp.org/extensions/diff/api/xep/0292/diff/0.4/vs/0.5

URL: http://xmpp.org/extensions/xep-0292.html



Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 11:43 AM, Kurt Zeilenga wrote:
>
> > earlier? I think this will make most people happy, and will only add a
> > few lines to the spec. See for example XEP-0085, section 4:
> > http://xmpp.org/extensions/xep-0085.html
> >
> > Kurt et cetra, would this be satisfactory in the short term?


> Yes.
>

Ok, it's not a painful change, and allows me to get the spec up sooner
before too many companies do damage with proprietary RTT.  Kurt?


> It would at least mean XMPP RTT would now have a basic mechanism of
> discovering whether the other end supports RTT, and being able to restrain
> from sending RTT if the other end does not support RTT. This would not be
> the complete session negotiation algorithm, but would allay the cheif
> concern of Kurt.
>
> Correct, and it would allow for fall back to unextended XMPP if RTT was not
> available end-to-end, which I would think quite important in emergency and
> deaf communications.
>

Yes, but RTT is backwards compatible, so both RTT and non-RTT conversations
look exactly the same to a client that do not support RTT.

In fact, if one wanted, one can even have groupchat's with mixed RTT and
non-RTT perfectly, even though I don't explicitly mention support for group
chats  because of the considerations I published at
http://www.marky.com/realjabber/XMPP-RTT-Supplement_2011-06-17.pdf  ...
linked from http://www.realjabber.org

Right now, RTT spec is defined for one-on-one conversations (even though the
RTT spec can be used verbatim for groupchats).  I mentioned group chats in
the previous specification, but for simplicity I removed mention/support for
group chat even though the RTT protocol continues to be compatible for use
in groupchat.

Mark Rejhon


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 8:27 AM, Mark Rejhon wrote:

> 2011/6/24 Remko Tronçon 
> Hi Mark,
> 
> On Fri, Jun 24, 2011 at 5:15 PM, Mark Rejhon  wrote:
> > The earlier spec covered feature negotiation via XEP-0020. However, it was
> > encouraged by a few people that spec simplification became more important,
> > and to focus chiefly on the most basic, core interop issues, at least for
> > the first published version of the specification.
> 
> How about just using Disco/Caps discovery for now, as was suggested
> earlier? I think this will make most people happy, and will only add a
> few lines to the spec. See for example XEP-0085, section 4:
> http://xmpp.org/extensions/xep-0085.html
> 
> Kurt et cetra, would this be satisfactory in the short term?

Yes.

> It would at least mean XMPP RTT would now have a basic mechanism of 
> discovering whether the other end supports RTT, and being able to restrain 
> from sending RTT if the other end does not support RTT. This would not be the 
> complete session negotiation algorithm, but would allay the cheif concern of 
> Kurt.

Correct, and it would allow for fall back to unextended XMPP if RTT was not 
available end-to-end, which I would think quite important in emergency and deaf 
communications.

-- Kurt

Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
2011/6/24 Remko Tronçon 

> Hi Mark,
>
> On Fri, Jun 24, 2011 at 5:15 PM, Mark Rejhon  wrote:
> > The earlier spec covered feature negotiation via XEP-0020. However, it
> was
> > encouraged by a few people that spec simplification became more
> important,
> > and to focus chiefly on the most basic, core interop issues, at least for
> > the first published version of the specification.
>
> How about just using Disco/Caps discovery for now, as was suggested
> earlier? I think this will make most people happy, and will only add a
> few lines to the spec. See for example XEP-0085, section 4:
> http://xmpp.org/extensions/xep-0085.html


Kurt et cetra, would this be satisfactory in the short term?

It would at least mean XMPP RTT would now have a basic mechanism of
discovering whether the other end supports RTT, and being able to restrain
from sending RTT if the other end does not support RTT. This would not be
the complete session negotiation algorithm, but would allay the cheif
concern of Kurt.


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Remko Tronçon
Hi Mark,

On Fri, Jun 24, 2011 at 5:15 PM, Mark Rejhon  wrote:
> The earlier spec covered feature negotiation via XEP-0020. However, it was
> encouraged by a few people that spec simplification became more important,
> and to focus chiefly on the most basic, core interop issues, at least for
> the first published version of the specification.

How about just using Disco/Caps discovery for now, as was suggested
earlier? I think this will make most people happy, and will only add a
few lines to the spec. See for example XEP-0085, section 4:
http://xmpp.org/extensions/xep-0085.html

cheers,
Remko


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 8:00 AM, Mark Rejhon wrote:

> On Fri, Jun 24, 2011 at 10:51 AM, Kurt Zeilenga  
> wrote:
> I should note that we'll kill this one way or the other, even if there's no 
> negotiation.  I just rather kill it by disrupting the negotiation.
> 
> I just think it's really bad form to have non-negoiated extensions.
> 
> It's not a 100% non-negotiated extension.

It's used before negotiated.  That's bad form.

> Negotiation is simply optional, and not documented in the specification. 
> Accept is done by continuing RTT by replying to event='start' with an 
> event='start'.

This means that you have to implement the extension to stop it, as opposed to 
simply not advertising (or disrupting the advertisement) of the extension for 
it not to be used.

> Reject be done by rejecting an attempted event='start' with an event='stop' 
> from the other end.
> 
> I also must point out that at least one 911 systems integrator is already 
> testing XMPP RTT, as a long-term replacement for deaf TDD/TTY, as a companion 
> to RFC4103 / T.140 which is also considered for this use too as well.  Thus, 
> servers are encouraged to stick to server policy (i.e. bandwidth 
> rate-limiting algorithms) rather than blocking the extension.

Rate-limits kick in too late, the damage would already have been done.  We need 
to stop the originator from putting the traffic onto the network.

-- Kurt

Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 10:58 AM, Kevin Smith  wrote:

> > Servers can inject event='stop' to achieve the same thing. Problem
> solved?
> > (We don't condone this. XMPP RTT may interoperate with 911 services, as a
> > replacement for deaf TDD/TTY legal requirements.)
>
> In that case, surely negotiation is vital and urgent?
>

The earlier spec covered feature negotiation via XEP-0020. However, it was
encouraged by a few people that spec simplification became more important,
and to focus chiefly on the most basic, core interop issues, at least for
the first published version of the specification.

We can agree to accelerating some kind of a standards-compliant session
negotiation quickly (i.e. less than a month from now). But more than one
company already developed proprietary variations of real time text over XMPP
(all of which were inferior to XMPP RTT), and I finally successfully
convinced one of them to switch to my XMPP RTT standard.

Note: If anybody needs to block XMPP RTT abuse, they should do it via
bandwidth policy, not via extension-blocking policy, due to the possible use
of XMPP RTT by deaf individuals (assistive act violations) and 911
(emergency accessibility). Also, XMPP RTT is also used on mobile phones.
Carefully done, by a slow cellphone typist, XMPP RTT only uses 150 bytes a
second (UTF-8 XML bytes, excluding TCP/IP overhead).  Not a problem even for
GPRS.


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 10:51 AM, Kurt Zeilenga wrote:

> I should note that we'll kill this one way or the other, even if there's no
> negotiation.  I just rather kill it by disrupting the negotiation.
>
> I just think it's really bad form to have non-negoiated extensions.
>

It's not a 100% non-negotiated extension. Negotiation is simply optional,
and not documented in the specification. Accept is done by continuing RTT by
replying to event='start' with an event='start'. Reject be done by rejecting
an attempted event='start' with an event='stop' from the other end.

I also must point out that at least one 911 systems integrator is already
testing XMPP RTT, as a long-term replacement for deaf TDD/TTY, as a
companion to RFC4103 / T.140 which is also considered for this use too as
well.  Thus, servers are encouraged to stick to server policy (i.e.
bandwidth rate-limiting algorithms) rather than blocking the extension.


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kevin Smith
On Fri, Jun 24, 2011 at 3:55 PM, Mark Rejhon  wrote:
> On Fri, Jun 24, 2011 at 10:48 AM, Kurt Zeilenga 
> wrote:
>>
>> Certain operational networks we support cannot deal with the extra
>> traffic.
>>
>> > That said, I agree -- it is going to be added to the spec in due time,
>> > well before XMPP RTT clients become popular.
>>
>> I see reason why a basic yes/no negotiation of the extension cannot be
>> added now, whether by iq or by caps or some other appropriate mechanism. If
>> it needs to change later, fine.  But please add something now.
>
> Servers can inject event='stop' to achieve the same thing. Problem solved?
> (We don't condone this. XMPP RTT may interoperate with 911 services, as a
> replacement for deaf TDD/TTY legal requirements.)

In that case, surely negotiation is vital and urgent?

/K


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 10:48 AM, Kurt Zeilenga wrote:

> Certain operational networks we support cannot deal with the extra traffic.
>
> > That said, I agree -- it is going to be added to the spec in due time,
> well before XMPP RTT clients become popular.
>
> I see reason why a basic yes/no negotiation of the extension cannot be
> added now, whether by iq or by caps or some other appropriate mechanism. If
> it needs to change later, fine.  But please add something now.
>

Servers can inject event='stop' to achieve the same thing. Problem solved?

(We don't condone this. XMPP RTT may interoperate with 911 services, as a
replacement for deaf TDD/TTY legal requirements.)


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga
I should note that we'll kill this one way or the other, even if there's no 
negotiation.  I just rather kill it by disrupting the negotiation.

I just think it's really bad form to have non-negoiated extensions.

-- Kurt

On Jun 24, 2011, at 7:48 AM, Kurt Zeilenga wrote:

> 
> On Jun 24, 2011, at 7:44 AM, Mark Rejhon wrote:
> 
>> On Fri, Jun 24, 2011 at 10:26 AM, Kurt Zeilenga  
>> wrote:
>> We need to keep experiments from harming our production networks.  If this 
>> extensions gets used blindly, it might well trash some production network.
>> 
>> I don't care how this feature is negotiated between the two entities 
>> intended to experiment with it, I only care that I have some ability to 
>> disrupt that negotiation so I can prevent this extensions use and hence 
>> protect my network from the real harm that would come by its use on my 
>> network.
>> 
>> We should keep this in perspective: This is XMPP RTT, not in-band 
>> bytestreams (i.e. XEP-0096 file transfer). It is low bandwidth the vast 
>> majority of the time, and contributes no additional data when nobody is 
>> typing.
> 
> Certain operational networks we support cannot deal with the extra traffic.
> 
>> That said, I agree -- it is going to be added to the spec in due time, well 
>> before XMPP RTT clients become popular.
> 
> I see reason why a basic yes/no negotiation of the extension cannot be added 
> now, whether by iq or by caps or some other appropriate mechanism.
> 
> If it needs to change later, fine.  But please add something now.
> 
> -- Kurt
> 



Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 7:44 AM, Mark Rejhon wrote:

> On Fri, Jun 24, 2011 at 10:26 AM, Kurt Zeilenga  
> wrote:
> We need to keep experiments from harming our production networks.  If this 
> extensions gets used blindly, it might well trash some production network.
> 
> I don't care how this feature is negotiated between the two entities intended 
> to experiment with it, I only care that I have some ability to disrupt that 
> negotiation so I can prevent this extensions use and hence protect my network 
> from the real harm that would come by its use on my network.
> 
> We should keep this in perspective: This is XMPP RTT, not in-band bytestreams 
> (i.e. XEP-0096 file transfer). It is low bandwidth the vast majority of the 
> time, and contributes no additional data when nobody is typing.

Certain operational networks we support cannot deal with the extra traffic.

> That said, I agree -- it is going to be added to the spec in due time, well 
> before XMPP RTT clients become popular.

I see reason why a basic yes/no negotiation of the extension cannot be added 
now, whether by iq or by caps or some other appropriate mechanism.

If it needs to change later, fine.  But please add something now.

-- Kurt



Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 10:26 AM, Kurt Zeilenga wrote:

> We need to keep experiments from harming our production networks.  If this
> extensions gets used blindly, it might well trash some production network.
>
> I don't care how this feature is negotiated between the two entities
> intended to experiment with it, I only care that I have some ability to
> disrupt that negotiation so I can prevent this extensions use and hence
> protect my network from the real harm that would come by its use on my
> network.
>

We should keep this in perspective: This is XMPP RTT, not in-band
bytestreams (i.e. XEP-0096 file transfer). It is low bandwidth the vast
majority of the time, and contributes no additional data when nobody is
typing.

That said, I agree -- it is going to be added to the spec in due time, well
before XMPP RTT clients become popular.


Re: [Standards] RTT, take 2

2011-06-24 Thread Mark Rejhon
On Fri, Jun 24, 2011 at 4:08 AM, Dave Cridland  wrote:

> 1) Processing software may have decoded the UTF-8 into "something", making
> it awkward to manage.
>
> 2) Referring to UTF-8 octets means we have silly states where we could edit
> inside characters. It's even possible this may be used intentionally, in
> some languages.
>
> So I'd say that we should refer to characters in a string, and deal with
> Unicode code-points in the abstract. I'd expect that implementations would
> convert this internally into whatever made sense for them.
>

That's what I did in the v0.0.2 of the specification already, but it makes
it necessary to explain which string format, which made it necessary to say
it is based on UTF16 strings. Unfortunately, the same string returns
different Unicode encodings in the programming language's native Unicode
storage format (not the wire transmission format before XML processing):

UTF8: String.Length("Québec") == 7
UTF16: String.Length("Québec") == 6

Now, when we start using Chinese characters outside the BMP, we now also
diverge between UTF16 and UCS4 for exactly the same chinese character:

UTF16: String.Length("#") == 2
UCS4: String.Length("#") == 1

This plays unfortunate havoc with String.Insert's and String.Delete's.
This hurts interoperability.
Therefore, we have to go with a consistent method.
Therefore, we had to go to "Unicode code points"


Re: [Standards] RTT, take 2

2011-06-24 Thread Mark Rejhon
2011/6/24 Remko Tronçon 

> I'm wondering whether 'code points' are any better than UTF-8 based
> positioning. Isn't it possible that a codepoint position also points
> inside a character/glyph/...?


That is correct, and yes it is intentionally possible, for reasons explained
in my other reply to you today.


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 7:10 AM, Mark Rejhon wrote:

> Re: http://www.xmpp.org/extensions/inbox/realtimetext.html
> 
> On Fri, Jun 24, 2011 at 9:44 AM, Kurt Zeilenga  
> wrote:
> I am quite concerned that the current spec offers zero negotiation of the 
> extension before its use.
> I urge the authors to add some negotiation, preferable before it's published 
> as XEP.
> 
> I agree, we are going to be developing a session negotiation mechanism over 
> time:
> However, it is not necessary for interoperability right now:
> 
> There was a negotiation mechanism in the previous spec, but it was claimed to 
> be overly complicated. Due to section 4.3.1 (backwards compatible), it is not 
> necessary to even use 'start' or 'stop' since RTT clients can work without 
> 'start' and 'stop'. A sender can send RTT right away, and a recipient can 
> interpret RTT right away.  
> 
> Experimentation during the Experimental stage is needed to determine best 
> interoperability for the process of starting a real-time-text session

We need to keep experiments from harming our production networks.  If this 
extensions gets used blindly, it might well trash some production network.

I don't care how this feature is negotiated between the two entities intended 
to experiment with it, I only care that I have some ability to disrupt that 
negotiation so I can prevent this extensions use and hence protect my network 
from the real harm that would come by its use on my network.

-- Kurt

> and signalling the remote end that a session has started (in the future, it 
> might be a process where one end starts a session, and the other end does an 
> Accept/Reject -- similiar to AOL AIM Real Time IM.   Or it might be a 
> different preferred method of starting a RTT session). It is also a "out in 
> the field" user preference that might influence the preferred session 
> negotiation algorithm, and several companies (4) are already working on XMPP 
> RTT based on this standard. Due to section 4.3.1, failure of signalling is 
> not a catastrophe at this early experimental stage, RTT will simply be turned 
> off but the chat conversation will continue to function normally.
> 
> I covered some of this discussion in the "Supplemental" document at 
> www.realjabber.org as a potential candidate mechanism to mimic the AIM 
> Real-Time IM capability.
> 
> Mark Rejhon



Re: [Standards] RTT, take 2

2011-06-24 Thread Mark Rejhon
Regarding: http://xmpp.org/extensions/inbox/realtimetext.html
(Replies to Remko Tronçon, David Cridland)

On Fri, Jun 24, 2011 at 9:04 AM, Florian Zeitz  wrote:

> On 24.06.2011 11:24, Remko Tronçon wrote:
> > I'm wondering whether 'code points' are any better than UTF-8 based
> > positioning. Isn't it possible that a codepoint position also points
> > inside a character/glyph/...? Peter could probably shed some light on
> > this.
> >
> FWIW, I think using codepoints solves somewhat different problem.
> If we count codepoints we can delete "half a character", e.g. remove the
> "combining cedilla" from ç, but if we count UTF-(8,16) based we can
> delete "half a codepoint" rendering the result undecodeable which is far
> worse.
>

Florian is correct -- this is one of the many reasons why we don't want to
use "UTF-8 counting methodology" for indexes and lengths for XMPP RTT
real-time editing (text inserts and deletes). Interoperability between
slightly buggy clients in UTF-8 can be much worse.


On Fri, Jun 24, 2011 at 5:38 AM, Dave Cridland  wrote:

> As in, adding a "C" character at the fifth code-point of "Tronçon" might
> give you "TroncÇon", or "TronçCon", depending on whether "ç" is a
> "c-with-cedilla" or a "c" followed by a "combining cedilla"?
>
> Yes, I'm quite sure that's possible.
>

Real-time editing worked fine in both cases, due to section 5.2.1
"Monitoring Message Edits". The pre-edit string is compared to the post-edit
string, in order to determine what code points changed. Although I did not
publish the algorithm, the algorithm to do so is actually simpler than most
think -- 50 lines of code (l.340-390 of RealTimeText.cs of the RealJabber
open source). By left/right scanning for unchanged characters (even if the
length has changed), you find the changed section in the middle of the
string and extract that out.  It works even with pastes, auto-spellcheckers,
auto-accenting, complex multi-keypress keyboard entry (multiple dead
characters) because we aren't worried about the input method, but only
worried about how the message changed. Which is why I added section 5.2.1 to
Implementor Notes. "Monitoring Message Edits"   which is recommended
instead of monitoring individual keypresses.

In fact you can use any operating systems' textbox and let the operating
system worry about presentation, which is why we aren't worried about
counting individual glyphs (besides, we have no control over counting glyphs
with most GUI frameworks)



> I don't have a solution, either, except to note that this applies to UTF-8
> octets etc as well, unless you normalize all strings first - but then it's
> really not clear to me how to translate editing actions in a GUI into that
> form.
>

The editing actions need to be executed before normalizing, because there is
not a consistent standard of normalization between different platforms. This
is an additional reason we don't count based on glyphs, too.  One platform
may display a glyph as 2 characters, and another platform as 1 character.
The method we chose, solves that problem.

Mark Rejhon


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kevin Smith
On Fri, Jun 24, 2011 at 3:10 PM, Mark Rejhon  wrote:
> Re: http://www.xmpp.org/extensions/inbox/realtimetext.html
> On Fri, Jun 24, 2011 at 9:44 AM, Kurt Zeilenga 
> wrote:
>>
>> I am quite concerned that the current spec offers zero negotiation of the
>> extension before its use.
>> I urge the authors to add some negotiation, preferable before it's
>> published as XEP.
>
> I agree, we are going to be developing a session negotiation mechanism over
> time:

I think a sensible baseline, as I noted in the other thread (to which
I need to reply again) is using caps to signal support. This at least
does away with the most terrible "Doesn't support it, but gets spammed
anyway" case.

/K


Re: [Standards] RTT: no negotiation of the feature

2011-06-24 Thread Mark Rejhon
Re: http://www.xmpp.org/extensions/inbox/realtimetext.html

On Fri, Jun 24, 2011 at 9:44 AM, Kurt Zeilenga wrote:

> I am quite concerned that the current spec offers zero negotiation of the
> extension before its use.
> I urge the authors to add some negotiation, preferable before it's
> published as XEP.
>

I agree, we are going to be developing a session negotiation mechanism over
time:
However, it is not necessary for interoperability right now:

There was a negotiation mechanism in the previous spec, but it was claimed
to be overly complicated. Due to section 4.3.1 (backwards compatible), it is
not necessary to even use 'start' or 'stop' since RTT clients can work
without 'start' and 'stop'. A sender can send RTT right away, and a
recipient can interpret RTT right away.

Experimentation during the Experimental stage is needed to determine best
interoperability for the process of starting a real-time-text session and
signalling the remote end that a session has started (in the future, it
might be a process where one end starts a session, and the other end does an
Accept/Reject -- similiar to AOL AIM Real Time IM.   Or it might be a
different preferred method of starting a RTT session). It is also a "out in
the field" user preference that might influence the preferred session
negotiation algorithm, and several companies (4) are already working on XMPP
RTT based on this standard. Due to section 4.3.1, failure of signalling is
not a catastrophe at this early experimental stage, RTT will simply be
turned off but the chat conversation will continue to function normally.

I covered some of this discussion in the "Supplemental" document at
www.realjabber.org as a potential candidate mechanism to mimic the AIM
Real-Time IM capability.

Mark Rejhon


[Standards] RTT: no negotiation of the feature

2011-06-24 Thread Kurt Zeilenga
I am quite concerned that the current spec offers zero negotiation of the 
extension before its use.

I urge the authors to add some negotiation, preferable before it's published as 
XEP.

-- Kurt

Re: [Standards] RTT, take 2

2011-06-24 Thread Kurt Zeilenga

On Jun 24, 2011, at 6:04 AM, Florian Zeitz wrote:

> On 24.06.2011 11:24, Remko Tronçon wrote:
>>> So I'd say that we should refer to characters in a string, and deal with
>>> Unicode code-points in the abstract.
>> 
>> I'm wondering whether 'code points' are any better than UTF-8 based
>> positioning. Isn't it possible that a codepoint position also points
>> inside a character/glyph/...? Peter could probably shed some light on
>> this.
>> 
> FWIW, I think using codepoints solves somewhat different problem.
> 
> If we count codepoints we can delete "half a character", e.g. remove the
> "combining cedilla" from ç, but if we count UTF-(8,16) based we can
> delete "half a codepoint" rendering the result undecodeable which is far
> worse.

The protocol ought to defined in wire terms… but state a few guidelines on 
handling of characters composed of multiple code points.

For instance, if a character is sent as(Y being a combining 
character), I have little problem with  being edited away so long as  by 
itself is valid… or being replaced with  (another combining character) 
without touching .

It's my view that that the client needs to be aware enough of what's happening 
in the GUI and the wire to ensure both are sane.   If you try to design this 
such that clients don't have to be aware of what really going on the wire or in 
the GUI, it will be quite fragile and prone to interoperability problems.

-- Kurt

Re: [Standards] RTT, take 2

2011-06-24 Thread Florian Zeitz
On 24.06.2011 11:24, Remko Tronçon wrote:
>> So I'd say that we should refer to characters in a string, and deal with
>> Unicode code-points in the abstract.
> 
> I'm wondering whether 'code points' are any better than UTF-8 based
> positioning. Isn't it possible that a codepoint position also points
> inside a character/glyph/...? Peter could probably shed some light on
> this.
> 
FWIW, I think using codepoints solves somewhat different problem.

If we count codepoints we can delete "half a character", e.g. remove the
"combining cedilla" from ç, but if we count UTF-(8,16) based we can
delete "half a codepoint" rendering the result undecodeable which is far
worse.


Re: [Standards] RTT, take 2

2011-06-24 Thread Gunnar Hellström



I'm wondering whether 'code points' are any better than UTF-8 based
>  positioning. Isn't it possible that a codepoint position also points
>  inside a character/glyph/...?

A codepoint is the fundamental thing defined by Unicode, but there is a
related concept which could be called a character (or grapheme?), consisting
of one or more codepoints (a codepoint representing a non-combining character,
followed by zero or more codepoints representing combining characters).


Yes, this why counting Unicode code points is the solution.
But it needs to be done at a sufficiently low level, close to the 
transmission of messages.
For e.g. erasure of one combined character consisting of two code 
points, the user interface action should at a low level result in 
erasure of two codepoints. That fact can be captured and sent in the RTT 
erasure element with an order to erase two code points.


The receiving client has its received rtt messages as reference, and 
does the action in the received string, and then takes the result to 
presentation. Then the operation is independent of any local Unicode 
habits in the receiving environment. Two code points is still two code 
points at that level, and the operation can be done without ambiguities.


Gunnar


Re: [Standards] RTT, take 2 -network load

2011-06-24 Thread Gunnar Hellström

Dave Cridland wrote 2011-06-24 11:01:
I'd like to see, somewhere in this document, a discussion about 
network load, and a consideration that clients (and possibly servers) 
MAY, or possibly SHOULD, disable RTT if network conditions deteriorate.


There is a very brief discussion along that line in section 4.4.

The recommended transmission interval of one second and smoothing out of 
characters in presentation during that second is a first good guarantee 
against congestion. This is very different from the 
character-by-character transmission technologies that can cause very 
high network load by rapid typers.


So, the load is a maximum of one message per second from each active 
typer. - About 5 kB/s.


If character-by-character transmission was used, it could end up at 20 
messages per second and about 100 kB/s.


So it is a huge difference.

How much mean load does a message-wise IM participant cause? Is it one 
message every 10 seconds and 1 kB/s ?  Maybe a factor 5-10 less than the 
RTT user.


It is a good habit to include a "Congestion considerations" section in 
this kind of specifications. So, let us aim at create one.

I see the following parts can be created:
-Server congestion considerations
-Client load considerations
-Multiparty load considerations

What good bases for such discussions do you suggest to refer to?

/Gunnar




Re: [Standards] RTT, take 2

2011-06-24 Thread Simon McVittie
On Fri, 24 Jun 2011 at 11:24:50 +0200, Remko Tronçon wrote:
> > So I'd say that we should refer to characters in a string, and deal with
> > Unicode code-points in the abstract.
> 
> I'm wondering whether 'code points' are any better than UTF-8 based
> positioning. Isn't it possible that a codepoint position also points
> inside a character/glyph/...?

A codepoint is the fundamental thing defined by Unicode, but there is a
related concept which could be called a character (or grapheme?), consisting
of one or more codepoints (a codepoint representing a non-combining character,
followed by zero or more codepoints representing combining characters).

(A glyph is something different, and as far as I can tell is only interesting
if you make fonts or font-rendering algorithms.)

In UTF-8 a codepoint is one or more bytes, in UTF-16 a codepoint is either
one or two 16-bit words, and in UCS-4 a codepoint is one 32-bit word.

Here are some codepoints:

* U+0041 LATIN CAPITAL LETTER A
* U+00C1 LATIN CAPITAL LETTER A WITH ACUTE
* U+0301 COMBINING ACUTE ACCENT

The grapheme Á could either be written as U+0041 U+0301 (decomposed form),
or U+00C1 (composed form). Not all graphemes have a composed form.

> For example, in Qt, this would most likely be
> implemented using a QTextCursor (
> http://doc.trolltech.com/4.7/qtextcursor.html ). However, the text
> talks about 'positioning at character X', and it doesn't seem to be
> defined what this means.

That might either be counting graphemes or codepoints, depending...

S


Re: [Standards] RTT, take 2

2011-06-24 Thread Dave Cridland

On Fri Jun 24 10:24:50 2011, Remko Tronçon wrote:
> So I'd say that we should refer to characters in a string, and  
deal with

> Unicode code-points in the abstract.

I'm wondering whether 'code points' are any better than UTF-8 based
positioning. Isn't it possible that a codepoint position also points
inside a character/glyph/...? Peter could probably shed some light  
on

this.


As in, adding a "C" character at the fifth code-point of "Tronçon"  
might give you "TroncÇon", or "TronçCon", depending on whether "ç" is  
a "c-with-cedilla" or a "c" followed by a "combining cedilla"?


Yes, I'm quite sure that's possible.

I don't have a solution, either, except to note that this applies to  
UTF-8 octets etc as well, unless you normalize all strings first -  
but then it's really not clear to me how to translate editing actions  
in a GUI into that form.


Dave.
--
Dave Cridland - mailto:d...@cridland.net - xmpp:d...@dave.cridland.net
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [Standards] RTT, take 2

2011-06-24 Thread Remko Tronçon
> So I'd say that we should refer to characters in a string, and deal with
> Unicode code-points in the abstract.

I'm wondering whether 'code points' are any better than UTF-8 based
positioning. Isn't it possible that a codepoint position also points
inside a character/glyph/...? Peter could probably shed some light on
this.

The major problem is that you want something that you can tell your
GUI "remove N characters", but that such an operation is very
toolkit-specific and not well specified, and that you don't have any
control over this. For example, in Qt, this would most likely be
implemented using a QTextCursor (
http://doc.trolltech.com/4.7/qtextcursor.html ). However, the text
talks about 'positioning at character X', and it doesn't seem to be
defined what this means. I think that deleting one 'character' using
this API would potentially delete multiple unicode code points? (or
maybe i don't know enough about unicode).

But if my understanding is correct, then i'm not sure if such a
positioning-based API would ever work in practice (for multiple
implementations).

cheers,
Remko


Re: [Standards] RTT, take 2

2011-06-24 Thread Dave Cridland

On Wed Jun 22 16:52:53 2011, Kevin Smith wrote:
I've performed a quick review of the new proposal. I have a handful  
of

comments on the spec; I don't currently intend these to be blocking,
for my part, when Council vote to Experimental. I consider this a  
vast

improvement over the first proposed version of the document.


Just to add...

The nice trip down memory lane in Section 1 paints a rather rosy  
picture, I think.


Since I was actually about, and using the net, in those days, I feel  
a flashback coming on.


The biggest problem for a lot of these systems was the lag and  
network load they generated. This is evidenced in the way that  
Nagle's algorithm is the default in BSD derived socket stacks, for  
instance.


Most of the talkers switched to using line buffering, Internet BBS  
developed clients (and/or CLients, depending on whether you were DOC  
or YAWC) which provided local editing facilities. C{lL}ient  
connections often got in ahead of the queue (remember queueing? No,  
of course not...) because of the vastly lower network load they  
generated, and people used them because of the vastly improved user  
experience of local echo - remote echo being not only more painful on  
its own, but in no small part due to the network load, latencies of  
30 seconds or more were quite common.


I'd like to see, somewhere in this document, a discussion about  
network load, and a consideration that clients (and possibly servers)  
MAY, or possibly SHOULD, disable RTT if network conditions  
deteriorate.


Dave.
--
Dave Cridland - mailto:d...@cridland.net - xmpp:d...@dave.cridland.net
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


Re: [Standards] RTT, take 2

2011-06-24 Thread Gunnar Hellström

Remko Tronçon wrote:

[ I don't like writing me-too e-mails, but you beat me by a minute to
sending the exact same mail, so I'm doing it anyway ;-) ]


So I'd say that we should refer to characters in a string, and deal with
Unicode code-points in the abstract. I'd expect that implementations would
convert this internally into whatever made sense for them.

I think it would be the first protocol to depend on knowing how to
count code points (I haven't needed it before), but I also think it's
the only sensible thing to do, because you could end up with incorrect
encodings using the protocol otherwise.

Anyway, for applications that don't use Unicode libraries, rolling
your own codepoint count isn't very hard, at least for utf-8.

We just need a concise way to tell lengths and positions within the 
Unicode string. With Unicode, some characters can be composed of 
characters. Just the word "characters" has therefore the risk of being 
ambigous and need a clarification.


RFC 5198 Network Unicode says:
"Unicode identifies each character by an integer, called its "code
   point", in the range 0-0x10.  These integers can be encoded into
   byte sequences for transmission in at least three standard and
   generally-recognized encoding forms, all of which are completely
   defined in The Unicode Standard and the documents cited below:"

It is this "Unicode code point" that is meant in the length and position 
parameters in this specification, as any representation of the Unicode 
character number.


With RFC 5198 using both the "character" and the "code point", and 
character being slightly ambigous, I suggest to use the term "Unicode 
code point".


cheers,
Gunnar



Re: [Standards] RTT, take 2

2011-06-24 Thread Remko Tronçon
[ I don't like writing me-too e-mails, but you beat me by a minute to
sending the exact same mail, so I'm doing it anyway ;-) ]

> So I'd say that we should refer to characters in a string, and deal with
> Unicode code-points in the abstract. I'd expect that implementations would
> convert this internally into whatever made sense for them.

I think it would be the first protocol to depend on knowing how to
count code points (I haven't needed it before), but I also think it's
the only sensible thing to do, because you could end up with incorrect
encodings using the protocol otherwise.

Anyway, for applications that don't use Unicode libraries, rolling
your own codepoint count isn't very hard, at least for utf-8.

cheers,
Remko


Re: [Standards] RTT, take 2

2011-06-24 Thread Dave Cridland

On Fri Jun 24 02:54:12 2011, Peter Saint-Andre wrote:

On 6/23/11 12:41 AM, Mark Rejhon wrote:
> Opinion?

On the wire is no such thing as a code point, there are only code  
points

that are encoded using an encoding form like UTF-8 or UTF-16. For
details, see:

http://tools.ietf.org/html/draft-ietf-appsawg-rfc3536bis-02

Given that XMPP is pure UTF-8, I don't see a compelling reason to  
count

UTF-16-encoded code points or UTF-32-encoded code points.


I think UTF-16 and UTF-32 encodings would both be a bad idea; XMPP is  
purely UTF-8 as you say.


However, I don't think that we should refer to UTF-8 octets either,  
here, for a number of reasons:


1) Processing software may have decoded the UTF-8 into "something",  
making it awkward to manage.


2) Referring to UTF-8 octets means we have silly states where we  
could edit inside characters. It's even possible this may be used  
intentionally, in some languages.


So I'd say that we should refer to characters in a string, and deal  
with Unicode code-points in the abstract. I'd expect that  
implementations would convert this internally into whatever made  
sense for them.


Dave.
--
Dave Cridland - mailto:d...@cridland.net - xmpp:d...@dave.cridland.net
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade