Ahoj Pavle! ;-)

Pavel Simerda wrote:
Hello,

I finally got to examine the XEP-0154 (User Profile).

Thanks for the review.

When reading
through I realized how big a job it was to synchronize the items and
names with various standards and other XEPs.

I think it was mostly bookkeeping, but thanks.

1) Look at the examples at http://www.xmpp.org/extensions/xep-0163.html
(Personal Eventing via Pubsub). Now look at the examples of XEP-0154.

The biggest difference is the added complexity and data because of the
data form syntax. These are actually no forms but data with a
particular structure.

I'm not a huge fan of x:data here either.

I read carefully the arguments for data forms but most of them also
apply to structured formats. Then there is database storage which
I consider a non-issue (I can give more details if needed) and
extensibility that should be IMO done differently (more details later).

Possible solution: Replace all of the data forms syntax with variable
names
by custom elements of the same (or similar) names. At the same time
I propose to keep up with the examples in XEP-0154.

Example:

<iq type='set' id='pub1'>
  <pubsub xmlns='http://jabber.org/protocol/pubsub'>
    <publish node='urn:xmpp:tmp:profile'>
      <item>
        <profile xmlns='urn:xmpp:tmp:profile'>
          <x xmlns='jabber:x:data' type='result'>
            <field var='jid'>
              <value>[EMAIL PROTECTED]</value>
              <value>[EMAIL PROTECTED]</value>
            </field>
          </x>
        </profile>
      </item>
    </publish>
  </pubsub>
</iq>

would be changed (if we make no other changes) to:

<iq type='set' id='pub1'>
  <pubsub xmlns='http://jabber.org/protocol/pubsub'>
    <publish node='urn:xmpp:tmp:profile'>
      <item>
        <profile xmlns='urn:xmpp:tmp:profile'>
          <jid>[EMAIL PROTECTED]</jid>
          <jid>[EMAIL PROTECTED]</jid>
        </profile>
      </item>
    </publish>
  </pubsub>
</iq>

Another possibility: a simple name-value format would work just as well as x:data, for example:

<iq type='set' id='pub1'>
  <pubsub xmlns='http://jabber.org/protocol/pubsub'>
    <publish node='urn:xmpp:tmp:profile'>
      <item>
        <profile xmlns='urn:xmpp:tmp:profile'>
          <item name='jid' value='[EMAIL PROTECTED]'/>
          <item name='jid' value='[EMAIL PROTECTED]'/>
        </profile>
      </item>
    </publish>
  </pubsub>
</iq>

(Btw, the <item/> element was missing in at least one of the examples,
if my understanding is correct.)

Probably. :)

2) I believe that the answer to the query for user information should
be reasonably short. This would have several consequences.

 * it would be possible to use with mobile client (I still have
problems with vcard images on my mobile)
 * it would not be then necessary to implement special techniques for
partial profile getting/setting

Those are good goals.

Possible solution: Don't include any data except short text. This
includes:
 * User pictures (aka avatars)

I'm not sure that user pictures are the same as avatars. But perhaps it's better to host such data via HTTP?

 * Long 'about' texts
 * Piles of unnecessary or uninteresting information

Unnecessary or uninteresting to whom?

These requirements are very easy to achieve. We only need to divide the
former monolithic profile/extended vcard. We would need more
namespaces.

Example separation:

 * urn:xmpp:tmp:profile:basic - Contact information, birthday, nameday,
etc...
 * urn:xmpp:tmp:profile:about - About text (as in vcard-temp)
 * urn:xmpp:tmp:profile:picture - User picture / photo

That seems reasonable.

Use cases:
 * I have a desktop client and good connection but lots of contacts.
   I want my client to first get the information about users and then
   (possibly) the pictures.
 * I have a mobile client and/on slow internet connection. I just want
   to check my friend's phone number. Downloading other contact info
   is considered a reasonable overhead (I may need it if the phone
   number is not published anyway). Downloading user's picture and a
   long about section *cannot*.
 * User's picture changes, I don't need to download his about/contacts.
 * User's contacts change, I don't need to recieve all the other
   (unchanged) stuff.

What is this "download" stuff? Isn't the data pushed to you via PEP?

3) home / work addresses

There are several solutions for designating home/work addresses. As far
as I understand the current XEP, we need to distinguish three types of
contact information pieces: generic, home and work.
Home and work addresses are usually presented in separate tabs. There
could be also usecases where home information is irrelevant or has a
different publishing policy than work information. What I'm not sure
about is the generic info, any suggestions?

Possible solution: further split the basic information into four
sections:

a) General information (and possibly generic contacts, which I see not
much use for)
 * Name and related
 * Birthday and nameday (possibly moved to misc.)
 * Generic stuff

I agree that generic contact information may not be useful, at least for physical addresses.

b) Home information (self-describing)
c) Work information (except several elements has the same
structure as home info)

Agreed.

d) Miscellanous information
 * I've seen something about religion
 * languages spoken
 * more personal stuff
For the sake of completeness, I repeat the other sections:
e) About page
f) User picture
Some other permanent suggestions:
g) xhtml-im version of the about page

Of course this list needs an extensive review.

Yes, it does.

And let's not forget that we want to make it possible for people to add more fields as needed (for gaming or any other application).

Advantages of this approach:
 * We almost never need to read all this stuff at the same time.
   This suggestion actually copies the way the data are actually used
   and displayed (roughly one section per tab in a typical user info
   dialog).
 * With a desktop client we may want to cache the user data on
   the disk and only download if changed. We should keep the
   synchronization stuff together and there's no better place than
   PEP/PubSub and divided into cathegories that usually change as a
   whole.

Good point. For example if I change jobs all my work contact info will change at once.

 * In low-speed/expensive network environments we also benefit from the
   separation. The client program may download only the section the
   user actually wants to see.
 * In other environments the client may first download what the user
   wants and only then cache what the user may want next (e.g. first
   download general info and avatar for the typical current UI's first
   tab and then the other data).

Or perhaps even download only the basic info.

 * *Extensibility* is achieved by allowing more sections to be added
   later.

What about adding new fields to existing sections?

 * The natural way to add company-specific data (that shouldn't be
   usually necessary) is to add special section for internal stuff.

Probably. I defer to people like Dave Cridland and Joe Hildebrand on that score since they deal with enterprise customers all the time. :)

4) The instant messaging contacts

The current spec proposes:

<field var='jid'>
  <value>[EMAIL PROTECTED]</value>
  <value>[EMAIL PROTECTED]</value>
</field>
<field var='msn_id'>
  <value>[EMAIL PROTECTED]</value>
</field>
<field var='yahoo_id'>
  <value>psaintandre</value>
</field>

In the words of my proposal, this is:

<jid>[EMAIL PROTECTED]</jid>
<jid>[EMAIL PROTECTED]</jid>
<msn_id>[EMAIL PROTECTED]</msn_id>
<yahoo_id>psaintandre</yahoo_id>

It's fairly inflexible and even with the current version,
we need to add another field. That means the clients won't
understand it at all. Adding new elements makes no more harm
but is also no better!

what I suggest is to unify all contact addresses with the same type
(purpose). An example would be:

<im network="xmpp">[EMAIL PROTECTED]</im>
<im network="xmpp">[EMAIL PROTECTED]</im>
<im network="msn">[EMAIL PROTECTED]</network>
<im network="yahoo">psaintandre</im>

One might also consider other possible syntaxes.

Why not use URIs? That's what they're for, after all. :)

This applies to:
 * 6.4. Telephony Address Data Aspects
 * 6.5. Electronic Address Data Aspects
 * 6.6. World Wide Web Resource Aspects

I certainly agree that we need to make extensibility easy, because we know that people will be launching new IM services, telephony services, social networking services, microblogging services, and everything else.

BTW, don't over-estimate the ability of people to know what technology powers their services. Does the average user know that Google Talk uses XMPP or that Gizmo5 uses SIP? I doubt it.

Alternatively, we might call it always 'type' to make easier mapping to
a relational database (but it doesn't help a generic pubsub xml
storage!).

5) Birthday / nameday

I very appreciate you can set individual parts of birthday.
The only thing I would think of... is if we couldn't do with:

<birthday>1960-xx-xx</birthday> - only year specified
<birthday>xxxx-03-20</birthday> - month and day specified

Advantages: one birthday - one field
Disadvantages: one might consider the current way more XMLish

IMHO dates should be parseable into year, month, and day, so I think the current approach is OK.


Third alternative:
<birthday>
  <year>1960</year>
  <month>03</month>
  <!-- day is omitted -->
</birthday>

Or that, sure.

6) Non-profile

There are some data included that don't seem to fit in the "static"
profile like geolocation. But maybe I just misunderstood their purpose.

I think that is static geolocation (e.g., for work or home).

I believe these should be moved out of the scope of the user profile
specification to make clear distinction between permanent profile
and often-changing (rather presence) data.

I agree in principle.

Grey area is the user picture. We should maintain a clear distinction
between dynamic avatars and profile images.

Yes, I think so -- my picture won't change (at least not frequently) whereas my avatar might.

7) Implementation issues

a) Servers need to include storage for node data.

b) Servers may (and usually will) implement the vcard-temp extension
mapped to the appropriate nodes (if available)

c) Client authors may not want to include UI for every single piece of
information. Proper division into node for various types of data
enables them to just choose the groups of fields they're authors (and
users) want.

Server administrators may provide other means to fill-in data not
supported by some of the clients - e.g. in a web form confirmed by
"XEP-0070: Verifying HTTP Requests via XMPP".

It might be desirable to only mark some of these nodes obligatory
and maybe even move the others to separate XEPs?


This are a lot of issues to cope with, I expect a lot of comments and
discussion. I want to help as much as I can with this one :), tell me
what need to be done.

Dekuji vam!

/psa

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to