Hi everyone,

Just read that very thought provoking thread. Just subscribed (sorry for breaking/reviving the thread)

For a quick background, I'm the author of the atompub-pubsub module for ejabberd, and wrote an atom validator for the ejabberd xml format and a node_atom for ejabberd pubsub. Among my interests leverage the respective strengths of http and xmpp ... and interconnect one with the other.

My 2 centimes :

The main difference between pull and push is that some state that has to be kept on the push server.
Meaning that we need a datastore that scale as traffic and volume grow.

The experience scaling datastores has been quite documented/solved now :
- With memcached/rdbms combinations
- Or those new DBs like CouchDB and my current playground, AWS SimpleDB.

As a sidenote I plan to release a few modules for ejabberd (authentication/roster/mod_last) using SimpleDB, and will probably port pubsub/pep on it (with the payload on Amazon S3). The objective is to have an EC2 AMI with ejabberd running without any data (except the transient state) stored in.

Another problem that should be taken care of is traffic to/from the servers. There should be a way of multicasting to all users of another domain a single notification/presence XEP-0033 (Extended Stanza Addressing) may or may not help. I haven't really looked into it.

Also taken from the http book, caching should be implemented.
We have everything specified here http://xmpp.org/extensions/xep-0131.html and of course in the http RFC. Implement http-style caching (etag/if-modified etc) for roster, disco and pubsub for get-items will help regarding traffic. (there's a bit of "polling" in XMPP though it's usually user initiated ;)

There's also XEP-230 (Service Discovery Notifications) and XEP-237 (Roster versioning) that could help. But from a server side perspective they are much harder to implement compared to an all-or-nothing approach like http's.
And they tend to add more state ...

There's also the problem of longlasting connections and there management.
HTTP experience can't help much in this regard ;)
But XEP-198 (Stream Management) helps for fast reconnect.
Yet currently lacking IMO is a way to rebalance connections.
If I want to dynamically add or remove servers (EC2-style) I'd like a way to inform clients that they should reconnect.
The load balancer would then direct the reconnection to another server.

Cheers,
c*

http://www.cestari.info/
http://twitter.com/cstar
JID : cstar-at-ohmforce.com

Reply via email to