Re: [Standards] Comments on SIFT

2010-03-10 Thread Jack Moffitt
> With per-message filtering this changes to something like this:
>
>  3. loop over the resultset and send all _allowed_ messages to the
> newly available resource
>  4. for each sent message, DELETE FROM offline_messages WHERE JID ==
> ${account_jid} and MESSAGEID == $(unique_message_id)
>
> This could be optimized somewhat, but would still be relatively complex.

This is a non-issue. The problem here is how expensive it is to find
each message to delete, but you have a list of IDs. You can avoid all
the overhead of multiple queries by sending a single SQL statement:

DELETE FROM offline_messages WHERE jid == ${account_id} and message_id
IN (@unique_message_ids)

This should probably be more efficient than the non-allow case,
because there are less messages to delete than the full list.

jack.


Re: [Standards] Comments on SIFT

2010-03-06 Thread Jason

interesting - I've built a variation on this for offline messages,
but allowing quite complex "allow" criteria. I couldnt make xmpp do it 
(I'm not saying xmpp couldnt, but just that I couldnt figure out how) as 
my case seemed to require altered routing rules and a few other issues 
surrounding my frequent, but momentary presence requirement, so I ended 
up just using xmpp as a transport.


Cheers.



Waqas Hussain wrote:

While implementing mod_sift for Prosody, I saw some possibilities for
improvement and had thoughts about issues. Some of these follow.


1. Remove disallowed child elements for filtered messages and presence.

Here's a typical identi.ca message:

  
  evan: RT @sil doom. the Shuttle computer I'm setting up for
dad can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ... [23931040]
  http://jabber.org/protocol/xhtml-im";>
  http://www.w3.org/1999/xhtml";>
  : RT @doom. the Shuttle computer I'm setting up for dad can't read
the hard drive. Won't boot from USB, has no CD drive, I have no USB
...
  http://identi.ca/evan";>evan
  
  http://identi.ca/user/279";>
  sil
  
  
  http://identi.ca/conversation/24011046#notice-23931040";>[23931040]
  
  
  http://www.w3.org/2005/Atom";>
  
  evan - Identi.ca
  http://identi.ca/evan"; />
  http://identi.ca/evan"; />
  http://creativecommons.org/licenses/by/3.0/"; />
  http://avatar.identi.ca/1-96-20090819204503.jpeg
  
  RT @sil doom. the Shuttle computer I'm setting up for dad
can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ...
  
  evan
  http://identi.ca/user/1
  
  http://activitystrea.ms/spec/1.0/";>
  http://activitystrea.ms/schema/1.0/person
  http://www.w3.org/2005/Atom";>http://identi.ca/user/1
  http://www.w3.org/2005/Atom";>Evan Prodromou
  http://identi.ca/evan";
xmlns="http://www.w3.org/2005/Atom"; />
  http://purl.org/syndication/atommedia"; ns1:height="353"
xmlns:ns2="http://purl.org/syndication/atommedia"; ns2:width="353"
href="http://avatar.identi.ca/1-353-20090819204502.jpeg";
xmlns="http://www.w3.org/2005/Atom"; />
  http://purl.org/syndication/atommedia"; ns1:height="96"
xmlns:ns2="http://purl.org/syndication/atommedia"; ns2:width="96"
href="http://avatar.identi.ca/1-96-20090819204503.jpeg";
xmlns="http://www.w3.org/2005/Atom"; />
  http://purl.org/syndication/atommedia"; ns1:height="48"
xmlns:ns2="http://purl.org/syndication/atommedia"; ns2:width="48"
href="http://avatar.identi.ca/1-48-20090819204503.jpeg";
xmlns="http://www.w3.org/2005/Atom"; />
  http://purl.org/syndication/atommedia"; ns1:height="24"
xmlns:ns2="http://purl.org/syndication/atommedia"; ns2:width="24"
href="http://avatar.identi.ca/1-24-20090819204503.jpeg";
xmlns="http://www.w3.org/2005/Atom"; />
  http://www.georss.org/georss";>45.5088375 -73.587809
  http://portablecontacts.net/spec/1.0";>evan
  http://portablecontacts.net/spec/1.0";>Evan
Prodromou
  http://portablecontacts.net/spec/1.0";>Montreal hacker
and entrepreneur. Founder of identi.ca, lead developer of StatusNet,
CEO of StatusNet Inc.
  http://portablecontacts.net/spec/1.0";>
  Montreal, Quebec, Canada
  
  http://portablecontacts.net/spec/1.0";>
  homepage
  http://evan.prodromou.name/
  true
  
  
  http://identi.ca/notice/23931040"; />
  http://identi.ca/notice/23931040
  2010-03-06T20:01:22+00:00
  2010-03-06T20:01:22+00:00
  http://identi.ca/conversation/24011046"; />
  http://identi.ca/notice/23928915";
href="http://identi.ca/notice/23928915";
xmlns="http://ostatus.org/schema/1.0"; />
  RT @http://identi.ca/user/279"; class="url" title="Stuart
Langridge">sil doom. the
Shuttle computer I'm setting up for dad can't read the hard drive.
Won't boot from USB, has no CD drive, I have no USB ...
  
  

Look at the size of that. Should I laugh or cry?  This should be reduced to:

  
  evan: RT @sil doom. the Shuttle computer I'm setting up for
dad can't read the hard drive. Won't boot from USB, has no CD drive, I
have no USB ... [23931040]
  

for mobile clients. That's roughly 6% of the original (~4,257 bytes
reduced to ~262 bytes). I think without this behavior, message
filtering is pretty useless.

Useless fact: Watching offline messages from identi.ca using up
bandwidth in slow motion (slow, expensive GPRS with payment based on
bandwidth usage) is what got mod_sift for Prosody started.


2. Offline messages.

A SIFT message filter which has some  elements doesn't scale
well for large numbers of offline messages. Currently a server with an
SQL backend may do something like this:

  1. resource becomes available
  2. SELECT * FROM offline_messages WHERE JID == ${account_jid}
  3. loop over the resultset and send all messages to the newly
available resource
  4. DELETE FROM offline_messages WHERE JID == ${account_jid}

With per-message filtering this changes to something like this:

  3. loop over the resultset and send all _allowed_ messages to the
newly available resource
  4. for each sent message, DELETE FROM offline_messages WHERE JID ==
${