[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-11 Thread Andrew Badera

you'd have to wonder if there's a push out to Gnip somewhere ... or if
protocol buffers are involved ..

Thanks-
- Andy Badera
- and...@badera.us
- (518) 641-1280
- Tech Valley Code Camp 2009.1: http://www.techvalleycodecamp.com/
- Google me: http://www.google.com/search?q=andrew+badera




On Wed, Mar 11, 2009 at 9:21 AM, peterk  wrote:
>
> I just read on your blog (from January) the intention to release the
> appengine port of Jaiku as open source when the port is finished..but
> I was wondering if I could be so cheeky as to jump ahead with a couple
> of questions about it. The requirements of Jaiku seem to line up
> roughly similarly with issues I'm having in a slightly different
> context, that I'm finding pretty challenging to implement efficiently
> on app engine.
>
> With your service, you seem to track updates for friends and other
> people I follow..so I  might have a long list of people I'm following,
> and you feed me their updates.
>
> How do you implement this on GAE?
>
> I've been toying with a very similar problem for some time now. It
> seems to me you cannot chain together queries such as
> me.friends.updates.order(..) to get your friends' latest updates, for
> example. You can't make n writes to n update queues for n people
> following you, since writes are so costly. If I store my friends in a
> list of keys, this limits the number of friends I can query at a given
> time to 30. e.g. updates.all().filter('user IN', me.friends) is
> limited to 30 subqueries. I may have many more friends, so this
> approach doesn't seem to be sufficient.
>
>
> I've been scratching my head over a similar problem for some time now,
> coming up with various hairbrained schemes that have been overly-
> complex, none of which deliver scalability to the nth degree. So I'd
> really, really, really appreciate any insight you could provide in
> implementing this kind of data model on GAE!! Many thanks!
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-11 Thread peterk

Hmm. Good thinking Andrew. I've done a bit more digging along those
lines..I don't think Jaiku has publically said if and what messaging
protocols they might use, but there seems to be some speculation that
it uses XMPP.

That might also neatly align with their comment that they are in some
cases "extending" GAE in their port..and the recent news that XMPP is
scheduled to come to GAE at some point in the next 6 months. Perhaps
it's the work on the Jaiku port that's bringing that functionality...

I didn't know anything about messaging protocols or XMPP until I dug
around after your post, so it's all new to me. I found a project
purporting to provide XMPP functionality in advance of Google's
official support, at http://imified-demo.appspot.com/, but looking at
the code for it, I cannot for the life of me see how this could be
used to address the problems discussed in my original email. That
project seems only to take instant messages via a POST to a URL and
save them to the datastore..but there doesn't seem to be any of the
'instant distribution' of updates to people subscribed to your
presence etc. that other XMPP software boast about. Using that xmpp
project linked, there doesn't seem to be anything there about pushing
updates to subscribed users etc.

I dug around the app gallery for other open source apps along similar
lines..there are some would-be twitter/jaiku clones..all I've seen use
non-messaging approaches, but they all make one compromise or another
similar to solutions I'd come up with myself (e.g. taking the last 5
updates from friends and sorting them yourself in memory by
date..which has its own issues, for example, if one user had made >5
updates recently..+  you'd inevitably hit a roadblock on all these
reads beyond a certain number of friends..there's only so many friends
you could do this for before it gets too slow).

Any thoughts/ideas...?

On Mar 11, 1:54 pm, Andrew Badera  wrote:
> you'd have to wonder if there's a push out to Gnip somewhere ... or if
> protocol buffers are involved ..
>
> Thanks-
> - Andy Badera
> - and...@badera.us
> - (518) 641-1280
> - Tech Valley Code Camp 2009.1:http://www.techvalleycodecamp.com/
> - Google me:http://www.google.com/search?q=andrew+badera
>
> On Wed, Mar 11, 2009 at 9:21 AM, peterk  wrote:
>
> > I just read on your blog (from January) the intention to release the
> > appengine port of Jaiku as open source when the port is finished..but
> > I was wondering if I could be so cheeky as to jump ahead with a couple
> > of questions about it. The requirements of Jaiku seem to line up
> > roughly similarly with issues I'm having in a slightly different
> > context, that I'm finding pretty challenging to implement efficiently
> > on app engine.
>
> > With your service, you seem to track updates for friends and other
> > people I follow..so I  might have a long list of people I'm following,
> > and you feed me their updates.
>
> > How do you implement this on GAE?
>
> > I've been toying with a very similar problem for some time now. It
> > seems to me you cannot chain together queries such as
> > me.friends.updates.order(..) to get your friends' latest updates, for
> > example. You can't make n writes to n update queues for n people
> > following you, since writes are so costly. If I store my friends in a
> > list of keys, this limits the number of friends I can query at a given
> > time to 30. e.g. updates.all().filter('user IN', me.friends) is
> > limited to 30 subqueries. I may have many more friends, so this
> > approach doesn't seem to be sufficient.
>
> > I've been scratching my head over a similar problem for some time now,
> > coming up with various hairbrained schemes that have been overly-
> > complex, none of which deliver scalability to the nth degree. So I'd
> > really, really, really appreciate any insight you could provide in
> > implementing this kind of data model on GAE!! Many thanks!
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-11 Thread peterk

I've done a good bit of googling about what exactly it is I'm trying
to achieve, the correct terminology etc. and turns out I'm really
looking for an efficient way to do pub/sub, publication/subscription,
on appengine.

Which led me to this eyebrow-raising little app from Brett Slakin and
Brad Fitzpatrick:

http://code.google.com/p/pubsubhubbub/

Jaiku folk seem to be involved too.

Basically they have an experimental pub/sub system running on GAE,
using http instead of xmpp. The goals are somewhat broader than my
requirements - basically you have an atom feed at one url you post
data to, the atom feed updates a hub, and the hub then updates your
subscribers at different urls, so it's all a lot more distributed than
I need.

I imagine one could do something similar rolled into one application
that handles publishing, the hub, and subscription (e.g. like jaiku or
twitter). But the big minus point that sticks out here for me in
pubsubhubub is that it completely avoids the question of the total
write-cost per update. It can afford to do that because subscribers
here are distributed all over the place on different domains and
servers etc. so the write load gets spread across all those different
urls/machines..it's not the concern of the hub running on appengine.
But if pubs/subs/hub are all rolled into one application on appengine,
then your application will be soaking up the total write cost. The hub
handles updating each subscriber in the background, I think, so it
does solve the issue of writing to n subscribers in the space of one
request..the publisher issues its update to the hub, returns
immediately, and then the hub partitions out the subscriber updates
and works on updating everyone over time. But it doesn't solve the
issue of monetary cost of doing n writes for n subscribers...if one
publisher had lots of subscribers (e.g. hundreds of thousands or
millions), each of their updates could cost $$$!

But I guess one might be safe in assuming the average number of subs
per publisher will be much lower than that..and make the idea
financially feasible..

Anyway, I thought this project looked very interesting...it's giving
me a lot to chew on now..

On Mar 11, 3:50 pm, peterk  wrote:
> Hmm. Good thinking Andrew. I've done a bit more digging along those
> lines..I don't think Jaiku has publically said if and what messaging
> protocols they might use, but there seems to be some speculation that
> it uses XMPP.
>
> That might also neatly align with their comment that they are in some
> cases "extending" GAE in their port..and the recent news that XMPP is
> scheduled to come to GAE at some point in the next 6 months. Perhaps
> it's the work on the Jaiku port that's bringing that functionality...
>
> I didn't know anything about messaging protocols or XMPP until I dug
> around after your post, so it's all new to me. I found a project
> purporting to provide XMPP functionality in advance of Google's
> official support, athttp://imified-demo.appspot.com/, but looking at
> the code for it, I cannot for the life of me see how this could be
> used to address the problems discussed in my original email. That
> project seems only to take instant messages via a POST to a URL and
> save them to the datastore..but there doesn't seem to be any of the
> 'instant distribution' of updates to people subscribed to your
> presence etc. that other XMPP software boast about. Using that xmpp
> project linked, there doesn't seem to be anything there about pushing
> updates to subscribed users etc.
>
> I dug around the app gallery for other open source apps along similar
> lines..there are some would-be twitter/jaiku clones..all I've seen use
> non-messaging approaches, but they all make one compromise or another
> similar to solutions I'd come up with myself (e.g. taking the last 5
> updates from friends and sorting them yourself in memory by
> date..which has its own issues, for example, if one user had made >5
> updates recently..+  you'd inevitably hit a roadblock on all these
> reads beyond a certain number of friends..there's only so many friends
> you could do this for before it gets too slow).
>
> Any thoughts/ideas...?
>
> On Mar 11, 1:54 pm, Andrew Badera  wrote:
>
> > you'd have to wonder if there's a push out to Gnip somewhere ... or if
> > protocol buffers are involved ..
>
> > Thanks-
> > - Andy Badera
> > - and...@badera.us
> > - (518) 641-1280
> > - Tech Valley Code Camp 2009.1:http://www.techvalleycodecamp.com/
> > - Google me:http://www.google.com/search?q=andrew+badera
>
> > On Wed, Mar 11, 2009 at 9:21 AM, peterk  wrote:
>
> > > I just read on your blog (from January) the intention to release the
> > > appengine port of Jaiku as open source when the port is finished..but
> > > I was wondering if I could be so cheeky as to jump ahead with a couple
> > > of questions about it. The requirements of Jaiku seem to line up
> > > roughly similarly with issues I'm having in a slightly different
> > > context, that I'm fin

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-11 Thread bFlood

great find peterk! cant help but notice the very interesting
async_apiproxy.py code in that project either...async db and url calls
would be awesome

cheers
brian

On Mar 11, 3:22 pm, peterk  wrote:
> I've done a good bit of googling about what exactly it is I'm trying
> to achieve, the correct terminology etc. and turns out I'm really
> looking for an efficient way to do pub/sub, publication/subscription,
> on appengine.
>
> Which led me to this eyebrow-raising little app from Brett Slakin and
> Brad Fitzpatrick:
>
> http://code.google.com/p/pubsubhubbub/
>
> Jaiku folk seem to be involved too.
>
> Basically they have an experimental pub/sub system running on GAE,
> using http instead of xmpp. The goals are somewhat broader than my
> requirements - basically you have an atom feed at one url you post
> data to, the atom feed updates a hub, and the hub then updates your
> subscribers at different urls, so it's all a lot more distributed than
> I need.
>
> I imagine one could do something similar rolled into one application
> that handles publishing, the hub, and subscription (e.g. like jaiku or
> twitter). But the big minus point that sticks out here for me in
> pubsubhubub is that it completely avoids the question of the total
> write-cost per update. It can afford to do that because subscribers
> here are distributed all over the place on different domains and
> servers etc. so the write load gets spread across all those different
> urls/machines..it's not the concern of the hub running on appengine.
> But if pubs/subs/hub are all rolled into one application on appengine,
> then your application will be soaking up the total write cost. The hub
> handles updating each subscriber in the background, I think, so it
> does solve the issue of writing to n subscribers in the space of one
> request..the publisher issues its update to the hub, returns
> immediately, and then the hub partitions out the subscriber updates
> and works on updating everyone over time. But it doesn't solve the
> issue of monetary cost of doing n writes for n subscribers...if one
> publisher had lots of subscribers (e.g. hundreds of thousands or
> millions), each of their updates could cost $$$!
>
> But I guess one might be safe in assuming the average number of subs
> per publisher will be much lower than that..and make the idea
> financially feasible..
>
> Anyway, I thought this project looked very interesting...it's giving
> me a lot to chew on now..
>
> On Mar 11, 3:50 pm, peterk  wrote:
>
> > Hmm. Good thinking Andrew. I've done a bit more digging along those
> > lines..I don't think Jaiku has publically said if and what messaging
> > protocols they might use, but there seems to be some speculation that
> > it uses XMPP.
>
> > That might also neatly align with their comment that they are in some
> > cases "extending" GAE in their port..and the recent news that XMPP is
> > scheduled to come to GAE at some point in the next 6 months. Perhaps
> > it's the work on the Jaiku port that's bringing that functionality...
>
> > I didn't know anything about messaging protocols or XMPP until I dug
> > around after your post, so it's all new to me. I found a project
> > purporting to provide XMPP functionality in advance of Google's
> > official support, athttp://imified-demo.appspot.com/, but looking at
> > the code for it, I cannot for the life of me see how this could be
> > used to address the problems discussed in my original email. That
> > project seems only to take instant messages via a POST to a URL and
> > save them to the datastore..but there doesn't seem to be any of the
> > 'instant distribution' of updates to people subscribed to your
> > presence etc. that other XMPP software boast about. Using that xmpp
> > project linked, there doesn't seem to be anything there about pushing
> > updates to subscribed users etc.
>
> > I dug around the app gallery for other open source apps along similar
> > lines..there are some would-be twitter/jaiku clones..all I've seen use
> > non-messaging approaches, but they all make one compromise or another
> > similar to solutions I'd come up with myself (e.g. taking the last 5
> > updates from friends and sorting them yourself in memory by
> > date..which has its own issues, for example, if one user had made >5
> > updates recently..+  you'd inevitably hit a roadblock on all these
> > reads beyond a certain number of friends..there's only so many friends
> > you could do this for before it gets too slow).
>
> > Any thoughts/ideas...?
>
> > On Mar 11, 1:54 pm, Andrew Badera  wrote:
>
> > > you'd have to wonder if there's a push out to Gnip somewhere ... or if
> > > protocol buffers are involved ..
>
> > > Thanks-
> > > - Andy Badera
> > > - and...@badera.us
> > > - (518) 641-1280
> > > - Tech Valley Code Camp 2009.1:http://www.techvalleycodecamp.com/
> > > - Google me:http://www.google.com/search?q=andrew+badera
>
> > > On Wed, Mar 11, 2009 at 9:21 AM, peterk  wrote:
>
> > > > I just read on your blog

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-11 Thread peterk

The app is actually live here:

http://pubsubhubbub.appspot.com/
http://pubsubhubbub-subscriber.appspot.com/

(pubsubhubbub-publisher isn't there, but it's trivial to upload your
own.)

This suggests it's working on appengine as it is now. Been looking
through the source, and I'm not entirely clear on how the 'background
workers' are actually working..there are two, one for pulling updates
to feeds from publishers, and one for propogating updates to
subscribers in batches.

But like I say, I can't see how they're actually started and running
constantly.  There is a video here of a live demonstration:

http://www.veodia.com/player.php?vid=fCNU1qQ1oSs

The background workers seem to be behaving as desired there, but I'm
not sure if they were just constantly polling some urls to keep the
workers live for the purposes of that demo, or if they're actually
running somehow constantly on their own.. I can't actually get the
live app at the urls above to work, but not sure if it's because
background workers aren't really working, or because i'm feeding it
incorrect urls/configuration etc.

On Mar 11, 8:01 pm, bFlood  wrote:
> great find peterk! cant help but notice the very interesting
> async_apiproxy.py code in that project either...async db and url calls
> would be awesome
>
> cheers
> brian
>
> On Mar 11, 3:22 pm, peterk  wrote:
>
> > I've done a good bit of googling about what exactly it is I'm trying
> > to achieve, the correct terminology etc. and turns out I'm really
> > looking for an efficient way to do pub/sub, publication/subscription,
> > on appengine.
>
> > Which led me to this eyebrow-raising little app from Brett Slakin and
> > Brad Fitzpatrick:
>
> >http://code.google.com/p/pubsubhubbub/
>
> > Jaiku folk seem to be involved too.
>
> > Basically they have an experimental pub/sub system running on GAE,
> > using http instead of xmpp. The goals are somewhat broader than my
> > requirements - basically you have an atom feed at one url you post
> > data to, the atom feed updates a hub, and the hub then updates your
> > subscribers at different urls, so it's all a lot more distributed than
> > I need.
>
> > I imagine one could do something similar rolled into one application
> > that handles publishing, the hub, and subscription (e.g. like jaiku or
> > twitter). But the big minus point that sticks out here for me in
> > pubsubhubub is that it completely avoids the question of the total
> > write-cost per update. It can afford to do that because subscribers
> > here are distributed all over the place on different domains and
> > servers etc. so the write load gets spread across all those different
> > urls/machines..it's not the concern of the hub running on appengine.
> > But if pubs/subs/hub are all rolled into one application on appengine,
> > then your application will be soaking up the total write cost. The hub
> > handles updating each subscriber in the background, I think, so it
> > does solve the issue of writing to n subscribers in the space of one
> > request..the publisher issues its update to the hub, returns
> > immediately, and then the hub partitions out the subscriber updates
> > and works on updating everyone over time. But it doesn't solve the
> > issue of monetary cost of doing n writes for n subscribers...if one
> > publisher had lots of subscribers (e.g. hundreds of thousands or
> > millions), each of their updates could cost $$$!
>
> > But I guess one might be safe in assuming the average number of subs
> > per publisher will be much lower than that..and make the idea
> > financially feasible..
>
> > Anyway, I thought this project looked very interesting...it's giving
> > me a lot to chew on now..
>
> > On Mar 11, 3:50 pm, peterk  wrote:
>
> > > Hmm. Good thinking Andrew. I've done a bit more digging along those
> > > lines..I don't think Jaiku has publically said if and what messaging
> > > protocols they might use, but there seems to be some speculation that
> > > it uses XMPP.
>
> > > That might also neatly align with their comment that they are in some
> > > cases "extending" GAE in their port..and the recent news that XMPP is
> > > scheduled to come to GAE at some point in the next 6 months. Perhaps
> > > it's the work on the Jaiku port that's bringing that functionality...
>
> > > I didn't know anything about messaging protocols or XMPP until I dug
> > > around after your post, so it's all new to me. I found a project
> > > purporting to provide XMPP functionality in advance of Google's
> > > official support, athttp://imified-demo.appspot.com/, but looking at
> > > the code for it, I cannot for the life of me see how this could be
> > > used to address the problems discussed in my original email. That
> > > project seems only to take instant messages via a POST to a URL and
> > > save them to the datastore..but there doesn't seem to be any of the
> > > 'instant distribution' of updates to people subscribed to your
> > > presence etc. that other XMPP software boast about. Usin

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-12 Thread emi420

I implemented a very basic microblog app:

http://microbloog.appspot.com/

Regards!

On Mar 11, 10:21 am, peterk  wrote:
> I just read on your blog (from January) the intention to release the
> appengine port of Jaiku as open source when the port is finished..but
> I was wondering if I could be so cheeky as to jump ahead with a couple
> of questions about it. The requirements of Jaiku seem to line up
> roughly similarly with issues I'm having in a slightly different
> context, that I'm finding pretty challenging to implement efficiently
> on app engine.
>
> With your service, you seem to track updates for friends and other
> people I follow..so I  might have a long list of people I'm following,
> and you feed me their updates.
>
> How do you implement this on GAE?
>
> I've been toying with a very similar problem for some time now. It
> seems to me you cannot chain together queries such as
> me.friends.updates.order(..) to get your friends' latest updates, for
> example. You can't make n writes to n update queues for n people
> following you, since writes are so costly. If I store my friends in a
> list of keys, this limits the number of friends I can query at a given
> time to 30. e.g. updates.all().filter('user IN', me.friends) is
> limited to 30 subqueries. I may have many more friends, so this
> approach doesn't seem to be sufficient.
>
> I've been scratching my head over a similar problem for some time now,
> coming up with various hairbrained schemes that have been overly-
> complex, none of which deliver scalability to the nth degree. So I'd
> really, really, really appreciate any insight you could provide in
> implementing this kind of data model on GAE!! Many thanks!
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-12 Thread bFlood

right, i can't really tell either. it looks like the churn of incoming
requests is powering the "background" workers, the real change is the
async proxy (AsyncAPIProxy.start_call) to fire the workers off without
affecting the incoming request (at least I think thats what is
happening...)

it would be nice to see the code behind this import (since it contains
the AsyncRPC class)
from google3.apphosting.runtime import
_apphosting_runtime___python__apiproxy

cheers
brian


On Mar 11, 6:17 pm, peterk  wrote:
> The app is actually live here:
>
> http://pubsubhubbub.appspot.com/http://pubsubhubbub-subscriber.appspot.com/
>
> (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> own.)
>
> This suggests it's working on appengine as it is now. Been looking
> through the source, and I'm not entirely clear on how the 'background
> workers' are actually working..there are two, one for pulling updates
> to feeds from publishers, and one for propogating updates to
> subscribers in batches.
>
> But like I say, I can't see how they're actually started and running
> constantly.  There is a video here of a live demonstration:
>
> http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> The background workers seem to be behaving as desired there, but I'm
> not sure if they were just constantly polling some urls to keep the
> workers live for the purposes of that demo, or if they're actually
> running somehow constantly on their own.. I can't actually get the
> live app at the urls above to work, but not sure if it's because
> background workers aren't really working, or because i'm feeding it
> incorrect urls/configuration etc.
>
> On Mar 11, 8:01 pm, bFlood  wrote:
>
> > great find peterk! cant help but notice the very interesting
> > async_apiproxy.py code in that project either...async db and url calls
> > would be awesome
>
> > cheers
> > brian
>
> > On Mar 11, 3:22 pm, peterk  wrote:
>
> > > I've done a good bit of googling about what exactly it is I'm trying
> > > to achieve, the correct terminology etc. and turns out I'm really
> > > looking for an efficient way to do pub/sub, publication/subscription,
> > > on appengine.
>
> > > Which led me to this eyebrow-raising little app from Brett Slakin and
> > > Brad Fitzpatrick:
>
> > >http://code.google.com/p/pubsubhubbub/
>
> > > Jaiku folk seem to be involved too.
>
> > > Basically they have an experimental pub/sub system running on GAE,
> > > using http instead of xmpp. The goals are somewhat broader than my
> > > requirements - basically you have an atom feed at one url you post
> > > data to, the atom feed updates a hub, and the hub then updates your
> > > subscribers at different urls, so it's all a lot more distributed than
> > > I need.
>
> > > I imagine one could do something similar rolled into one application
> > > that handles publishing, the hub, and subscription (e.g. like jaiku or
> > > twitter). But the big minus point that sticks out here for me in
> > > pubsubhubub is that it completely avoids the question of the total
> > > write-cost per update. It can afford to do that because subscribers
> > > here are distributed all over the place on different domains and
> > > servers etc. so the write load gets spread across all those different
> > > urls/machines..it's not the concern of the hub running on appengine.
> > > But if pubs/subs/hub are all rolled into one application on appengine,
> > > then your application will be soaking up the total write cost. The hub
> > > handles updating each subscriber in the background, I think, so it
> > > does solve the issue of writing to n subscribers in the space of one
> > > request..the publisher issues its update to the hub, returns
> > > immediately, and then the hub partitions out the subscriber updates
> > > and works on updating everyone over time. But it doesn't solve the
> > > issue of monetary cost of doing n writes for n subscribers...if one
> > > publisher had lots of subscribers (e.g. hundreds of thousands or
> > > millions), each of their updates could cost $$$!
>
> > > But I guess one might be safe in assuming the average number of subs
> > > per publisher will be much lower than that..and make the idea
> > > financially feasible..
>
> > > Anyway, I thought this project looked very interesting...it's giving
> > > me a lot to chew on now..
>
> > > On Mar 11, 3:50 pm, peterk  wrote:
>
> > > > Hmm. Good thinking Andrew. I've done a bit more digging along those
> > > > lines..I don't think Jaiku has publically said if and what messaging
> > > > protocols they might use, but there seems to be some speculation that
> > > > it uses XMPP.
>
> > > > That might also neatly align with their comment that they are in some
> > > > cases "extending" GAE in their port..and the recent news that XMPP is
> > > > scheduled to come to GAE at some point in the next 6 months. Perhaps
> > > > it's the work on the Jaiku port that's bringing that functionality...
>
> > > > I didn't know anything about messaging protoc

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-12 Thread peterk

The urls at /work/pull_feeds and /work/push_events seem to manually
trigger the 'background' work of pulling feeds and pushing events..but
I've still yet to actually get it working :p In the demo vid it seems
to work beautifully, with instantaneous results.

There is a tiny group for pubsubhubub, currently with no messages..I
could leave one asking about it, but it seems somehow wrong for me to
go in with the first post being some noob question :p I know Brett
Slatkin watches this group too..

 emi420 - thanks for the link, I actually came across your site
yesterday during my research. How do you manage feeding updates to
followers? Do you actually write updates n times to n different
message queues?

On Mar 12, 12:30 pm, bFlood  wrote:
> right, i can't really tell either. it looks like the churn of incoming
> requests is powering the "background" workers, the real change is the
> async proxy (AsyncAPIProxy.start_call) to fire the workers off without
> affecting the incoming request (at least I think thats what is
> happening...)
>
> it would be nice to see the code behind this import (since it contains
> the AsyncRPC class)
> from google3.apphosting.runtime import
> _apphosting_runtime___python__apiproxy
>
> cheers
> brian
>
> On Mar 11, 6:17 pm, peterk  wrote:
>
> > The app is actually live here:
>
> >http://pubsubhubbub.appspot.com/http://pubsubhubbub-subscriber.appspo...
>
> > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > own.)
>
> > This suggests it's working on appengine as it is now. Been looking
> > through the source, and I'm not entirely clear on how the 'background
> > workers' are actually working..there are two, one for pulling updates
> > to feeds from publishers, and one for propogating updates to
> > subscribers in batches.
>
> > But like I say, I can't see how they're actually started and running
> > constantly.  There is a video here of a live demonstration:
>
> >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> > The background workers seem to be behaving as desired there, but I'm
> > not sure if they were just constantly polling some urls to keep the
> > workers live for the purposes of that demo, or if they're actually
> > running somehow constantly on their own.. I can't actually get the
> > live app at the urls above to work, but not sure if it's because
> > background workers aren't really working, or because i'm feeding it
> > incorrect urls/configuration etc.
>
> > On Mar 11, 8:01 pm, bFlood  wrote:
>
> > > great find peterk! cant help but notice the very interesting
> > > async_apiproxy.py code in that project either...async db and url calls
> > > would be awesome
>
> > > cheers
> > > brian
>
> > > On Mar 11, 3:22 pm, peterk  wrote:
>
> > > > I've done a good bit of googling about what exactly it is I'm trying
> > > > to achieve, the correct terminology etc. and turns out I'm really
> > > > looking for an efficient way to do pub/sub, publication/subscription,
> > > > on appengine.
>
> > > > Which led me to this eyebrow-raising little app from Brett Slakin and
> > > > Brad Fitzpatrick:
>
> > > >http://code.google.com/p/pubsubhubbub/
>
> > > > Jaiku folk seem to be involved too.
>
> > > > Basically they have an experimental pub/sub system running on GAE,
> > > > using http instead of xmpp. The goals are somewhat broader than my
> > > > requirements - basically you have an atom feed at one url you post
> > > > data to, the atom feed updates a hub, and the hub then updates your
> > > > subscribers at different urls, so it's all a lot more distributed than
> > > > I need.
>
> > > > I imagine one could do something similar rolled into one application
> > > > that handles publishing, the hub, and subscription (e.g. like jaiku or
> > > > twitter). But the big minus point that sticks out here for me in
> > > > pubsubhubub is that it completely avoids the question of the total
> > > > write-cost per update. It can afford to do that because subscribers
> > > > here are distributed all over the place on different domains and
> > > > servers etc. so the write load gets spread across all those different
> > > > urls/machines..it's not the concern of the hub running on appengine.
> > > > But if pubs/subs/hub are all rolled into one application on appengine,
> > > > then your application will be soaking up the total write cost. The hub
> > > > handles updating each subscriber in the background, I think, so it
> > > > does solve the issue of writing to n subscribers in the space of one
> > > > request..the publisher issues its update to the hub, returns
> > > > immediately, and then the hub partitions out the subscriber updates
> > > > and works on updating everyone over time. But it doesn't solve the
> > > > issue of monetary cost of doing n writes for n subscribers...if one
> > > > publisher had lots of subscribers (e.g. hundreds of thousands or
> > > > millions), each of their updates could cost $$$!
>
> > > > But I guess one might be safe in assuming the average

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-12 Thread Brett Slatkin

Heyo,

Good finds, peterk!

pubsubhubbub uses some of the same techniques that Jaiku uses for
doing one-to-many fan-out of status message updates. The migration is
underway as we speak
(http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
believe the code should be available very soon.

2009/3/11 peterk :
>
> The app is actually live here:
>
> http://pubsubhubbub.appspot.com/
> http://pubsubhubbub-subscriber.appspot.com/
>
> (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> own.)
>
> This suggests it's working on appengine as it is now. Been looking
> through the source, and I'm not entirely clear on how the 'background
> workers' are actually working..there are two, one for pulling updates
> to feeds from publishers, and one for propogating updates to
> subscribers in batches.
>
> But like I say, I can't see how they're actually started and running
> constantly.  There is a video here of a live demonstration:
>
> http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> The background workers seem to be behaving as desired there, but I'm
> not sure if they were just constantly polling some urls to keep the
> workers live for the purposes of that demo, or if they're actually
> running somehow constantly on their own.. I can't actually get the
> live app at the urls above to work, but not sure if it's because
> background workers aren't really working, or because i'm feeding it
> incorrect urls/configuration etc.

Ah sorry yeah I still have the old version of the source running on
pubsubhubbub.appspot.com; I need to update that with a more recent
build. Sorry for the trouble! It's still not quite ready for
widespread use, but it should be soon.

The way pubsubhubbub does fan-out, there's no need to write an entity
for each subscriber of a feed. Instead, each time it consumes a task
from the work queue it will update the current iterator position in
the query result of subscribers for a URL. Subsequent work requests
will offset into the subscribers starting at the iterator position.
This works well in this case because it's using urlfetch to actually
notify subscribers, instead of writing to the Datastore.

For other pub/sub-style systems where you want to write to the
Datastore, the trick is to use list properties to track the
subscribers you've published to. So for instance, instead of writing a
single entity per subscriber, you write one entity with 1000-2000
subscriber IDs in a list. Then all queries for that list with an
equals filter for the subscriber will show the entity. This lets you
pack a lot of information into a single entity write, thus minimizing
Datastore overhead, cost, etc. Does that make sense?


@bFlood: Indeed, the async_apiproxy.py code is interesting. Not much
to say about that at this time, besides the fact that it works. =)

-Brett

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-13 Thread Paul Kinlan
Just Curious,

For other pub/sub-style systems where you want to write to the
Datastore, the trick is to use list properties to track the
subscribers you've published to. So for instance, instead of writing a
single entity per subscriber, you write one entity with 1000-2000
subscriber IDs in a list. Then all queries for that list with an
equals filter for the subscriber will show the entity. This lets you
pack a lot of information into a single entity write, thus minimizing
Datastore overhead, cost, etc. Does that make sense?

So if you have over the 5000 limit in the subscribers would you write the
entity twice? Each with differnt subscriber id's?

Paul

2009/3/13 Brett Slatkin 

>
> Heyo,
>
> Good finds, peterk!
>
> pubsubhubbub uses some of the same techniques that Jaiku uses for
> doing one-to-many fan-out of status message updates. The migration is
> underway as we speak
> (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> believe the code should be available very soon.
>
> 2009/3/11 peterk :
> >
> > The app is actually live here:
> >
> > http://pubsubhubbub.appspot.com/
> > http://pubsubhubbub-subscriber.appspot.com/
> >
> > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > own.)
> >
> > This suggests it's working on appengine as it is now. Been looking
> > through the source, and I'm not entirely clear on how the 'background
> > workers' are actually working..there are two, one for pulling updates
> > to feeds from publishers, and one for propogating updates to
> > subscribers in batches.
> >
> > But like I say, I can't see how they're actually started and running
> > constantly.  There is a video here of a live demonstration:
> >
> > http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
> >
> > The background workers seem to be behaving as desired there, but I'm
> > not sure if they were just constantly polling some urls to keep the
> > workers live for the purposes of that demo, or if they're actually
> > running somehow constantly on their own.. I can't actually get the
> > live app at the urls above to work, but not sure if it's because
> > background workers aren't really working, or because i'm feeding it
> > incorrect urls/configuration etc.
>
> Ah sorry yeah I still have the old version of the source running on
> pubsubhubbub.appspot.com; I need to update that with a more recent
> build. Sorry for the trouble! It's still not quite ready for
> widespread use, but it should be soon.
>
> The way pubsubhubbub does fan-out, there's no need to write an entity
> for each subscriber of a feed. Instead, each time it consumes a task
> from the work queue it will update the current iterator position in
> the query result of subscribers for a URL. Subsequent work requests
> will offset into the subscribers starting at the iterator position.
> This works well in this case because it's using urlfetch to actually
> notify subscribers, instead of writing to the Datastore.
>
> For other pub/sub-style systems where you want to write to the
> Datastore, the trick is to use list properties to track the
> subscribers you've published to. So for instance, instead of writing a
> single entity per subscriber, you write one entity with 1000-2000
> subscriber IDs in a list. Then all queries for that list with an
> equals filter for the subscriber will show the entity. This lets you
> pack a lot of information into a single entity write, thus minimizing
> Datastore overhead, cost, etc. Does that make sense?
>
>
> @bFlood: Indeed, the async_apiproxy.py code is interesting. Not much
> to say about that at this time, besides the fact that it works. =)
>
> -Brett
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-13 Thread peterk

I was just toying around with this idea yesterday Brett.. :D I did
some profiling, and it would reduce the write cost per subscriber to
about 24ms-40ms (depending on the number of subscribers you have..more
= lower cost per avg), from 100-150ms. These are rough numbers with
entities I was using, I have to do some more accurate profiling..

When I first thought about doing this, I was thinking ":o I'll reduce
write cost by a factor of hundreds!", but as it turns out, the extra
index update time for an entity with a large number of list property
entries eats into that saving significantly.

But it still is a saving. Funnily enough the per subscriber saving
increases (to a point) the more subscribers you have.

I'm not sure if there's anything one can do to optimise index creation
time with large lists.. I'm going to do some more work as well to see
if there's an optimum 'batch size' for grouping subscribers
together..at first blush, as mentioned above, it seems the larger the
better (up to the per entity property/index cap of course).

Thanks also for the insight on pubsubhubub..I eagerly await updates on
that front :) Thank you!!

On Mar 13, 8:05 am, Paul Kinlan  wrote:
> Just Curious,
>
> For other pub/sub-style systems where you want to write to the
> Datastore, the trick is to use list properties to track the
> subscribers you've published to. So for instance, instead of writing a
> single entity per subscriber, you write one entity with 1000-2000
> subscriber IDs in a list. Then all queries for that list with an
> equals filter for the subscriber will show the entity. This lets you
> pack a lot of information into a single entity write, thus minimizing
> Datastore overhead, cost, etc. Does that make sense?
>
> So if you have over the 5000 limit in the subscribers would you write the
> entity twice? Each with differnt subscriber id's?
>
> Paul
>
> 2009/3/13 Brett Slatkin 
>
>
>
> > Heyo,
>
> > Good finds, peterk!
>
> > pubsubhubbub uses some of the same techniques that Jaiku uses for
> > doing one-to-many fan-out of status message updates. The migration is
> > underway as we speak
> > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> > believe the code should be available very soon.
>
> > 2009/3/11 peterk :
>
> > > The app is actually live here:
>
> > >http://pubsubhubbub.appspot.com/
> > >http://pubsubhubbub-subscriber.appspot.com/
>
> > > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > > own.)
>
> > > This suggests it's working on appengine as it is now. Been looking
> > > through the source, and I'm not entirely clear on how the 'background
> > > workers' are actually working..there are two, one for pulling updates
> > > to feeds from publishers, and one for propogating updates to
> > > subscribers in batches.
>
> > > But like I say, I can't see how they're actually started and running
> > > constantly.  There is a video here of a live demonstration:
>
> > >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> > > The background workers seem to be behaving as desired there, but I'm
> > > not sure if they were just constantly polling some urls to keep the
> > > workers live for the purposes of that demo, or if they're actually
> > > running somehow constantly on their own.. I can't actually get the
> > > live app at the urls above to work, but not sure if it's because
> > > background workers aren't really working, or because i'm feeding it
> > > incorrect urls/configuration etc.
>
> > Ah sorry yeah I still have the old version of the source running on
> > pubsubhubbub.appspot.com; I need to update that with a more recent
> > build. Sorry for the trouble! It's still not quite ready for
> > widespread use, but it should be soon.
>
> > The way pubsubhubbub does fan-out, there's no need to write an entity
> > for each subscriber of a feed. Instead, each time it consumes a task
> > from the work queue it will update the current iterator position in
> > the query result of subscribers for a URL. Subsequent work requests
> > will offset into the subscribers starting at the iterator position.
> > This works well in this case because it's using urlfetch to actually
> > notify subscribers, instead of writing to the Datastore.
>
> > For other pub/sub-style systems where you want to write to the
> > Datastore, the trick is to use list properties to track the
> > subscribers you've published to. So for instance, instead of writing a
> > single entity per subscriber, you write one entity with 1000-2000
> > subscriber IDs in a list. Then all queries for that list with an
> > equals filter for the subscriber will show the entity. This lets you
> > pack a lot of information into a single entity write, thus minimizing
> > Datastore overhead, cost, etc. Does that make sense?
>
> > @bFlood: Indeed, the async_apiproxy.py code is interesting. Not much
> > to say about that at this time, besides the fact that it works. =)
>
> > -Brett
--~--~-~--~~~---~--

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-13 Thread bFlood

@peterk - if you don't need to query by the subscriber, you could
alternatively pack the list of subscribers for a feed into a
TextProperty so it is not indexed. I use TextProperty a lot to store
large lists of geometry data and they work out pretty well

@brett - async! looking forward to it in future GAE builds. thanks

cheers
brian

On Mar 13, 5:37 am, peterk  wrote:
> I was just toying around with this idea yesterday Brett.. :D I did
> some profiling, and it would reduce the write cost per subscriber to
> about 24ms-40ms (depending on the number of subscribers you have..more
> = lower cost per avg), from 100-150ms. These are rough numbers with
> entities I was using, I have to do some more accurate profiling..
>
> When I first thought about doing this, I was thinking ":o I'll reduce
> write cost by a factor of hundreds!", but as it turns out, the extra
> index update time for an entity with a large number of list property
> entries eats into that saving significantly.
>
> But it still is a saving. Funnily enough the per subscriber saving
> increases (to a point) the more subscribers you have.
>
> I'm not sure if there's anything one can do to optimise index creation
> time with large lists.. I'm going to do some more work as well to see
> if there's an optimum 'batch size' for grouping subscribers
> together..at first blush, as mentioned above, it seems the larger the
> better (up to the per entity property/index cap of course).
>
> Thanks also for the insight on pubsubhubub..I eagerly await updates on
> that front :) Thank you!!
>
> On Mar 13, 8:05 am, Paul Kinlan  wrote:
>
> > Just Curious,
>
> > For other pub/sub-style systems where you want to write to the
> > Datastore, the trick is to use list properties to track the
> > subscribers you've published to. So for instance, instead of writing a
> > single entity per subscriber, you write one entity with 1000-2000
> > subscriber IDs in a list. Then all queries for that list with an
> > equals filter for the subscriber will show the entity. This lets you
> > pack a lot of information into a single entity write, thus minimizing
> > Datastore overhead, cost, etc. Does that make sense?
>
> > So if you have over the 5000 limit in the subscribers would you write the
> > entity twice? Each with differnt subscriber id's?
>
> > Paul
>
> > 2009/3/13 Brett Slatkin 
>
> > > Heyo,
>
> > > Good finds, peterk!
>
> > > pubsubhubbub uses some of the same techniques that Jaiku uses for
> > > doing one-to-many fan-out of status message updates. The migration is
> > > underway as we speak
> > > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> > > believe the code should be available very soon.
>
> > > 2009/3/11 peterk :
>
> > > > The app is actually live here:
>
> > > >http://pubsubhubbub.appspot.com/
> > > >http://pubsubhubbub-subscriber.appspot.com/
>
> > > > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > > > own.)
>
> > > > This suggests it's working on appengine as it is now. Been looking
> > > > through the source, and I'm not entirely clear on how the 'background
> > > > workers' are actually working..there are two, one for pulling updates
> > > > to feeds from publishers, and one for propogating updates to
> > > > subscribers in batches.
>
> > > > But like I say, I can't see how they're actually started and running
> > > > constantly.  There is a video here of a live demonstration:
>
> > > >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> > > > The background workers seem to be behaving as desired there, but I'm
> > > > not sure if they were just constantly polling some urls to keep the
> > > > workers live for the purposes of that demo, or if they're actually
> > > > running somehow constantly on their own.. I can't actually get the
> > > > live app at the urls above to work, but not sure if it's because
> > > > background workers aren't really working, or because i'm feeding it
> > > > incorrect urls/configuration etc.
>
> > > Ah sorry yeah I still have the old version of the source running on
> > > pubsubhubbub.appspot.com; I need to update that with a more recent
> > > build. Sorry for the trouble! It's still not quite ready for
> > > widespread use, but it should be soon.
>
> > > The way pubsubhubbub does fan-out, there's no need to write an entity
> > > for each subscriber of a feed. Instead, each time it consumes a task
> > > from the work queue it will update the current iterator position in
> > > the query result of subscribers for a URL. Subsequent work requests
> > > will offset into the subscribers starting at the iterator position.
> > > This works well in this case because it's using urlfetch to actually
> > > notify subscribers, instead of writing to the Datastore.
>
> > > For other pub/sub-style systems where you want to write to the
> > > Datastore, the trick is to use list properties to track the
> > > subscribers you've published to. So for instance, instead of writing a
> > > single entity per s

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-13 Thread peterk

Unfortunately I do need to query them based on subscriber_id..so I
can't pack them into a non-indexed property.

Retrieving updates particular user has subscribed to is blazingly fast
though...that's the gain in the end, I can query and fetch 1000
updates for a user sorted by date in 20-30ms-cpu. Love that :p In my
hacky approaches previously where I tried to write once and then
'gather', I had to do lots of in-memory sorting and stuff, and even
the results often wouldn't be totally accurate.

I'm going to keep toying with the write end of things though..because
in my full app, I may need to do write to other entities along with
subscribers to do certain things I'm trying to achieve. So I'm going
to be looking for every opportunity possible to optimise the cost of
an 'update', which in my case may go beyond notifying subscribers. So
any thoughts/ideas on further optimisation are more than welcome!!

@Paul

If you've more subscribers than will fit in one 'group' you'll need
multiple groups, correct. So you'll have n writes, where n = number of
subscribers/group-size, rounded up to the nearest whole number. Even
with the costly index creation for each of these 'group' entities
though, it should still work out a fair bit cheaper than writing a
seperate entity for each subscriber.


On Mar 13, 11:47 am, bFlood  wrote:
> @peterk - if you don't need to query by the subscriber, you could
> alternatively pack the list of subscribers for a feed into a
> TextProperty so it is not indexed. I use TextProperty a lot to store
> large lists of geometry data and they work out pretty well
>
> @brett - async! looking forward to it in future GAE builds. thanks
>
> cheers
> brian
>
> On Mar 13, 5:37 am, peterk  wrote:
>
> > I was just toying around with this idea yesterday Brett.. :D I did
> > some profiling, and it would reduce the write cost per subscriber to
> > about 24ms-40ms (depending on the number of subscribers you have..more
> > = lower cost per avg), from 100-150ms. These are rough numbers with
> > entities I was using, I have to do some more accurate profiling..
>
> > When I first thought about doing this, I was thinking ":o I'll reduce
> > write cost by a factor of hundreds!", but as it turns out, the extra
> > index update time for an entity with a large number of list property
> > entries eats into that saving significantly.
>
> > But it still is a saving. Funnily enough the per subscriber saving
> > increases (to a point) the more subscribers you have.
>
> > I'm not sure if there's anything one can do to optimise index creation
> > time with large lists.. I'm going to do some more work as well to see
> > if there's an optimum 'batch size' for grouping subscribers
> > together..at first blush, as mentioned above, it seems the larger the
> > better (up to the per entity property/index cap of course).
>
> > Thanks also for the insight on pubsubhubub..I eagerly await updates on
> > that front :) Thank you!!
>
> > On Mar 13, 8:05 am, Paul Kinlan  wrote:
>
> > > Just Curious,
>
> > > For other pub/sub-style systems where you want to write to the
> > > Datastore, the trick is to use list properties to track the
> > > subscribers you've published to. So for instance, instead of writing a
> > > single entity per subscriber, you write one entity with 1000-2000
> > > subscriber IDs in a list. Then all queries for that list with an
> > > equals filter for the subscriber will show the entity. This lets you
> > > pack a lot of information into a single entity write, thus minimizing
> > > Datastore overhead, cost, etc. Does that make sense?
>
> > > So if you have over the 5000 limit in the subscribers would you write the
> > > entity twice? Each with differnt subscriber id's?
>
> > > Paul
>
> > > 2009/3/13 Brett Slatkin 
>
> > > > Heyo,
>
> > > > Good finds, peterk!
>
> > > > pubsubhubbub uses some of the same techniques that Jaiku uses for
> > > > doing one-to-many fan-out of status message updates. The migration is
> > > > underway as we speak
> > > > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> > > > believe the code should be available very soon.
>
> > > > 2009/3/11 peterk :
>
> > > > > The app is actually live here:
>
> > > > >http://pubsubhubbub.appspot.com/
> > > > >http://pubsubhubbub-subscriber.appspot.com/
>
> > > > > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > > > > own.)
>
> > > > > This suggests it's working on appengine as it is now. Been looking
> > > > > through the source, and I'm not entirely clear on how the 'background
> > > > > workers' are actually working..there are two, one for pulling updates
> > > > > to feeds from publishers, and one for propogating updates to
> > > > > subscribers in batches.
>
> > > > > But like I say, I can't see how they're actually started and running
> > > > > constantly.  There is a video here of a live demonstration:
>
> > > > >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> > > > > The background workers seem to be 

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-14 Thread peterk

Just a head's up - Jaiku has gone open source :)

http://code.google.com/p/jaikuengine/

At a very brief first glance, I see references to xmpp stuff and
more..going to try and map out the code and see what goodies might be
there, could be stuff of interest beyond pub/sub too.

On Mar 13, 1:28 pm, peterk  wrote:
> Unfortunately I do need to query them based on subscriber_id..so I
> can't pack them into a non-indexed property.
>
> Retrieving updates particular user has subscribed to is blazingly fast
> though...that's the gain in the end, I can query and fetch 1000
> updates for a user sorted by date in 20-30ms-cpu. Love that :p In my
> hacky approaches previously where I tried to write once and then
> 'gather', I had to do lots of in-memory sorting and stuff, and even
> the results often wouldn't be totally accurate.
>
> I'm going to keep toying with the write end of things though..because
> in my full app, I may need to do write to other entities along with
> subscribers to do certain things I'm trying to achieve. So I'm going
> to be looking for every opportunity possible to optimise the cost of
> an 'update', which in my case may go beyond notifying subscribers. So
> any thoughts/ideas on further optimisation are more than welcome!!
>
> @Paul
>
> If you've more subscribers than will fit in one 'group' you'll need
> multiple groups, correct. So you'll have n writes, where n = number of
> subscribers/group-size, rounded up to the nearest whole number. Even
> with the costly index creation for each of these 'group' entities
> though, it should still work out a fair bit cheaper than writing a
> seperate entity for each subscriber.
>
> On Mar 13, 11:47 am, bFlood  wrote:
>
> > @peterk - if you don't need to query by the subscriber, you could
> > alternatively pack the list of subscribers for a feed into a
> > TextProperty so it is not indexed. I use TextProperty a lot to store
> > large lists of geometry data and they work out pretty well
>
> > @brett - async! looking forward to it in future GAE builds. thanks
>
> > cheers
> > brian
>
> > On Mar 13, 5:37 am, peterk  wrote:
>
> > > I was just toying around with this idea yesterday Brett.. :D I did
> > > some profiling, and it would reduce the write cost per subscriber to
> > > about 24ms-40ms (depending on the number of subscribers you have..more
> > > = lower cost per avg), from 100-150ms. These are rough numbers with
> > > entities I was using, I have to do some more accurate profiling..
>
> > > When I first thought about doing this, I was thinking ":o I'll reduce
> > > write cost by a factor of hundreds!", but as it turns out, the extra
> > > index update time for an entity with a large number of list property
> > > entries eats into that saving significantly.
>
> > > But it still is a saving. Funnily enough the per subscriber saving
> > > increases (to a point) the more subscribers you have.
>
> > > I'm not sure if there's anything one can do to optimise index creation
> > > time with large lists.. I'm going to do some more work as well to see
> > > if there's an optimum 'batch size' for grouping subscribers
> > > together..at first blush, as mentioned above, it seems the larger the
> > > better (up to the per entity property/index cap of course).
>
> > > Thanks also for the insight on pubsubhubub..I eagerly await updates on
> > > that front :) Thank you!!
>
> > > On Mar 13, 8:05 am, Paul Kinlan  wrote:
>
> > > > Just Curious,
>
> > > > For other pub/sub-style systems where you want to write to the
> > > > Datastore, the trick is to use list properties to track the
> > > > subscribers you've published to. So for instance, instead of writing a
> > > > single entity per subscriber, you write one entity with 1000-2000
> > > > subscriber IDs in a list. Then all queries for that list with an
> > > > equals filter for the subscriber will show the entity. This lets you
> > > > pack a lot of information into a single entity write, thus minimizing
> > > > Datastore overhead, cost, etc. Does that make sense?
>
> > > > So if you have over the 5000 limit in the subscribers would you write 
> > > > the
> > > > entity twice? Each with differnt subscriber id's?
>
> > > > Paul
>
> > > > 2009/3/13 Brett Slatkin 
>
> > > > > Heyo,
>
> > > > > Good finds, peterk!
>
> > > > > pubsubhubbub uses some of the same techniques thatJaikuuses for
> > > > > doing one-to-many fan-out of status message updates. The migration is
> > > > > underway as we speak
> > > > > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> > > > > believe the code should be available very soon.
>
> > > > > 2009/3/11 peterk :
>
> > > > > > The app is actually live here:
>
> > > > > >http://pubsubhubbub.appspot.com/
> > > > > >http://pubsubhubbub-subscriber.appspot.com/
>
> > > > > > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > > > > > own.)
>
> > > > > > This suggests it's working on appengine as it is now. Been looking
> > > > > > through the source, and I'm not

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-14 Thread Mahmoud

By the way, we do something very similar at Sponty (http://
www.thesponty.com/boston). We're adding a new feature whereby when one
of your friends posts an event, we notify his or her friends via
email, sms or twitter. It would take too long to process those
communique right away. So we just write a PendingNotification to the
datastore.

All clients make a periodic ajax call to the server. We call that
"presence". Each presence call processes a set of PendingNotifications
in a transaction to ensure we do not email people more than once.
Since the UI is not waiting on the result of the presence call, it can
take as long as possible.

-Mahmoud


On Mar 13, 2:23 am, Brett Slatkin  wrote:
> Heyo,
>
> Good finds, peterk!
>
> pubsubhubbub uses some of the same techniques that Jaiku uses for
> doing one-to-many fan-out of status message updates. The migration is
> underway as we speak
> (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/). I
> believe the code should be available very soon.
>
> 2009/3/11 peterk :
>
>
>
>
>
> > The app is actually live here:
>
> >http://pubsubhubbub.appspot.com/
> >http://pubsubhubbub-subscriber.appspot.com/
>
> > (pubsubhubbub-publisher isn't there, but it's trivial to upload your
> > own.)
>
> > This suggests it's working on appengine as it is now. Been looking
> > through the source, and I'm not entirely clear on how the 'background
> > workers' are actually working..there are two, one for pulling updates
> > to feeds from publishers, and one for propogating updates to
> > subscribers in batches.
>
> > But like I say, I can't see how they're actually started and running
> > constantly.  There is a video here of a live demonstration:
>
> >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
>
> > The background workers seem to be behaving as desired there, but I'm
> > not sure if they were just constantly polling some urls to keep the
> > workers live for the purposes of that demo, or if they're actually
> > running somehow constantly on their own.. I can't actually get the
> > live app at the urls above to work, but not sure if it's because
> > background workers aren't really working, or because i'm feeding it
> > incorrect urls/configuration etc.
>
> Ah sorry yeah I still have the old version of the source running on
> pubsubhubbub.appspot.com; I need to update that with a more recent
> build. Sorry for the trouble! It's still not quite ready for
> widespread use, but it should be soon.
>
> The way pubsubhubbub does fan-out, there's no need to write an entity
> for each subscriber of a feed. Instead, each time it consumes a task
> from the work queue it will update the current iterator position in
> the query result of subscribers for a URL. Subsequent work requests
> will offset into the subscribers starting at the iterator position.
> This works well in this case because it's using urlfetch to actually
> notify subscribers, instead of writing to the Datastore.
>
> For other pub/sub-style systems where you want to write to the
> Datastore, the trick is to use list properties to track the
> subscribers you've published to. So for instance, instead of writing a
> single entity per subscriber, you write one entity with 1000-2000
> subscriber IDs in a list. Then all queries for that list with an
> equals filter for the subscriber will show the entity. This lets you
> pack a lot of information into a single entity write, thus minimizing
> Datastore overhead, cost, etc. Does that make sense?
>
> @bFlood: Indeed, the async_apiproxy.py code is interesting. Not much
> to say about that at this time, besides the fact that it works. =)
>
> -Brett
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-14 Thread Paul Kinlan
I am really stoked by the fact that it is now open.  I love seeing how other
people develop software.

Paul.

2009/3/14 peterk 

>
> Just a head's up - Jaiku has gone open source :)
>
> http://code.google.com/p/jaikuengine/
>
> At a very brief first glance, I see references to xmpp stuff and
> more..going to try and map out the code and see what goodies might be
> there, could be stuff of interest beyond pub/sub too.
>
> On Mar 13, 1:28 pm, peterk  wrote:
> > Unfortunately I do need to query them based on subscriber_id..so I
> > can't pack them into a non-indexed property.
> >
> > Retrieving updates particular user has subscribed to is blazingly fast
> > though...that's the gain in the end, I can query and fetch 1000
> > updates for a user sorted by date in 20-30ms-cpu. Love that :p In my
> > hacky approaches previously where I tried to write once and then
> > 'gather', I had to do lots of in-memory sorting and stuff, and even
> > the results often wouldn't be totally accurate.
> >
> > I'm going to keep toying with the write end of things though..because
> > in my full app, I may need to do write to other entities along with
> > subscribers to do certain things I'm trying to achieve. So I'm going
> > to be looking for every opportunity possible to optimise the cost of
> > an 'update', which in my case may go beyond notifying subscribers. So
> > any thoughts/ideas on further optimisation are more than welcome!!
> >
> > @Paul
> >
> > If you've more subscribers than will fit in one 'group' you'll need
> > multiple groups, correct. So you'll have n writes, where n = number of
> > subscribers/group-size, rounded up to the nearest whole number. Even
> > with the costly index creation for each of these 'group' entities
> > though, it should still work out a fair bit cheaper than writing a
> > seperate entity for each subscriber.
> >
> > On Mar 13, 11:47 am, bFlood  wrote:
> >
> > > @peterk - if you don't need to query by the subscriber, you could
> > > alternatively pack the list of subscribers for a feed into a
> > > TextProperty so it is not indexed. I use TextProperty a lot to store
> > > large lists of geometry data and they work out pretty well
> >
> > > @brett - async! looking forward to it in future GAE builds. thanks
> >
> > > cheers
> > > brian
> >
> > > On Mar 13, 5:37 am, peterk  wrote:
> >
> > > > I was just toying around with this idea yesterday Brett.. :D I did
> > > > some profiling, and it would reduce the write cost per subscriber to
> > > > about 24ms-40ms (depending on the number of subscribers you
> have..more
> > > > = lower cost per avg), from 100-150ms. These are rough numbers with
> > > > entities I was using, I have to do some more accurate profiling..
> >
> > > > When I first thought about doing this, I was thinking ":o I'll reduce
> > > > write cost by a factor of hundreds!", but as it turns out, the extra
> > > > index update time for an entity with a large number of list property
> > > > entries eats into that saving significantly.
> >
> > > > But it still is a saving. Funnily enough the per subscriber saving
> > > > increases (to a point) the more subscribers you have.
> >
> > > > I'm not sure if there's anything one can do to optimise index
> creation
> > > > time with large lists.. I'm going to do some more work as well to see
> > > > if there's an optimum 'batch size' for grouping subscribers
> > > > together..at first blush, as mentioned above, it seems the larger the
> > > > better (up to the per entity property/index cap of course).
> >
> > > > Thanks also for the insight on pubsubhubub..I eagerly await updates
> on
> > > > that front :) Thank you!!
> >
> > > > On Mar 13, 8:05 am, Paul Kinlan  wrote:
> >
> > > > > Just Curious,
> >
> > > > > For other pub/sub-style systems where you want to write to the
> > > > > Datastore, the trick is to use list properties to track the
> > > > > subscribers you've published to. So for instance, instead of
> writing a
> > > > > single entity per subscriber, you write one entity with 1000-2000
> > > > > subscriber IDs in a list. Then all queries for that list with an
> > > > > equals filter for the subscriber will show the entity. This lets
> you
> > > > > pack a lot of information into a single entity write, thus
> minimizing
> > > > > Datastore overhead, cost, etc. Does that make sense?
> >
> > > > > So if you have over the 5000 limit in the subscribers would you
> write the
> > > > > entity twice? Each with differnt subscriber id's?
> >
> > > > > Paul
> >
> > > > > 2009/3/13 Brett Slatkin 
> >
> > > > > > Heyo,
> >
> > > > > > Good finds, peterk!
> >
> > > > > > pubsubhubbub uses some of the same techniques thatJaikuuses for
> > > > > > doing one-to-many fan-out of status message updates. The
> migration is
> > > > > > underway as we speak
> > > > > > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/).
> I
> > > > > > believe the code should be available very soon.
> >
> > > > > > 2009/3/11 peterk :
> >
> > > > > > > The app is 

[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-15 Thread thuan

I know the topic is more about microblogging services than xmpp, but
by chance, have somebody achieved to install some comet/ajax push
applications? The technique might be used to speed up message display
for popular conversations as it is used for the chat function in
google mail.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-15 Thread Andrew Badera

Wouldn't a Comet mechanism be fairly expensive, implemented on GAE?


On Sun, Mar 15, 2009 at 7:59 AM, thuan  wrote:
>
> I know the topic is more about microblogging services than xmpp, but
> by chance, have somebody achieved to install some comet/ajax push
> applications? The technique might be used to speed up message display
> for popular conversations as it is used for the chat function in
> google mail.
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-15 Thread David Wilson

2009/3/15 thuan :
>
> I know the topic is more about microblogging services than xmpp, but
> by chance, have somebody achieved to install some comet/ajax push
> applications? The technique might be used to speed up message display
> for popular conversations as it is used for the chat function in
> google mail.

Since there are no blocking / sleeping primitives in AppEngine, it's
not really possible without spinning and burning a tonne of CPU, or
making lots of requests. You really need an external component for it.

Hopefully the AppEngine XMPP support that is on the roadmap will
include support for BOSH (),
which would solve the Comet problem beautifully. :)


David

> >
>



-- 
It is better to be wrong than to be vague.
  — Freeman Dyson

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-16 Thread Dan Sanderson
The XMPP support mentioned on the roadmap does not include BOSH.
-- Dan

On Sun, Mar 15, 2009 at 6:14 PM, David Wilson  wrote:

>
> 2009/3/15 thuan :
> >
> > I know the topic is more about microblogging services than xmpp, but
> > by chance, have somebody achieved to install some comet/ajax push
> > applications? The technique might be used to speed up message display
> > for popular conversations as it is used for the chat function in
> > google mail.
>
> Since there are no blocking / sleeping primitives in AppEngine, it's
> not really possible without spinning and burning a tonne of CPU, or
> making lots of requests. You really need an external component for it.
>
> Hopefully the AppEngine XMPP support that is on the roadmap will
> include support for BOSH (),
> which would solve the Comet problem beautifully. :)
>
>
> David
>
> > >
> >
>
>
>
> --
> It is better to be wrong than to be vague.
>  — Freeman Dyson
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-16 Thread David Wilson

Thanks for that Dan.

I just noticed that quite surprisingly, time.sleep() works.


David.

2009/3/16 Dan Sanderson :
> The XMPP support mentioned on the roadmap does not include BOSH.
> -- Dan
>
> On Sun, Mar 15, 2009 at 6:14 PM, David Wilson  wrote:
>>
>> 2009/3/15 thuan :
>> >
>> > I know the topic is more about microblogging services than xmpp, but
>> > by chance, have somebody achieved to install some comet/ajax push
>> > applications? The technique might be used to speed up message display
>> > for popular conversations as it is used for the chat function in
>> > google mail.
>>
>> Since there are no blocking / sleeping primitives in AppEngine, it's
>> not really possible without spinning and burning a tonne of CPU, or
>> making lots of requests. You really need an external component for it.
>>
>> Hopefully the AppEngine XMPP support that is on the roadmap will
>> include support for BOSH (),
>> which would solve the Comet problem beautifully. :)
>>
>>
>> David
>>
>> > >
>> >
>>
>>
>>
>> --
>> It is better to be wrong than to be vague.
>>  — Freeman Dyson
>>
>>
>
>
> >
>



-- 
It is better to be wrong than to be vague.
  — Freeman Dyson

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-16 Thread thuan

> I just noticed that quite surprisingly, time.sleep() works.

There is a 10 second execution limit for all requests. This includes
requesting and submitting data to external services. Long polling
would not be that easy to sustain. I'm still looking for a way to
circumvent this limitation, if anybody has an idea...
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: A question for Jaiku's developers, if they're watching..

2009-03-16 Thread Andrew Badera

use a URL monitoring service to ping your URLs every n minutes ...




On Mon, Mar 16, 2009 at 6:40 AM, thuan  wrote:
>
>> I just noticed that quite surprisingly, time.sleep() works.
>
> There is a 10 second execution limit for all requests. This includes
> requesting and submitting data to external services. Long polling
> would not be that easy to sustain. I'm still looking for a way to
> circumvent this limitation, if anybody has an idea...
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---