I am really stoked by the fact that it is now open.  I love seeing how other
people develop software.

Paul.

2009/3/14 peterk <peter.ke...@gmail.com>

>
> Just a head's up - Jaiku has gone open source :)
>
> http://code.google.com/p/jaikuengine/
>
> At a very brief first glance, I see references to xmpp stuff and
> more..going to try and map out the code and see what goodies might be
> there, could be stuff of interest beyond pub/sub too.
>
> On Mar 13, 1:28 pm, peterk <peter.ke...@gmail.com> wrote:
> > Unfortunately I do need to query them based on subscriber_id..so I
> > can't pack them into a non-indexed property.
> >
> > Retrieving updates particular user has subscribed to is blazingly fast
> > though...that's the gain in the end, I can query and fetch 1000
> > updates for a user sorted by date in 20-30ms-cpu. Love that :p In my
> > hacky approaches previously where I tried to write once and then
> > 'gather', I had to do lots of in-memory sorting and stuff, and even
> > the results often wouldn't be totally accurate.
> >
> > I'm going to keep toying with the write end of things though..because
> > in my full app, I may need to do write to other entities along with
> > subscribers to do certain things I'm trying to achieve. So I'm going
> > to be looking for every opportunity possible to optimise the cost of
> > an 'update', which in my case may go beyond notifying subscribers. So
> > any thoughts/ideas on further optimisation are more than welcome!!
> >
> > @Paul
> >
> > If you've more subscribers than will fit in one 'group' you'll need
> > multiple groups, correct. So you'll have n writes, where n = number of
> > subscribers/group-size, rounded up to the nearest whole number. Even
> > with the costly index creation for each of these 'group' entities
> > though, it should still work out a fair bit cheaper than writing a
> > seperate entity for each subscriber.
> >
> > On Mar 13, 11:47 am, bFlood <bflood...@gmail.com> wrote:
> >
> > > @peterk - if you don't need to query by the subscriber, you could
> > > alternatively pack the list of subscribers for a feed into a
> > > TextProperty so it is not indexed. I use TextProperty a lot to store
> > > large lists of geometry data and they work out pretty well
> >
> > > @brett - async! looking forward to it in future GAE builds. thanks
> >
> > > cheers
> > > brian
> >
> > > On Mar 13, 5:37 am, peterk <peter.ke...@gmail.com> wrote:
> >
> > > > I was just toying around with this idea yesterday Brett.. :D I did
> > > > some profiling, and it would reduce the write cost per subscriber to
> > > > about 24ms-40ms (depending on the number of subscribers you
> have..more
> > > > = lower cost per avg), from 100-150ms. These are rough numbers with
> > > > entities I was using, I have to do some more accurate profiling..
> >
> > > > When I first thought about doing this, I was thinking ":o I'll reduce
> > > > write cost by a factor of hundreds!", but as it turns out, the extra
> > > > index update time for an entity with a large number of list property
> > > > entries eats into that saving significantly.
> >
> > > > But it still is a saving. Funnily enough the per subscriber saving
> > > > increases (to a point) the more subscribers you have.
> >
> > > > I'm not sure if there's anything one can do to optimise index
> creation
> > > > time with large lists.. I'm going to do some more work as well to see
> > > > if there's an optimum 'batch size' for grouping subscribers
> > > > together..at first blush, as mentioned above, it seems the larger the
> > > > better (up to the per entity property/index cap of course).
> >
> > > > Thanks also for the insight on pubsubhubub..I eagerly await updates
> on
> > > > that front :) Thank you!!
> >
> > > > On Mar 13, 8:05 am, Paul Kinlan <paul.kin...@gmail.com> wrote:
> >
> > > > > Just Curious,
> >
> > > > > For other pub/sub-style systems where you want to write to the
> > > > > Datastore, the trick is to use list properties to track the
> > > > > subscribers you've published to. So for instance, instead of
> writing a
> > > > > single entity per subscriber, you write one entity with 1000-2000
> > > > > subscriber IDs in a list. Then all queries for that list with an
> > > > > equals filter for the subscriber will show the entity. This lets
> you
> > > > > pack a lot of information into a single entity write, thus
> minimizing
> > > > > Datastore overhead, cost, etc. Does that make sense?
> >
> > > > > So if you have over the 5000 limit in the subscribers would you
> write the
> > > > > entity twice? Each with differnt subscriber id's?
> >
> > > > > Paul
> >
> > > > > 2009/3/13 Brett Slatkin <brett-appeng...@google.com>
> >
> > > > > > Heyo,
> >
> > > > > > Good finds, peterk!
> >
> > > > > > pubsubhubbub uses some of the same techniques thatJaikuuses for
> > > > > > doing one-to-many fan-out of status message updates. The
> migration is
> > > > > > underway as we speak
> > > > > > (http://www.jaiku.com/blog/2009/03/11/upcoming-service-break/).
> I
> > > > > > believe the code should be available very soon.
> >
> > > > > > 2009/3/11 peterk <peter.ke...@gmail.com>:
> >
> > > > > > > The app is actually live here:
> >
> > > > > > >http://pubsubhubbub.appspot.com/
> > > > > > >http://pubsubhubbub-subscriber.appspot.com/
> >
> > > > > > > (pubsubhubbub-publisher isn't there, but it's trivial to upload
> your
> > > > > > > own.)
> >
> > > > > > > This suggests it's working on appengine as it is now. Been
> looking
> > > > > > > through the source, and I'm not entirely clear on how the
> 'background
> > > > > > > workers' are actually working..there are two, one for pulling
> updates
> > > > > > > to feeds from publishers, and one for propogating updates to
> > > > > > > subscribers in batches.
> >
> > > > > > > But like I say, I can't see how they're actually started and
> running
> > > > > > > constantly.  There is a video here of a live demonstration:
> >
> > > > > > >http://www.veodia.com/player.php?vid=fCNU1qQ1oSs
> >
> > > > > > > The background workers seem to be behaving as desired there,
> but I'm
> > > > > > > not sure if they were just constantly polling some urls to keep
> the
> > > > > > > workers live for the purposes of that demo, or if they're
> actually
> > > > > > > running somehow constantly on their own.. I can't actually get
> the
> > > > > > > live app at the urls above to work, but not sure if it's
> because
> > > > > > > background workers aren't really working, or because i'm
> feeding it
> > > > > > > incorrect urls/configuration etc.
> >
> > > > > > Ah sorry yeah I still have the old version of the source running
> on
> > > > > > pubsubhubbub.appspot.com; I need to update that with a more
> recent
> > > > > > build. Sorry for the trouble! It's still not quite ready for
> > > > > > widespread use, but it should be soon.
> >
> > > > > > The way pubsubhubbub does fan-out, there's no need to write an
> entity
> > > > > > for each subscriber of a feed. Instead, each time it consumes a
> task
> > > > > > from the work queue it will update the current iterator position
> in
> > > > > > the query result of subscribers for a URL. Subsequent work
> requests
> > > > > > will offset into the subscribers starting at the iterator
> position.
> > > > > > This works well in this case because it's using urlfetch to
> actually
> > > > > > notify subscribers, instead of writing to the Datastore.
> >
> > > > > > For other pub/sub-style systems where you want to write to the
> > > > > > Datastore, the trick is to use list properties to track the
> > > > > > subscribers you've published to. So for instance, instead of
> writing a
> > > > > > single entity per subscriber, you write one entity with 1000-2000
> > > > > > subscriber IDs in a list. Then all queries for that list with an
> > > > > > equals filter for the subscriber will show the entity. This lets
> you
> > > > > > pack a lot of information into a single entity write, thus
> minimizing
> > > > > > Datastore overhead, cost, etc. Does that make sense?
> >
> > > > > > @bFlood: Indeed, the async_apiproxy.py code is interesting. Not
> much
> > > > > > to say about that at this time, besides the fact that it works.
> =)
> >
> > > > > > -Brett
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to