Just as an update..I've modified my query so that if you have more
than 30 friends, it splits the list into chunks of 30 and does
multiple queries.

So it gets the last 100 scores from your first 30 friends, then the
last 100 scores from your next 30 friends and so on..and then sums up
the scores for each product.

Performance wise, for 50 friends at least, this might be OK..all of
this takes about 3.5seconds. Not ideal, but OK I guess, and I can
cache the results for a few minutes at least.

However, I'm concious of the fact that the combined results from
chunks of 30 friends won't necessarily be the same as the results from
50 friends altogether.

I would welcome any further thoughts of efficient ways of finding the
latest 'content' (whatever it might be) generated by a user's friends.
As a general problem..it's quite a common case in social networking,
finding good strategies for addressing that on GAE might help a lot of
people. The above may not be the best approach, maybe I need to flip
my thinking on the whole problem..

Thanks again :)

On Feb 28, 7:23 pm, peterk <peter.ke...@gmail.com> wrote:
> Hey Joe,
>
> Thanks for your thoughts!
>
> The reason I was initially thinking of having 'n' groups per user was
> that I was going to treat groups of friends as groups..that is, while
> you may be a member of only a few 'named' groups (like..'moms',
> 'knitters', 'chefs', for example), you'd also be a member of all your
> friends' 'friends groups'. On a busy site, you could have many many
> friends. Thousands, perhaps hundreds of thousands if you were really
> popular. Every time you score a product, you'd have to ping all your
> friends too, as well as your few, 'named groups'.
>
> So then I got to thinking, that rather than pinging all your friends
> every time you score a product..flip it around, so that when you want
> to see what your friends have been up to (with regard to scoring), you
> would do lots of reads to generate the list of products your friends
> have been scoring. It wouldn't have to be an all-time list, just
> recent scoring, so I was thinking:
>
> 1) Every time a user scores a product, record it as a 'score', with
> the user, the product, and the score
> 2) To view your friends' recent 'cool products' vis-a-vis their
> scoring, fetch scores.all().filter('user IN', me.friends).order('-
> date').fetch(limit=100) - where me.friends is a list property of user
> keys, of your friends, and where I want to have just the most recent
> 100 (this the date ordering and fetch limit). And then I'd go through
> these 100 entities, and basically extract out the products, and sum up
> the scores present in this list of entities, to get a rough-ish
> approximation of what friends are rating well recently. I capped it at
> 100 in order so as not to spend too much time in the request doing
> additions..I did a test where I was doing 100 additions along with
> some datastore fetches etc. and the request time was acceptable.
>
> This sort of seemed to work well in my head, but coding it up, I run
> into a 'too many subqueries (max: 30)' error, because I have 50
> friends in my friends list..so it seems like it was a naive idea to
> try and search through scores where user IN my list of friends.
>
> I think I've gone so deep down this burrow-hole, I can't see if
> there's an obvious solution..is there a way I can get my friends, then
> get their 100 most recent scores? If I were to do a m:m relationship
> between me and other users to track my friends, and then a 1:m
> relationship between users and scores, would I be able to query
> me.friends.scores?
>
> Long-winded..sorry..thank you again for your thoughts :)
>
> On Feb 28, 5:30 pm, Qian Qiao <qian.q...@gmail.com> wrote:
>
> > On Sat, Feb 28, 2009 at 20:20, peterk <peter.ke...@gmail.com> wrote:
>
> > > It might be because it's a Saturday morning and I'm not thinking very
> > > clearly, but I can't really seem to find a good solution to this
> > > problem, or specifically one that might work well on GAE.
>
> > > The scenario is this (bear with me, it might sound a bit silly, but
> > > within the context of my application it makes sense):
>
> > > Say we have users. And we have products. And say users can rate
> > > products, so each product has a score.
>
> > > So far, so good. But now, say users can be members of groups. N
> > > groups. And say for each group I want to track the group's score for a
> > > given product.
>
> > > So then when a user scores a product, asides from the overall score, n
> > > scores need to be updated - n being the number of groups the user is a
> > > part of.
>
> > > So, I want to be able to answer queries like - what are the highest
> > > rated products among all users? Among moms? Among my friends?
>
> > > At first blush, the naive (?) way would be for each product to have
> > > multiple scores, one for each group, and when a user scores a product,
> > > to update the scores for the groups it belongs too.
>
> > > But that could be an awful lot of writes (bad). Or could batch writes
> > > come to the rescue here? Looking at the docs it's not clear if batch
> > > writes actually minimise the cost of multiple writes versus multiple
> > > put() calls. Is there any way to accomodate a scenario like this with
> > > a minimum, and constant, number of writes, so that no matter how
> > > popular the app gets and how many groups a user is a member of, the
> > > cost of scoring a product scales in an OK way?
>
> > > Thanks for any input, apologies again if there's an obvious answer
> > > here.. :)
>
> > I imagine the scores don't need to be perfectly accurate, off by one
> > or two on a busy site is acceptable isn't it? So the sharded counter
> > approach for scores might be worth looking at.
>
> > And since the scores don't need to be perfectly accurate, you don't
> > need a transaction, you can just update all the score counters in a
> > batch update.
>
> > Realistically, a user isn't going to be in more than 20 groups is he?
> > Even for 20 groups, that's 21 writes, which could be distributed to
> > several servers and run in parallel with batch write, it shouldn't
> > take too long I imagine.
>
> > Just my $0.02 (which is worth 200 emails with the current app engine 
> > pricing).
>
> > -- Joe
>
> > --
> > A wise man once said: never share your wisdom with others.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to