Hey Joe,

Thanks for your thoughts!

The reason I was initially thinking of having 'n' groups per user was
that I was going to treat groups of friends as groups..that is, while
you may be a member of only a few 'named' groups (like..'moms',
'knitters', 'chefs', for example), you'd also be a member of all your
friends' 'friends groups'. On a busy site, you could have many many
friends. Thousands, perhaps hundreds of thousands if you were really
popular. Every time you score a product, you'd have to ping all your
friends too, as well as your few, 'named groups'.

So then I got to thinking, that rather than pinging all your friends
every time you score a product..flip it around, so that when you want
to see what your friends have been up to (with regard to scoring), you
would do lots of reads to generate the list of products your friends
have been scoring. It wouldn't have to be an all-time list, just
recent scoring, so I was thinking:

1) Every time a user scores a product, record it as a 'score', with
the user, the product, and the score
2) To view your friends' recent 'cool products' vis-a-vis their
scoring, fetch scores.all().filter('user IN', me.friends).order('-
date').fetch(limit=100) - where me.friends is a list property of user
keys, of your friends, and where I want to have just the most recent
100 (this the date ordering and fetch limit). And then I'd go through
these 100 entities, and basically extract out the products, and sum up
the scores present in this list of entities, to get a rough-ish
approximation of what friends are rating well recently. I capped it at
100 in order so as not to spend too much time in the request doing
additions..I did a test where I was doing 100 additions along with
some datastore fetches etc. and the request time was acceptable.

This sort of seemed to work well in my head, but coding it up, I run
into a 'too many subqueries (max: 30)' error, because I have 50
friends in my friends list..so it seems like it was a naive idea to
try and search through scores where user IN my list of friends.

I think I've gone so deep down this burrow-hole, I can't see if
there's an obvious solution..is there a way I can get my friends, then
get their 100 most recent scores? If I were to do a m:m relationship
between me and other users to track my friends, and then a 1:m
relationship between users and scores, would I be able to query
me.friends.scores?

Long-winded..sorry..thank you again for your thoughts :)



On Feb 28, 5:30 pm, Qian Qiao <qian.q...@gmail.com> wrote:
> On Sat, Feb 28, 2009 at 20:20, peterk <peter.ke...@gmail.com> wrote:
>
> > It might be because it's a Saturday morning and I'm not thinking very
> > clearly, but I can't really seem to find a good solution to this
> > problem, or specifically one that might work well on GAE.
>
> > The scenario is this (bear with me, it might sound a bit silly, but
> > within the context of my application it makes sense):
>
> > Say we have users. And we have products. And say users can rate
> > products, so each product has a score.
>
> > So far, so good. But now, say users can be members of groups. N
> > groups. And say for each group I want to track the group's score for a
> > given product.
>
> > So then when a user scores a product, asides from the overall score, n
> > scores need to be updated - n being the number of groups the user is a
> > part of.
>
> > So, I want to be able to answer queries like - what are the highest
> > rated products among all users? Among moms? Among my friends?
>
> > At first blush, the naive (?) way would be for each product to have
> > multiple scores, one for each group, and when a user scores a product,
> > to update the scores for the groups it belongs too.
>
> > But that could be an awful lot of writes (bad). Or could batch writes
> > come to the rescue here? Looking at the docs it's not clear if batch
> > writes actually minimise the cost of multiple writes versus multiple
> > put() calls. Is there any way to accomodate a scenario like this with
> > a minimum, and constant, number of writes, so that no matter how
> > popular the app gets and how many groups a user is a member of, the
> > cost of scoring a product scales in an OK way?
>
> > Thanks for any input, apologies again if there's an obvious answer
> > here.. :)
>
> I imagine the scores don't need to be perfectly accurate, off by one
> or two on a busy site is acceptable isn't it? So the sharded counter
> approach for scores might be worth looking at.
>
> And since the scores don't need to be perfectly accurate, you don't
> need a transaction, you can just update all the score counters in a
> batch update.
>
> Realistically, a user isn't going to be in more than 20 groups is he?
> Even for 20 groups, that's 21 writes, which could be distributed to
> several servers and run in parallel with batch write, it shouldn't
> take too long I imagine.
>
> Just my $0.02 (which is worth 200 emails with the current app engine pricing).
>
> -- Joe
>
> --
> A wise man once said: never share your wisdom with others.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to