On Mon, Oct 3, 2011 at 9:24 AM, Mat Jaggard <[email protected]> wrote: > Jeff - I'm a bit confused. I thought that the whole idea of the > datastore was that you could read or write as much as you want, as > fast as you want as long as they are not related? So one datastore > write per vote (and being written to different entity groups) should > be fine? I thought that the system just split tablets if they were > being accessed too much - so as long as the traffic didn't suddenly > increase, there'd be no scalability issues apart from cost.
"apart from cost" he says :-) The OP posited millions of users and millions of things to vote for. Each million votes will cost you (at minimum) $1.70 for one write + one read, but it'll probably be more depending on how many page views you have and what caching strategy you have. Still, maybe this is no big deal. The bigger problem though is that vote traffic is likely to be focused on a handful of items. Popular things might get thousands of votes per second, unpopular things won't be voted for at all. It's hard to come up with a sharding strategy that works well for this - you probably don't want 1k shards for everything, storage costs go up and expense/latency of calculating totals goes up. I have to deal with a similar problem myself right now (with the added constraint that I need an instantaneously precise count). I'm considering a system that automatically tracks latency and increases the shard count when it crosses a threshold. It's not a pretty problem to solve. Jeff -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
