I also remember hearing (and this is not verified so don't quote me on this or come after me if I'm wrong) from a friend of mine running KV stores in production that there were issues with certain distributed key/value stores that actually managed to slow down as a function of the number of objects in the store - and Tokyo Tyrant was on his list. A key property of scalable stores is that the opposite of this is true.
12,000 synchronous, serialized writes in a single sub-second request is pretty serious. I am not aware of a single website in the world that does this. On Wed, Feb 24, 2010 at 11:35 AM, Jeff Schnitzer <j...@infohazard.org>wrote: > I think this is actually an interesting question, and brings up a > discussion worth having: > > Is datastore performance reasonable? > > I don't want to make this a discussion of reliability, which is a > separate issue. It just seems to me that the datastore is actually > kinda pokey, taking seconds to write a few hundred entities. When > people benchmark Tokyo Tyrant, I hear numbers thrown around like > 22,000 writes/second sustained across 1M records: > > http://blog.hunch.se/2009/02/28-tokyo-cabinet > > You might argue that the theoretical scalability of BigTable's > distributed store is higher... but we're talking about two full orders > of magnitude difference. Will I ever near the 100-google-server > equivalent load? Could I pay for it if I did? 100 CPUs (measured) > running for 1 month is about $7,200. Actual CPU speed is at least > twice the measured rate, so a single Tokyo Tyrant is theoretically > equivalent to almost $15,000/month of appengine hosting. Ouch. > > Maybe this isn't an apples to apples comparison. Sure, there aren't > extra indexes on those Tyrant entities... but to be honest, few of my > entities have extra indexes. What other factors could change this > analysis? > > Thoughts? > > BTW Tim, you may very well have quite a few indexes on your entities. > In JDO, nearly all single fields are indexed by default. You must > explicitly add an annotation to your fields to make them unindexed. > With Objectify, you can declare your entity as @Indexed or @Unindexed > and then use the same annotation on individual fields to override the > default. > > Jeff > > On Wed, Feb 24, 2010 at 12:43 AM, Tim Cooper <tco...@gmail.com> wrote: > > I have been trying to write 12,000 objects in a single page request. > > These objects are all very small and the total amount of memory is not > > large. There is no index on these objects - the only GQL queries I > > make on them are based on the primary key. > > > > Ikai has said: "That is - if you have to delete or create 150 > > persistent, indexed objects, you may want to rethink what problems you > > are trying to solve." > > > > So I have been thinking about the problems I'm trying to solve, > > including looking at the BuddyPoke blog and reading the GAE > > documentation. I'm trying to populate the database with entries > > relating to high school timetables. > > > > * I could do the writes asynchronously, but that looks like a lot of > > additional effort. On my C++ app, writing the same information to my > > laptop drive, this happens in under a second, because the amount of > > data is actually quite small, but it times out on GAE. > > * I am using pm.makePersistentAll(), but this doesn't help. > > * There is no index on the objects - I access them only through the > > primary key. (I'm pretty sure there's no index - but how can I > > confirm this via the development server dashboard?) > > * The objects constitute 12,000 entity groups. I could merge them > > into fewer entity groups, but there's no natural groupings I could > > use, so it could get quite complex to introduce a contrived grouping, > > and also this would complicate the multi-user updating of the objects. > > The AppEngine team seem to generally recommend using more entity > > groups, but it's difficult to integrate that advice with the contrary > > advice to use fewer entity groups for acceptable performance. > > * I'd be happy if the GAE database was < 10 times slower than a > > non-cloud RDBMS, but the way I'm using it, it's currently not. > > > > Does anyone have any advice? > > > > -- > > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > > To post to this group, send email to > google-appengine-j...@googlegroups.com. > > To unsubscribe from this group, send email to > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > . > > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To post to this group, send email to > google-appengine-j...@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. > > -- Ikai Lan Developer Programs Engineer, Google App Engine http://googleappengine.blogspot.com | http://twitter.com/app_engine -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-j...@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.