my experience with a relatively simple application via JDO makePersistentAll() was that I got DataStore Operation Timeout exceptions with batch sizes of approx 200-300 objects ...
On Feb 24, 1:48 pm, Guillermo Schwarz <guillermo.schw...@gmail.com> wrote: > I think we can safely assume that the programmer was trying to speed > up things a little by writing 12 thousand objects in a single > operation. > > Now if that gets to be faster or slower than writing each object > separately, it is a matter of the internal implementation of the data > store. I prefer to do no hacks, but OTOH it is better sometimes to be > clear bout what you want (API wise). > > The point here is that the programmer wants to insert 15 thousand > objects in a second, you seem to imply that is possible. > "While it's an interesting thought exercise to see if BigTable can do > it through App Engine's interface (hint: it can, globally, easily)". > > I rest my case ;-) > > Do we need to do anything to test that? Is there anything we could do > to help? > > Cheers, > Guillermo. > > On 24 feb, 18:06, "Ikai L (Google)" <ika...@google.com> wrote: > > > > > Simple key-only writes can definitely do it, but there's a few places where > > you can introduce overhead: > > > - serialization > > - network I/O > > - indexes > > > My point wasn't necessarily that it wasn't possible. makePersistentAll does > > use a batch write, and there are definitely sites that can do 12,000+ writes > > a second (and well above that), but I don't know of any that will attempt to > > do that in a single request. While it's an interesting thought exercise to > > see if BigTable can do it through App Engine's interface (hint: it can, > > globally, easily), I can't think of a single use case for a site to need to > > do this all the time and with the sub-second requirement. I think it's > > reasonable to ask why this design exists and why the requirements exist and > > rethink one or the other. > > > On Wed, Feb 24, 2010 at 12:35 PM, Guillermo Schwarz < > > > guillermo.schw...@gmail.com> wrote: > > > Ikai, > > > > Maybe you are right. Maybe not. I'm not an expert in datastore > > > internals, but here is my point of view. > > > > This paper claims that Berkeley DB Java edition can insert about > > > 15,000 records per second. > > > >http://www.oracle.com/database/docs/bdb-je-architecture-whitepaper.pdf > > > > The graphic is on page 22. The main reason they claim to be able to do > > > that is that they don't need to actually sync the write to disk, they > > > can queue the write, update in-memory data and write a log file. > > > Writing the log file is for transactional purposes and it is the only > > > write really needed.That is pretty fast. > > > > Cheers, > > > Guillermo. > > > > On 24 feb, 16:51, "Ikai L (Google)" <ika...@google.com> wrote: > > > > I also remember hearing (and this is not verified so don't quote me on > > > this > > > > or come after me if I'm wrong) from a friend of mine running KV stores > > > > in > > > > production that there were issues with certain distributed key/value > > > stores > > > > that actually managed to slow down as a function of the number of > > > > objects > > > in > > > > the store - and Tokyo Tyrant was on his list. A key property of scalable > > > > stores is that the opposite of this is true. > > > > > 12,000 synchronous, serialized writes in a single sub-second request is > > > > pretty serious. I am not aware of a single website in the world that > > > > does > > > > this. > > > > > On Wed, Feb 24, 2010 at 11:35 AM, Jeff Schnitzer <j...@infohazard.org > > > >wrote: > > > > > > I think this is actually an interesting question, and brings up a > > > > > discussion worth having: > > > > > > Is datastore performance reasonable? > > > > > > I don't want to make this a discussion of reliability, which is a > > > > > separate issue. It just seems to me that the datastore is actually > > > > > kinda pokey, taking seconds to write a few hundred entities. When > > > > > people benchmark Tokyo Tyrant, I hear numbers thrown around like > > > > > 22,000 writes/second sustained across 1M records: > > > > > >http://blog.hunch.se/2009/02/28-tokyo-cabinet > > > > > > You might argue that the theoretical scalability of BigTable's > > > > > distributed store is higher... but we're talking about two full orders > > > > > of magnitude difference. Will I ever near the 100-google-server > > > > > equivalent load? Could I pay for it if I did? 100 CPUs (measured) > > > > > running for 1 month is about $7,200. Actual CPU speed is at least > > > > > twice the measured rate, so a single Tokyo Tyrant is theoretically > > > > > equivalent to almost $15,000/month of appengine hosting. Ouch. > > > > > > Maybe this isn't an apples to apples comparison. Sure, there aren't > > > > > extra indexes on those Tyrant entities... but to be honest, few of my > > > > > entities have extra indexes. What other factors could change this > > > > > analysis? > > > > > > Thoughts? > > > > > > BTW Tim, you may very well have quite a few indexes on your entities. > > > > > In JDO, nearly all single fields are indexed by default. You must > > > > > explicitly add an annotation to your fields to make them unindexed. > > > > > With Objectify, you can declare your entity as @Indexed or @Unindexed > > > > > and then use the same annotation on individual fields to override the > > > > > default. > > > > > > Jeff > > > > > > On Wed, Feb 24, 2010 at 12:43 AM, Tim Cooper <tco...@gmail.com> wrote: > > > > > > I have been trying to write 12,000 objects in a single page request. > > > > > > These objects are all very small and the total amount of memory is > > > not > > > > > > large. There is no index on these objects - the only GQL queries I > > > > > > make on them are based on the primary key. > > > > > > > Ikai has said: "That is - if you have to delete or create 150 > > > > > > persistent, indexed objects, you may want to rethink what problems > > > you > > > > > > are trying to solve." > > > > > > > So I have been thinking about the problems I'm trying to solve, > > > > > > including looking at the BuddyPoke blog and reading the GAE > > > > > > documentation. I'm trying to populate the database with entries > > > > > > relating to high school timetables. > > > > > > > * I could do the writes asynchronously, but that looks like a lot of > > > > > > additional effort. On my C++ app, writing the same information to my > > > > > > laptop drive, this happens in under a second, because the amount of > > > > > > data is actually quite small, but it times out on GAE. > > > > > > * I am using pm.makePersistentAll(), but this doesn't help. > > > > > > * There is no index on the objects - I access them only through the > > > > > > primary key. (I'm pretty sure there's no index - but how can I > > > > > > confirm this via the development server dashboard?) > > > > > > * The objects constitute 12,000 entity groups. I could merge them > > > > > > into fewer entity groups, but there's no natural groupings I could > > > > > > use, so it could get quite complex to introduce a contrived > > > > > > grouping, > > > > > > and also this would complicate the multi-user updating of the > > > objects. > > > > > > The AppEngine team seem to generally recommend using more entity > > > > > > groups, but it's difficult to integrate that advice with the > > > > > > contrary > > > > > > advice to use fewer entity groups for acceptable performance. > > > > > > * I'd be happy if the GAE database was < 10 times slower than a > > > > > > non-cloud RDBMS, but the way I'm using it, it's currently not. > > > > > > > Does anyone have any advice? > > > > > > > -- > > > > > > You received this message because you are subscribed to the Google > > > Groups > > > > > "Google App Engine for Java" group. > > > > > > To post to this group, send email to > > > > > google-appengine-j...@googlegroups.com. > > > > > > To unsubscribe from this group, send email to > > > > > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > > > <google-appengine-java%2bunsubscr...@googlegroups.com<google-appengine-java%252bunsubscr...@googlegroups.com> > > > > > > . > > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/google-appengine-java?hl=en. > > > > > > -- > > > > > You received this message because you are subscribed to the Google > > > Groups > > > > > "Google App Engine for Java" group. > > > > > To post to this group, send email to > > > > > google-appengine-j...@googlegroups.com. > > > > > To unsubscribe from this group, send email to > > > > > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > > > <google-appengine-java%2bunsubscr...@googlegroups.com<google-appengine-java%252bunsubscr...@googlegroups.com> > > > > > > . > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/google-appengine-java?hl=en. > > > > > -- > > > > Ikai Lan > > > > Developer Programs Engineer, Google App Enginehttp:// > > > googleappengine.blogspot.com|http://twitter.com/app_engine > > > > -- > > > You received this message because you are subscribed to the Google Groups > > > "Google App Engine for Java" group. > > > To post to this group, send email to > > > google-appengine-j...@googlegroups.com. > > > To unsubscribe from this group, send email to > > > google-appengine-java+unsubscr...@googlegroups.com<google-appengine-java%2bunsubscr...@googlegroups.com> > > > . > > > For more options, visit this group at > > >http://groups.google.com/group/google-appengine-java?hl=en. > > > -- > > Ikai Lan > > Developer Programs Engineer, Google App > > Enginehttp://googleappengine.blogspot.com|http://twitter.com/app_engine- > > Hide quoted text - > > - Show quoted text - -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-j...@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.