If you're trying to achieve high write throughput, as it sounds like you are since you have 1,000,000 entities to write, you should be designing your schema to minimize the number of entities in an entity group. These and other general tips are listed here:
http://code.google.com/appengine/docs/python/datastore/keysandentitygroups.html#Entity_Groups_Ancestors_and_Paths Putting all of your entities in a single group significantly impairs your application's ability to update entities since entities can no longer be written in parallel. Large entity groups will work fine if your entities aren't being updated very often (generally 1-10 per second, max), but if you want to do massive bulk writes like this, I suggest re-thinking your design with this in mind. Even if you can't rollback the entire write, doing a batch put of entities in separate entity groups should thrown an exception in the event of a failure which you can catch and re-try the write for the single affected entity. - Jason On Sun, Sep 6, 2009 at 5:12 PM, Nicholas Albion <nalb...@gmail.com> wrote: > > On Sep 5, 10:24 am, "Jason (Google)" <apija...@google.com> wrote: > > Batch puts are supported, yes, and as of yesterday's release, calling > > makePersistentAll (JDO) and the equivalent JPA call will take advantage > of > > this support (previously, you had to use the low-level API). > > > > Two quick notes: > > > > 1) All of the entities that you're persisting should be in separate > entity > > groups since two entities in the same entity group can't be written to > > consecutively, and you will see datastore timeout exceptions if many > > simultaneous write requests come in for the same entity or entity group. > > Sorry Jason, I'm a bit confused now. Wouldn't that be the most common > use case for batch puts? According to the GAE documentation, this is > the main point of entity groups: > "App Engine creates related entities in entity groups automatically > to support updating related objects together" > > ...so you can add them together _logically_ but not chronologically? > I've got several cases where I'd have 50,000 to 1000,000 records which > logically belong to a single parent entity. If I need to add them to > the datastore individually it's going to take about somewhere between > 2 to 24 hours to write them all (spread across multiple HTTP requests > in any case). If I could batch put the data (within the same entity > group) I imagine that the time would be reduced significantly. > > > 2) Batch puts do not operate in a transaction. This means that some > writes > > may succeed but others may not, so if you need the ability to rollback, > > you'll need transactions. > > Do you mean that if necessary, the call to makePersistentAll() should > be wrapped in a transaction, or that makePersistentAll() _can_not_ be > wrapped in a transaction? > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-java@googlegroups.com To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en -~----------~----~----~----~------~----~------~--~---