Hey guys and girls, I've got a situation where I'd have to "transactionally" update multiple entities which would cumulatively be greater than the 1MB datastore API limit... is there a decent solution for this?
For example, let's say that I start off with entities E1, E2, E3 which are all about 400kb each. All the entities are specific to a given User. I grab them all on a "remote node" and do some calculations on them to yield new "computed" entities E1', E2', and E3'. Any failure of the remote node or the datastore is recoverable except when the remote node tries to *update* the datastore... in that situation, it'd have to batch the update into 2 separate .put() calls to overcome the 1MB limit. And should the remote node die after the first put(), we have a messy situation =) My solution at the moment is to: 1. Create a UserRecord entity which has a 'version' attribute corresponding to the "latest" versions of the related entities for any given User. 2. Add a 'version' attribute to all the entities. 3. Whenever the remote node creates the "computed" new set of entities, it creates them all with a new version number -- applying the same version for all the entities in the same "transaction". 4. These new entities are actually .put() as totally separate and new entities, i.e. they do not overwrite the old entities. 5. Once a remote node successfully writes new versions of all the entities relating to a User, it updates the UserRecord with the latest version number. 6. From the remote node, delete all Entities related to a User which don't have the latest version number. 7. Have a background thread check and do deletions of invalid versions in case a remote node had died whilst doing step 4, 5 or 6... I've skipped out the complications caused by multiple remote nodes working on data relating to the same User -- but, overall, the approach is pretty much the same. Now, the advantage of this approach (as far as I can see) is that data relating to a User is never *lost*. That is, data is never lost before there is valid data to replace it. However, the disadvantage is that for (unknown) periods of time, there would be duplicate data sets for a given User... All of which is caused by the fact that the datastore calls cannot exceed 1MB. =( So queries will yield duplicate data -- gah!! Is there a better approach to try at all? Thanks! -- love, tav plex:espians/tav | t...@espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---