Hi tav,

Batch puts aren't transactional unless all the entities are in the
same entity group. Transactions, however, _are_ transactional, and the
1MB limit applies only to single API calls, so you can make multiple
puts to the same entity group in a transaction.

-Nick Johnson

On Fri, Jun 26, 2009 at 8:53 AM, tav<t...@espians.com> wrote:
>
> Hey guys and girls,
>
> I've got a situation where I'd have to "transactionally" update
> multiple entities which would cumulatively be greater than the 1MB
> datastore API limit... is there a decent solution for this?
>
> For example, let's say that I start off with entities E1, E2, E3 which
> are all about 400kb each. All the entities are specific to a given
> User. I grab them all on a "remote node" and do some calculations on
> them to yield new "computed" entities E1', E2', and E3'.
>
> Any failure of the remote node or the datastore is recoverable except
> when the remote node tries to *update* the datastore... in that
> situation, it'd have to batch the update into 2 separate .put() calls
> to overcome the 1MB limit. And should the remote node die after the
> first put(), we have a messy situation =)
>
> My solution at the moment is to:
>
> 1. Create a UserRecord entity which has a 'version' attribute
> corresponding to the "latest" versions of the related entities for any
> given User.
>
> 2. Add a 'version' attribute to all the entities.
>
> 3. Whenever the remote node creates the "computed" new set of
> entities, it creates them all with a new version number -- applying
> the same version for all the entities in the same "transaction".
>
> 4. These new entities are actually .put() as totally separate and new
> entities, i.e. they do not overwrite the old entities.
>
> 5. Once a remote node successfully writes new versions of all the
> entities relating to a User, it updates the UserRecord with the latest
> version number.
>
> 6. From the remote node, delete all Entities related to a User which
> don't have the latest version number.
>
> 7. Have a background thread check and do deletions of invalid versions
> in case a remote node had died whilst doing step 4, 5 or 6...
>
> I've skipped out the complications caused by multiple remote nodes
> working on data relating to the same User -- but, overall, the
> approach is pretty much the same.
>
> Now, the advantage of this approach (as far as I can see) is that data
> relating to a User is never *lost*. That is, data is never lost before
> there is valid data to replace it.
>
> However, the disadvantage is that for (unknown) periods of time, there
> would be duplicate data sets for a given User... All of which is
> caused by the fact that the datastore calls cannot exceed 1MB. =(
>
> So queries will yield duplicate data -- gah!!
>
> Is there a better approach to try at all? Thanks!
>
> --
> love, tav
>
> plex:espians/tav | t...@espians.com | +44 (0) 7809 569 369
> http://tav.espians.com | http://twitter.com/tav | skype:tavespian
>
> >
>



-- 
Nick Johnson, App Engine Developer Programs Engineer
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
Number: 368047

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to