there are often changes, i integrated it in the daily processing, if an
entity is ask for do also the changes, if from another datastore first fetch
and save (changed) in the new.
2011/3/11 Jeff Knox lairdk...@gmail.com
If you want to keep costs down but not leave the data spinning around and
If would be better if the admin tool uses DatastoreKeyInputReader.
I think it will use it, and thus being the fastest way of deleting
large number of entities.
It would be more cost effective if we make every index needed explicit.
If for a single property you only need the ascending index you
If you want to keep costs down but not leave the data spinning around and
around why not delete a subset every day? It may take an extremely long time
but at least it would be better than paying storage costs forever.
--
You received this message because you are subscribed to the Google Groups
We are also finding that deletion is an expensive prospect. So much so
that we find ourselves discussing the cost-tradeoff of simply leaving
the dead data around because the storage cost is so much lower than
what it would take to delete - even using the techniques mentioned in
this post.
It
I also think deleting data is quite cpu intensive and expensive
process, different techniques described above by Tim, Wesley, Ikai and
others give only marginal benefits.
There are some use cases where we deal with ephemeral data and all
entities of a model are obsolete-irrelevant after some time.
I'm also interested in this. I have 300 million entities I no longer need,
they cost $5 / day to store, but short test with the Datastore Admin seemed
to show that they would cost $1500 to delete. Not a nice thing to discover
since I'd like to delete them precisely because I need to save money.
Crossposting my reply from stackoverflow.
I got advice on #appengine in IRC that simply getting the keys of 2000
entities at a time and spawning tasks to delete them in pieces (can pass
keys as strings to tasks) may be cheaper than using the Datastore Admin
tool. I am trying this now. I will
why would it be cheaper if at the end, the datastore admin creates a map
reduce that iterates thru the model via splitting the index and loading each
entity per key name ? Each map is a task queue so it is exactly the same :)
On 9 March 2011 18:22, Bemmu bemmu@gmail.com wrote:
Crossposting
I've been given the impression from these forums that the datastore admin
tool is so expensive for exactly the reason you've stated David - it loads
each entity by key before deletion, whereas deleting purely on keys is much
cheaper CPU-wise as you don't need to bother with the retrieval of the
hmmm, well Wesley's option B was merely because the batch operations i
think. One nice feature about the map reduce is the mutation pool which
handles the logic of batching an operation while you yield thru iterations.
I guess in a big dataset like yours make sense (the model retrieval vs key
I've tried using the datastore admin to delete some very large
datasets as well. The solution Bemmu is using will be *significantly*
faster, and from my experience, should be more cost effective.
I'm also eager to hear what Bemmu thinks after the delete has ran for a while.
Robert
On
One more tip: only index the properties you need. When you delete an entity,
we also have to delete all the associated indexes.
--
Ikai Lan
Developer Programs Engineer, Google App Engine
Blogger: http://googleappengine.blogspot.com
Reddit: http://www.reddit.com/r/appengine
Twitter:
Make sure you are only getting the keys, rather than whole entities.
That will be a lot less costly
Rgds
Tim
--
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To
13 matches
Mail list logo