Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-12 Thread Wim den Ouden
there are often changes, i integrated it in the daily processing, if an
entity is ask for do also the changes, if from another datastore first fetch
and save (changed) in the new.

2011/3/11 Jeff Knox lairdk...@gmail.com

 If you want to keep costs down but not leave the data spinning around and
 around why not delete a subset every day? It may take an extremely long time
 but at least it would be better than paying storage costs forever.

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
gr,
Wim den Ouden
Custom Google App Engine http://code.google.com/intl/nl/appengine/ based
webapps https://neighborshare.appspot.com/.
Free open source neighborshare framework http://code.google.com/p/relat/.
Gae tips http://code.google.com/p/relat/wiki/gaetips Datastore
(async)http://code.google.com/p/relat/wiki/gaetips?ts=1299673682updated=gaetips#Datastore_plus_(async)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-11 Thread djidjadji
If would be better if the admin tool uses DatastoreKeyInputReader.
I think it will use it, and thus being the fastest way of deleting
large number of entities.

It would be more cost effective if we make every index needed explicit.
If for a single property you only need the ascending index you get a
descending index for penalty extra.

2011/3/10 David Mora dla.m...@gmail.com:
 why would it be cheaper if at the end, the datastore admin creates a map
 reduce that iterates thru the model via splitting the index and loading each
 entity per key name ? Each map is a task queue so it is exactly the same :)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Deleting Data Really Expensive!

2011-03-11 Thread Jeff Knox
If you want to keep costs down but not leave the data spinning around and 
around why not delete a subset every day? It may take an extremely long time 
but at least it would be better than paying storage costs forever.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Deleting Data Really Expensive!

2011-03-10 Thread Jason Collins
We are also finding that deletion is an expensive prospect. So much so
that we find ourselves discussing the cost-tradeoff of simply leaving
the dead data around because the storage cost is so much lower than
what it would take to delete - even using the techniques mentioned in
this post.

It hurts me to think that we're leaving disks spinning with junk on
them.

Google, any improvements on this front that would lead to less
datastore CPU usage for deletions?

j

On Mar 10, 5:01 am, andreas schmid a.schmi...@gmail.com wrote:
 at this point i would delete the app and create a new one at no cost!

 On Mar 10, 2011, at 3:25 AM, Robert Kluin wrote:



  I've tried using the datastore admin to delete some very large
  datasets as well.  The solution Bemmu is using will be *significantly*
  faster, and from my experience, should be more cost effective.

  I'm also eager to hear what Bemmu thinks after the delete has ran for a 
  while.

  Robert

  On Wed, Mar 9, 2011 at 21:20, David Mora dla.m...@gmail.com wrote:
  hmmm, well Wesley's option B was merely because the batch operations i
  think. One nice feature about the map reduce is the mutation pool which
  handles the logic of batching an operation while you yield thru iterations.
  I guess in a big dataset like yours make sense (the model retrieval vs key
  only). Anyways, interested case - i'll love to see where it ends :)

  On 9 March 2011 19:26, Simon Knott knott.si...@gmail.com wrote:

  I've been given the impression from these forums that the datastore admin
  tool is so expensive for exactly the reason you've stated David - it loads
  each entity by key before deletion, whereas deleting purely on keys is 
  much
  cheaper CPU-wise as you don't need to bother with the retrieval of the
  entire entity to carry out the delete.

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine group.
  To post to this group, send email to google-appengine@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine+unsubscr...@googlegroups.com.
  For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.

  --
 http://about.me/david.mora

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine group.
  To post to this group, send email to google-appengine@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine+unsubscr...@googlegroups.com.
  For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.

  --
  You received this message because you are subscribed to the Google Groups 
  Google App Engine group.
  To post to this group, send email to google-appengine@googlegroups.com.
  To unsubscribe from this group, send email to 
  google-appengine+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Deleting Data Really Expensive!

2011-03-10 Thread nickmilon
I also think deleting data is quite cpu intensive and expensive
process, different techniques described above by Tim, Wesley, Ikai and
others give only marginal benefits.
There are some use cases where we deal with ephemeral data and all
entities of a model are obsolete-irrelevant after some time.
There must be a way to wipe out a  model without having to delete all
individual entities.

Happy coding ;-)

On Mar 10, 6:27 pm, Jason Collins jason.a.coll...@gmail.com wrote:
 We are also finding that deletion is an expensive prospect. So much so
 that we find ourselves discussing the cost-tradeoff of simply leaving
 the dead data around because the storage cost is so much lower than
 what it would take to delete - even using the techniques mentioned in
 this post.

 It hurts me to think that we're leaving disks spinning with junk on
 them.

 Google, any improvements on this front that would lead to less
 datastore CPU usage for deletions?

 j

 On Mar 10, 5:01 am, andreas schmid a.schmi...@gmail.com wrote:







  at this point i would delete the app and create a new one at no cost!

  On Mar 10, 2011, at 3:25 AM, Robert Kluin wrote:

   I've tried using the datastore admin to delete some very large
   datasets as well.  The solution Bemmu is using will be *significantly*
   faster, and from my experience, should be more cost effective.

   I'm also eager to hear what Bemmu thinks after the delete has ran for a 
   while.

   Robert

   On Wed, Mar 9, 2011 at 21:20, David Mora dla.m...@gmail.com wrote:
   hmmm, well Wesley's option B was merely because the batch operations i
   think. One nice feature about the map reduce is the mutation pool which
   handles the logic of batching an operation while you yield thru 
   iterations.
   I guess in a big dataset like yours make sense (the model retrieval vs 
   key
   only). Anyways, interested case - i'll love to see where it ends :)

   On 9 March 2011 19:26, Simon Knott knott.si...@gmail.com wrote:

   I've been given the impression from these forums that the datastore 
   admin
   tool is so expensive for exactly the reason you've stated David - it 
   loads
   each entity by key before deletion, whereas deleting purely on keys is 
   much
   cheaper CPU-wise as you don't need to bother with the retrieval of the
   entire entity to carry out the delete.

   --
   You received this message because you are subscribed to the Google 
   Groups
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com.
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.com.
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.

   --
  http://about.me/david.mora

   --
   You received this message because you are subscribed to the Google Groups
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com.
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.com.
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.

   --
   You received this message because you are subscribed to the Google Groups 
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com.
   To unsubscribe from this group, send email to 
   google-appengine+unsubscr...@googlegroups.com.
   For more options, visit this group 
   athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread Bemmu
I'm also interested in this. I have 300 million entities I no longer need, 
they cost $5 / day to store, but short test with the Datastore Admin seemed 
to show that they would cost $1500 to delete. Not a nice thing to discover 
since I'd like to delete them precisely because I need to save money. Isn't 
there any cheaper way to delete all entities of a type?

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread Bemmu
Crossposting my reply from stackoverflow.

I got advice on #appengine in IRC that simply getting the keys of 2000 
entities at a time and spawning tasks to delete them in pieces (can pass 
keys as strings to tasks) may be cheaper than using the Datastore Admin 
tool. I am trying this now. I will try to remember to report back tomorrow 
if this seems to be cheaper or not.

http://stackoverflow.com/questions/5252477/economically-deleting-data-from-app-engine

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread David Mora
why would it be cheaper if at the end, the datastore admin creates a map
reduce that iterates thru the model via splitting the index and loading each
entity per key name ? Each map is a task queue so it is exactly the same :)

On 9 March 2011 18:22, Bemmu bemmu@gmail.com wrote:

 Crossposting my reply from stackoverflow.

 I got advice on #appengine in IRC that simply getting the keys of 2000
 entities at a time and spawning tasks to delete them in pieces (can pass
 keys as strings to tasks) may be cheaper than using the Datastore Admin
 tool. I am trying this now. I will try to remember to report back tomorrow
 if this seems to be cheaper or not.


 http://stackoverflow.com/questions/5252477/economically-deleting-data-from-app-engine

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
http://about.me/david.mora

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread Simon Knott
I've been given the impression from these forums that the datastore admin 
tool is so expensive for exactly the reason you've stated David - it loads 
each entity by key before deletion, whereas deleting purely on keys is much 
cheaper CPU-wise as you don't need to bother with the retrieval of the 
entire entity to carry out the delete.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread David Mora
hmmm, well Wesley's option B was merely because the batch operations i
think. One nice feature about the map reduce is the mutation pool which
handles the logic of batching an operation while you yield thru iterations.
I guess in a big dataset like yours make sense (the model retrieval vs key
only). Anyways, interested case - i'll love to see where it ends :)

On 9 March 2011 19:26, Simon Knott knott.si...@gmail.com wrote:

 I've been given the impression from these forums that the datastore admin
 tool is so expensive for exactly the reason you've stated David - it loads
 each entity by key before deletion, whereas deleting purely on keys is much
 cheaper CPU-wise as you don't need to bother with the retrieval of the
 entire entity to carry out the delete.

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
http://about.me/david.mora

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2011-03-09 Thread Robert Kluin
I've tried using the datastore admin to delete some very large
datasets as well.  The solution Bemmu is using will be *significantly*
faster, and from my experience, should be more cost effective.

I'm also eager to hear what Bemmu thinks after the delete has ran for a while.



Robert






On Wed, Mar 9, 2011 at 21:20, David Mora dla.m...@gmail.com wrote:
 hmmm, well Wesley's option B was merely because the batch operations i
 think. One nice feature about the map reduce is the mutation pool which
 handles the logic of batching an operation while you yield thru iterations.
 I guess in a big dataset like yours make sense (the model retrieval vs key
 only). Anyways, interested case - i'll love to see where it ends :)

 On 9 March 2011 19:26, Simon Knott knott.si...@gmail.com wrote:

 I've been given the impression from these forums that the datastore admin
 tool is so expensive for exactly the reason you've stated David - it loads
 each entity by key before deletion, whereas deleting purely on keys is much
 cheaper CPU-wise as you don't need to bother with the retrieval of the
 entire entity to carry out the delete.

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.



 --
 http://about.me/david.mora

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: Deleting Data Really Expensive!

2010-12-29 Thread Ikai Lan (Google)
One more tip: only index the properties you need. When you delete an entity,
we also have to delete all the associated indexes.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
Blogger: http://googleappengine.blogspot.com
Reddit: http://www.reddit.com/r/appengine
Twitter: http://twitter.com/app_engine



On Tue, Dec 28, 2010 at 9:01 PM, Wesley C (Google)
wesc+...@google.comwesc%2b...@google.com
 wrote:

 A. +1 on tim's suggestion.as an example (in Python), the following
 piece of code...

 for obj in Object.all():
obj.delete() # thousands of individual delete()s

 ... should run way slower than...

 Object.all(keys_only=True)
 db.delete(posts) # one massive delete()

 the second is faster in 2 ways:
 1) keys-only means that it doesn't have to fetch the actual data
 2) uses the google.appengine.ext.db.delete() once where you pass in
 individual keys

 B. another alternative is the new Datastore Admin where you can
 manually bulk delete entities:

 http://code.google.com/appengine/docs/python/datastore/creatinggettinganddeletingdata.html#Deleting_Entities_in_Bulk

 http://googleappengine.blogspot.com/2010/10/new-app-engine-sdk-138-includes-new.html

 C. other alternatives

 when a game finishes, *must* you do the deletes before returning back
 to the user? if it's not critical, then you can farm out this job to a
 task queue or use the new mapper API (half of the MapReduce solution).
 more info on both at:

 Tasks Queue API
 http://code.google.com/appengine/docs/python/taskqueue/overview.html

 Mapper API
 http://googleappengine.blogspot.com/2010/07/introducing-mapper-api.html
 http://code.google.com/p/appengine-mapreduce/
 http://code.google.com/appengine/articles/mr/mapper.html

 if deletion is complex, you may also consider the new Pipeline API:
 http://code.google.com/p/appengine-pipeline/
 http://news.ycombinator.com/item?id=2013133

 hope this helps!
 -- wesley
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 Core Python Programming, Prentice Hall, (c)2007,2001
 Python Fundamentals, Prentice Hall, (c)2009
http://corepython.com

 wesley.chun : wesc+api at google.com : @wescpy
 developer relations :: google app engine
 @app_engine :: googleappengine.blogspot.com

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Deleting Data Really Expensive!

2010-12-28 Thread Tim Hoffman
Make sure you are only getting the keys, rather than whole entities.

That will be a lot less costly

Rgds

Tim

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.