I haven't looked too closely at your problem, but one thing that has come
up before on this list is that it's a bad idea to do things like this:
repeat {
delete N items;
}
Basically, deleting items just flags them as deleted in the underlying
store. They are vacuumed up later. So if you delete a lot of stuff like
this it ends up being a slow O(N^2) operation as you're effectively doing
an offset over all the previously deleted items.
To efficiently delete large numbers of entities, either delete them all in
a single request or use a cursor.
Of course, this information may be out of date.
Jeff
On Thu, Dec 8, 2011 at 8:22 PM, Michael <[email protected]> wrote:
> I have an hourly cron job that deletes data from the datastore older than
> one month (this data is archived elsewhere for long term storage). The
> first run of that cron job on Tuesday after the datastore came back up
> behaved quite unusually, ended with "java.lang.OutOfMemoryError: Java heap
> space", and hasn't completed once since then. While it is possible that
> this is pure coincidence, I'm wondering if something done during the
> maintenance resulted in this behavior. I have been unable to get this cron
> job to run correctly since then.
>
> The job is quite simple, and has been running happily for about a year; I
> will present the idea here for brevity and attach the source for those
> interested. In essence the job does the following:
> - retrieve at most 500 entities older than 1 month by their keys only
> - send all resulting keys as a list to datastore.delete()
> - repeat until no results are returned
>
> The first run after maintenance produced the attached log-excerpt.txt.
> The brief version is the following:
> - deleted 500 objects
> - deleted 465 objects
> - deleted 213 objects (repeated 395 times)
> - out of memory
>
> It seems that, after actually deleting the first 752 objects of the query,
> the datastore got stuck on the next 213. The same 213 objects were sent
> repeatedly to datastore.delete(). No exceptions were generated, but the
> data was obviously not deleted.
>
> The next attempt (the job was retried since it crashed) produced almost
> identical output. This time, it actually deleted 174 objects, then tried
> to delete the same 213 objects over and over until it, too, crashed with an
> OutOfMemoryError. The run after that actually deleted 8 objects before it
> crashed in the same manner. This continued until the error ran my
> application out of quota for the day, at which point I got a notification
> email and went to go pause the queue that these jobs run under.
>
> Note, I am not on the high replication datastore. I do not know why this
> is happening, but it is currently an insurmountable obstacle. I tried
> unpausing the queue temporarily and running the problematic job, and this
> time I did not even get the previously frustrating but informative output;
> instead, I merely got the "A serious problem was encountered . . ." message
> on both runs.
>
> Any help in getting this fixed or understanding the problem would be
> greatly appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine for Java" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine-java/-/AU89aTTQOR8J.
> To post to this group, send email to
> [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine-java?hl=en.
>
--
We are the 20%
--
You received this message because you are subscribed to the Google Groups
"Google App Engine for Java" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-appengine-java?hl=en.