Hey Richard,
  I think a factor could well be the small number of entities you're
removing.  I'm sure there is a good bit of overhead for that small a
batch, but I don't have an exact figure for you (without looking over
their code or doing some tests).

  For 1300 entities, I would try doing:
db.delete(ThatKind.all(keys_only=True).fetch(2000)).  That should take
only a second to run.  If you don't want to bother with uploading
code, set the datastore viewer to show 200 entities and manually
remove them (there's a select all button).  Frankly by doing just a
little more than "db.delete(ThatKind.all(keys_only=True).fetch(500))"
inside a loop within a task you can very rapidly delete quite a lot of
data.

  cursor = None
  while True:
    query = ThatKind.all(keys_only=True)
    if cursor:
      query.with_cursor(cursor)
    keys = query.fetch(500)
    cursor = query.cursor()
    db.delete(keys)
    if len(keys) < 500:
      break

  So, the first thing to note is that I've never been able to
successfully estimate the space exactly.  I can usually get close by
using the numbers in the article, then adding somewhere between a 15%
and 30% "fuzz" factor.  I assume that is the pb overhead, etc....
Your estimate will get you in the ballpark though.  The basic idea is
entity count *  (len(your appid) + len(namespace) + len(kind name) +
len(property name) + len(avg property value) + len(avg str(entity
key))).  There's also whatever field separators they're using, so
probably another 5 or so bytes.  As I said, I've got spreadsheets I
use to estimate all this, but it isn't exact.


Robert



On Thu, Feb 9, 2012 at 04:41, Richard Arrano <rickarr...@gmail.com> wrote:
> Hi Robert,
> Thanks for the quick response! I gathered that the admin would be
> incurring some overhead, but does it seem reasonable that it could
> account for my estimate having been off by nearly a factor of 10? It
> seems like in that case, it would be far cheaper to just write a
> custom entity delete task that does a keys only query and calls
> db.delete on them.
>
> Thanks for the article, I think I understand. To clarify, in the
> example I mentioned, would adding a non-composite EntitiesByProperty
> ASC type index then require 12,000 * (size of EntitiesByProperty ASC
> i.e. 3 str + key + property value) on top of the original storage
> space?
>
> Thanks,
> Rick
>
> On Feb 8, 11:18 pm, Robert Kluin <robert.kl...@gmail.com> wrote:
>> Hi Richard,
>>   The datastore admin also incurs some overhead.  At the minimum it
>> will be querying to find the keys that need deleted, so you'll have
>> *at least* 1 additional small operation per entity deleted.  In
>> addition you'll have overhead from the queries, and the shards getting
>> / writing their status entities -- so several more datastore
>> operations per task that was run.  All those add up pretty fast.
>>
>>   You can see how many write operations an entity needs in the SDK's
>> datastore view.  There's not a really good way to easily determine the
>> storage used by an index, I've got a feature request to at least
>> provide details on what existing indexes are using:
>>  http://code.google.com/p/googleappengine/issues/detail?id=2740
>>
>>   There's also an article where this is discussed, note that the
>> article is a little out of date.  It doesn't account for namespaces,
>> so you'll need to factor those in if you're using them.  I usually do
>> my estimates in a spreadsheet.
>>    http://code.google.com/appengine/articles/storage_breakdown.html
>>
>> Robert
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Feb 9, 2012 at 00:44, Richard Arrano <rickarr...@gmail.com> wrote:
>> > Hello,
>> > I'm having some trouble understanding the billing figures for when I
>> > perform data writes. I had 1300 entities with 1 property indexed and I
>> > kicked off a job via the Datastore Admin to delete them all. Given
>> > that:
>>
>> > Entity Delete (per entity)      2 Writes + 2 Writes per indexed property
>> > value + 1 Write per composite index value
>>
>> > It seems like in that case, for each entity there would be 2 writes +
>> > 2 writes for the 1 indexed property = 4 writes per entity, so 4 * 1300
>> > = 5200 writes used, or ~10% of the daily quota for writes consumed.
>> > However, within seconds, the data was indeed deleted but 100% of my
>> > quota had been consumed(0% had been consumed prior). How was I somehow
>> > off by a factor of 10?
>>
>> > On a related note, is there any tool out there for estimating how much
>> > space will be consumed by a certain object? I.e. I have a huge amount
>> > of one object, around 12,000 and I would love to see how much space
>> > would be consumed with one property indexed as opposed to two.
>>
>> > Thanks,
>> > Rick
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "Google App Engine" group.
>> > To post to this group, send email to google-appengine@googlegroups.com.
>> > To unsubscribe from this group, send email to 
>> > google-appengine+unsubscr...@googlegroups.com.
>> > For more options, visit this group 
>> > athttp://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to 
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to