Hello,

After noticing my app using far more of the datastore quota than I
expected, I conducted a little experiment with a toy app. The toy
creates records of the format:

class Record(db.Model):
    int1 = db.IntegerProperty()
    # ... repeated three more times

According to the App Engine documentation, IntegerProperty types in
the datastore are stored as 64-bit integers, so each record consists
of 8 * 4 = 32 bytes of "payload", plus overhead (key size * number of
records, etc). I was hoping, with this experiment, to gain some
insight into what that overhead costs.

I uploaded 141,725 (sorry it's not a round number) of these records
and waited a day for my quota to update; it now shows me using 0.09GB
of my stored data quota. (In comparison, the amount of "payload" data
in the store is only 141,725 x 32 = 4,535,200 bytes, or ~0.004GB.)

I'm not sure whether it's best to compute overhead-per-property or
overhead-per-record (there is obviously a nontrivial cost to the
former, of course), and determining which is actually using the extra
quota would probably take more effort than it's worth, so I figured
I'd ask here.

Anyway, in terms of overhead, I'm observing (in this case)
approximately 600 bytes per record (in addition to the size of the
record's payload), or 150 bytes per property. In my particular case
(though I doubt this ratio is anything resembling constant), I'm
seeing 19 bytes of quota used for every byte of payload.

So I guess I have two big questions.

First, quota obviously measures something resembling the total size of
the datastore tables created by the user (their size on disk?), which,
due to indexing and such, is obviously going to be larger than the
size of the payload alone. Right?

Second, how can I minimize this? Is the primary cost the added
indices, such that if I disable indexing for properties I won't be
querying on, I'll save space? (Again, this is something I could
experiment with, but at this point, why not just ask?) What's the
overhead per record, regardless of indexing? Are there any other steps
I can take to minimize my datastore quota usage?

(The application I'm actually developing stores a few hundreds of
megabytes of data in three tables--two of them only necessary because
of the limitations on the types of queries datastore can handle--and
while when I was developing the application I estimated that we would
be able to stay well within an affordable amount of quota usage, it
now appears that we would be using something like 200GB of
datastore.)

Thanks,
Calvin

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to