On Mar 15, 7:48 am, Stephen <sdea...@gmail.com> wrote: > On Mar 13, 9:39 pm, "C. Scott Ananian" <canan...@gmail.com> wrote: > > I imported about 3 million very simple records into appengine, using > > the bulk_uploader tool. The raw size of the input CSV was 85M. After > > upload, google showed my datastore quota usage as being about 0.16G -- > > a factor of 2 increase, but reasonable. I left the application for a > > few days, and logged back in to check on it: suddenly my datastore > > usage is pegged at 100% (1.0G)! > > > First: why did my datastore usage grow over time? My application > > isn't writing records to the datastore, just serving queries against > > the static data set. Is it because Google created indexes? Can I > > stop that from happening? > > > Second: how do I find out how much space my data is "really" taking up > > (according to the quota system). I'm reluctant to turn on billing > > until I know whether the suddenly-bloated version of my data is 1.01G > > or 16G -- and until I know whether it will continue to bloat further! > > >http://cscott-geotest.appspot.com/isthe application URL; as you can > > see, it's a very simple application which returns a guessed zip code > > and country for the user based on the IP address their request comes > > from. The model looks like: > > > class IP2PostalCode(BaseModel): > > """Starting IPv4 address, as a 32-bit integer; ie, a.b.c.d > > is represented as 256*(256*(256*a+b)+c)+d""" > > ipStart = db.IntegerProperty(required=True) > > """ISO-3166 alpha2 country code for this IPv4 address block.""" > > country = db.StringProperty() > > """State (AdminCode1) for this block.""" > > state = db.StringProperty() > > """City for this block.""" > > city = db.StringProperty() > > """Postal code for this IPv4 address block.""" > > postcode = db.StringProperty() > > You only query based on ipStart? Try making the other properties > db.TextProperty instead of db.StringProperty, which do not have > indexes automatically created for them. > > Not sure how you remove the now unused indexes. The documentation > mentions 'appcfg.py vacuum_indexes', but that only seems to apply to > composite indexes defined in index.yaml. You may have to loop through > every entity in your database, select it and then save it back, so the > properties change from String to Text. > > BigTable does seem bloaty...
Your suggestion seems to work. Changing from StringProperty to TextProperty did bloat the size of the entities -- I went from about 0.16G after initial import to about .80G -- but the datastore storage doesn't seem to balloon further when I leave it alone and the indexes get created. (Or at least, it hasn't yet: fingers crossed!) It would be nice if I could explicitly disable single-property indexes in index.yaml, so that I could remove the unneeded indexes on StringProperties without paying the 4x storage overhead of converting them to TextProperty. --scott --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---