On Mar 15, 7:48 am, Stephen <sdea...@gmail.com> wrote:
> On Mar 13, 9:39 pm, "C. Scott Ananian" <canan...@gmail.com> wrote:
> > I imported about 3 million very simple records into appengine, using
> > the bulk_uploader tool.  The raw size of the input CSV was 85M.  After
> > upload, google showed my datastore quota usage as being about 0.16G --
> > a factor of 2 increase, but reasonable.  I left the application for a
> > few days, and logged back in to check on it: suddenly my datastore
> > usage is pegged at 100% (1.0G)!
>
> > First: why did my datastore usage grow over time?  My application
> > isn't writing records to the datastore, just serving queries against
> > the static data set.  Is it because Google created indexes?  Can I
> > stop that from happening?
>
> > Second: how do I find out how much space my data is "really" taking up
> > (according to the quota system).  I'm reluctant to turn on billing
> > until I know whether the suddenly-bloated version of my data is 1.01G
> > or 16G -- and until I know whether it will continue to bloat further!
>
> >http://cscott-geotest.appspot.com/isthe application URL; as you can
> > see, it's a very simple application which returns a guessed zip code
> > and country for the user based on the IP address their request comes
> > from.  The model looks like:
>
> > class IP2PostalCode(BaseModel):
> >     """Starting IPv4 address, as a 32-bit integer; ie, a.b.c.d
> >     is represented as 256*(256*(256*a+b)+c)+d"""
> >     ipStart = db.IntegerProperty(required=True)
> >     """ISO-3166 alpha2 country code for this IPv4 address block."""
> >     country = db.StringProperty()
> >     """State (AdminCode1) for this block."""
> >     state = db.StringProperty()
> >     """City for this block."""
> >     city = db.StringProperty()
> >     """Postal code for this IPv4 address block."""
> >     postcode = db.StringProperty()
>
> You only query based on ipStart?  Try making the other properties
> db.TextProperty instead of db.StringProperty, which do not have
> indexes automatically created for them.
>
> Not sure how you remove the now unused indexes.  The documentation
> mentions 'appcfg.py vacuum_indexes', but that only seems to apply to
> composite indexes defined in index.yaml.  You may have to loop through
> every entity in your database, select it and then save it back, so the
> properties change from String to Text.
>
> BigTable does seem bloaty...

Your suggestion seems to work.  Changing from StringProperty to
TextProperty did bloat the size of the entities -- I went from about
0.16G after initial import to about .80G -- but the datastore storage
doesn't seem to balloon further when I leave it alone and the indexes
get created.  (Or at least, it hasn't yet: fingers crossed!)

It would be nice if I could explicitly disable single-property indexes
in index.yaml, so that I could remove the unneeded indexes on
StringProperties without paying the 4x storage overhead of converting
them to TextProperty.
  --scott
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to