[google-appengine] Datastore usage goes up over time?

2009-03-13 Thread C. Scott Ananian

I imported about 3 million very simple records into appengine, using
the bulk_uploader tool.  The raw size of the input CSV was 85M.  After
upload, google showed my datastore quota usage as being about 0.16G --
a factor of 2 increase, but reasonable.  I left the application for a
few days, and logged back in to check on it: suddenly my datastore
usage is pegged at 100% (1.0G)!

First: why did my datastore usage grow over time?  My application
isn't writing records to the datastore, just serving queries against
the static data set.  Is it because Google created indexes?  Can I
stop that from happening?

Second: how do I find out how much space my data is "really" taking up
(according to the quota system).  I'm reluctant to turn on billing
until I know whether the suddenly-bloated version of my data is 1.01G
or 16G -- and until I know whether it will continue to bloat further!

http://cscott-geotest.appspot.com/ is the application URL; as you can
see, it's a very simple application which returns a guessed zip code
and country for the user based on the IP address their request comes
from.  The model looks like:

class IP2PostalCode(BaseModel):
"""Starting IPv4 address, as a 32-bit integer; ie, a.b.c.d
is represented as 256*(256*(256*a+b)+c)+d"""
ipStart = db.IntegerProperty(required=True)
"""ISO-3166 alpha2 country code for this IPv4 address block."""
country = db.StringProperty()
"""State (AdminCode1) for this block."""
state = db.StringProperty()
"""City for this block."""
city = db.StringProperty()
"""Postal code for this IPv4 address block."""
postcode = db.StringProperty()

Can someone help me figure out what's going on here?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Datastore usage goes up over time?

2009-03-20 Thread C. Scott Ananian

On Mar 15, 7:48 am, Stephen  wrote:
> On Mar 13, 9:39 pm, "C. Scott Ananian"  wrote:
> > I imported about 3 million very simple records into appengine, using
> > the bulk_uploader tool.  The raw size of the input CSV was 85M.  After
> > upload, google showed my datastore quota usage as being about 0.16G --
> > a factor of 2 increase, but reasonable.  I left the application for a
> > few days, and logged back in to check on it: suddenly my datastore
> > usage is pegged at 100% (1.0G)!
>
> > First: why did my datastore usage grow over time?  My application
> > isn't writing records to the datastore, just serving queries against
> > the static data set.  Is it because Google created indexes?  Can I
> > stop that from happening?
>
> > Second: how do I find out how much space my data is "really" taking up
> > (according to the quota system).  I'm reluctant to turn on billing
> > until I know whether the suddenly-bloated version of my data is 1.01G
> > or 16G -- and until I know whether it will continue to bloat further!
>
> >http://cscott-geotest.appspot.com/isthe application URL; as you can
> > see, it's a very simple application which returns a guessed zip code
> > and country for the user based on the IP address their request comes
> > from.  The model looks like:
>
> > class IP2PostalCode(BaseModel):
> >     """Starting IPv4 address, as a 32-bit integer; ie, a.b.c.d
> >     is represented as 256*(256*(256*a+b)+c)+d"""
> >     ipStart = db.IntegerProperty(required=True)
> >     """ISO-3166 alpha2 country code for this IPv4 address block."""
> >     country = db.StringProperty()
> >     """State (AdminCode1) for this block."""
> >     state = db.StringProperty()
> >     """City for this block."""
> >     city = db.StringProperty()
> >     """Postal code for this IPv4 address block."""
> >     postcode = db.StringProperty()
>
> You only query based on ipStart?  Try making the other properties
> db.TextProperty instead of db.StringProperty, which do not have
> indexes automatically created for them.
>
> Not sure how you remove the now unused indexes.  The documentation
> mentions 'appcfg.py vacuum_indexes', but that only seems to apply to
> composite indexes defined in index.yaml.  You may have to loop through
> every entity in your database, select it and then save it back, so the
> properties change from String to Text.
>
> BigTable does seem bloaty...

Your suggestion seems to work.  Changing from StringProperty to
TextProperty did bloat the size of the entities -- I went from about
0.16G after initial import to about .80G -- but the datastore storage
doesn't seem to balloon further when I leave it alone and the indexes
get created.  (Or at least, it hasn't yet: fingers crossed!)

It would be nice if I could explicitly disable single-property indexes
in index.yaml, so that I could remove the unneeded indexes on
StringProperties without paying the 4x storage overhead of converting
them to TextProperty.
  --scott
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Datastore usage goes up over time?

2009-03-20 Thread C. Scott Ananian

On Mar 20, 1:58 pm, "C. Scott Ananian"  wrote:
> Your suggestion seems to work.  Changing from StringProperty to
> TextProperty did bloat the size of the entities -- I went from about
> 0.16G after initial import to about .80G -- but the datastore storage
> doesn't seem to balloon further when I leave it alone and the indexes
> get created.  (Or at least, it hasn't yet: fingers crossed!)

I take it back: a few hours after import the datastore usage ballooned
and I'm "over quota" again.  So it's not only related to indexes, or
there are some indexes I didn't manage to disable?  I have no clue.
 --scott
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---