Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-23 Thread Nick Johnson (Google)
Hi,

On Tue, Mar 23, 2010 at 10:25 AM, homunq jameson.qu...@gmail.com wrote:



 On Mar 22, 3:48 pm, Nick Johnson (Google) nick.john...@google.com
 wrote
  On Mon, Mar 22, 2010 at 8:45 PM, homunq jameson.qu...@gmail.com wrote:
   OK, after hashing it out on IRC, I see that I have to erase my data
   and start again.
 
  Why is that? Wouldn't updating the data be a better option?

 Because everything about it is wrong for saving space - the key names,
 the field names, the indexes, and even in one case the fact of
 breaking a string out into a list. (something I did for better
 searching in several cases, one of which is not worth it now I realize
 that 10X is easy to hit.)

 And because the data import runs smoothly, and I have code for that
 already.

 

 Watching my deletion process start to get trapped in molasses, as Eli
 Jones mentions above, I have to ask two things again:

 1. Is there ANY ANY way to delete all indexes on a given property
 name? Without worrying about keeping indexes in order when I'm just
 paring them down to 0, I'd just be running through key names and
 deleting them. It seems that would be much faster. (If it's any help,
 I strongly suspect that most of my key names are globally unique
 across all of Google).


No - that would violate the constant that indexes are always kept in sync
with the data they refer to.



 2. What is the reason for the slowdown? If I understand his suggestion
 to delete every 10th record, Eli Jones seems to suspect that it's
 because there's some kind of resource conflict on specific sections of
 storage, thus the solution is to attempt to spread your load across
 machines. I don't see why that would cause a gradual slowdown. My best
 theory is that write-then-delete leaves the index somehow a little
 messier (for instance, maybe the index doesn't fully recover the
 unused space because it expects you to fill it again) and that when
 you do it on a massive scale you get massively messy and slow indexes.
 Thus, again, I suspect this question reduces to question 1, although I
 guess that if my theory is right a compress/garbage-collect/degunking
 call for the indexes would be (for me) second best after a way to nuke
 them.


Deletes using the naive approach slow down because when a record is deleted
in Bigtable, it simply inserts a 'tombstone' record indicating the original
record is deleted - the record isn't actually removed entirely from the
datastore until the tablet it's on does its next compaction cycle. Until
then, every subsequent query has to skip over the tombstone records to find
the live records.

This is easy to avoid: Use cursors to delete records sequentially. That way,
your queries won't be skipping the same tombstoned records over and over
again - O(n) instead of O(n^2)!

-Nick Johnson



 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-23 Thread Nick Johnson (Google)
On Tue, Mar 23, 2010 at 1:57 PM, homunq jameson.qu...@gmail.com wrote:


 
   Watching my deletion process start to get trapped in molasses, as Eli
   Jones mentions above, I have to ask two things again:
 
   1. Is there ANY ANY way to delete all indexes on a given property
   name? Without worrying about keeping indexes in order when I'm just
   paring them down to 0, I'd just be running through key names and
   deleting them. It seems that would be much faster. (If it's any help,
   I strongly suspect that most of my key names are globally unique
   across all of Google).
 
  No - that would violate the constant that indexes are always kept in sync
  with the data they refer to.
 

 It seems to me that having no index at all is the same situation as if
 the property was indexed=False from the beginning. If that's so, it
 can't be violating a hard constraint.


Internally, indexed fields are stored in the 'properties' list in the Entity
Protocol Buffer, while unindexed fields are stored in the
'unindexed_properties' list in the Entity PB. The only way to change the
indexing properties is to fetch them and store them.



 
   2. What is the reason for the slowdown? If I understand his suggestion
   to delete every 10th record, Eli Jones seems to suspect that it's
   because there's some kind of resource conflict on specific sections of
   storage, thus the solution is to attempt to spread your load across
   machines. I don't see why that would cause a gradual slowdown. My best
   theory is that write-then-delete leaves the index somehow a little
   messier (for instance, maybe the index doesn't fully recover the
   unused space because it expects you to fill it again) and that when
   you do it on a massive scale you get massively messy and slow indexes.
   Thus, again, I suspect this question reduces to question 1, although I
   guess that if my theory is right a compress/garbage-collect/degunking
   call for the indexes would be (for me) second best after a way to nuke
   them.
 
  Deletes using the naive approach slow down because when a record is
 deleted
  in Bigtable, it simply inserts a 'tombstone' record indicating the
 original
  record is deleted - the record isn't actually removed entirely from the
  datastore until the tablet it's on does its next compaction cycle. Until
  then, every subsequent query has to skip over the tombstone records to
 find
  the live records.
 
  This is easy to avoid: Use cursors to delete records sequentially. That
 way,
  your queries won't be skipping the same tombstoned records over and over
  again - O(n) instead of O(n^2)!
 

 Thanks for explaining. Can you say anything about how often the
 compaction cycles are? Just an order of magnitude - hours, days, or
 weeks?


They're based on the quantity of modifications to data in a given tablet.
Doing many inserts, updates or deletes will, sooner or later, cause a
compaction.

-Nick Johnson



 Thanks,
 Jameson

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread Patrick Twohig
Hey Nick,

Just out of curiosity, how many properties would it take to get that amount
of wasted space in overhead?  Are we talking about entities in the orders of
magnitudes of tens/thousands/hundreds?


On Mon, Mar 22, 2010 at 9:07 AM, homunq jameson.qu...@gmail.com wrote:

 OK, I guess I'm guilty on all counts.

 Clearly, I can fix that moving forward, though it will cost me a lot
 of CPU to fix the data I've already entered. But as a short-term
 stopgap, is there any way to delete entire default indexes for a given
 property? (I mean, anything besides setting indexed=False and then
 touching each entity one-by-one). You can vacuum custom indexes - can
 you do it with indexes created by default?

 Thanks,
 Jameson

 On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
 wrote:
  Hi,
 
  The discrepancy between datastore stats volume and stored data is
 generally
  due to indexing overhead, which is not included in the datastore stats.
 This
  can be very high for entities with many properties, or with long entity
 and
  property names or entity keys. Do you have reason to suppose that's not
 the
  case in your situation?
 
  -Nick Johnson
 
 
 
 
 
  On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote:
   Something is wrong. My app is showing with 7.42GB of total stored
   data, but only 615 MB of datastore. There is only one version string
   uploaded, which is almost 150MB, and nothing in the blobstore. This
   discrepancy has been getting worse - several hours ago (longer than
   the period since datastore statistics were updated, if you're
   wondering), there were the same 615 MB in the datastore, and only
   3.09GB of total stored data. (at that time, my theory was that it
   was old uploads of tweaks to the same version - but the numbers have
   gone far, far beyond that explanation now.) It's not some exploding
   index; the only non-default index I have is on an entity type with
   just 33 entities.
 
   Here's the line from my dashboard:
   Total Stored Data$0.005/GByte-day82% 7.42
 of
   9.00 GBytes
   $0.04 / $0.04
 
   And here is the word from my datastore statistics:
   Last updatedTotal number of entitiesSize of all entities
   1:32:13 ago 232,867 615 MBytes
   (metadata 11%, if that matters)
 
   Please, can someone help me figure out this issue? I'd be happy to
   share any info or code which would help track this down. My app id is
   vulahealth.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com
 .
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib
 e...@googlegroups.com
   .
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Nick Johnson, Developer Programs Engineer, App Engine
  Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
 Number:
  368047

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Patrick H. Twohig.

Namazu Studios
P.O. Box 34161
San Diego, CA 92163-4161

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread Nick Johnson (Google)
Hi Patrick,

An overhead factor of 12 (as observed below) is high, but not outrageous.
With long model names and property names, this could happen with relatively
few indexed properties - on the order of magnitude of tens, at most.

-Nick Johnson

On Mon, Mar 22, 2010 at 8:07 PM, Patrick Twohig
patr...@namazustudios.comwrote:

 Hey Nick,

 Just out of curiosity, how many properties would it take to get that amount
 of wasted space in overhead?  Are we talking about entities in the orders of
 magnitudes of tens/thousands/hundreds?



 On Mon, Mar 22, 2010 at 9:07 AM, homunq jameson.qu...@gmail.com wrote:

 OK, I guess I'm guilty on all counts.

 Clearly, I can fix that moving forward, though it will cost me a lot
 of CPU to fix the data I've already entered. But as a short-term
 stopgap, is there any way to delete entire default indexes for a given
 property? (I mean, anything besides setting indexed=False and then
 touching each entity one-by-one). You can vacuum custom indexes - can
 you do it with indexes created by default?

 Thanks,
 Jameson

 On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
 wrote:
  Hi,
 
  The discrepancy between datastore stats volume and stored data is
 generally
  due to indexing overhead, which is not included in the datastore stats.
 This
  can be very high for entities with many properties, or with long entity
 and
  property names or entity keys. Do you have reason to suppose that's not
 the
  case in your situation?
 
  -Nick Johnson
 
 
 
 
 
  On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com
 wrote:
   Something is wrong. My app is showing with 7.42GB of total stored
   data, but only 615 MB of datastore. There is only one version string
   uploaded, which is almost 150MB, and nothing in the blobstore. This
   discrepancy has been getting worse - several hours ago (longer than
   the period since datastore statistics were updated, if you're
   wondering), there were the same 615 MB in the datastore, and only
   3.09GB of total stored data. (at that time, my theory was that it
   was old uploads of tweaks to the same version - but the numbers have
   gone far, far beyond that explanation now.) It's not some exploding
   index; the only non-default index I have is on an entity type with
   just 33 entities.
 
   Here's the line from my dashboard:
   Total Stored Data$0.005/GByte-day82% 7.42
 of
   9.00 GBytes
   $0.04 / $0.04
 
   And here is the word from my datastore statistics:
   Last updatedTotal number of entitiesSize of all entities
   1:32:13 ago 232,867 615 MBytes
   (metadata 11%, if that matters)
 
   Please, can someone help me figure out this issue? I'd be happy to
   share any info or code which would help track this down. My app id is
   vulahealth.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To post to this group, send email to
 google-appeng...@googlegroups.com.
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib
 e...@googlegroups.com
   .
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Nick Johnson, Developer Programs Engineer, App Engine
  Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
 Number:
  368047

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




 --
 Patrick H. Twohig.

 Namazu Studios
 P.O. Box 34161
 San Diego, CA 92163-4161


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread Patrick Twohig
I'd use a cursor on the task queue.  Do bulk deletes in blocks of 500 (I
think that's the most keys you can pass to delete on a single call) and it
shouldn't be that hard to wipe it out.

Cheers!

On Mon, Mar 22, 2010 at 1:45 PM, homunq jameson.qu...@gmail.com wrote:

 OK, after hashing it out on IRC, I see that I have to erase my data
 and start again. Since it took me 3 days of CPU quota to add the data,
 I want to know if I can erase it quickly.

 1. Is the overhead for erasing data (and thus whittling down indexes)
 over half the overhead from adding it? Under 10%? Or what? (I don't
 need exact numbers, just approximates.

 2. If it's more like half - is there some way to just nuke all my data
 and start over?

 Thanks,
 Jameson


 On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
 wrote:
  Hi,
 
  The discrepancy between datastore stats volume and stored data is
 generally
  due to indexing overhead, which is not included in the datastore stats.
 This
  can be very high for entities with many properties, or with long entity
 and
  property names or entity keys. Do you have reason to suppose that's not
 the
  case in your situation?
 
  -Nick Johnson
 
 
 
 
 
  On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote:
   Something is wrong. My app is showing with 7.42GB of total stored
   data, but only 615 MB of datastore. There is only one version string
   uploaded, which is almost 150MB, and nothing in the blobstore. This
   discrepancy has been getting worse - several hours ago (longer than
   the period since datastore statistics were updated, if you're
   wondering), there were the same 615 MB in the datastore, and only
   3.09GB of total stored data. (at that time, my theory was that it
   was old uploads of tweaks to the same version - but the numbers have
   gone far, far beyond that explanation now.) It's not some exploding
   index; the only non-default index I have is on an entity type with
   just 33 entities.
 
   Here's the line from my dashboard:
   Total Stored Data$0.005/GByte-day82% 7.42
 of
   9.00 GBytes
   $0.04 / $0.04
 
   And here is the word from my datastore statistics:
   Last updatedTotal number of entitiesSize of all entities
   1:32:13 ago 232,867 615 MBytes
   (metadata 11%, if that matters)
 
   Please, can someone help me figure out this issue? I'd be happy to
   share any info or code which would help track this down. My app id is
   vulahealth.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com
 .
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib
 e...@googlegroups.com
   .
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Nick Johnson, Developer Programs Engineer, App Engine
  Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
 Number:
  368047

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Patrick H. Twohig.

Namazu Studios
P.O. Box 34161
San Diego, CA 92163-4161

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread Eli Jones
oh man.. well, he's going to be wiping out 7GB of junk... :)

When I went through process of deleting something like 400MB of junk.. it
was not fun

First I started off deleting by __key__ in batches of 500, then I had to
limit down to 200.. then down to 100.. then down to 50.. then down to 10..
then it stopped responding for hours (I could not even fetch(1) from the
Model).

There must be a sanctioned way to remove 100,000s of entities based on how
the datastore is structured.  For example, does it make sense to do
something like this.

Use a cursor to:
1. Select __key__ from Model Order By __key__
2. append every 10th (or 100th) result to a list.. and delete that list for
every 100 or 200 or 500 entities added.
3. Once at end of cursor, start over at the beginning.

That way, you wouldn't be deleting everything on the same table at the same
time?  The datastore completely died on me when I tried to straight delete
by __key__ using GqlQuery in a loop.. Just kept getting slower and slower.
(I think maybe directly deleting by key_name might work better but I never
had to do a bulk delete again.. so have not tested that theory).

On Mon, Mar 22, 2010 at 5:19 PM, Patrick Twohig
patr...@namazustudios.comwrote:

 I'd use a cursor on the task queue.  Do bulk deletes in blocks of 500 (I
 think that's the most keys you can pass to delete on a single call) and it
 shouldn't be that hard to wipe it out.

 Cheers!


 On Mon, Mar 22, 2010 at 1:45 PM, homunq jameson.qu...@gmail.com wrote:

 OK, after hashing it out on IRC, I see that I have to erase my data
 and start again. Since it took me 3 days of CPU quota to add the data,
 I want to know if I can erase it quickly.

 1. Is the overhead for erasing data (and thus whittling down indexes)
 over half the overhead from adding it? Under 10%? Or what? (I don't
 need exact numbers, just approximates.

 2. If it's more like half - is there some way to just nuke all my data
 and start over?

 Thanks,
 Jameson


 On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
 wrote:
  Hi,
 
  The discrepancy between datastore stats volume and stored data is
 generally
  due to indexing overhead, which is not included in the datastore stats.
 This
  can be very high for entities with many properties, or with long entity
 and
  property names or entity keys. Do you have reason to suppose that's not
 the
  case in your situation?
 
  -Nick Johnson
 
 
 
 
 
  On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com
 wrote:
   Something is wrong. My app is showing with 7.42GB of total stored
   data, but only 615 MB of datastore. There is only one version string
   uploaded, which is almost 150MB, and nothing in the blobstore. This
   discrepancy has been getting worse - several hours ago (longer than
   the period since datastore statistics were updated, if you're
   wondering), there were the same 615 MB in the datastore, and only
   3.09GB of total stored data. (at that time, my theory was that it
   was old uploads of tweaks to the same version - but the numbers have
   gone far, far beyond that explanation now.) It's not some exploding
   index; the only non-default index I have is on an entity type with
   just 33 entities.
 
   Here's the line from my dashboard:
   Total Stored Data$0.005/GByte-day82% 7.42
 of
   9.00 GBytes
   $0.04 / $0.04
 
   And here is the word from my datastore statistics:
   Last updatedTotal number of entitiesSize of all entities
   1:32:13 ago 232,867 615 MBytes
   (metadata 11%, if that matters)
 
   Please, can someone help me figure out this issue? I'd be happy to
   share any info or code which would help track this down. My app id is
   vulahealth.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To post to this group, send email to
 google-appeng...@googlegroups.com.
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib
 e...@googlegroups.com
   .
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Nick Johnson, Developer Programs Engineer, App Engine
  Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
 Number:
  368047

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




 --
 Patrick H. Twohig.

 Namazu Studios
 P.O. Box 34161
 San Diego, CA 92163-4161

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post 

Re: [google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread Nick Johnson (Google)
Hi,

On Mon, Mar 22, 2010 at 8:45 PM, homunq jameson.qu...@gmail.com wrote:

 OK, after hashing it out on IRC, I see that I have to erase my data
 and start again.


Why is that? Wouldn't updating the data be a better option?


 Since it took me 3 days of CPU quota to add the data,
 I want to know if I can erase it quickly.

 1. Is the overhead for erasing data (and thus whittling down indexes)
 over half the overhead from adding it? Under 10%? Or what? (I don't
 need exact numbers, just approximates.


It should be significantly lower - you can do a keys-only query, and delete
the returned keys.

-Nick Johnson



 2. If it's more like half - is there some way to just nuke all my data
 and start over?

 Thanks,
 Jameson


 On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
 wrote:
  Hi,
 
  The discrepancy between datastore stats volume and stored data is
 generally
  due to indexing overhead, which is not included in the datastore stats.
 This
  can be very high for entities with many properties, or with long entity
 and
  property names or entity keys. Do you have reason to suppose that's not
 the
  case in your situation?
 
  -Nick Johnson
 
 
 
 
 
  On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote:
   Something is wrong. My app is showing with 7.42GB of total stored
   data, but only 615 MB of datastore. There is only one version string
   uploaded, which is almost 150MB, and nothing in the blobstore. This
   discrepancy has been getting worse - several hours ago (longer than
   the period since datastore statistics were updated, if you're
   wondering), there were the same 615 MB in the datastore, and only
   3.09GB of total stored data. (at that time, my theory was that it
   was old uploads of tweaks to the same version - but the numbers have
   gone far, far beyond that explanation now.) It's not some exploding
   index; the only non-default index I have is on an entity type with
   just 33 entities.
 
   Here's the line from my dashboard:
   Total Stored Data$0.005/GByte-day82% 7.42
 of
   9.00 GBytes
   $0.04 / $0.04
 
   And here is the word from my datastore statistics:
   Last updatedTotal number of entitiesSize of all entities
   1:32:13 ago 232,867 615 MBytes
   (metadata 11%, if that matters)
 
   Please, can someone help me figure out this issue? I'd be happy to
   share any info or code which would help track this down. My app id is
   vulahealth.
 
   --
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To post to this group, send email to google-appengine@googlegroups.com
 .
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib
 e...@googlegroups.com
   .
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Nick Johnson, Developer Programs Engineer, App Engine
  Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
 Number:
  368047

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appeng...@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
 .
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.