[google-appengine] Re: having trouble with uploading app

2010-12-07 Thread homunq
My partner and I are both getting this 503-unexpected problem,
repeatedly, on different platforms, with 1.3.8. (We're in different
countries, too.)

We've done nothing with error handlers, and it was working for both of
us as of yesterday.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: ~7 GB of ghost data???

2010-03-23 Thread homunq


On Mar 22, 3:48 pm, Nick Johnson (Google) nick.john...@google.com
wrote
 On Mon, Mar 22, 2010 at 8:45 PM, homunq jameson.qu...@gmail.com wrote:
  OK, after hashing it out on IRC, I see that I have to erase my data
  and start again.

 Why is that? Wouldn't updating the data be a better option?

Because everything about it is wrong for saving space - the key names,
the field names, the indexes, and even in one case the fact of
breaking a string out into a list. (something I did for better
searching in several cases, one of which is not worth it now I realize
that 10X is easy to hit.)

And because the data import runs smoothly, and I have code for that
already.



Watching my deletion process start to get trapped in molasses, as Eli
Jones mentions above, I have to ask two things again:

1. Is there ANY ANY way to delete all indexes on a given property
name? Without worrying about keeping indexes in order when I'm just
paring them down to 0, I'd just be running through key names and
deleting them. It seems that would be much faster. (If it's any help,
I strongly suspect that most of my key names are globally unique
across all of Google).

2. What is the reason for the slowdown? If I understand his suggestion
to delete every 10th record, Eli Jones seems to suspect that it's
because there's some kind of resource conflict on specific sections of
storage, thus the solution is to attempt to spread your load across
machines. I don't see why that would cause a gradual slowdown. My best
theory is that write-then-delete leaves the index somehow a little
messier (for instance, maybe the index doesn't fully recover the
unused space because it expects you to fill it again) and that when
you do it on a massive scale you get massively messy and slow indexes.
Thus, again, I suspect this question reduces to question 1, although I
guess that if my theory is right a compress/garbage-collect/degunking
call for the indexes would be (for me) second best after a way to nuke
them.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: ~7 GB of ghost data???

2010-03-23 Thread homunq


  Watching my deletion process start to get trapped in molasses, as Eli
  Jones mentions above, I have to ask two things again:

  1. Is there ANY ANY way to delete all indexes on a given property
  name? Without worrying about keeping indexes in order when I'm just
  paring them down to 0, I'd just be running through key names and
  deleting them. It seems that would be much faster. (If it's any help,
  I strongly suspect that most of my key names are globally unique
  across all of Google).

 No - that would violate the constant that indexes are always kept in sync
 with the data they refer to.


It seems to me that having no index at all is the same situation as if
the property was indexed=False from the beginning. If that's so, it
can't be violating a hard constraint.


  2. What is the reason for the slowdown? If I understand his suggestion
  to delete every 10th record, Eli Jones seems to suspect that it's
  because there's some kind of resource conflict on specific sections of
  storage, thus the solution is to attempt to spread your load across
  machines. I don't see why that would cause a gradual slowdown. My best
  theory is that write-then-delete leaves the index somehow a little
  messier (for instance, maybe the index doesn't fully recover the
  unused space because it expects you to fill it again) and that when
  you do it on a massive scale you get massively messy and slow indexes.
  Thus, again, I suspect this question reduces to question 1, although I
  guess that if my theory is right a compress/garbage-collect/degunking
  call for the indexes would be (for me) second best after a way to nuke
  them.

 Deletes using the naive approach slow down because when a record is deleted
 in Bigtable, it simply inserts a 'tombstone' record indicating the original
 record is deleted - the record isn't actually removed entirely from the
 datastore until the tablet it's on does its next compaction cycle. Until
 then, every subsequent query has to skip over the tombstone records to find
 the live records.

 This is easy to avoid: Use cursors to delete records sequentially. That way,
 your queries won't be skipping the same tombstoned records over and over
 again - O(n) instead of O(n^2)!


Thanks for explaining. Can you say anything about how often the
compaction cycles are? Just an order of magnitude - hours, days, or
weeks?

Thanks,
Jameson

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.

2010-03-22 Thread homunq
Thanks for responding. I've starred the bug - it seems basic. More
answers below.

On Mar 21, 7:06 pm, Eli Jones eli.jo...@gmail.com wrote:
 Depending on how your Models are defined (how many properties and what the
 property types are there) and depending on custom indexes, 10 to 1 is
 reasonable.

Even if I have essentially no custom indexes? (Only one ever, on a few
dozen entities with two integers each which handle cron housekeeping).


 For my app, I aggressively set most properties to Indexed=false and I have
 no custom indexes.  I'm still at 2 to 1 for Datastore usage versus Size of
 All Entities.

When you say datastore usage, you mean total data, right? Or is
there some other place this number is reported which I've missed?


 To fully explore what is going on, you'd need to post what your Model
 definitions are.. and then post what custom indexes are defined on any of
 the properties.

I have no custom indexes. I have only two entities. 99.9% of the data
is in the Expando which has about 45 stringproperties (initally all
indexed, now around 30 are; many frequently empty string); 4 integer
properties; 4 dates; 4 lists (about 4-8 members typically); from 3-18
expando string lists (of 20 types total; an average of 6 lists with 2
members each); and 2 expando strings (mostly but not always present).

 If you've been creating and then later on deleting custom indexes or
 changing your Models around.. running vacuum indexes might help (but that's
 just a shot in the dark).

I have done nothing with custom indexes, except the one miniscule
housekeeping one on a few entities. For the main entity kind, is there
any way to delete a default index without re-touching all the entities
one-by-one?


 On Sun, Mar 21, 2010 at 7:19 PM, homunq jameson.qu...@gmail.com wrote:
  My app is showing only 739 MB of datastore data, only 1 version
  (150MB), no blobs, a tiny amount of index and memcache use (2K in
  each case) - and yet 7.7GB of billable data! What the heck is going
  on? How can I fix it?

  Last updated    Total number of entities        Size of all entities
  3:27:42 ago     279,689 739 MBytes

  Total Stored Data        $0.005/GByte-day                86%     7.70 of
  9.00 GBytes
  $0.04 / $0.04

  This situation has persisted for well over 24 hours now, it's not just
  a figment of the update period. Also, there's negligible non-default
  indexes, and probably a total of around 2 GB if you count all my
  uploaded code versions ever (but since they all had the same version
  string, they should have overwritten). I probably have a fair amount
  of data in the logs - I have a number of cron jobs, collectively they
  run just under twice a minute - but supposedly logs are not billable
  data.

  This is an app I'm developing for a client. Until this issue is fixed,
  I'm certainly not going to bill for my work, which is otherwise done.
  So I'm anxious to fix this ASAP. If there's somewhere else I should/
  could be taking this question (aside from IRC, where I've brought it
  up twice), I'd be happy to learn it. If there's any further info which
  could help resolve this, I'd be happy to share it, too. (My app id is
  vulahealth)

  Thanks,
  Jameson

  (Second post, I think the first post was moderated into oblivion - it
  was my first post on this list.)

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine group.
  To post to this group, send email to google-appeng...@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
  .
  For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread homunq
OK, I guess I'm guilty on all counts.

Clearly, I can fix that moving forward, though it will cost me a lot
of CPU to fix the data I've already entered. But as a short-term
stopgap, is there any way to delete entire default indexes for a given
property? (I mean, anything besides setting indexed=False and then
touching each entity one-by-one). You can vacuum custom indexes - can
you do it with indexes created by default?

Thanks,
Jameson

On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
wrote:
 Hi,

 The discrepancy between datastore stats volume and stored data is generally
 due to indexing overhead, which is not included in the datastore stats. This
 can be very high for entities with many properties, or with long entity and
 property names or entity keys. Do you have reason to suppose that's not the
 case in your situation?

 -Nick Johnson





 On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote:
  Something is wrong. My app is showing with 7.42GB of total stored
  data, but only 615 MB of datastore. There is only one version string
  uploaded, which is almost 150MB, and nothing in the blobstore. This
  discrepancy has been getting worse - several hours ago (longer than
  the period since datastore statistics were updated, if you're
  wondering), there were the same 615 MB in the datastore, and only
  3.09GB of total stored data. (at that time, my theory was that it
  was old uploads of tweaks to the same version - but the numbers have
  gone far, far beyond that explanation now.) It's not some exploding
  index; the only non-default index I have is on an entity type with
  just 33 entities.

  Here's the line from my dashboard:
  Total Stored Data        $0.005/GByte-day                82%     7.42 of
  9.00 GBytes
  $0.04 / $0.04

  And here is the word from my datastore statistics:
  Last updated    Total number of entities        Size of all entities
  1:32:13 ago     232,867 615 MBytes
  (metadata 11%, if that matters)

  Please, can someone help me figure out this issue? I'd be happy to
  share any info or code which would help track this down. My app id is
  vulahealth.

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine group.
  To post to this group, send email to google-appeng...@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib 
  e...@googlegroups.com
  .
  For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.

 --
 Nick Johnson, Developer Programs Engineer, App Engine
 Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
 368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.

2010-03-22 Thread homunq
Sorry, this thread is a duplicate of a href=http://groups.google.com/
group/google-appengine/browse_thread/thread/2a116e341b97c6fd/
6ad663cd210032b2?lnk=gstthat other one/a. I got impatient and
reposted before the other one got moderated. Useful word from Google
along with my response is over there - basically, I ask, is there any
way to bulk delete default indexes.

On 22 mar, 05:44, Wooble geoffsp...@gmail.com wrote:
 On Mar 22, 4:20 am, homunq jameson.qu...@gmail.com wrote:

  I have no custom indexes. I have only two entities. 99.9% of the data
  is in the Expando which has about 45 stringproperties (initally all
  indexed, now around 30 are; many frequently empty string); 4 integer
  properties; 4 dates; 4 lists (about 4-8 members typically); from 3-18
  expando string lists (of 20 types total; an average of 6 lists with 2
  members each); and 2 expando strings (mostly but not always present).

 For each indexed property, there will be 2 index entries (one in the
 forward index and one in the reverse index), each of which contains
 the actual data, the name of the property, the name of your
 application, and if there's a parent entity information about the
 parent will be in the key as well.  This quickly adds up.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: ~7 GB of ghost data???

2010-03-22 Thread homunq
OK, after hashing it out on IRC, I see that I have to erase my data
and start again. Since it took me 3 days of CPU quota to add the data,
I want to know if I can erase it quickly.

1. Is the overhead for erasing data (and thus whittling down indexes)
over half the overhead from adding it? Under 10%? Or what? (I don't
need exact numbers, just approximates.

2. If it's more like half - is there some way to just nuke all my data
and start over?

Thanks,
Jameson


On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com
wrote:
 Hi,

 The discrepancy between datastore stats volume and stored data is generally
 due to indexing overhead, which is not included in the datastore stats. This
 can be very high for entities with many properties, or with long entity and
 property names or entity keys. Do you have reason to suppose that's not the
 case in your situation?

 -Nick Johnson





 On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote:
  Something is wrong. My app is showing with 7.42GB of total stored
  data, but only 615 MB of datastore. There is only one version string
  uploaded, which is almost 150MB, and nothing in the blobstore. This
  discrepancy has been getting worse - several hours ago (longer than
  the period since datastore statistics were updated, if you're
  wondering), there were the same 615 MB in the datastore, and only
  3.09GB of total stored data. (at that time, my theory was that it
  was old uploads of tweaks to the same version - but the numbers have
  gone far, far beyond that explanation now.) It's not some exploding
  index; the only non-default index I have is on an entity type with
  just 33 entities.

  Here's the line from my dashboard:
  Total Stored Data        $0.005/GByte-day                82%     7.42 of
  9.00 GBytes
  $0.04 / $0.04

  And here is the word from my datastore statistics:
  Last updated    Total number of entities        Size of all entities
  1:32:13 ago     232,867 615 MBytes
  (metadata 11%, if that matters)

  Please, can someone help me figure out this issue? I'd be happy to
  share any info or code which would help track this down. My app id is
  vulahealth.

  --
  You received this message because you are subscribed to the Google Groups
  Google App Engine group.
  To post to this group, send email to google-appeng...@googlegroups.com.
  To unsubscribe from this group, send email to
  google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib 
  e...@googlegroups.com
  .
  For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.

 --
 Nick Johnson, Developer Programs Engineer, App Engine
 Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
 368047

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] ~7 GB of ghost data???

2010-03-21 Thread homunq
Something is wrong. My app is showing with 7.42GB of total stored
data, but only 615 MB of datastore. There is only one version string
uploaded, which is almost 150MB, and nothing in the blobstore. This
discrepancy has been getting worse - several hours ago (longer than
the period since datastore statistics were updated, if you're
wondering), there were the same 615 MB in the datastore, and only
3.09GB of total stored data. (at that time, my theory was that it
was old uploads of tweaks to the same version - but the numbers have
gone far, far beyond that explanation now.) It's not some exploding
index; the only non-default index I have is on an entity type with
just 33 entities.

Here's the line from my dashboard:
Total Stored Data$0.005/GByte-day82% 7.42 of 9.00 
GBytes
$0.04 / $0.04

And here is the word from my datastore statistics:
Last updatedTotal number of entitiesSize of all entities
1:32:13 ago 232,867 615 MBytes
(metadata 11%, if that matters)

Please, can someone help me figure out this issue? I'd be happy to
share any info or code which would help track this down. My app id is
vulahealth.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Will GAE ever be open source?

2010-03-21 Thread homunq
There are several open-source efforts to get a drop-in replacement.
[1] It is likely that this will happen before Google open-sources the
code. Still, it would be great to get a commitment from Google that,
at some specific future date (say, in 7 years) they'll open source
their current code. (That is, code that's 7 years old by then -
without the intervening improvements. The open source community could
take it from there.)

But remember that they make heavy internal use of Bigtable and other
bits of code, far beyond app engine. App engine is a relatively
peripheral business for them, and probably does not pull the clout to
get all that up-to-date code open-sourced in the forseeable future.
That's why personally I'd be happy to settle for a commitment to open-
source old code later.

[1] 
http://blog.notdot.net/2009/04/Announcing-BDBDatastore-a-replacement-datastore-for-App-Engine
is not the freshest blog post, but it covers a number of the efforts.

On Mar 21, 4:28 pm, Josh Rehman j...@joshrehman.com wrote:
 It would be great to run my own app engine, both for development, and
 for production. Writing apps for a proprietary platform like GAE ties
 you to the platform, as I'm sure Google is aware. So, is there any
 chance the App Engine will be open-sourced such that it can be run on
 non-Google hardware?

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.

2010-03-21 Thread homunq
My app is showing only 739 MB of datastore data, only 1 version
(150MB), no blobs, a tiny amount of index and memcache use (2K in
each case) - and yet 7.7GB of billable data! What the heck is going
on? How can I fix it?

Last updatedTotal number of entitiesSize of all entities
3:27:42 ago 279,689 739 MBytes

Total Stored Data$0.005/GByte-day86% 7.70 of 9.00 
GBytes
$0.04 / $0.04

This situation has persisted for well over 24 hours now, it's not just
a figment of the update period. Also, there's negligible non-default
indexes, and probably a total of around 2 GB if you count all my
uploaded code versions ever (but since they all had the same version
string, they should have overwritten). I probably have a fair amount
of data in the logs - I have a number of cron jobs, collectively they
run just under twice a minute - but supposedly logs are not billable
data.

This is an app I'm developing for a client. Until this issue is fixed,
I'm certainly not going to bill for my work, which is otherwise done.
So I'm anxious to fix this ASAP. If there's somewhere else I should/
could be taking this question (aside from IRC, where I've brought it
up twice), I'd be happy to learn it. If there's any further info which
could help resolve this, I'd be happy to share it, too. (My app id is
vulahealth)

Thanks,
Jameson

(Second post, I think the first post was moderated into oblivion - it
was my first post on this list.)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.

2010-03-21 Thread homunq
I know I probably shouldn't reply to myself, but this issue seems to
be a bit more serious than your average newbie question. This is
something approaching a 1000% inflation from my raw data to my
billable data (actually, the csv data, without redundant field names,
is even smaller, so it would be more like 2000%).  Also,  I've tried
to google this, and I can't find any reference to similar problems.

Sure, in the end it's probably partly my fault. For instance, my cron
jobs might be considered overactive. But I still deserve some
documentation that would help me understand where the heck my (boss's)
money is (and will be?) going. At the moment, unless my boss wants to
pay for this ghost data forever, I can only advise him to start fresh
with a new app id... and with no guarantee the problem won't repeat.
That is not going to make me popular, and I'd really like to avoid it.

On Mar 21, 5:19 pm, homunq jameson.qu...@gmail.com wrote:
 My app is showing only 739 MB of datastore data, only 1 version
 (150MB), no blobs, a tiny amount of index and memcache use (2K in
 each case) - and yet 7.7GB of billable data! What the heck is going
 on? How can I fix it?

 Last updated    Total number of entities        Size of all entities
 3:27:42 ago     279,689 739 MBytes

 Total Stored Data        $0.005/GByte-day                86%     7.70 of 9.00 
 GBytes
 $0.04 / $0.04

 This situation has persisted for well over 24 hours now, it's not just
 a figment of the update period. Also, there's negligible non-default
 indexes, and probably a total of around 2 GB if you count all my
 uploaded code versions ever (but since they all had the same version
 string, they should have overwritten). I probably have a fair amount
 of data in the logs - I have a number of cron jobs, collectively they
 run just under twice a minute - but supposedly logs are not billable
 data.

 This is an app I'm developing for a client. Until this issue is fixed,
 I'm certainly not going to bill for my work, which is otherwise done.
 So I'm anxious to fix this ASAP. If there's somewhere else I should/
 could be taking this question (aside from IRC, where I've brought it
 up twice), I'd be happy to learn it. If there's any further info which
 could help resolve this, I'd be happy to share it, too. (My app id is
 vulahealth)

 Thanks,
 Jameson

 (Second post, I think the first post was moderated into oblivion - it
 was my first post on this list.)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.