[google-appengine] Re: having trouble with uploading app
My partner and I are both getting this 503-unexpected problem, repeatedly, on different platforms, with 1.3.8. (We're in different countries, too.) We've done nothing with error handlers, and it was working for both of us as of yesterday. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: ~7 GB of ghost data???
On Mar 22, 3:48 pm, Nick Johnson (Google) nick.john...@google.com wrote On Mon, Mar 22, 2010 at 8:45 PM, homunq jameson.qu...@gmail.com wrote: OK, after hashing it out on IRC, I see that I have to erase my data and start again. Why is that? Wouldn't updating the data be a better option? Because everything about it is wrong for saving space - the key names, the field names, the indexes, and even in one case the fact of breaking a string out into a list. (something I did for better searching in several cases, one of which is not worth it now I realize that 10X is easy to hit.) And because the data import runs smoothly, and I have code for that already. Watching my deletion process start to get trapped in molasses, as Eli Jones mentions above, I have to ask two things again: 1. Is there ANY ANY way to delete all indexes on a given property name? Without worrying about keeping indexes in order when I'm just paring them down to 0, I'd just be running through key names and deleting them. It seems that would be much faster. (If it's any help, I strongly suspect that most of my key names are globally unique across all of Google). 2. What is the reason for the slowdown? If I understand his suggestion to delete every 10th record, Eli Jones seems to suspect that it's because there's some kind of resource conflict on specific sections of storage, thus the solution is to attempt to spread your load across machines. I don't see why that would cause a gradual slowdown. My best theory is that write-then-delete leaves the index somehow a little messier (for instance, maybe the index doesn't fully recover the unused space because it expects you to fill it again) and that when you do it on a massive scale you get massively messy and slow indexes. Thus, again, I suspect this question reduces to question 1, although I guess that if my theory is right a compress/garbage-collect/degunking call for the indexes would be (for me) second best after a way to nuke them. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: ~7 GB of ghost data???
Watching my deletion process start to get trapped in molasses, as Eli Jones mentions above, I have to ask two things again: 1. Is there ANY ANY way to delete all indexes on a given property name? Without worrying about keeping indexes in order when I'm just paring them down to 0, I'd just be running through key names and deleting them. It seems that would be much faster. (If it's any help, I strongly suspect that most of my key names are globally unique across all of Google). No - that would violate the constant that indexes are always kept in sync with the data they refer to. It seems to me that having no index at all is the same situation as if the property was indexed=False from the beginning. If that's so, it can't be violating a hard constraint. 2. What is the reason for the slowdown? If I understand his suggestion to delete every 10th record, Eli Jones seems to suspect that it's because there's some kind of resource conflict on specific sections of storage, thus the solution is to attempt to spread your load across machines. I don't see why that would cause a gradual slowdown. My best theory is that write-then-delete leaves the index somehow a little messier (for instance, maybe the index doesn't fully recover the unused space because it expects you to fill it again) and that when you do it on a massive scale you get massively messy and slow indexes. Thus, again, I suspect this question reduces to question 1, although I guess that if my theory is right a compress/garbage-collect/degunking call for the indexes would be (for me) second best after a way to nuke them. Deletes using the naive approach slow down because when a record is deleted in Bigtable, it simply inserts a 'tombstone' record indicating the original record is deleted - the record isn't actually removed entirely from the datastore until the tablet it's on does its next compaction cycle. Until then, every subsequent query has to skip over the tombstone records to find the live records. This is easy to avoid: Use cursors to delete records sequentially. That way, your queries won't be skipping the same tombstoned records over and over again - O(n) instead of O(n^2)! Thanks for explaining. Can you say anything about how often the compaction cycles are? Just an order of magnitude - hours, days, or weeks? Thanks, Jameson -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.
Thanks for responding. I've starred the bug - it seems basic. More answers below. On Mar 21, 7:06 pm, Eli Jones eli.jo...@gmail.com wrote: Depending on how your Models are defined (how many properties and what the property types are there) and depending on custom indexes, 10 to 1 is reasonable. Even if I have essentially no custom indexes? (Only one ever, on a few dozen entities with two integers each which handle cron housekeeping). For my app, I aggressively set most properties to Indexed=false and I have no custom indexes. I'm still at 2 to 1 for Datastore usage versus Size of All Entities. When you say datastore usage, you mean total data, right? Or is there some other place this number is reported which I've missed? To fully explore what is going on, you'd need to post what your Model definitions are.. and then post what custom indexes are defined on any of the properties. I have no custom indexes. I have only two entities. 99.9% of the data is in the Expando which has about 45 stringproperties (initally all indexed, now around 30 are; many frequently empty string); 4 integer properties; 4 dates; 4 lists (about 4-8 members typically); from 3-18 expando string lists (of 20 types total; an average of 6 lists with 2 members each); and 2 expando strings (mostly but not always present). If you've been creating and then later on deleting custom indexes or changing your Models around.. running vacuum indexes might help (but that's just a shot in the dark). I have done nothing with custom indexes, except the one miniscule housekeeping one on a few entities. For the main entity kind, is there any way to delete a default index without re-touching all the entities one-by-one? On Sun, Mar 21, 2010 at 7:19 PM, homunq jameson.qu...@gmail.com wrote: My app is showing only 739 MB of datastore data, only 1 version (150MB), no blobs, a tiny amount of index and memcache use (2K in each case) - and yet 7.7GB of billable data! What the heck is going on? How can I fix it? Last updated Total number of entities Size of all entities 3:27:42 ago 279,689 739 MBytes Total Stored Data $0.005/GByte-day 86% 7.70 of 9.00 GBytes $0.04 / $0.04 This situation has persisted for well over 24 hours now, it's not just a figment of the update period. Also, there's negligible non-default indexes, and probably a total of around 2 GB if you count all my uploaded code versions ever (but since they all had the same version string, they should have overwritten). I probably have a fair amount of data in the logs - I have a number of cron jobs, collectively they run just under twice a minute - but supposedly logs are not billable data. This is an app I'm developing for a client. Until this issue is fixed, I'm certainly not going to bill for my work, which is otherwise done. So I'm anxious to fix this ASAP. If there's somewhere else I should/ could be taking this question (aside from IRC, where I've brought it up twice), I'd be happy to learn it. If there's any further info which could help resolve this, I'd be happy to share it, too. (My app id is vulahealth) Thanks, Jameson (Second post, I think the first post was moderated into oblivion - it was my first post on this list.) -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: ~7 GB of ghost data???
OK, I guess I'm guilty on all counts. Clearly, I can fix that moving forward, though it will cost me a lot of CPU to fix the data I've already entered. But as a short-term stopgap, is there any way to delete entire default indexes for a given property? (I mean, anything besides setting indexed=False and then touching each entity one-by-one). You can vacuum custom indexes - can you do it with indexes created by default? Thanks, Jameson On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com wrote: Hi, The discrepancy between datastore stats volume and stored data is generally due to indexing overhead, which is not included in the datastore stats. This can be very high for entities with many properties, or with long entity and property names or entity keys. Do you have reason to suppose that's not the case in your situation? -Nick Johnson On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote: Something is wrong. My app is showing with 7.42GB of total stored data, but only 615 MB of datastore. There is only one version string uploaded, which is almost 150MB, and nothing in the blobstore. This discrepancy has been getting worse - several hours ago (longer than the period since datastore statistics were updated, if you're wondering), there were the same 615 MB in the datastore, and only 3.09GB of total stored data. (at that time, my theory was that it was old uploads of tweaks to the same version - but the numbers have gone far, far beyond that explanation now.) It's not some exploding index; the only non-default index I have is on an entity type with just 33 entities. Here's the line from my dashboard: Total Stored Data $0.005/GByte-day 82% 7.42 of 9.00 GBytes $0.04 / $0.04 And here is the word from my datastore statistics: Last updated Total number of entities Size of all entities 1:32:13 ago 232,867 615 MBytes (metadata 11%, if that matters) Please, can someone help me figure out this issue? I'd be happy to share any info or code which would help track this down. My app id is vulahealth. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.
Sorry, this thread is a duplicate of a href=http://groups.google.com/ group/google-appengine/browse_thread/thread/2a116e341b97c6fd/ 6ad663cd210032b2?lnk=gstthat other one/a. I got impatient and reposted before the other one got moderated. Useful word from Google along with my response is over there - basically, I ask, is there any way to bulk delete default indexes. On 22 mar, 05:44, Wooble geoffsp...@gmail.com wrote: On Mar 22, 4:20 am, homunq jameson.qu...@gmail.com wrote: I have no custom indexes. I have only two entities. 99.9% of the data is in the Expando which has about 45 stringproperties (initally all indexed, now around 30 are; many frequently empty string); 4 integer properties; 4 dates; 4 lists (about 4-8 members typically); from 3-18 expando string lists (of 20 types total; an average of 6 lists with 2 members each); and 2 expando strings (mostly but not always present). For each indexed property, there will be 2 index entries (one in the forward index and one in the reverse index), each of which contains the actual data, the name of the property, the name of your application, and if there's a parent entity information about the parent will be in the key as well. This quickly adds up. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: ~7 GB of ghost data???
OK, after hashing it out on IRC, I see that I have to erase my data and start again. Since it took me 3 days of CPU quota to add the data, I want to know if I can erase it quickly. 1. Is the overhead for erasing data (and thus whittling down indexes) over half the overhead from adding it? Under 10%? Or what? (I don't need exact numbers, just approximates. 2. If it's more like half - is there some way to just nuke all my data and start over? Thanks, Jameson On 22 mar, 03:42, Nick Johnson (Google) nick.john...@google.com wrote: Hi, The discrepancy between datastore stats volume and stored data is generally due to indexing overhead, which is not included in the datastore stats. This can be very high for entities with many properties, or with long entity and property names or entity keys. Do you have reason to suppose that's not the case in your situation? -Nick Johnson On Sun, Mar 21, 2010 at 3:39 AM, homunq jameson.qu...@gmail.com wrote: Something is wrong. My app is showing with 7.42GB of total stored data, but only 615 MB of datastore. There is only one version string uploaded, which is almost 150MB, and nothing in the blobstore. This discrepancy has been getting worse - several hours ago (longer than the period since datastore statistics were updated, if you're wondering), there were the same 615 MB in the datastore, and only 3.09GB of total stored data. (at that time, my theory was that it was old uploads of tweaks to the same version - but the numbers have gone far, far beyond that explanation now.) It's not some exploding index; the only non-default index I have is on an entity type with just 33 entities. Here's the line from my dashboard: Total Stored Data $0.005/GByte-day 82% 7.42 of 9.00 GBytes $0.04 / $0.04 And here is the word from my datastore statistics: Last updated Total number of entities Size of all entities 1:32:13 ago 232,867 615 MBytes (metadata 11%, if that matters) Please, can someone help me figure out this issue? I'd be happy to share any info or code which would help track this down. My app id is vulahealth. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] ~7 GB of ghost data???
Something is wrong. My app is showing with 7.42GB of total stored data, but only 615 MB of datastore. There is only one version string uploaded, which is almost 150MB, and nothing in the blobstore. This discrepancy has been getting worse - several hours ago (longer than the period since datastore statistics were updated, if you're wondering), there were the same 615 MB in the datastore, and only 3.09GB of total stored data. (at that time, my theory was that it was old uploads of tweaks to the same version - but the numbers have gone far, far beyond that explanation now.) It's not some exploding index; the only non-default index I have is on an entity type with just 33 entities. Here's the line from my dashboard: Total Stored Data$0.005/GByte-day82% 7.42 of 9.00 GBytes $0.04 / $0.04 And here is the word from my datastore statistics: Last updatedTotal number of entitiesSize of all entities 1:32:13 ago 232,867 615 MBytes (metadata 11%, if that matters) Please, can someone help me figure out this issue? I'd be happy to share any info or code which would help track this down. My app id is vulahealth. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Will GAE ever be open source?
There are several open-source efforts to get a drop-in replacement. [1] It is likely that this will happen before Google open-sources the code. Still, it would be great to get a commitment from Google that, at some specific future date (say, in 7 years) they'll open source their current code. (That is, code that's 7 years old by then - without the intervening improvements. The open source community could take it from there.) But remember that they make heavy internal use of Bigtable and other bits of code, far beyond app engine. App engine is a relatively peripheral business for them, and probably does not pull the clout to get all that up-to-date code open-sourced in the forseeable future. That's why personally I'd be happy to settle for a commitment to open- source old code later. [1] http://blog.notdot.net/2009/04/Announcing-BDBDatastore-a-replacement-datastore-for-App-Engine is not the freshest blog post, but it covers a number of the efforts. On Mar 21, 4:28 pm, Josh Rehman j...@joshrehman.com wrote: It would be great to run my own app engine, both for development, and for production. Writing apps for a proprietary platform like GAE ties you to the platform, as I'm sure Google is aware. So, is there any chance the App Engine will be open-sourced such that it can be run on non-Google hardware? -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.
My app is showing only 739 MB of datastore data, only 1 version (150MB), no blobs, a tiny amount of index and memcache use (2K in each case) - and yet 7.7GB of billable data! What the heck is going on? How can I fix it? Last updatedTotal number of entitiesSize of all entities 3:27:42 ago 279,689 739 MBytes Total Stored Data$0.005/GByte-day86% 7.70 of 9.00 GBytes $0.04 / $0.04 This situation has persisted for well over 24 hours now, it's not just a figment of the update period. Also, there's negligible non-default indexes, and probably a total of around 2 GB if you count all my uploaded code versions ever (but since they all had the same version string, they should have overwritten). I probably have a fair amount of data in the logs - I have a number of cron jobs, collectively they run just under twice a minute - but supposedly logs are not billable data. This is an app I'm developing for a client. Until this issue is fixed, I'm certainly not going to bill for my work, which is otherwise done. So I'm anxious to fix this ASAP. If there's somewhere else I should/ could be taking this question (aside from IRC, where I've brought it up twice), I'd be happy to learn it. If there's any further info which could help resolve this, I'd be happy to share it, too. (My app id is vulahealth) Thanks, Jameson (Second post, I think the first post was moderated into oblivion - it was my first post on this list.) -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: HELP! 7 GB of ghost data - billable but not in datastore, blobs, or versions.
I know I probably shouldn't reply to myself, but this issue seems to be a bit more serious than your average newbie question. This is something approaching a 1000% inflation from my raw data to my billable data (actually, the csv data, without redundant field names, is even smaller, so it would be more like 2000%). Also, I've tried to google this, and I can't find any reference to similar problems. Sure, in the end it's probably partly my fault. For instance, my cron jobs might be considered overactive. But I still deserve some documentation that would help me understand where the heck my (boss's) money is (and will be?) going. At the moment, unless my boss wants to pay for this ghost data forever, I can only advise him to start fresh with a new app id... and with no guarantee the problem won't repeat. That is not going to make me popular, and I'd really like to avoid it. On Mar 21, 5:19 pm, homunq jameson.qu...@gmail.com wrote: My app is showing only 739 MB of datastore data, only 1 version (150MB), no blobs, a tiny amount of index and memcache use (2K in each case) - and yet 7.7GB of billable data! What the heck is going on? How can I fix it? Last updated Total number of entities Size of all entities 3:27:42 ago 279,689 739 MBytes Total Stored Data $0.005/GByte-day 86% 7.70 of 9.00 GBytes $0.04 / $0.04 This situation has persisted for well over 24 hours now, it's not just a figment of the update period. Also, there's negligible non-default indexes, and probably a total of around 2 GB if you count all my uploaded code versions ever (but since they all had the same version string, they should have overwritten). I probably have a fair amount of data in the logs - I have a number of cron jobs, collectively they run just under twice a minute - but supposedly logs are not billable data. This is an app I'm developing for a client. Until this issue is fixed, I'm certainly not going to bill for my work, which is otherwise done. So I'm anxious to fix this ASAP. If there's somewhere else I should/ could be taking this question (aside from IRC, where I've brought it up twice), I'd be happy to learn it. If there's any further info which could help resolve this, I'd be happy to share it, too. (My app id is vulahealth) Thanks, Jameson (Second post, I think the first post was moderated into oblivion - it was my first post on this list.) -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.