Earlier this week, the App Engine team held its most recent "Chat Time" IRC office hour session. The next session will take place July 1st from 7:00-8:00 p.m. (PDT) in the #appengine channel on irc.freenode.net. A summary and a transcript of Wednesday's session follows:
Summary: - Because sessions are backed by the datastore, there is a chance that session details may not be updated property if the update request fails -- in general, up to 0.2% of datastore updates will result in a DatastoreTimeoutException and the session data will not be updated in memcache. If you depend on sessions for maintaining state, you should catch these occasional DatastoreTimeoutExceptions in your code and retry if necessary. [9:02, 9:06, 9:09-9:15] - Q: Is it best to go with one application per tenant or design a multi-tenant app? A: At the moment, multi-tenant support is lacking; you can define pre-call and post-call hooks that add a per-tenant field on each entity, but access control is not built in, which complicates the issue of per-tenant data security. Until better multi- tenant support is available, we are considering case-by-case exceptions to the terms of service to allow for per-tenant applications. To file such a request, please complete the form at http://code.google.com/support/bin/request.py?contact_type=AppEngineMultiInstanceExceptionRequest. [9:07-9:08 9:14-9:15, 9:17-9:19] - Currently, support for wildcard domains and Comet are not on the roadmap, but you can file a request for these in the public issue tracker if you're interested. [9:20-9:21] - The task queue API is coming soon. "We're whipping our programmers as hard as we can! ;)" [9:23] - Q: Can anyone provide pointers on how to compute aggregates, e.g. an average of values, when the number of datastore entities is greater than 1,000? A: Calculate aggregates progressively -- instead of generating the data on demand, update another entity representing the aggregated value when new entities are added so it's always current. If you already have a lot of data, you can run various cron jobs (or use the forthcoming task queue API) to calculate the aggregate. [9:28-9:29, 9:34] - An early version of the data export tool is available for testing with the Python SDK 1.2.2 -- see http://code.google.com/appengine/docs/python/tools/uploadingdata.html#Downloading_Data_from_App_Engine. A similar tool for the Java SDK is forthcoming. [9:36] - Using entity groups improves read performance since the physical bytes are stored near each other on disk. The issue with entity groups is write performance since only a single entity in a group can be updated at one time, meaning if you have a sustained rate of traffic of 1 QPS or more, you could begin to experience contention. Try to keep your entity groups small for this reason. [9:37-9:39] - There is no punch or pie, much less transatlantic pie. :( [9:41] - Use indexed=False to prevent an index from being generated for a given property. If you don't need to query for a given property, this can save you stroage costs and write time as well since less indexes need to be updated. [9:41-9:42] The more properties you have, the more latency you'll see, although this isn't linear with the number of properties. CPU cost is, however. [10:00] - Stop by the EuroPython conference in the U.K. and say "Hi" to Nick. [9:52-9:53] - Online JavaDoc will be updated soon to account for missing methods like setUnindexedProperty -- for now, use the JavaDoc packaged with the SDK. [9:54-9:55] Transcript: [Wed Jun 17 2009] [09:00:59] <scudder_google> Hi all, I'd like to begin our hour long chat session. So far from Google we have dan_google, nickjohnson, jason_google, and myself, others may join as we go. [Wed Jun 17 2009] [09:01:11] <scudder_google> The floor is open, ask your questions :-) [Wed Jun 17 2009] [09:01:32] <dan_google> You could access it directly with the user API. If you really want to pass it in, such as for unit testing, you could use middleware to add it to the request. [Wed Jun 17 2009] [09:01:36] <dfabulich> I have a question apropos http://code.google.com/p/googleappengine/issues/detail?id=1692 [Wed Jun 17 2009] [09:02:01] <dfabulich> Does the Java SaveSessionFilter have any retry handling around writing to the datastore? [Wed Jun 17 2009] [09:02:42] <luddep> The performance improvements scheduled for june 22nd, is there going to be any more info released regarding that? [Wed Jun 17 2009] [09:02:55] <Jason_Google> dfabulich: In general, all datastore writes are retried several times in the case of failure. [Wed Jun 17 2009] [09:03:21] <lent> we're are implementing an enterprise saas product, with potentially thousands of tenants, we're trying to decide between one application per tenant and a multi-tenant approach, any thoughts on this? [Wed Jun 17 2009] [09:03:24] <Jason_Google> dfabulich: Are you seeing this error consistently or was it transient? [Wed Jun 17 2009] [09:03:32] <jaysern> thanks Dan. As a more basic question, how do I peek into that 'request' object and see what else is inside? e.g. what's the equivalent of doing a dir() on that request object via command line? [Wed Jun 17 2009] [09:03:42] <dfabulich> Jason_Google: I get this error in my log dozens of times every day. It's happened 8 times today so far. [Wed Jun 17 2009] [09:04:00] <nickjohnson> jaysern: Do a dir() and log it or return it to the user? :) [Wed Jun 17 2009] [09:04:02] <scudder_google> ah jcgregorio is here from Google too [Wed Jun 17 2009] [09:04:19] * jcgregorio waves [Wed Jun 17 2009] [09:04:21] <dfabulich> Jason_Google: I assume everyone who uses sessions will get this problem if they fall under meaningful load. [Wed Jun 17 2009] [09:04:36] <jaysern> lol .. thanks. I'm new to Python, Django, and App Engine . .trying to learn all at the same time [Wed Jun 17 2009] [09:04:52] <nickjohnson> lent: I would suggest multi- tenant. It'll be easier to administer and upgrade, and avoids any issues about splitting your traffic across multiple apps [Wed Jun 17 2009] [09:05:18] <nickjohnson> dfabulich: Timeout errors are not dependent on load. [Wed Jun 17 2009] [09:05:32] <nickjohnson> Or rather, the rate of them is not. :) [Wed Jun 17 2009] [09:05:41] Nick joshthecoder_afk is now known as joshthecoder. [Wed Jun 17 2009] [09:06:00] <Jason_Google> dfabulich: In general, somewhere between 0.1 to 0.2% of datastore writes will fail. But sessions are backed by memcache too, so in theory the data should continue to be persisted. [Wed Jun 17 2009] [09:06:01] <dfabulich> nickjohnson: I just meant that it happens sporadically so if you aren't handling too many requests you'll be less likely to see the error. [Wed Jun 17 2009] [09:06:13] <Jason_Google> dfabulich: How often are you writing to the session? [Wed Jun 17 2009] [09:06:25] <dfabulich> Jason_Google: Can you confirm what actually happens to the session data when SSF gets a DTE? Is the work I did during the current request just lost in that case? Or does it get persisted to memcache first? [Wed Jun 17 2009] [09:06:33] <dfabulich> Jason_Google: I write to the session on every request. [Wed Jun 17 2009] [09:07:12] <dfabulich> For those who are interested, and as a shameless plug, my app is at http://www.playalterego.com/ [Wed Jun 17 2009] [09:07:16] <lent> one thing about multi-tenant is that we're concerned about security. in an rdbms we could use views and access control to guarantee that there is no unfiltered access. is there any way to do something similar in GAE? [Wed Jun 17 2009] [09:07:33] <dfabulich> It's a game the user plays in a conversational session with the server. [Wed Jun 17 2009] [09:08:30] <nickjohnson> lent: In Python, you can define pre-call and post-call hooks that add and check a per-tenant field on each entity [Wed Jun 17 2009] [09:08:50] <nickjohnson> There's no pre-packaged solution available for this yet, but all the necessary parts are there. [Wed Jun 17 2009] [09:09:20] <dfabulich> More generally, I'm confused about the guidance around DatastoreTimeoutExceptions. I've heard that 0.1% of requests will get DTEs and to put retry logic in my application, but I've also heard remarks (like Jason_Google said earlier) that all datastore writes are automatically retried. What's the real deal here? [Wed Jun 17 2009] [09:09:36] <lent> Is there some equivalent in java? [Wed Jun 17 2009] [09:09:39] <pranny> dfabulich: your app is cool [Wed Jun 17 2009] [09:09:46] <dfabulich> pranny: :-) [Wed Jun 17 2009] [09:09:49] <scudder_google> dfabulich: I'm looking at the code for the App Engine session, and it attempts to write to the datastore first, then puts into memcache [Wed Jun 17 2009] [09:10:12] <scudder_google> dfabulich: so an exception on the datastore write will prevent memcache from being updated [Wed Jun 17 2009] [09:10:30] <dfabulich> scudder_google: :-( Does it do any retries on the application side? Or does it just choke on the first DTE? Best practice is to wrap this in a try-try-again loop, right? [Wed Jun 17 2009] [09:11:07] <nickjohnson> lent: I'm not familiar enough with the Java runtime to say, I'm afraid [Wed Jun 17 2009] [09:11:49] <scudder_google> dfabulich: in the backend datastore writes are retried a few times, if the retries fail, then an exception is raised for you to handle as you will [Wed Jun 17 2009] [09:12:31] <lent> nickjohnson: do you see any performance issue once there are million entities for a kind and every operation will have to filter by tenant id? [Wed Jun 17 2009] [09:12:47] <scudder_google> dfabulich: I'm wondering if your session writes could be experiencing contention. If you are writing to the session several times in one second, then the writes could back up and time out [Wed Jun 17 2009] [09:12:59] <nickjohnson> lent: No. Queries scale with number of results returned, not with number of results in the datastore. [Wed Jun 17 2009] [09:13:14] <nickjohnson> Note that every single entity for every single app is stored in a single Bigtable table, for example. :) [Wed Jun 17 2009] [09:13:31] <nickjohnson> This is what you get in return for having to tolerate occasional shortcomings of the non- relational design model. :) [Wed Jun 17 2009] [09:14:06] <chuck> Which languages (if any) are you guys planning on supporting in the future, as I'm sure you're not going to be supporting every single one requested in the issue tracker [Wed Jun 17 2009] [09:14:20] <dfabulich> scudder_google: That seems VERY unlikely to me. My app has no ajax at all; I basically never expect two requests to be in-flight simultaneously. [Wed Jun 17 2009] [09:14:46] <dfabulich> scudder_google: The user does one request at a time over and over again until death :-) [Wed Jun 17 2009] [09:14:55] <lent> nickjohnson: there's been mention that multi-tenant support will be added in the future. do you know any details about this and potentially when this would be released? [Wed Jun 17 2009] [09:15:41] <nickjohnson> lent: I don't know when it'll be released. I'm fairly certain that you can expect Python support much earlier than anything for Java, though [Wed Jun 17 2009] [09:15:58] <scudder_google> dfabulich: I see, then it sounds like your best bet is to handle the exception, perhaps with a retry, or a write to memcache [Wed Jun 17 2009] [09:16:14] <nickjohnson> scudder_google: The exception is occurring in the session code, outside his control [Wed Jun 17 2009] [09:16:58] <dfabulich> I would happily rewrite the SaveSessionFilter from scratch to fix these problems, if I had access to its source code [Wed Jun 17 2009] [09:17:20] <lent> nickjohnson: putting aside the issue of managing thousands of applications, is a one application per tenant approach okay under GAE's usage agreements? I know that it says you can't split an applicaiton between application ids. [Wed Jun 17 2009] [09:17:36] <dfabulich> I've thought about just ditching Google sessions and re-implementing my own, but I'm optimistic Google will fix SSF at some point so I hesitate to do that work now [Wed Jun 17 2009] [09:17:44] <ericsk> hi, I'm using appengine-python sdk 1.2.2. I found that the admin console cannot process the non-ascii characters [Wed Jun 17 2009] [09:17:54] <nickjohnson> dfabulich: Have you filed a bug yet? :) [Wed Jun 17 2009] [09:18:16] <dfabulich> nickjohnson: I began this chat with the bug number! :-) http://code.google.com/p/googleappengine/issues/detail?id=1692 [Wed Jun 17 2009] [09:18:19] <chuck> Which languages are you guys planning on supporting in the future? :P [Wed Jun 17 2009] [09:18:21] <nickjohnson> lent: I'm afraid I can't say for certain. IANAL, of course. [Wed Jun 17 2009] [09:18:26] <ericsk> I've tried to add `encode ('utf-8')` somewhere in the google/appengine/ext/admin/ datastore_edit.py [Wed Jun 17 2009] [09:18:28] <nickjohnson> dfabulich: osrry, I'd forgotten. [Wed Jun 17 2009] [09:18:29] <Jason_Google> lent: We're building better support for multi-tenant apps, but in the meantime, it is OK if you let us know what you're doing ahead of time. We have a form for this, let me try to dig it up. [Wed Jun 17 2009] [09:18:49] <nickjohnson> chuck: We haven't announced plans for any additional languages at the moment. Which doesn't mean that there won't be any, just that we haven't announced any. [Wed Jun 17 2009] [09:19:04] <ericsk> however, it cannot process the utf-8 text in the ListType box [Wed Jun 17 2009] [09:19:24] <nickjohnson> Oops, ignore me, and pay attention to Jason_Google, lent ;) [Wed Jun 17 2009] [09:19:54] <Jason_Google> lent: Here you go: http://code.google.com/support/bin/request.py?contact_type=AppEngineMultiInstanceExceptionRequest [Wed Jun 17 2009] [09:20:23] <chuck> nickjohnson: Okay. I have a few other questions, 1. Is wildcard domain support a feature that you guys will be creating? 2. Will Google App Engine ever support Comet? 3. Will it be possible to delete applications if they were just tests or whatever to free up available slots? [Wed Jun 17 2009] [09:21:00] <lent> Jason_Google: thank you [Wed Jun 17 2009] [09:21:11] <dfabulich> Is it possible for you guys to just send me the source code for SSF? I'd cheerfully submit a patch. (and in the short-term I'd just use my patched version) I imagine it's like 20 lines long :-) [Wed Jun 17 2009] [09:21:29] <nickjohnson> chuck: 1. It's not on the roadmap, but it would certainly be nice to have. 2. Probably not in the forseeable future, 3. Not on the roadmap, but if you need more apps, ask in the group and we'll give you some. [Wed Jun 17 2009] [09:21:31] <lent> nickjohnson: thank you very much for your replies and your patience [Wed Jun 17 2009] [09:21:40] <nickjohnson> lent: Our pleasure [Wed Jun 17 2009] [09:21:56] <dfabulich> I'm not far from Mountain View and have been on Google campus; signed the NDA more times than I can count. How can I help here? [Wed Jun 17 2009] [09:22:00] <chuck> aww, okay. [Wed Jun 17 2009] [09:22:01] <nickjohnson> dfabulich: We're planning on releasing the source for the Java SDK as soon as we can. In the meantime, we do fix bugs ourselves. :) [Wed Jun 17 2009] [09:23:01] <picalolabu> hello everyone - [python] i was wondering if you have an update on the ETA for the task queue api? [Wed Jun 17 2009] [09:23:22] <scudder_google> picalolabu: soon :-) [Wed Jun 17 2009] [09:23:23] <luddep> picalolabu: i was just about to ask that, heh. Weeks / months? [Wed Jun 17 2009] [09:23:55] <nickjohnson> We're whipping our programmers as hard as we can! ;) [Wed Jun 17 2009] [09:24:00] Nick dan_google__ is now known as dan_google. [Wed Jun 17 2009] [09:24:13] <lent> [java] on a different topic, when will the service for supporting large files be released and is this going allow application to write to the file system through java.io calls? [Wed Jun 17 2009] [09:24:37] <picalolabu> no pb - thx [Wed Jun 17 2009] [09:24:59] <jdeibele> using Universal Feed Parser a = Article(key_name = mykey) a.content = e.content[0].value a.put() Seems to work fine. When I try to display the post from my app, I get 'ascii' codec can't decode byte 0xe2 in position 83: ordinal not in range(128) datastore viewer says 'ascii' codec can't encode character u'\xa0' in position 309: ordinal not in range(128) Type of a.content is <class 'google.appengine.api.datastore_types.Tex [Wed Jun 17 2009] [09:25:50] <Jason_Google> lent: When it's available, I don't believe the update will enable you to write to the file system, although you will be able to write larger entities to the datastore. [Wed Jun 17 2009] [09:25:52] <nickjohnson> jdeibele: You're trying to convert a Unicode string to a byte string somewhere without specifying an encoding. You need to either use Unicode everywhere, or call .encode ('utf8') on your string before using it [Wed Jun 17 2009] [09:25:56] <nickjohnson> The former would be preferable [Wed Jun 17 2009] [09:25:57] <ericsk> jdeibele: try `a.content.encode ('utf-8')` ? [Wed Jun 17 2009] [09:26:13] <dfabulich> nickjohnson: Even once the source for the Java SDK is released, SaveSessionFilter isn't part of the SDK (it's not shipped in the API jars). [Wed Jun 17 2009] [09:26:35] <nickjohnson> dfabulich: Really? Damn. My lack of knowledge of the Java SDK strikes again. [Wed Jun 17 2009] [09:27:29] <ribrdb> dfabulich: I missed your question. What problem are you having with SaveSessionFilter? [Wed Jun 17 2009] [09:28:05] <krisajenkins_> [java] Hi - I'm having trouble coming from a relational database background. Can anyone give me some pointers on how I can compute aggregates, for instance. I'd like to write an app that will record financial data, but I have no idea how I'll compute balances once the number of entries goes over 1000... :-} [Wed Jun 17 2009] [09:28:16] <scudder_google> ribrdb: datastore timeout raises an exception, and it is difficult to handle [Wed Jun 17 2009] [09:28:22] <luddep> [python] Is there any ETA on django 1.0 support, or are you holding out for 1.1 to get out of beta? [Wed Jun 17 2009] [09:28:31] <dieterk> still chat time? [Wed Jun 17 2009] [09:28:33] <nickjohnson> krisajenkins_: Calculate aggregates progressively, and store them against another entity in the same entity group [Wed Jun 17 2009] [09:28:36] <nickjohnson> dieterk: yes [Wed Jun 17 2009] [09:28:41] <dieterk> neat ;) [Wed Jun 17 2009] [09:28:45] <dfabulich> ribrdb: http://code.google.com/p/googleappengine/issues/detail?id=1692 [Wed Jun 17 2009] [09:29:19] <dfabulich> ribrdb: 1) It has no application-side retry-logic. 2) It writes to the datastore before the memcache, so any DatestoreTimeoutException results in data loss [Wed Jun 17 2009] [09:29:40] <dfabulich> ribrdb: I'd love to just rewrite it, if I could. How long could it be? :-) [Wed Jun 17 2009] [09:29:45] <Jason_Google> krisajenkins_: Right, as Nick said. You would do all the calculation at write time so the data is available on-demand instead of having to iterate over all entities to compute it. [Wed Jun 17 2009] [09:30:06] <dieterk> Is there any plan or any timeframe to go forward with this? http://code.google.com/p/googleappengine/issues/detail?id=1378 [Wed Jun 17 2009] [09:30:17] <lent> Jason_Google: there are items in the Roadmap for January 2009 - June 2009, when are these likely to be released (such as task queues)? [Wed Jun 17 2009] [09:30:25] <luddep> nickjohnson: is there any performance improvement with using entity groups, or would that be for transaction goodness? [Wed Jun 17 2009] [09:30:31] <dfabulich> ribrdb: I see SSF getting DTEs dozens of times a day every day in my logs, and it kills me, because I know that every single one of them resulted in data loss... [Wed Jun 17 2009] [09:30:44] <chuck> What are a few of the issues in the issue tracker that you guys are working on implementing next? [Wed Jun 17 2009] [09:30:55] <nickjohnson> luddep: Purely for transaction semantics. [Wed Jun 17 2009] [09:31:13] <luddep> nickjohnson: ok! [Wed Jun 17 2009] [09:31:16] <dieterk> chuck: hopefully http://code.google.com/p/googleappengine/issues/detail?id=1378 ;) [Wed Jun 17 2009] [09:31:17] <Jason_Google> lent: We're planning at least one more release before the end of the month. Some of the features will likely slip a bit past the end of June, but the team is making good progress. [Wed Jun 17 2009] [09:31:47] <dieterk> Jason_Google: my issue in the next release? [Wed Jun 17 2009] [09:32:31] <chuck> This would be cool too: http://code.google.com/p/googleappengine/issues/detail?id=56&colspec=ID%20Type%20Status%20Priority%20Stars%20Owner%20Summary%20Log%20Component [Wed Jun 17 2009] [09:32:57] <lent> Jason_Google: that's good to hear. thanks to all you google guys for the great work. [Wed Jun 17 2009] [09:34:02] <krisajenkins_> Hmm...sorry, I'm being slow on this...so I'll need to know which aggregates I want up-front, at design time? If I get a new requirement once there's lots of data, I'm in trouble, right? [Wed Jun 17 2009] [09:34:19] <nickjohnson> krisajenkins_: That's correct [Wed Jun 17 2009] [09:34:31] <nickjohnson> Well, you can update your schema and run a bulk update of some sort once you already have data [Wed Jun 17 2009] [09:34:45] <nickjohnson> And if you want to do analytics style ad-hoc reporting, you can download your data and process it offline [Wed Jun 17 2009] [09:34:49] Part danmx has left this channel. [Wed Jun 17 2009] [09:34:56] <Jason_Google> dieterk: We're definitely aware of the request, as you can tell by the comments in the issue, but it probably won't make the next cut. [Wed Jun 17 2009] [09:35:29] <jdeibele> nickjohnson: I see ericsk mentioned one of the same issues I'm having (with the datastore viewer). Is there a way to say "it's all unicode"? I have # -*- coding: utf-8 -*- as second line of all programs [Wed Jun 17 2009] [09:36:04] <krisajenkins_> Okay, I can live with that. :-) But it raises the question, when will get data downloads? [Wed Jun 17 2009] [09:36:38] <Jason_Google> krisajenkins_: An early version of the data export tool is already available in the latest Python release. [Wed Jun 17 2009] [09:37:02] <nickjohnson> jdeibele: What issue are you having with the datastore viewer trying to view these entities? [Wed Jun 17 2009] [09:37:06] <krisajenkins_> Jason_Google: Any ETA for Java? [Wed Jun 17 2009] [09:37:30] <jdeibele> nickjohnson: datastore viewer says 'ascii' codec can't encode character u'\xa0' in position 309: ordinal not in range(128) [Wed Jun 17 2009] [09:37:32] <nickjohnson> The ultimate problem is that as far as the datastore is concerned, it's all bytes, and there's no way to determine what encoding you decided to use. But if you're seeing errors, that's still a bug. [Wed Jun 17 2009] [09:37:39] <nickjohnson> jdeibele: 'says' it where? [Wed Jun 17 2009] [09:37:43] <lent> we're considering having a large number of entities, maybe up to 10,000 in an entity group to ensure transaction semantics, this will hurt scalability/performance for writes but will there be a significant impact on reads/queries as well? [Wed Jun 17 2009] [09:37:46] <Jason_Google> krisajenkins_: I don't have an ETA, but we're working on reconciling the two SDKs as soon as we can. [Wed Jun 17 2009] [09:37:47] <nickjohnson> And is this on the dev_appserver, or in production? [Wed Jun 17 2009] [09:38:05] <krisajenkins_> Jason_Google: Fair enough. Thanks. [Wed Jun 17 2009] [09:38:08] <jdeibele> When I try to click on the entry in dev_appserver. So I have a list of articles and try to click on one article [Wed Jun 17 2009] [09:38:31] <nickjohnson> lent: The update rate for an entity group is far more important than its size. 10k entities in a group is fine, but doing more than 1QPS of sustained updates to an entity group may cause problems. [Wed Jun 17 2009] [09:38:47] <nickjohnson> jdeibele: Does this happen on production, too? [Wed Jun 17 2009] [09:39:08] <Jason_Google> lent: Using entity groups tends to improve read time, actually since the data is stored physically close together. But, as nick says, updates can be a problem if they're frequent enough (more than 1-2 per second). [Wed Jun 17 2009] [09:39:19] <dieterk> Jason_Google: why is it a new feature? one can set already some Header values. I guess you just filter out when people try to set certain Header values. Just stop filtering out the 'Content-Encoding' Header if its value is 'pack200- gzip'. I believe it is somehow frustrating for people if they cannot follow why it is such a big problem to allow it. Perhaps somebody could explain by posting a comment in the issue? [Wed Jun 17 2009] [09:39:21] Nick Lennie is now known as Lennie|Food. [Wed Jun 17 2009] [09:40:13] <lent> nickjohnson, Jason_Google: thank you for your insights [Wed Jun 17 2009] [09:40:20] <dfabulich> I have a question about datastore performance. In my case, it happens that I never need to query my entities; I just need to persist them and restore them by key. However, my objects typically have dozens of columns (80 columns sometimes). Would I improve my datastore performance pre-serializing the data myself and storing the data as a blob? Would I get fewer DatastoreTimeoutExceptions that way? [Wed Jun 17 2009] [09:41:13] <ryan_google> @dfabulich, i assume you're generally seeing timeouts on puts? not ges? try using indexed=False for your properties. [Wed Jun 17 2009] [09:41:25] <ryan_google> [btw, i heard there would be punch and pie?] [Wed Jun 17 2009] [09:41:27] Nick tobyr is now known as tobyr_google. [Wed Jun 17 2009] [09:41:35] <nickjohnson> ryan_google: Really? Is it transatlantic pie? [Wed Jun 17 2009] [09:41:41] <dfabulich> ryan_google: [there isn't any] [Wed Jun 17 2009] [09:41:50] <dankles> ryan_google: could you please explain specifically what indexed=False affects, performance-wise? [Wed Jun 17 2009] [09:41:55] <nickjohnson> Khaaaaaaaa- er, I mean Nooooooooo! [Wed Jun 17 2009] [09:42:13] <ryan_google> [more people will come if we say there's punch and pie!!!] [Wed Jun 17 2009] [09:42:19] <ryan_google> http://www.urbandictionary.com/define.php?term=punch%20and%20pie [Wed Jun 17 2009] [09:42:22] <nickjohnson> dankles: Every property that's not indexed is (at least) one index entry that doesn't have to be written, updated, or deleted [Wed Jun 17 2009] [09:42:33] <ryan_google> what nick said :P [Wed Jun 17 2009] [09:42:55] <dfabulich> ryan_google: How do I set indexed=False using the Java Datastore API? http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/package-summary.html [Wed Jun 17 2009] [09:42:56] <dankles> is there a difference between an update with indexed=False vs. an update where that specific property is not changed? [Wed Jun 17 2009] [09:43:42] <ryan_google> good question! no, no difference. updates only write index deltas [Wed Jun 17 2009] [09:44:20] <jdeibele> nickjohnson: not sure about production With another app, I use get_or_insert and have no problems with unicode I'm trying to use a = Article(key_name = mykey) because I assume I'm going to hit an article I've already seen so check, put, check, put and then stop both url (of the blog article) and the content are db.TextProperty because I'm not doing any ordering. [Wed Jun 17 2009] [09:44:26] <jaysern> a general direction question: any advice for someone learning Django and App Engine at the same time? I'm using examples from Ayman Hourieh's "Django 1.0 Web Site Development"; looks like he is also a Googler [Wed Jun 17 2009] [09:44:49] <Jason_Google> dieterk: It's a matter of priority at this point. There are a lot of items that the team is working on, as you can see in the product roadmap, but we have acknowledged the issue and there may be a resolution at some point. [Wed Jun 17 2009] [09:45:05] <jaysern> so far I'm relying on App Engine's official docs, that book from Ayman Hourieh, App Engine Django Helper's page [Wed Jun 17 2009] [09:45:06] <luddep> How big is the performance difference on a first .put() on an entity with 10 properties vs, lets say, 20 or even 30? [Wed Jun 17 2009] [09:45:09] <nickjohnson> jdeibele: Okay. I'm not quite sure how your access pattern applies to the problem [Wed Jun 17 2009] [09:45:21] <jaysern> and Django's official docs. Any other good resources to suggest? [Wed Jun 17 2009] [09:45:31] <Jason_Google> jaysern: See the articles page -- there are a whole host of articles on using Django and App Engine. http://code.google.com/appengine/articles/ [Wed Jun 17 2009] [09:45:50] <nickjohnson> luddep: You'd have to benchmark to say for certain. It also depends on what composite indexes your entity has [Wed Jun 17 2009] [09:46:36] <dankles> as a followup: what % of the datastore cpu time would you expect to be shaved off *roughly* for an initial put if all fields were unindexed vs. ~10 basic indexed property fields? [Wed Jun 17 2009] [09:46:44] <jaysern> I've seen a few there. Thanks Jason. Will check it out more [Wed Jun 17 2009] [09:47:50] <nickjohnson> dankles: Again, you'd have to benchmark to say for certain. I don't think any of us have those sort of performance figures memorized, and they may vary. :) [Wed Jun 17 2009] [09:47:51] <picalolabu> [python] do you have any recommendations for the maximum size (guesstimate) of a list to reside in memcache? [Wed Jun 17 2009] [09:48:04] <nickjohnson> But the rule of 'fewer index entries are better than more' will always apply ;) [Wed Jun 17 2009] [09:48:04] <dieterk> Jason_Google: when I look at other issues it is often a question of 'nice to have' but one can work around it. This issue is a critical bug from my point of view. It should get a much higher priority. Could we change the priority to Defect? [Wed Jun 17 2009] [09:48:15] <nickjohnson> picalolabu: 1MB is the hard limit. Other than that, pickle tends to be slow. [Wed Jun 17 2009] [09:48:31] <picalolabu> nickjohnson: thx [Wed Jun 17 2009] [09:48:55] <jdeibele> nickjohnson: at some point I use what's working for me, which is db.get and db.get_or_insert I'm trying to understand what I could be doing wrong with using the a = Article(key_name = key_name) a.put() syntax, which I haven't used before [Wed Jun 17 2009] [09:49:19] <nickjohnson> jdeibele: Oh, this is unrelated to the encoding issue? [Wed Jun 17 2009] [09:49:28] <lent> [java] for jpa/jdo, is using owned relationships the only way to put entities into the same entity group or is there some other way to do this? [Wed Jun 17 2009] [09:49:46] <nickjohnson> jdeibele: The only thing wrong with the latter approach is the potential for races - two concurrent requests could create an article with the same key_name, and the latter one would win [Wed Jun 17 2009] [09:49:48] <dennis_tw> i was looking around for tips on coding around the datastore timeout problem. found gaeutilities and that's it. found a reference about creating my own wrapper to retry puts. also found a reference saying retrying does not help much (there is already a run_in_transaction_custom_retries). wondering what the current recommendation is for coding to avoid timeouts [Wed Jun 17 2009] [09:51:14] <dfabulich> Sayyy. I've searched all through the Javadoc for the low-level datastore API and can't find any way to make properties unindexed. Is it even possible? [Wed Jun 17 2009] [09:52:20] <dennis_tw> is there a way to avoid timeouts or should we just push these to the end user and have them retry? [Wed Jun 17 2009] [09:52:56] <knoonan> Are any of you going to be at the "EuroPython" conference (in the UK, the week after next)? I'll be there. [Wed Jun 17 2009] [09:52:58] <nickjohnson> dennis_tw: Timeouts are sometimes caused by contention, which you can take steps to alleviate, but some timeouts will occur inevitably. You can retry them if you wish, and error out if they persist. [Wed Jun 17 2009] [09:53:11] <nickjohnson> knoonan: I will be, staffing the Google stand. Stop by and say hi! [Wed Jun 17 2009] [09:53:50] <dankles> oh - "run_in_transaction_custom_retries" - is this new? i see it here, never noticed: http://code.google.com/appengine/docs/python/datastore/functions.html [Wed Jun 17 2009] [09:53:50] <jdeibele> nickjohnson: I guess I'm trying to say that this is my third app and I have almost no unicode issues with the other two. I'm trying to understand what's wrong but I've spent too much time on it already. I'll switch to what has worked for me before. Thanks. [Wed Jun 17 2009] [09:54:36] <maxoizo> Hi google team. Are you planning in foreseeable future "Country-specific Storage"? (issue 193) [Wed Jun 17 2009] [09:54:48] <dfabulich> oh! I figured it out. There's a "setUnindexedProperty" method on the Entity object, but it's not available in the onilne javadoc, but only in the Javadoc on my hard drive! [Wed Jun 17 2009] [09:55:11] <dfabulich> http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/Entity.html is missing setUnindexedProperty [Wed Jun 17 2009] [09:55:13] <dennis_tw> nickjohnson: so don't need to code anything special for the actual put (other than contention). and just make sure end user can retry via my code? [Wed Jun 17 2009] [09:55:17] <Jason_Google> dieterk: I can change the issue type to defect, but it likely won't have a huge impact the priority. As critical as the issue is to you, there don't appear to be too many other developers running up against it, and we have to prioritize the features that will have the largest impact. That's not to say that we won't get around to addressing this issue, but there are other requests in the pipeline that have a higher priority right no [Wed Jun 17 2009] [09:55:22] <dfabulich> is there a bug I should file here on the documentation? [Wed Jun 17 2009] [09:55:36] <dfabulich> Is the online javadoc just out-of-date? Is that known? [Wed Jun 17 2009] [09:55:39] <knoonan> nickjohnson: Great, see you at EuroPython! [Wed Jun 17 2009] [09:56:52] <scudder_google> dfabulich: Yes we are aware and we will update the javadocs [Wed Jun 17 2009] [09:58:20] <ribrdb> dieterk: This isn't as simple as allowing you to set the content encoding. You're app never gets the accept encoding header, so you have no way of knowing if it is ok to use pack200-gzip [Wed Jun 17 2009] [09:58:29] <dankles> There is an argument in http://code.google.com/p/googleappengine/issues/detail?id=1680 about whether or not Task Queue will have 30s limit or not. Is this known? (Sorry if this was covered already) [Wed Jun 17 2009] [10:00:57] * ryan_google is back and catching up [Wed Jun 17 2009] [10:00:59] <ryan_google> @luddep, the latency for new put() with 10 vs 20 vs 30 properties will increase with the number of properties, but not linearly. the more important difference is the cpu cost. that does increase linearly. @dankles, not sure, but it should be pretty noticeable. [Wed Jun 17 2009] [10:01:02] <ryan_google> re timeouts, @dennis_tw, we'd need more details. if you're doing a lot of writes inside a single transaction, that can definitely contribute. we currently do retry writes in the datastore backend, with backoff; in 1.2.3 we're going to start retrying reads too, which should make a noticeable difference. [Wed Jun 17 2009] [10:01:04] <ryan_google> @dfabulich, sorry for the confusion re unindexed properties in the javadocs. we do know we need to update them. [Wed Jun 17 2009] [10:01:07] <ryan_google> @dankles, run_in_transaction_custom_retries was added a few releases ago. (glad you're interested!) [Wed Jun 17 2009] [10:01:09] <ryan_google> @dankles, yes, for the foreseeable future, task queue handlers will still be subject to the 30s limit [Wed Jun 17 2009] [10:01:18] <dan_google> Task queue tasks will have the 30s limit, but a task can enqueue another task to drive more work. [Wed Jun 17 2009] [10:02:15] <luddep> ryan_google: ok! thanks. [Wed Jun 17 2009] [10:02:16] <dennis_tw> ryan_google: i'm just starting coding so wondering if i need to create a wrapper around put. sounds like it won't help [Wed Jun 17 2009] [10:02:34] <dankles> ok thanks a lot guys for another great chat [Wed Jun 17 2009] [10:02:48] <ryan_google> er, sorry, to clarify, @dennis_tw re timeouts on txes: we currently retry puts and delets outside txes in the datastore backend, but not tx commits themselves, since we can't automatically retry read/modify/write patterns in the datastore backend. we retry your code, in userland, to do that. [Wed Jun 17 2009] [10:03:09] <scudder_google> It's a few minutes past the hour, so I'd like to call our hour long chat time to an end [Wed Jun 17 2009] [10:03:23] <dfabulich> thanks googlers! [Wed Jun 17 2009] [10:03:26] <scudder_google> some of us will still be online, so feel free to keep asking away [Wed Jun 17 2009] [10:03:27] <krisajenkins_> Thanks for the info! :-) [Wed Jun 17 2009] [10:03:28] <ryan_google> @dennis_tw: it may help a little, but not a lot. reducing the amount of work you do in a single put will also help. timeouts will never go away altogether, though, so handling them gracefully is always a good idea. [Wed Jun 17 2009] [10:03:33] <ryan_google> thanks scudder! [Wed Jun 17 2009] [10:03:47] <bthomson> is there any plan for custom # of retry on puts also? [Wed Jun 17 2009] [10:03:48] <scudder_google> thanks all for the great questions and comments, glad we could "meet" you :-) [Wed Jun 17 2009] [10:03:56] <ryan_google> @bthomson: sorry, no [Wed Jun 17 2009] [10:04:02] <jaysern> thanks guys! [Wed Jun 17 2009] [10:04:07] <dennis_tw> great, that sounds like good advice for my initial release --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---