This past Wednesday, the App Engine team hosted the latest session of
its bimonthly IRC office hours. A transcript of the session and a
summary of the topics covered is provided below. The next session will
take place on Wednesday, November 4th from 7:00-8:00 p.m. PST in the
#appengine channel on irc.freenode.net.

Note that this will be the first Chat Time to occur after daylight
time in the U.S. ends, which means that it will be taking place one
hour earlier than usual in countries or states that don't observe
daylight savings time. Please be aware of this time difference so you
don't inadvertently miss the session.

- Jason


--SUMMARY-----------------------------------------------------------
- Q: Why am I seeing > 0.1% rate of datastore operations time out and
is anything being done to reduce this? A: A certain level of datastore
timeouts is expected (generally between 0.1% and 0.2% of all datastore
operations) but, we are actively working on ways to improve datastore
reliability. If you are seeing a much higher rate, be sure to inspect
your data model for write contention which often manifests as
datastore timeouts. [9:02-9:07]

- Q: What is the recommended approach to datastore capacity planning
ahead of a large bulk upload? A: Entities are stored as protocol
buffers (http://code.google.com/p/protobuf/) -- if you familiarize
yourself with the protobuf specification, you can determine the space
needed to store each entity, minus the datastore overhead, fairly
easily. An article is coming out soon which explains how entities and
indexes are stored in much more detail. [9:04-9:05]

- Q: Can a high level of read operations result in datastore
contention? A: Datastore contention is usually the result of too many
attempted concurrent writes to the same datastore entity or entity
group. Before implementing your data model, consider the expected read/
write access patterns and design your data model accordingly, sharding
entities that you expect to update more than once per second (http://
code.google.com/appengine/articles/scaling/contention.html). While
concurrent writes generally result in contention, concurrent reads
generally result in better performance due to caching. [9:08-9:09,
9:11-9:13, 9:18]

- Q: Are there any plans to support more file extensions for
attachments to outgoing email, e.g. .doc, .docx, etc.? A: There are no
immediate plans to support these extensions due to the prevalence of
viruses contained in files of these types. In the meantime, you can
include links to the files or share them via Google Docs. [9:14, 9:16,
9:19-9:20]

- Q: What is the recommended approach to paging large data sets in App
Engine? A: The offset approach is *not* recommended because it won't
work for result sets larger than 1,000. Until datastore cursors are
available, the recommended approaches are summarized in
http://code.google.com/appengine/articles/paging.html. [9:21-9:23]

- Q: How can one avoid exploding indexes when using list properties?
A: In general, you should avoid referencing more than one list
property in any query, especially if one or both list properties
contain a large number of elements. Index rows have to be added for
every permutation of values in the lists, which can result in index
"explosion". See the video at
http://sites.google.com/site/io/under-the-covers-of-the-google-app-engine-datastore
to learn more about why exploding indexes occur. [9:22, 9:26,
9:28-9:30, 9:32-9:33, 9:40]

- Q: In Java, can one use sequence methods in JPA to get a sequence of
datastore IDs? A: No, you have to use the low-level datastore API's
allocateIds() method for now. [9:31, 9:33]

- If you're looking to use Google Web Toolkit (GWT) and App Engine
together, there are a number of combo samples available in
http://code.google.com/p/googleappengine/source/browse/#svn/trunk/java/demos
including gwtguestbook, sticky, and taskengine. [9:46, 9:48,
9:50-9:51]

- Q: What is being done to address long initialization times for Java
applications? A: We are definitely aware of the issue and are rolling
out several back-end enhancements over the next few releases to try to
minimize this startup time as much as possible. [9:52-9:53]


--FULL TRANSCRIPT---------------------------------------------------
[09:01am] scudder_google: Hi all, welcome to another instlallment of
our hour long chat time with people on the App Engine team
[09:01am] johnvdenley: Is there any kind of formality to this session?
or is it just a free for all?
[09:01am] moraes: take what you can!
[09:02am] moraes: meh.
[09:02am] Jason_Google_: johnvdenley: It's basically a free-for-all.
[09:02am] scudder_google: so far from Google we have nickjohnson,
Jason_Google and a few others may join as we go
[09:02am] scudder_google: yes, jump right in questions and comments
welcome
[09:02am] mbw: Is anything being done to reduce timeouts?  I am seeing
a lot more than .01% timeouts.  We even use a low level catch and
retry trick to try and reduce its effect.  We saw a huge spike of them
yesterday at one point.
[09:02am] johnvdenley: OK, brb then, just need to move my car!...
[09:03am] scudder_google: mbw are these timeouts with datastore
operations?
[09:03am] mbw: yes
[09:03am] nickjohnson: mbw: We're actively working on datastore
timeouts. Bear in mind that they frequently highly correlated: When
you see them at all, they come in batches.
[09:04am] brett_ae: heyo
[09:04am] dw: re: idle question from last week, is there any good
advice going on capacity planning for datastore? i note that even very
small entities have a metadata overhead of 100+ bytes, and was
wondering how that metadata number is calculated (is it constant,
dependent on indexed fields, field count, etc.)
[09:04am] scudder_google: ah ok, there are a few things that you can
do but a small percentage of timeouts is currecntly expected
[09:04am] mbw: we see a steady amount of timeouts during the day.
[09:04am] mbw: i'd be happy with .01% ...
[09:05am] Jason_Google_: dw: I have an article coming out really soon
that explains all this. I'll try to get it published in the next week,
if you can hold out.
[09:05am] nickjohnson: dw: Entities are stored as Protocol Buffers;
the overhead in the datastore stats is simply the total size of the
entity's PB less the space used for each field.
[09:05am] dw: Jason_Google: that's great.  more a curiosity than
anything right now
[09:05am] scudder_google: I'm assuming these are timeouts on writes,
about how many indexes need to be updated with a write
[09:05am] nickjohnson: The simplest way to reduce overhead is to use
shorter field names.
[09:05am] dw: nice
[09:06am] mbw: timeouts happen on reads for us as much as writes.
They don't seem to happen any more on big operations vs. small simple
queries or gets
[09:06am] nickjohnson: You can specify the field name to use
internally to the Property subclass constructor, by the way, so you
don't need to compromise the design of your model.
[09:06am] dw: nickjohnson: +10 points for preempting the evil thoughts
i was having
[09:07am] nickjohnson: mbw: Do you typically tend to write a lot to
the same entity groups?
[09:07am] scudder_google: mbw: ah ok, I'd like to look into this more
sepecifically for your app, what is the app ID?
[09:07am] mbw: scudder_google: ill PM it
[09:07am] moraes: i was thinking alongthe lines of 'store everything
in a big pickle property named "a".'
[09:08am] _tmatsuo: Talking of timeouts, if there's too many accesses
to a particlar node in a short period of time, could it be a reason
for datastore timeouts?
[09:08am] nickjohnson: moraes: Pickle is, amongst other things,
bulky.
[09:09am] nickjohnson: _tmatsuo: For a given value of 'too many', yes
[09:09am] brett_ae: _tmatsuo: for writes, possibly; for reads, no; it
should actually get faster
[09:09am] brett_ae: because of caching
[09:09am] johnvdenley: whats the status of a local datastore viewer?
[09:09am] moraes: johnvdenley: there's one.
[09:10am] dw: johnvdenley: /_ah/admin/datastore url when running
dev_appserver
[09:11am] Jason_Google_: johnvdenley: A local data viewer was added in
the Java SDK a couple of releases back.
[09:11am] mbw: scudder_google:  did you receive my PM?
[09:11am] _tmatsuo: nick: brett_ae: thanks. In such a case(timeouts on
writes because of massive access), in my opinion, re-partitioning of
data will help reducing timeouts. Is there any mechanism for re-
partitioning of data?
[09:12am] scudder_google: mbw: yes, just replied, apologies for the
delay
[09:12am] nickjohnson: _tmatsuo: That's too general a question to
answer as-is. It depends highly on the data model in question.
[09:12am] nickjohnson: Frequently, simple optimisations do make a big
difference, though
[09:12am] brett_ae: _tmatsuo: you can do a migration (by changing your
schema/entity groups) yourself, which can be difficult; easiest thing
to do is think about your datamodel ahead of time and think of your
read/write access patterns
[09:12am] johnvdenley: ah, apologies, i must have been reading an old
message
[09:13am] brett_ae: _tmatsuo: So if you know you're writing to a
single piece of data more than once per second, maybe split it
somehow?
[09:14am] max-oizo: When I was doing "diff" between versions 1.2.5 and
1.2.6, I found a CompiledQuery. What is it? Part future support
cursors?
[09:14am] Sylvain_: News to support new extensions in the e-mail
service ? and particularly MS Office (Word, Excel) and Open Office
files ? (issue 494).
[09:14am] brett_ae: max:
[09:14am] Sylvain_: For example, We'd like to create a "HR section"
where people can send their resum (most of the time) : .doc, docx.
We'd like to send them by mail then.
[09:14am] Sylvain_: And if possible not case sensitive (issue 493)
[09:16am] brett_ae: sylvain_: It's something we should support; the
concern thus far has been virus propagation
[09:17am] johnvdenley: AWESOME http://localhost:8080/_ah/admin/datastore
works beautifully  thanks!
[09:18am] _tmatsuo: brett_ae: Ok. That's understandable, but what if
any node which holds my particular entity group also has another
entity-group of other application, which is massively accessed by
other application? Is there anything I could do?
[09:18am] wcr: Good morning folks.
[09:18am] wcr: or afternoon
[09:18am] brett_ae: _tmatsuo: You're confusing bigtable separation of
data (which is transparent to you, the developer) and entity group
separation (which you, as a developer, are in full control of)
[09:18am] brett_ae: bigtable separation should not affect you
[09:19am] brett_ae: some of this is here:
http://code.google.com/appengine/articles/scaling/contention.html
[09:19am] Sylvain_: brett_ae: ok, thank you. I didn't know word,
excel,... was a big source of virus. I can
understand .cmd .bat .vbs, .js,....
[09:19am] nickjohnson: Sylvain_: In the meantime, emailing users links
to download the doc is probably a good plan. That or share it with
them on Google Docs.
[09:19am] Jason_Google: wcr: Hi
[09:20am] moraes: max-oizo: CompiledQuery is used by cursors. res =
query.fetch(10) / cursor = query.cursor() / next_res =
query.with_cursor(cursor).fetch(10) -> last time i checked, was only
working in dev, cursors are always None in production.
[09:20am] brett_ae: sylvain_: Yeah they're pretty notorious
[09:20am] Sylvain_: yes nickjohnson , but my users want to receive a
mail with an attachement not a link
[09:20am] max-oizo: And another, i found that an images API will
support a blob_key in the constructor. When can we expect a support of
"Service for storing and serving large files"?
[09:21am] brett_ae: max: good digging you've done
[09:21am] brett_ae: nothing to announce today, but it's on the roadmap
[09:21am] wcr: What is currently the best method for paging results,
since offset is not an option > 1000.  Someone mentioned something
about sorting by key, anyone have any more details?
[09:21am] practicalint: newbie. Updated eclipse with latest GWT and AE
plugins. Can't get Taskengine demo to run. should it work out of the
box with the latest?
[09:22am] nickjohnson: wcr: The basic technique is to store the value
of the sort field (Which by default is the key) for the last entity
you saw, then pick up where you left off.
[09:22am] max-oizo: 2brett_ae: only diff with winmerge
[09:22am] nickjohnson: There are libraries that will help with this
[09:22am] Aneon: it would be nice with some more documentation/
articles about indexes, especially related to list properties, as
these are so important in app engine. for example, i would like to
know more about when exploding indexes actually becomes a performance/
storage problem
[09:23am] wcr: nickjohnson:  Do you have a blog post about this by any
chance?
[09:23am] scudder_google: There are also several pagination
techniquest discussed in this article 
http://code.google.com/appengine/articles/paging.html
[09:23am] johnvdenley: practicalint, there is a setting you have to
add to the properties for the java VM, Ill see if I can find the
article about this, unless someone else beats me to it!
[09:23am] nickjohnson: wcr: 
http://code.google.com/appengine/articles/paging.html
[09:23am] nickjohnson: wcr: also
http://appengine-cookbook.appspot.com/recipe/efficient-paging-for-any-query-and-any-model/
[09:23am] nickjohnson: scudder_google: Snap!
[09:24am] wcr: lol
[09:24am] scudder_google: nickjohnson yes jinx!
[09:25am] scudder_google: Aneon: that's a great idea, in fact we've
been thinking about publishing some more datastore related articles in
the not too distant future
[09:26am] max-oizo: 2google_team: in issue 354 niall.kenned wrote
recently: "bslatkin of GAE team confirmed last week he is working on
this feature and will work similar to urlfetch." - is it true?
[09:26am] scudder_google: Aneon: the threshold of pain for exploding
indexes depends in part on how much data you have.
[09:27am] wcr: soon(tm!)
[09:27am] max-oizo: * Issue 354: Feature: DNS Lookup
[09:27am] nickjohnson: max-oizo: What feature?
[09:27am] Aneon: scudder_google: that sounds great! just what i feel
missing. because i feel that it's very hard in the beginning, before
you have your app up and running and can perform tests, to actually
make good judgement on model design, especially related to list
properties.
[09:27am] max-oizo: 2nick: Feature: DNS Lookup
[09:28am] Aneon: scudder_google: how much data you have in the
specific list property you mean? or in total?
[09:29am] scudder_google: Aneon: what I was trying to get at is that
say you had a list property in a model and each entity has just a few
values in it
[09:29am] nickjohnson: Aneon: The 'under the covers' datastore talk is
an excellent one to watch if you want to learn about why exploding
indexes happen
[09:29am] scudder_google: ah yes, good suggestion Nick
[09:29am] Aneon: thanks, i'll make a search for that
[09:30am] Jason_Google: Aneon: There's a reasonable explanation in the
indexes documentation, but you basically want to avoid, as much as
possible, having more than one indexed list property for a given kind,
especially if you plan to store a lot of values in them. Due to the
way that indexes have to be built using the various permutations of
these lists, etc.
[09:30am] _tmatsuo: brett_ae: Thanks. Thats good to know developpers
souldn't care about other applications datastore acccess.
[09:30am] scudder_google: Aneon: so with just a few values, indexing
all 2 pair combinations might not be too bad, but if you start
indexing three pairs, or you have lists with a large number of values,
the number of index rows per entity increases exponentially
[09:30am] Jason_Google: Jinx x 2
[09:31am] scudder_google: we're on a roll today
[09:31am] wcr: Anyone happen to be using any libraries that handle
paging?  Looking for a recommendation
[09:31am] lent: <java> Is there any way to define and access a
sequence directly in JPA with Appengine?  When allocation of ids came
in with issue 110, Max Ross commented that appengine supports
pm.getSequence() in JDO which allows for accessing sequence directly.
Any thing like this with JPA?  If not, is the only way to directly use
the low level api with allocated ids?
[09:32am] johnvdenley: practicalint, im not sure this is the right
link, but It sounds like the same issue I had:
http://groups.google.com/group/google-appengine-java/browse_frm/thread/3497eec1c4908bbf/14b2963f245a37f4?lnk=gst&q=1.7.1#14b2963f245a37f4
[09:32am] nickjohnson: wcr: My recommendation would be to wait,
personally
[09:32am] moraes: wcr: simply use the one described here
http://code.google.com/appengine/articles/paging.html while cursors
don't come.
[09:32am] wcr: nickjohnson:  wait for what?
[09:32am] max-oizo: 2google_team: So it's true or not? (about bslatkin
and comment into the issue 354:Feature: DNS Lookup)
[09:32am] Aneon: Jason_Google: that's what i was suspecting. so a kind
with two list properties with about 20-40 values each could be a
problem? and is it only a problem when you actually filter on them on
reads (if you disregard the storage costs)?
[09:32am] moraes: cursors!
[09:32am] wcr: oh!
[09:33am] wcr: I haven't read about cursors... someone link me!
[09:33am] tobyr: lent: Yes, currently if you're using JPA, you'll need
to use the low level API to get at "sequences".
[09:33am] nickjohnson: Aneon: Only if they're both used in the same
custom index
[09:33am] nickjohnson: Using the same list property more than once in
a custom index has the same issues
[09:34am] nickjohnson: wcr: There's no public docs yet
[09:34am] moraes: wcr: there're no link, as they were not relesead.
you'll find hints in google.appengine.ext.db code.
[09:35am] wcr: I'll look forward to seeing this completed sometime
this weekend, thank you.
[09:35am] wcr: BUt yeah, that's great to hear
[09:35am] practicalint: johnvdenley doesn't look like my problem. Have
RT arg of tasks. I can get it to run to the point of logging in from
web loading the task module, then runtime exception cannot load
module.
[09:36am] johnvdenley: practicalint, i would suggest cutting and
pasting the error into the groups search and seeing what comes up
[09:39am] practicalint: johnvdenley I have - most of the hits are
about files not containing all the proper values which came from the
demo. Had to change one set properties to set configuration properties
due to GWT changes.
[09:39am] _tmatsuo: google_team: Will appengine/java have remote_api
soon?
[09:39am] nickjohnson: _tmatsuo: It's being worked on.
[09:40am] Aneon: nickjohnson: you mean only when i define them in
index.yaml? ah, didn't know the same problem could happen with one
property used two times in one index. i'll try to read up and
experiment more with this, but it would be really cool with some more
in-depth articles and best-practices/example texts than those
available in the official docs. they're nice, but i feel they could be
expanded on a lot as this is such an important area for
[09:40am] Aneon: optimization
[09:40am] practicalint: johnvdenley I suspect other GWT changes not
compatible with demo code, maybe in conjunction with I have ie8 on the
box maybe?
[09:40am] nickjohnson: Aneon: yes
[09:41am] johnvdenley: practicalin, ive only been on GWT/GAE for a
couple of months, your problem seems to be beyond my capabilities!
[09:41am] Aneon: thanks for the response guys, have to go. keep up the
good work, i like app engine a lot!
[09:43am] johnvdenley: Id just like to say a big thanks to all at
google for providing GWT/GAE, Ive only been using it (and java) since
September, and I did a demo of my new application today to my business
partners, who were simply stunned at the functionality and speed that
has been achieved in such a short development period
[09:44am] nickjohnson: johnvdenley: We're the wrong people to thank,
but happy to accept your gratitude anyway
[09:44am] practicalint: johnvdenley OK thanks, I was picking on you
cause you reponded. anyone know if re-running the demos would occur
before the plugin release - trying to figure out is it me or is the
demo broken
[09:46am] johnvdenley: nickjohnson, feel free to pass it on to the
"right" people
[09:46am] practicalint: I learn/code best from example and was trying
to use demo as GWT/GAE example to build on. suggestions for another
place to get such an example with code ?
[09:48am] Jason_Google: practicalint: There are a few GWT/GAE sample
apps -- gwtguestbook, stickynotes, etc. Have you looked at these?
[09:48am] Jason_Google: practicalint: The full set of demos is here:
http://code.google.com/p/googleappengine/source/browse/#svn/trunk/java/demos
(stickynotes is actually called sticky)
[09:49am] _tmatsuo: currently, google moderator have many users and
series. Is google moderator billing enabled?
[09:49am] johnvdenley: practicalint, i think i have a fully working
stockwatcher demo which takes some concepts from the stickynotes/
guestbook somewhere which I can send over to you. Sounds like you
learn the same way I do, as I say Im only 2 months ahead of you and I
had real trouble getting the standard demos actually working fully!
[09:50am] scudder_google: practicalint: the taskengine demo also uses
GWT
[09:50am] practicalint: Jason_Google yes - each different purposes
(and I can run them). I want GWT with GAE as I am trying to learn/
deploy with both
[09:50am] Jason_Google: _tmatsuo: I imagine that Moderator does
require more than the free quotas, but it's also a Google application
so it's not necessarily billed the same as third-party apps.
[09:51am] practicalint: scudder_google thats the one I can't get to
run
[09:51am] Jason_Google: practicalint: After the chat, I'll try to look
into why the other demo isn't working and see if it's an issue in the
demo itself. In the meantime, the other apps that I mentioned and the
updated StockWatcher app should be enough to get you started.
[09:52am] WdWeaver: I'm interest in improvement of spinning up time
for appengine/java. How is that status?
[09:52am] _tmatsuo: Jason_Google: Thank you. Could it be possible for
us to know how much it costs if google moderator is third-party app
and billing enabled?
[09:53am] Jason_Google: WdWeaver: We have various improvements that
we're working on, spreading them out over the next few SDK releases.
Hopefully you'll begin to see the startup time improving, slowly but
surely.
[09:53am] max-oizo: 2Jason: You promised to do the off-line
documentation in other languages (ru/etc.). How soon it happens or
better to parse it me with a special programm?
[09:54am] Jason_Google: _tmatsuo: I don't have this data point on-
hand, but I'll try to find out sometime.
[09:55am] johnvdenley: oh, have we solved the problem with the hosted
browser trying to load 127.0.0.1 instead of the application.html? the
groups suggest its IE caching, but after having this problem hundreds
of times, ive pretty much prooved that clearing the IE cache has no
direct effect, it usually is just a matter of waiting a few minutes,
but sometimes even waiting 20 minutes still doesnt clear the problem!
[09:55am] Jason_Google: max: I asked the person who built the script
that automatically packages the documentation set shortly after our
last conversation. They're working on enhancements to automatically
get the intl docs in there also, but he didn't give me an ETA. So it
will probably be a little while, but it will happen.
[09:55am] practicalint: johnvdenley: does your example use GWT?
[09:57am] Jason_Google: johnvdenley: Unfortunately, I'm not a GWT
expert. Best to ask that in the GWT discussion forums.
[09:57am] practicalint: Jason_Google: thanks please let me know what
you find. my gmail is the same nick here.
[09:57am] _tmatsuo: Jason_Google: Thanks. That data could encourage
more developers diving into appengine
[09:58am] max-oizo: 2Jason: thank you very match
[09:59am] wcr: Are there any plans to allow adsense generated revenue
to go into your billing enabled pool, or is that too much to
coordinate ?
[09:59am] johnvdenley: practicalint yes stockwatcher uses GWT & GAE.
Ive been using GWT/GAE as if they are one tool, as such i often get
confused about which question to direct (case in point, my question
above, which appears to be a GWT question rather than a GAE
question!!)
[10:00am] johnvdenley: (Im being surprisingly chatty for someone who
WAS planning on just watching this discussion!)
[10:01am] Jason_Google: wcr: Interesting, I just got that question in
person a few days back. The short answer is, it's not on the short-
term roadmap, so I wouldn't expect anything like this around the
corner. But I do concede that it's an interesting integration angle.
[10:01am] nickjohnson: deferred.defer(developer_chat,
_eta=datetime.datetime(2009, 11, 4, 19, 00, 00))
[10:03am] nickjohnson: In other words, this marks the end of this
week's official developer chat. Some of us will be around for longer,
and there are many enthusiasts in the channel, so feel free to ask
questions any time.
[10:03am] max-oizo: goodbye!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-java@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to