[google-appengine] Chat Time transcript for April 21, 2010

Jason (Google) Wed, 21 Apr 2010 15:20:08 -0700

The high-level summary and complete transcript of the April 7th
edition of the IRC office hours is pasted below. Join us in two weeks,
Wednesday, May 5th from 7:00-8:00 p.m. PST for the next installment.


Note: On the first and third Wednesdays of every month, the App Engine
team signs into the #appengine IRC channel on irc.freenode.net for an
hour-long chat session. On the first Wednesday, we meet in the channel
from 7:00-8:00 p.m. PST (evening hours), and on the third Wednesday,
we're available from 9:00-10:00 a.m. PST (morning hours). Please stop
by!

- Jason


--SUMMARY-----------------------------------------------------------
- There is a slight error in our webapp framework documentation -- if
you're using templates, you should specify the path relative to the
location of the controller, not the root directory. We'll address this
docs bug shortly. [9:03, 9:05, 9:09]

- Discussion on various forthcoming roadmap features (OpenID, OAuth,
reserved instances) [9:05, 9:08, 9:11, 9:13, 9:18, 9:38]

- Q: What is the best way to estimate the total number of entities for
paging display (e.g. 1-10 of 400)? A: Try to avoid using count()
queries if possible. You can use a keys-only query to retrieve the
keys of all entities that match a specific criteria, and use that, but
if you do this, you should cache this count to avoid re-querying in
every request. You may also consider estimating or otherwise giving a
rough number if the exact count isn't important, similar to Gmail
(e.g. 1-20 of hundreds). [9:17, 9:19-9:32]

- Discussion on cold startup times for Python and Java applications --
all applications are cycled out after a certain period of time when no
requests come in. Startup times are generally better for Python apps
since the Python runtime doesn't have to load a JVM and most Python
frameworks are smaller than comparable Java frameworks. For now, the
only way to ensure that your application remains warm is to have
steady traffic (generally more than one request per minute) from
actual users, not cron pingers. Reserved instances are on the roadmap
which will enable you to pay to keep your application loaded in the
absence of traffic. [9:29, 9:34-9:36, 9:38-9:41, 9:43, 9:49-10:04]

- Discussion on backwards-traversing cursors [9:43-9:45]


--FULL
TRANSCRIPT-----------------------------------------------------------
[09:02am] apijason_google: Hi Everyone. Welcome to the second IRC
office hour session of the month. I and a few others will be in the
channel for the next hour to answer any App Engine questions if you've
got any, so fire away.
[09:03am] perlmonkey2: hmm, the GAE webapp docs say that for template
paths, the current dir is the application root directory.  but passing
in templates/mytemplate.html which is under the MyApp directory isn't
working.  Is this a bug or am I not Doing It Right?
[09:04am] perlmonkey2: oh, office hours.  I'll shut up for real
questions
[09:05am] rsaccon:  @apijason_google is there anything you can tell us
about forthcoming OpenId / oAuth  service, beyond what is on the
roadmap ?
[09:05am] apijason_google: perlmonkey2: It works for me. This is the
template path I set in my own apps:  path =
os.path.join(os.path.dirname(__file__), 'templates', 'index.html')
where templates is the name of the directory at the root level.
[09:06am] ikai_google: Hi everybody!
[09:06am] apijason_google: rsaccon: What specifically would you like
to know? The answer is probably not, although I can assure you that
it's coming along.
[09:06am] apijason_google: Hi Ikai!
[09:08am] prencher: apijason_google: can you tell us anything about
the upcoming reserved instances? guessing not, but if so I'd like to
know whether we're talking about N hot spares at all times (including
while there's load), or a minimum number of instances
[09:09am] ikai_google: prencher: We can't give many details right now
about any upcoming features.
[09:09am] perlmonkey2: hmm, outputted the path.  Looks like the docs
might be wrong as the path is from the controller not the root path of
the application.  So if my controllers are in MyApp/lib/controllers,
that is the relative path. Not Myapp/ as the docs say. Or am I doing
it wrong
[09:10am] perlmonkey2: Nevermind me, these are webapp questions.  Hate
to burn any time on this when there are serious GAE questions.
[09:11am] apijason_google: prencher: The details are still being
worked out, so nothing I say here is canon. But, to my knowledge, the
team has been discussing the second option -- minimum number of
instances. Everything is subject to change until release, however.
[09:11am] yodler12: How's the pagination blog post coming? Any good
pagination libraries yet that you know of based on cursors? Any advice
if I'm going to try to write one?
[09:11am] ikai_google: perlmonkey2: No, it's totally fair game, though
you're generally better asking this in the groups
[09:11am] apijason_google: perlmonkey2: Thanks for the pointer. Can
you file a docs bug in the issue tracker?
[09:11am] ikai_google: yodler12: Haha, I'm working on that. It's
nothing super special though - one way pagination
[09:11am] ikai_google: yodler12: Do a query, get a cursor, move on
[09:12am] perlmonkey2: sure apijason_google
[09:12am] rahulkmr1: My app on my local machine under dev server runs
youtube fine. However when deployed, the layout for youtube comes all
wrong and videos don't play. I am working on a web proxy. I thought
dev server is pretty much an accurate representation of appengine.
Anyone knows why this is happening?
[09:12am] rahulkmr1: I am using urfetch for proxying
[09:13am] rsaccon: apijason_google: so let's ask something very
simple: will it replace existing login api ? Will customer  need to
add Google App and Appengine app (on google domain administration
interface), if app is running on a Custom Domain ?
[09:14am] Wooble: perlmonkey2: unless you have huge number of
controllers in separate files, it's probably easiest to just keep them
all in your app's root directory.
[09:14am] apijason_google: rahulkmr1: Yes, the local development
server should be as close an approximation to the production system as
possible. Your issues sound strange -- layout shouldn't be affected by
deployments unless a style sheet didn't get uploaded appropriately,
etc.
[09:15am] ikai_google: perlmonkey2: I agree with Wooble. Because if
you start giving everything its own file, suddenly you have Java
[09:15am] apijason_google: rahulkmr1: Or, in your web proxy case,
downloaded from YouTube I suppose.
[09:15am] Wooble: ikai_google: until yesterday, he was using Java so
that's unerstandable
[09:15am] ikai_google: rahulkmr1: Do you use Firebug?
[09:15am] rahulkmr1: Ok. Let me look into it. Logs for youtube are
huge. I didn't find anything under error or critical.
[09:15am] perlmonkey2: Wooble: ikai_google: Since I'm less than 24
hours into Python, I'll take your expert advice
[09:15am] rahulkmr1: @ikai_google: Yes I do
[09:16am] ikai_google: rahulkmr1: What does Firebug tell you?
[09:16am] ikai_google: rahulkmr1: Are you getting the styles
correctly?
[09:16am] rahulkmr1: Any insights on videos issue? If proxied videos
work fine on devserver, why doesn't it work when deployed?
[09:16am] ikai_google: rahulkmr1: Are you proxying the videos as well?
[09:16am] rahulkmr1: @ikai_google. I didn't look deep enough
[09:16am] rahulkmr1: @ikai_google Not proxying videos. Just the html
source, obj tags etc
[09:17am] ikai_google: rahulkmr1: Well, you probably need to dig a bit
deeper then with Firebug before asking
[09:17am] ikai_google: rahulkm1: After you have these looked at, post
some screenshots to the groups
[09:17am] yodler12: Cool thanks for the upcoming paging blog. Ok I'm
trying to mimic Google search engine paging, using memcache to store
cursors in the backwards direction. Do you think I should use count()
query to find out how far Gooooooooooooooooogle should extend beyond
the current page?
[09:17am] ikai_google: rahulkm1: Or StackOverflow. First look at how
the styles bubble down, solve that first, then look at what's
happening with the Flash
[09:17am] rahulkmr1: @ikai_google I will do so. But it's unexpected
when it works on dev server and not on deployed code? right? After
all, the deveserver fetches the same youtube page
[09:18am] apijason_google: rsaccon: I haven't seen it in action, but
from what I understand, you should be able to use it in a similar way
to the existing Google Accounts service, but for other OpenID
providers in addition to Google. I'm not sure about the impact on the
domain console -- that may not change, but OAuth request signing/
validation should be much easier or at least not require any external
libraries.
[09:18am] ikai_google: rahulkmr1: Yes it is, but it could be anything.
The dev server is not an exact clone of what's in production. it's a
close approximation using mocked services
[09:19am] rahulkmr1: @ikai_google Ok let me look into it. Will get
back in some time
[09:19am] ikai_google: rahulkmr1: Without knowing what the difference
is and seeing the diff between the dev and production server, we can't
know if it's a bug or coding error
[09:19am] rahulkmr1: @ikai_google sure. makes sense. Let me dig deeper
[09:19am] Wooble: yodler12: count() is slow and won't work at all if
you have more than 1000 results.  google search itself doesn't count,
it magically estimates.
[09:20am] yodler12: So would you recommend a keys_only query instead?
[09:20am] ikai_google: yodler12: Wooble is correct. If you do a
complex labeled search in gmail, on some screens you'll see something
like "1-20 of hundreds" (approximation), but when you click next a few
times it'll figure out that there are only, say, 85 emails
[09:21am] yodler12: Not trying to figure out how many results there
are total, just if there are up to 100 or so past the current result
[09:22am] Wooble: counting might work for that, although if your
cursor goes beyond the amount you can count, things might break a bit.
[09:22am] apijason_google: yodler12: We had a discussion on keys-only
queries for counts in the last office hour session. It's a reasonable
solution, but I would cache this number for some period so you don't
have to continually execute the query on every search. As Wooble and
Ikai note, Google itself doesn't calculate the total number of results
with each query -- that would take too much time, and Google's all
about low latency.
[09:24am] yodler12: Ok thanks for your help! Hopefully if I do a good
job on the library I can put it up on GitHub in case others find it
useful/can help improve it
[09:25am] apijason_google: Make sure to submit it to the GAE open
source project page too. Good luck!
[09:27am] prencher: <yodler12> Cool thanks for the upcoming paging
blog <- which blog would this be, yodler12?
[09:28am] ikai_google: prencher: I'm working on a cursors article (for
Java)
[09:28am] prencher: oh, right
[09:28am] ikai_google: but there's not a lot of complex stuff in
there, it's just an expanded example on what's in the datastore
[09:28am] ikai_google: I don't go backwards
[09:28am] prencher: was hoping it'd be about the things also mentioned
for that upcoming talk
[09:28am] ikai_google: there are too many challenges with providing
pagination like the way people expect
[09:29am] ikai_google: with "prev 1 2 3 4 5 next"
[09:29am] lent: ikai_google: I think you mentioned last month in the
GAE java forums that there was some documentation forth coming in the
area of request processing details (this was in a thread about
sporadic response times).  Has this documentation been put up?
[09:29am] ikai_google: it's closer to something like "1-20 of
HUNDREDS"
[09:29am] ikai_google: lent: What area of request processing?
[09:29am] sar4j: This is about cold startups for java. I run my
website sarathonline.com on appegnine py. But I am a java developer.
So I have to get reasonabl undersanding about how gae/j works. To
simulate the same performance, I am adding a test.js  script to
exisitng website. So every time a visit comes to gae/py a hit happens
to gae/j. I still see that gae/j is very aggressively spinned down.
(given number of requests are the same) is there any algorithm on how
and
[09:29am] prencher: ikai_google: for generic pagination, yeah,
agreed.. for more fixed pagination, its easy
[09:31am] prencher: ikai_google: my impression is more that people
don't want "prev 12345 next" as much as they want consistent,
bookmarkable url's though
[09:31am] lent: ikai_google: I believe it was supposed to be about
concurrent request processing
[09:31am] prencher: which can be problematic with cursors as well
[09:31am] ikai_google: lent: No, we haven't put up the documentation
[09:32am] ikai_google: prencher: Yep, you're right
[09:33am] lent: ikai_google: is it still expected?  or are things
going to change so much once you guys put in the reserve instances
stuff that it doesn't make sense to put it up?
[09:34am] ikai_google: lent: We're improving the way we do it, but I
don't know that we're planning to release documentation - this is kind
of why we don't like to comment on stuff coming up - it's subject to
change
[09:34am] apijason_google: sar4j: We're definitely aware of the issues
that Java developers are facing with cold startup times, especially
since many Java frameworks exaggerate the loading time. We have
reserved instances on the roadmap, which should allow you to pay to
keep a minimum number of app instances warm, and we're working on
other general improvements to increase performance. Right now, the
only general way to ensure that your application is not cycled out is
to have steady traffic. However, this should be genuine traffic -- our
systems know when you're just using a cron job to keep your app alive.
[09:35am] cgrinds: apijason_google, knows and doesn't keep it hot?
[09:35am] ikai_google: cgrinds: Not necessarily, but we discourage
using cron jobs
[09:36am] sar4j: apijason_google: So you advice me to  try a full
blown move to gae/java for a few days and see? Will your algorithm
then spin down the app at longer intervels?
[09:36am] ikai_google: cgrinds: *using cron jobs purely as a mechanism
to keep applications loaded
[09:36am] cgrinds: ikai_google, yep understood
[09:38am] prencher: apijason_google: consider this to me my lobbying
request for you to lobby that it should be hot spares, not minimum
instances (minimum instances just reduces the problem - hot spares
eliminate it, mostly)
[09:38am] apijason_google: sar4j: If you have steady traffic, then
your application should stay loaded. Are you using the Python runtime
already?
[09:39am] sar4j: yes. it is much more realtime - than my study on gae/
j
[09:39am] sar4j: my apps : sar (py) and sarjava
[09:40am] apijason_google: prencher: Understood. I think the work for
this is largely underway already, but I'll pass on your request.
[09:40am] sar4j: so now if I want to move to java, I am a little
hesitant only because of cold startups.
[09:41am] apijason_google: sar4j: With steady, real-time traffic, your
Java app should perform just fine and shouldn't be cycled out. I'm
interested in hearing your impressions if you decide to deploy a Java
app vs. your already deployed Python app.
[09:41am] ikai_google: sar4j: You should base your solution on which
language or tool fits you best. For many developers, this is Python.
It sounds like you've got code written already
[09:43am] moraes: will cursors go backwards one day?
[09:43am] ikai_google: moraes: Possibly, I've heard people talking
about how to solve that problem
[09:43am] Wesley_google: sar4j: we're curious... what are the diffs in
startup times b/w your Py vs. Java apps?
[09:44am] ikai_google: moraes: You can kind of do a backwards query
now, it just costs a bit more CPU: find the first Key, then do a
reverse query. save that cursor, etc
[09:44am] ikai_google: moraes: Requires another index going backwards
[09:45am] ikai_google: moraes: only difference is it's not a clean API
[09:45am] moraes: yep
[09:49am] sar4j: Well, (steady, real-time traffic, ) is a little
abstract. I get real traffic (regular search generated traffic, etc
about 1 hit evry 10 min or less). For my blog and site (they run
resources and dynamic content off of sar.appspot, the python version -
right now). The python runtime give sub 2 sec response consistently,
and most times sub second. While different versions of java apps
(with, without spring, slim based, and ust servlet based) all spin
down
[09:52am] sar4j: Also I noticed that the time taken after the webapp
gets the hit (real time of response) is very little(in msecs for bare
minimum app and about 1-2 secs for a spring app). Its the runtime
bootup that takes a most of the time.
[09:52am] apijason_google: sar4j: In that case, I would probably stick
with your Python version, at least until reserved instances are
available. At 1 hit every 10 minutes, it's likely that your Python app
is getting spun down too. But the extra complexity and weight of the
JVM means that Python instances spin up faster than Java instances,
even discounting any frameworks you might be using.
[09:53am] sar4j: apijason_google:thanks. I guess I will do the same.
[09:55am] apijason_google: Five more minutes left -- get your
questions in!
[09:55am] sar4j: However, If I may ask, I would like to know, how the
jetty run time work. Does it create a seperate java process everytime
spindown/up happens or just the webapp is turned off and on?
[09:56am] sar4j: What takes the most time for the runtime? the
classloading, or the file io for reading the jars, etc. do y have any
such profiling info?
[09:57am] sar4j: the appstats gives profiling AFTER the runtime is up,
[09:58am] frew_google: sar4j: The same process is reused sometimes
(you can see this experimentally by checking whether or not your
global variables are still around from last time).
[09:58am] yodler12: Just wanted to comment that I am very excited for
whatever upcoming new features you'll be announcing at IO. Having you
google nerds working on that stuff reminds me why I chose App Engine -
it's almost like having your own Googlers working for you!
[09:59am] sar4j: I ask because, if its the classloading or the library
over head that is slowing down the runtime's bootup, may be you should
think in the direction of OSGi.. most libraries are same. if an OSGi
model is adopted,  the runtime bootup takes less time and less CPU
[09:59am] apijason_google: sar4j: We don't have any public profiling
info. currently. Maybe we can put this together at some point.
[09:59am] sar4j: saves green and saves the hassle of reserved
instances.
[09:59am] Wesley_google: thanks yodler! i don't think you'll be
disappointed.
[10:00am] prencher: apijason_google: quick one, why not generate pycs
server side after deploy, similar to java? it'd help python cold
starts
[10:02am] frew_google: sar4j: You might want to look at
http://www.answercow.com/2010/03/google-app-engine-cold-start-guide-for.html
(not written by a Googler, and I haven't had a chance to verify the
numbers, but the methodology of adding/removing libraries and checking
cold start times is pretty sound if you're worried about class loading
times).
[10:04am] Wesley_google: prencher: that's a good idea, however, it is
a moot point if we are spinning up different instances each time for
your app. the PYCs on one instance are not transferred to another, so
this would have to happen each time your app is spun up anyway.
[10:05am] apijason_google: OK Everyone, thanks for a great session.
It's past 10:00, so today's office hour is officially complete, though
a few of us may stick around for a few more minutes. The next chat
session is in two weeks, Wednesday, May 5th, from 7:00-8:00 p.m. PST.
Cheers!

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Chat Time transcript for April 21, 2010

Reply via email to