03 [PDT]

Greg Darke (Google) Wed, 03 Aug 2011 20:28:58 -0700

Chat log from today's App Engine office hours. All times are in AEST (UTC+10).


12:01 -!- Irssi: #appengine: Total of 115 nicks [0 ops, 0 halfops, 2
voices, 113 normal]
12:01 < jwbnyc> Thanks for joining, us Wesley!
12:01 < Wesley_Google> who? where?
12:01 < robertk> Wesley_Google: there ->
12:01 < robertk> ;)
12:02 <+gregdarke> Hey all
12:02 < johnlockwood> howdy
12:02 < mbw> ok, ive got one.  Have you guys made any changes
(improvements) in parallel db.get across entity groups on HRD?   We
had a developer testing async db.get and we were just wondering if
that has gotten better yet.
12:02 < robertk> so what do you work on dave_google?   don't remember
seeing you around here
12:02 < dave_google> i'm on app engine
12:03 < dave_google> wes usually schedules these for when i can't make it. :)
12:03 < Wesley_Google> mbw> you mean batch get by key?
12:03 < jwbnyc> For our part, we're butting up against issues of code
size. We're starting to encounter difficulty due to instance startup
times.
12:03 < mbw> Wesley_Google: yep
12:03 < robertk> yeah which part of AE?  gregdarke is a task queue
guy, wesley is dev rel... just want to know which questions to direct
at you :)
12:04 < dave_google> a bit of this, a bit of that. ask your question,
and the right person will chime in.
12:05 < Wesley_Google> thanks dave!! (no, these are pre-scheduled all
the time... it's either 7-8p PDT or 9-10a PDT)
12:05 < mbw> Wesley_Google: I know we have spoken to you guys about
this before and you said things where being done to improve
performance of that... since it was very slow comparead to master
slave (when we first started using it... we havent done much
benchmarking lately and didnt really notice a change)
12:05 < dave_google> 9am exists as an abstraction.
12:07 < jwbnyc> Has anyone been able to get a handle on where the
initialization time goes? Our effort have shown wide differences
between the Java dev server and GAE in terms of how our initial
request performs.
12:07 < ronoaldojlp> I'm experiencing some instances startup latency,
and posted an issue today. Does someone else is having the same issues
too?
12:07 < robertk> mbw: when i was poking around in the sdk i did notice
that they are putting in the code to auto parallelize the rpcs now
12:08 < ronoaldojlp> our Java instances usually take 3-6 sec to
startup, but some of them throws (Hard)DeadlineExceededExceptions
12:09 < jwbnyc> ronoaldojlp: We've been trying to track down a similar
set of issues. What's the URL for the issue you raised?
12:09 < mbw> robertk: right, but thats in the case where it would be
too large to do it in one put right?
12:10 < robertk> mbw: nah, looked like something more general to me.
judging by the comments at least
12:10 < mbw> robertk: which could be a bit scary actually since you
could be writing them all to the same entitiy group, thus contention
12:10 < ronoaldojlp> jwbnyc:
http://code.google.com/p/googleappengine/issues/detail?id=5477
12:10 < robertk> mbw: haven't dug as deeply as i wanted to yet though
12:10 < mbw> ya, who knows... frigen docs would be nice
12:10 < robertk> mbw: yeah there logic seemed to take entity groups into account
12:11 < dave_google> create an issue for where you want to see docs
expanded. those do get listened to.
12:11 < robertk> mbw: seemed to break up the rpcs based on entity
group  -- there is now na entity_grpups_per_rpc (or something similar)
config option
12:11 < jwbnyc> Thanks, Ron! I posted to the Appengine for Java group:
https://groups.google.com/d/topic/google-appengine-java/rWfC6cypiwg/discussion
12:12 < robertk> mbw: here is one part of what i saw today:
http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/datastore/datastore_rpc.py#488
12:12 < robertk> dave_google: mbw and i probably dig in a bit deeper
than normal ;)
12:12 < robertk> dave_google: if you could get the internal comments
left in when you push the code that would be great ;)
12:13 < ronoaldojlp> jwbnyc: we are trying to do some lazy
initialization but not sure if there is any good "patter" on that
area... we are using Google Guice to do a lot of work on that.
12:15 < JasonAtBobber> Anyone have thoughts on how best to profile GAE
apps?  Our initialization times are fairly short when run locally, but
can be crazy-long when on AppEngine.  Anyone use anything other than
Appstats?  AppWrench looked promising, except it appears to be dead.
;)
12:16 < ronoaldojlp> Googlers, is the "min idle instances" knob of the
scheduler  avaiable as preview? We're expecting a huge traffic
tomorrow for a TV commercial and we thing that it may be a good Idea
to leave some instances waiting ... Can I "emulate" it in a good way,
by warming up some instances?
12:17 < robertk> what other scheduler config settings will be rolling
out?  i know min-idle-instances is planned.  what about additional
information to help us making decisions on scheduler settings, will
there be more logging or analysis charts?
12:17 < robertk> and what is being done to increase qps per instance?
12:17 < robertk> 0.2 avg qps with a 100ms latency is a little ... low.
12:18 < dave_google> JasonAtBobber: i pretty much use appstats or
hand-rolled timers.
12:18 < robertk> ronoaldojlp: always on will have 3 on stand by
12:18 < JasonAtBobber> Thanks Dave
12:20 < ronoaldojlp> robertk: thanks .. we have it enabled, but our
app currently uses on a moderate qps (40) ~18 instances, and with the
current Deadline exceptions we guess that it may be a good sense to
have some more than 3. In fact, we expect that the scheduler wil start
and keep more instances around to handle the traffic spyke.
12:22 < robertk> ronoaldojlp: ah, yeah you're worried about startup
times.  you're using warmup requests to start loading stuff, right?
12:22 < robertk> ronoaldojlp: i've had no issues popping from a couple
hundred qps up to several thousand -- but that was with a short
startup
12:23 < jwbnyc> The warmup is helpful. The problem is that we often
encounter harddeadline exceeded even though on average, the request
takes about 7sec.
12:24 < JasonAtBobber> Anyone else been running into user requests
hanging for 20+ seconds?  I'm looking at our logs, and in the middle
of DataNucleus initialization it pauses for 25 seconds.  This doesn't
happen often, but often enough to be vexing.  Is this just us?
12:25 < ronoaldojlp> robertk: yes ... I'm currently using a context
listenner to setup my Guice injector ...
12:25 < robertk> there have been a few people posting to the groups
with strange request times.  several 10s of seconds with no real
explanation
12:26 < johnlockwood> I've had startup request times of 18s
12:26 < robertk> yeah shit like that.
12:26 < mbw> wow
12:26 < robertk> what is your usual startup johnlockwood?
12:27 < jwbnyc> The behavior on our app feels a bit like a
garbage-collection pause. Our app needs about 130K/instance - would it
make sense to allow apps to specify an initial memory allocation?
12:27 < robertk> gregdarke: i heard the task dispatcher is getting
revamped a bit and it will help reduce the frequency of queue-stalls.
 is the taskqueue pannel in the dashboard going to be adjusted to
reflect when tasks are put back in the queue?
12:28 < robertk> hi ikai_google.  you're being super quiet tonight.
12:28 < kebomix> guys, i use jdom library and looks like the servlet
doesn't see it ! http://pastebin.com/mmcTXphx , how to solve this
error ?
12:28 < JasonAtBobber> Not 130K, jwb... 130M ;)
12:29 < robertk> any googlers able to comment on my questions (above, ~22:17) ?
12:29 < Wesley_Google> ikai is going to be late or not coming tonight.
or it's not 100% him. LOL
12:29 < robertk> ha ha.  k.
12:30 <+gregdarke> robertk: What do you mean? There was a bug that
partially caused
http://code.google.com/p/googleappengine/issues/detail?id=5471 , but
that is only an issue with long running tasks
12:30 < dave_google> kebomix: is the jdom jar in your WEB-INF/lib?
12:31 < johnlockwood> 200 28134ms 2218cpu_ms 118api_cpu_ms
12:32 < johnlockwood> this is what I just got, it use to be a few seconds max
12:32 < robertk> gregdarke: yeah sometimes even with fast taks (sub
600ms) i have queues stall out for a while while other queues keep
chugging along
12:32 < robertk> gregdarke: is there any information we could be given
that would help us diagnose slow queues when we encounter them?
12:32 < kebomix> dave_google: thanks i was only refering to it in
another folder, i copied it to WEB-INF and it worked :)
12:33 < kebomix> btw it have been a week now and i still didn't get
email from google to activate my java runtime ! is that normal?
12:33 < robertk> gregdarke: yeah from the detail on that issue, what
i'm wondering is how will we know if tasks are getting dispatched then
put back into the queue?
12:34 < robertk> gregdarke: it sounds like there is some change being
made that will allow us to see that information?
12:34 < mbw> Our average for the last 24 hours,
HRD/Python/big-ass-django-app on warmup requests is 4216 ms, not too
bad, but not great either
12:34 < dave_google> kebomix: there's nothing special needed before
deploying an app built with the java sdk. what do you mean by
'runtime'?
12:34 < JasonAtBobber> mbw: Is your info fine-grained enough to detect spikes?
12:35 < JasonAtBobber> I'm curious if the behavior is sporadic...
12:35 <+gregdarke> robertk: You can now see the enforced rate of a
queue, that will allow you to see if your queue has been throttled.
Though other than that, no.
12:35 < mbw> JasonAtBobber: yes, but this chart only shows me the average
12:36 < robertk> mbw: johnlockwood: interesting, my loading times (hr
app) have went back down to what they were a couple months ago:
12:36 < robertk> ms=766 cpu_ms=324 api_cpu_ms=68 cpm_usd=0.009241
loading_request=1
12:36 < kebomix> dave_google: i mean access to google servers to
deploy my app, i still have no access, working locally only
12:36 < robertk> they were higher, but seem to have went back down
12:36 < mbw> robertk: show off
12:37 < mbw> dtuckerames1: go for it, i havent asked yet
12:37 < robertk> mbw: now you know why i poo poo django :P
12:37 < johnlockwood> this one is a M/S robertk
12:37 < mbw> robertk: I honestly could care less about 4 measly
seconds on a warmup call
12:38 < dtuckerames1> any updates on the monitoring APIs that we heard
about at I/O?
12:38 < robertk> mbw: but that app is also optimized for loading
times.  has several separate wsgi entry points so it is only loading
the sections needed
12:39 < jwbnyc> mbw: The problem we have isn't a few seconds. We're
seeing a lot of cases where GAE decides to start an instance and then
kills it because the warmup takes >20sec rather than the average 7sec.
12:40 < johnlockwood> I just wonder why 2 seconds of cpu time turn to
28 secs of realtiem
12:40 < mbw> kills it because its >20s or 30s?
12:40 < mbw> johnlockwood: thats a slow cpu
12:41 < johnlockwood> maybe my app is on  386 or something
12:41 < robertk> ha ha
12:41 < jwbnyc> Sorry, the total time is >30sec, there's an
unexplained period between our filter and the first log entry from the
warmup.
12:42 < robertk> johnlockwood: i've had apps find their way to a 'bad
spot' in the cluster before where my startup times go from 700 or 800
ms to 3+ seconds
12:42 < Wesley_Google> jwbnyc> other users are reporting the same
thing. are you using something big like Spring? do you have a lot of
resource or static files?
12:42 < robertk> johnlockwood: no idea why it happens but eventually
it seems to work itself out..... problem is that sometimes it has
taken several weeks to do so.
12:43 < johnlockwood> I'm seeing a startup call that took 67s
12:43 < jwbnyc> Wesley_Google: We have a large number of static files:
about 1700. We don't use Spring.
12:43 < johnlockwood> from earlier today
12:44 < robertk> johnlockwood: so guy in the groups posted one like
that a couple days ago too
12:44 < robertk> *some
12:44 < johnlockwood> i've been seeing this happen for months
12:44 < johnlockwood> i'm using the builtin django 1.2
12:45 < robertk> ms=13226 cpu_ms=210 api_cpu_ms=0 cpm_usd=0.006032
loading_request=1
12:45 < mbw> johnlockwood: 67s, how is that possible?  task?
12:45 < johnlockwood> in fact is was not long after I started using
the builtin 1.2
12:45 < robertk> avg latency on that app is 138.8 ms
12:45 < robertk> it is a M/S app
12:45 < johnlockwood> at first it was faster, like 2 seconds
12:45 < robertk> serving a steady 10 qps
12:46 < robertk> mbw: no i've seen strange number mismatches like that too.
12:46 < robertk> mbw: i've seen them with silly latencies on
user-facing loading requests
12:46 < johnlockwood> mbw not tasks, these are to the / of the site. I
see 52s, 54s too
12:47 < johnlockwood> last time i brought this up I was told to use
warming requests, but it's silly
12:48 < mbw> johnlockwood: you obviously need to use a warming request
for your warming request.
12:48 < johnlockwood> mbw thanks for the advise
12:48 < johnlockwood> :)
12:48 < ronoaldojlp> Wesley_Google: does the Guice injector loading
(and changing) bytecode a problem for the startup? I guess that it may
try to read some data from the filesystem to setup some dependencies
...
12:48 < robertk> ha ha, Wesley_Google we need warmup warmup requests please ;)
12:49 < mbw> well, no answer on the Monitoring APIs huh?
12:49 < robertk> Wesley_Google: ^^ i would also like to know
12:50 < mbw> Logging improvements? (maybe thats another feature)
12:50 < robertk> gregdarke do you work with the guy who's doing the
monitoring api?  he's based in sydney, right?
12:50 < robertk> (can't recall his name)
12:51 < johnlockwood> the same code on HR has the same cpu usage, but
the realtime is 3s
12:51 < johnlockwood> well I am meaning to move this over to HR
12:52 < mbw> robertk: you should ask njoyce to ask in Australian.
Maybe they don't understand you.
12:52 <+gregdarke> mbw: I believe we are currently in trusted tester
for the monitoring api at the moment, if you are interested can you
give either myself or Wesley your contact details and we can look into
it
12:52 < robertk> mbw: ha ha, need a translator ;)
12:52 < robertk> lets try "ello..."
12:52 < robertk> :P
12:52 <+gregdarke> robertk: s/ello/G'Day/
12:53 < robertk> dtuckerames1: you see gregdarke's comment?
12:53 < mbw> gregdarke: we are interested in pretty much all TT.  Can
you have Chris Schalk contact us about it? (WebFilings)
12:53 < robertk> gregdarke:  awesome :)
12:53 < vbabiy> Hey guys, is there any docs about google GAE from android?
12:53 < dtuckerames1> robertk: ya just saw it - will work through chris
12:53 < robertk> gregdarke: i'm ready to hit the streets of sydney now :)
12:53 < johnlockwood> robertk:  don't you get TT automaticly for every
appengine thing?
12:54 <+gregdarke> mbw: Sure, I will see what I can do
12:54 < mbw> TT is another topic... any effort into organizing that
system so that everything can be done on a central site/page/signup,
etc
12:54 < robertk> johnlockwood: apparently not... ha ha ;)
12:54 < robertk> yeah i thought we were going to get a dashboard to
just click what we wanted to participate in?
12:55 < Wesley_Google> mbw> i'll ask chris to look into it for you
12:55 < robertk> :)
12:55 < mbw> Wesley_Google, gregdarke thanks guys.
12:56 < robertk> yeah thanks for your time *_google  ;)
12:56 < johnlockwood> thanks Wesley_Google  and gregdarke
12:56 < vbabiy> Sorry I am trying to find docs about using google auth
from GAE in my android app, does google have any?
12:56 < Wesley_Google> sure no problem... thanks for coming guys, and
thanks for your patience as we were shorthanded tonite
12:58 < vlad__> hi
12:58 < vlad__> is anyone still in?
12:58 < mbw> sure
12:58 < robertk> vbabiy:
http://www.google.com/events/io/2011/sessions/android-app-engine-a-developer-s-dream-combination.html
12:58 < robertk> maybe that will help
12:58 < vlad__> are googlers still in?
12:58 <+gregdarke> vbabiy: There is this ->
http://blog.notdot.net/2010/05/Authenticating-against-App-Engine-from-an-Android-app
and the talk that robertk linked
12:59 < vlad__> i see they are
12:59 < vbabiy> gregdarke: I have not seen that one yet, I will review
12:59 < vlad__> Have a question about recently introduced 'instance' header
12:59 <+gregdarke> vlad__: There are usually a few who idle here, but
the office hours has just finished
12:59 <+gregdarke> vlad__: What about it?
13:00 < vbabiy> I am looking for detailed docs since the session seem
to skip a lot of details about auth key experation and such.
13:00 < Wesley_Google> so anyone who is running into this where your
instances are getting killed due to long startup times pls add your
comments to that ticket!!
13:00 < Wesley_Google>
http://code.google.com/p/googleappengine/issues/detail?id=5477
13:00 < vlad__> I have a task putting a piece of data in memcache
13:00 < vlad__> then client request comes to pick up that data
13:01 < Wesley_Google> ok, i got to take off... thanks everyone for
coming to office hours!!

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] App Engine office hours for 2011/07/03 [PDT]

Reply via email to