[google-appengine] Re: Parallel urlfetch utility class / function.

Joe Bowman Mon, 16 Mar 2009 08:01:48 -0700

Does the batch fetching working on live appengine applications, or
only on the SDK?


On Mar 16, 10:19 am, David Wilson <d...@botanicus.net> wrote:
> I have no idea how definitive this is, but literally it means wall
> clock time seems to be how CPU cost is measured. I guess this makes
> sense for a few different reasons.
>
> I found some internal function
> "google3.apphosting.runtime._apphosting_runtime___python__apiproxy.get_request_cpu_usage"
> with the docstring:
>
>     Returns the number of megacycles used so far by this request.
>     Does not include CPU used by API calls.
>
> Calling it, then running time.sleep(5), then calling it again,
> indicates thousands of megacycles used, yet in real terms the CPU was
> probably doing nothing. I guess Datastore CPU, etc., is added on top
> of this, but it seems to suggest to me that if you can drastically
> reduce request time, quota usage should drop too.
>
> I have yet to do any kind of rough measurements of Datastore CPU, so
> I'm not sure how correct this all is.
>
> David.
>
>  - One of the guys on IRC suggested this means that per-request cost
> is scaled during peak usage (and thus internal services running
> slower).
>
> 2009/3/16 peterk <peter.ke...@gmail.com>:
>
>
>
>
>
> > A couple of questions re. CPU usage..
>
> > "CPU time quota appears to be calculated based on literal time"
>
> > Can you clarify what you mean here? I presume each async request eats
> > into your CPU budget. But you say:
>
> > "since you can burn a whole lot more AppEngine CPU more cheaply using
> > the async api"
>
> > Can you clarify how that's the case?
>
> > I would guess as long as you're being billed for the cpu-ms spent in
> > your asynchronous calls, Google would let you hang yourself with them
> > when it comes to billing.. :) so I presume they'd let you squeeze in
> > as many as your original request, and its limit, will allow for?
>
> > Thanks again.
>
> > On Mar 16, 2:00 pm, David Wilson <d...@botanicus.net> wrote:
> >> It's completely undocumented (at this stage, anyway), but definitely
> >> seems to work. A few notes I've come gathered:
>
> >>  - CPU time quota appears to be calculated based on literal time,
> >> rather than e.g. the UNIX concept of "time spent in running state".
>
> >>  - I can fetch 100 URLs in 1.3 seconds from a machine colocated in
> >> Germany using the asynchronous API. I can't begin to imagine how slow
> >> (and therefore expensive in monetary terms) this would be using the
> >> standard API.
>
> >>  - The user-specified callback function appears to be invoked in a
> >> separate thread; the RPC isn't "complete" until this callback
> >> completes. The callback thread is still subject to the request
> >> deadline.
>
> >>  - It's a standard interface, and seems to have no parallel
> >> restrictions at least for urlfetch and Datastore. However, I imagine
> >> that it's possible restrictions may be placed here at some later
> >> stage, since you can burn a whole lot more AppEngine CPU more cheaply
> >> using the async api.
>
> >>  - It's "standard" only insomuch as you have to fiddle with
> >> AppEngine-internal protocolbuffer definitions for each service type.
> >> This mostly means copy-pasting the standard sync call code from the
> >> SDK, and hacking it to use pubsubhubub's proxy code.
>
> >> Per the last point, you might be better waiting for an officially
> >> sanctioned API for doing this, albeit I doubt the protocolbuffer
> >> definitions change all that often.
>
> >> Thanks for Brett Slatkin & co. for doing the digging required to get
> >> the async stuff working! :)
>
> >> David.
>
> >> 2009/3/16 peterk <peter.ke...@gmail.com>:
>
> >> > Very neat.. Thank you.
>
> >> > Just to clarify, can we use this for all API calls? Datastore too? I
> >> > didn't look very closely at the async proxy in pubsubhubub..
>
> >> > Asynchronous calls available on all apis might give a lot to chew
> >> > on.. :) It's been a while since I've worked with async function calls
> >> > or threading, might have to dig up some old notes to see where I could
> >> > extract gains from it in my app. Some common cases might be worth the
> >> > community documenting for all to benefit from, too.
>
> >> > On Mar 16, 1:26 pm, David Wilson <d...@botanicus.net> wrote:
> >> >> I've created a Google Code project to contain some batch utilities I'm
> >> >> working on, based on async_apiproxy.py from pubsubhubbub[0]. The
> >> >> project currently contains just a modified async_apiproxy.py that
> >> >> doesn't require dummy google3 modules on the local machine, and a
> >> >> megafetch.py, for batch-fetching URLs.
>
> >> >>    http://code.google.com/p/appengine-async-tools/
>
> >> >> David
>
> >> >> [0]http://code.google.com/p/pubsubhubbub/source/browse/trunk/hub/async_a...
>
> >> >> --
> >> >> It is better to be wrong than to be vague.
> >> >>   — Freeman Dyson
>
> >> --
> >> It is better to be wrong than to be vague.
> >>   — Freeman Dyson
>
> --
> It is better to be wrong than to be vague.
>   — Freeman Dyson
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Re: Parallel urlfetch utility class / function.

Reply via email to