[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread peterk
Very neat.. Thank you. Just to clarify, can we use this for all API calls? Datastore too? I didn't look very closely at the async proxy in pubsubhubub.. Asynchronous calls available on all apis might give a lot to chew on.. :) It's been a while since I've worked with async function calls or thre

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread David Wilson
It's completely undocumented (at this stage, anyway), but definitely seems to work. A few notes I've come gathered: - CPU time quota appears to be calculated based on literal time, rather than e.g. the UNIX concept of "time spent in running state". - I can fetch 100 URLs in 1.3 seconds from a

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread peterk
A couple of questions re. CPU usage.. "CPU time quota appears to be calculated based on literal time" Can you clarify what you mean here? I presume each async request eats into your CPU budget. But you say: "since you can burn a whole lot more AppEngine CPU more cheaply using the async api" Ca

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread bFlood
oh my, this is working now?!? I just assumed it would only be available from the next build. great work david! I agree on waiting for the "official" release but its certainly something that we can test with right now in preparation for the new release. thanks for digging this out (and thanks to

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread David Wilson
I have no idea how definitive this is, but literally it means wall clock time seems to be how CPU cost is measured. I guess this makes sense for a few different reasons. I found some internal function "google3.apphosting.runtime._apphosting_runtime___python__apiproxy.get_request_cpu_usage" with t

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread Joe Bowman
Does the batch fetching working on live appengine applications, or only on the SDK? On Mar 16, 10:19 am, David Wilson wrote: > I have no idea how definitive this is, but literally it means wall > clock time seems to be how CPU cost is measured. I guess this makes > sense for a few different reas

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread David Wilson
Joe, I've only tested it in production. ;) The code should work serially on the SDK, but I haven't tried yet. David. 2009/3/16 Joe Bowman : > > Does the batch fetching working on live appengine applications, or > only on the SDK? > > On Mar 16, 10:19 am, David Wilson wrote: >> I have no idea

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread Joe Bowman
Wow that's great. The SDK might be problematic for you, as it appears to be very single threaded, I know for a fact it can't reply to requests to itself. Out of curiosity, are you still using base urlfetch, or is it your own creation? While when Google releases their scheduled tasks functionality

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread bFlood
@joe - fire/forget - you can just skip the fetcher.wait() call (which call AsyncAPIProxy.wait). I'm not sure of you would need a valid callback but even if you did it could be a simple stub that does nothing. @david - have you made this work with datastore calls yet? having some issues trying to

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread David Wilson
I forgot to mention, AppEngine does not close the request until all asynchronous requests have ended. This means it's not truly "fire and forget". Regardless of whether you're waiting for a response or not, if a request is in progress, the HTTP response body is not returned to the client. I creat

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread Joe Bowman
I imagine keeping the request open until everything is done isn't going to go away any time soon, it's how http responses work and the scheduled tasks on the roadmap would be better suited to providing better support for that. I also agree on the batch put and get functionality for the most part i

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-16 Thread bFlood
thanks david. agreed on datastore except that unlike the current batch calls, you might be able to execute code concurrently on each response and then wait for all the worker's results. to me, and I could be wrong, even a no-op datastore request could serve as a poor man's worker thread. I'll see

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-17 Thread David Wilson
2009/3/16 Joe Bowman : > > Wow that's great. The SDK might be problematic for you, as it appears > to be very single threaded, I know for a fact it can't reply to > requests to itself. > > Out of curiosity, are you still using base urlfetch, or is it your own > creation? While when Google releases

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-17 Thread Joe Bowman
Thanks, I'm going to give it a go for urlfetch calls for one project I'm working on this week. Not sure when I'd be able to include it in gaeutiltiies for cron and such, that project is currently lower on my priority list at the moment, but can't wait until I get a chance to play with it. Anothe

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-17 Thread Joe Bowman
This may be a really dumb question, but.. I'm still learning so... Is there a way to do something other than a direct api call asynchronously? I'm writing a script that pulls from multiple sources, sometimes with higher level calls that use urlfetch, such as gdata. Since I'm attempting to pull fr

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-18 Thread David Wilson
Hey Joe, With the gdata package you can do something like this instead: As usual, completely untested code, but looks about right.. from youtube import YouTubeVideoFeedFromString def get_feeds_async(usernames): fetcher = megafetch.Fetcher() output = {} def cb(username, result):

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-18 Thread Joe Bowman
Ah ha.. thanks David. And for the views, if I really wanted to launch everything at once, I could map my boss, youtube, twitter, etc etc pulls to their own urls, and use megafetch in my master view to pull those urls all at once too. On Mar 18, 5:14 am, David Wilson wrote: > Hey Joe, > > With t

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-18 Thread bFlood
hey david,joe I've got the async datastore Get working but I'm not sure the callbacks are being run on a background thread. they appear to be when you examine something like the thread local storage (hashes are all unique) but then if you insert just a simple time.sleep they appear to run seriall

[google-appengine] Re: Parallel urlfetch utility class / function.

2009-03-18 Thread Joe Bowman
Well, you'll never get a true parallel running of the callbacks, based on the fact even if they're running in the same thread as the urlfetch, each fetch will take a different amount of time. Though, I'm not sure if the callbacks would run in the core thread or not. That's where they'd be run if y