Ahh! Yes, by google search API I meant Twitter search API! I'm using a CRON job to trigger a special URL every 5 minutes. Originally I had this job on my own webhost, but I breached the terms of service because a) sometimes the way I update the trend lists can take a long time and the very basic PHP fetch I do was waiting for a return value (which it doesn't really need to do) - this caused CPU limits on my cheap host to be exceeded and b) my cheap host only allows jobs to be scheduled every 15 minutes!
I ended up with a two part solution: 1) I use http://www.webcron.org to schedule jobs that call a URL on my webhost for longer jobs every 5 minutes or direct on GAE for shorter jobs. Webcron charges by the length of job so sub-30 seconds is cheapest (0.0001 Euro cents or 1000 jobs per cent) 2) On my webhost I use cURL instead of a standard PHP fetch (which is how I first did it) - this just triggers the job then terminates the script. GAE will happily continue to execute the job even though the listening party has terminated. I get what I want and my webhost doesn't get upset. I need to do it in this "2-part" way becase webcron won't let you terminate a job after calling it - this achieved what I wanted in a fairly cheap way for me. Here is the PHP script I use Note the URL doesn't need the HTTP:// part in front of it. <? $url = "myurl.appsot.com/somejob"; $ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 2); $curl_scraped_page = curl_exec($ch); curl_close($ch); ?> On Sun, Mar 15, 2009 at 4:52 PM, lock <lachlan.hu...@gmail.com> wrote: > > Hi Tim, > > Just had a look at Twendly, looks good! I've just got a few quick > questions, if you wouldn't mind... > > 1. By 'google search API' you actually mean 'twitter seach API', > yeah ? ;-) > > 2. How do you go about pulling data from twitter every 5 minutes? > Unless I'm missing something there are no scheduled tasks in > app engine (yet). Using a cron job on another server to call a > special URL maybe? > > The API key sounds like the proper solution, would be nice if > there was a solution now though. > > Just an idea that probably won't work for most cases. Get the > client (via javascript) to pull data from twitter and send it on to > app engine for processing/storage. Not real pretty. > > Thanks, lock > > On Mar 15, 9:16 am, Tim Bull <tim.b...@binaryplex.com> wrote: > > Interesting, > > > > I have a Twitter app (http://twendly.appspot.com) but I don't seem to be > > having this issue at the moment. However, while I read information every > 5 > > minutes from the google search API (which is rate limited differently) I > > only send a few messages (no more than 5 or 6 max and usually only 4) as > the > > hour clicks over. Although ocasionally this drops a message, it's > generally > > pretty solid. Perhaps because of when I'm sending them, I get in at the > > start of the allocation. > > > > As far as scalability goes, I would say GAE is really suited for it's > read > > scalability, so if unless your Twitter bot writes are going to massive, > then > > scalability shouldn't be an issue if you move these writes over to a > > seperate host. I guess a (nasty but possible) pattern would be to have > the > > Twitter interaction come from your host which could act as a proxy, then > use > > App Engine for all the processing and reporting on the data. At least in > my > > application this would be a potential work-around if this becomes an > issue. > > > > Cheers > > > > Tim > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---