Thanks Tim.

Think I've managed to convince myself that I can work around the
lack of inbuilt scheduled tasks.  Who knows by the time I manage
to pull together enough motivation google may have implemented
it already.  Worst case I may be able to use a work server to call
an URL, although its not a work project.  Hmmm, wonder how that
will go down.  There's also http://schedulerservice.appspot.com/,
which I might try.

It's just the twitter rate limit roulette game I'm worried about now.
Really highlights learnings from my first project, make sure
you upload and test your app early.



On Mar 16, 7:23 am, Tim Bull <tim.b...@binaryplex.com> wrote:
> Ahh! Yes, by google search API I meant Twitter search API!
>
> I'm using a CRON job to trigger a special URL every 5 minutes.  Originally I
> had this job on my own webhost, but I breached the terms of service because
> a) sometimes the way I update the trend lists can take a long time and the
> very basic PHP fetch I do was waiting for a return value (which it doesn't
> really need to do) - this caused CPU limits on my cheap host to be exceeded
> and b) my cheap host only allows jobs to be scheduled every 15 minutes!
>
> I ended up with a two part solution:
>
> 1) I usehttp://www.webcron.orgto schedule jobs that call a URL on my
> webhost for longer jobs every 5 minutes or direct on GAE for shorter jobs.
> Webcron charges by the length of job so sub-30 seconds is cheapest (0.0001
> Euro cents or 1000 jobs per cent)
>
> 2) On my webhost I use cURL instead of a standard PHP fetch (which is how I
> first did it) - this just triggers the job then terminates the script.  GAE
> will happily continue to execute the job even though the listening party has
> terminated. I get what I want and my webhost doesn't get upset.  I need to
> do it in this "2-part" way becase webcron won't let you terminate a job
> after calling it - this achieved what I wanted in a fairly cheap way for me.
>
> Here is the PHP script I use
>
> Note the URL doesn't need the HTTP:// part in front of it.
>
> <?
> $url = "myurl.appsot.com/somejob";
> $ch = curl_init($url);
> curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
> curl_setopt($ch, CURLOPT_TIMEOUT, 2);
> $curl_scraped_page = curl_exec($ch);
> curl_close($ch);
> ?>
>
> On Sun, Mar 15, 2009 at 4:52 PM, lock <lachlan.hu...@gmail.com> wrote:
>
> > Hi Tim,
>
> > Just had a look at Twendly, looks good! I've just got a few quick
> > questions, if you wouldn't mind...
>
> > 1. By 'google search API' you actually mean 'twitter seach API',
> > yeah ? ;-)
>
> > 2. How do you go about pulling data from twitter every 5 minutes?
> > Unless I'm missing something there are no scheduled tasks in
> > app engine (yet).  Using a cron job on another server to call a
> > special URL maybe?
>
> > The API key sounds like the proper solution, would be nice if
> > there was a solution now though.
>
> > Just an idea that probably won't work for most cases.  Get the
> > client (via javascript) to pull data from twitter and send it on to
> > app engine for processing/storage.  Not real pretty.
>
> > Thanks, lock
>
> > On Mar 15, 9:16 am, Tim Bull <tim.b...@binaryplex.com> wrote:
> > > Interesting,
>
> > > I have a Twitter app (http://twendly.appspot.com) but I don't seem to be
> > > having this issue at the moment.  However, while I read information every
> > 5
> > > minutes from the google search API (which is rate limited differently) I
> > > only send a few messages (no more than 5 or 6 max and usually only 4) as
> > the
> > > hour clicks over.  Although ocasionally this drops a message, it's
> > generally
> > > pretty solid.  Perhaps because of when I'm sending them, I get in at the
> > > start of the allocation.
>
> > > As far as scalability goes, I would say GAE is really suited for it's
> > read
> > > scalability, so if unless your Twitter bot writes are going to massive,
> > then
> > > scalability shouldn't be an issue if you move these writes over to a
> > > seperate host.  I guess a (nasty but possible) pattern would be to have
> > the
> > > Twitter interaction come from your host which could act as a proxy, then
> > use
> > > App Engine for all the processing and reporting on the data.  At least in
> > my
> > > application this would be a potential work-around if this becomes an
> > issue.
>
> > > Cheers
>
> > > Tim
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to