Jim, I'd probably add a hook to the on_save event in your blogs that pushes the URL into a queue. Have a simple script that saves the content to your static failover. No need for a spider/crawler when you just want to grab one page's content on an event trigger.
Perhaps I'm not understanding why you'd need something heavy like scrapy, you could write a 30 line python program to monitor the queue, requests.get() the page, then save to static location. On Mon, Nov 2, 2015 at 5:16 PM, Jim Priest <[email protected]> wrote: > I should have provided a bit more info on our use case :) > > We have a lot of dynamic content in Drupal, blogs, etc. The failover > content is static versions of this dynamic content. Currently this is done > via a rather clunky Akamai tool which we're hoping to replace. > > Another goal is to more immediately update this content - ie someone > updates a Drupal page, it is immediately spidered (via API call or > something) and that content is then saved to failover. > > I could probably cobble something together with wget or some other tool > but trying to not reinvent the wheel here as much as possible. > > Thanks! > Jim > > > > On Monday, November 2, 2015 at 7:28:39 AM UTC-5, Jakob de Maeyer wrote: >> >> Hey Jim, >> >> Scrapy is great at two things: >> 1. downloading web pages, and >> 2. extracting unstructured data. >> >> In your case, you should have already have access to the raw files (via >> FTP, etc.), as well as to the data in a structured format. It would be >> possible to do what you're aiming at with Scrapy, but it doesn't seem to be >> the most elegant solution. What speaks against setting up an rsync cronjob >> or similar to keep the failover in sync? >> >> >> Cheers, >> -Jakob >> > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
