Hey Bruce, I'm not sure what you're exact situation is, and of course Pablo is far more knowledgeable about Scrapy cans and can'ts than I am.
But I will say that I have many spiders that crawl AJAX powered sites. On Mon, Apr 28, 2014 at 1:30 PM, bruce <[email protected]> wrote: > bill... > > not sure that's the same... ie, I don't think scrapy has a way to > "wait" for an element to show up on a given page, based on the > underlying ajax functions... > > I had talked to pablo about this awhile ago and he was saying scrapy > couldn't handle this. Are you saying it now can?? > > This would be cool if it really can. > > > On Mon, Apr 28, 2014 at 1:13 PM, Bill Ebeling <[email protected]> > wrote: > > Scrapy sends a request to the ajax address just like it does for the > normal > > webpage. You maintain data from one request to the other with the meta > dict. > > > > There was a tutorial on it a while back about scraping the nasa website > for > > it's pic of the day. Can't seem to find it, now though. If you take a > look > > at the link above, you can read all about it. > > > > > > On Mon, Apr 28, 2014 at 1:01 PM, bruce <[email protected]> wrote: > >> > >> I didn't think scrappy had the ability to run remote ajax, similar to > >> casperjs/phantom/nodejs... > >> > >> Does scrappy run a headless browser process to accomplish this?? > >> > >> thanks > >> > >> > >> On Mon, Apr 28, 2014 at 10:17 AM, Bill Ebeling <[email protected]> > >> wrote: > >> > Hey Mitch, > >> > > >> > At the risk of stating the obvious, Scrapy handles dynamic content > quite > >> > well. The general approach is to scrape the page, submit requests for > >> > the > >> > ajax, stich the item together, submit it to the pipeline. > >> > > >> > That said, it's not complicated, but not trivial, either. > >> > > >> > To your specific point, the solution is either to regex it out, or to > >> > start > >> > fiddling with the underlying html. I would not personally download > >> > someone > >> > else's page and then put it on a server, since the js is still going > to > >> > be > >> > running and logging things and all that. > >> > > >> > If you want to look into writing a crawler that gets the dynamic > >> > content, > >> > start here: > http://doc.scrapy.org/en/latest/topics/request-response.html > >> > and > >> > pay special attention to the meta dict. > >> > > >> > If you want more help with the specific site, provide a link so we can > >> > see > >> > it. > >> > > >> > Hope that helps. > >> > > >> > -- > >> > You received this message because you are subscribed to the Google > >> > Groups > >> > "scrapy-users" group. > >> > To unsubscribe from this group and stop receiving emails from it, send > >> > an > >> > email to [email protected]. > >> > To post to this group, send email to [email protected]. > >> > Visit this group at http://groups.google.com/group/scrapy-users. > >> > For more options, visit https://groups.google.com/d/optout. > >> > >> -- > >> You received this message because you are subscribed to a topic in the > >> Google Groups "scrapy-users" group. > >> To unsubscribe from this topic, visit > >> https://groups.google.com/d/topic/scrapy-users/LyCuWu4ydeA/unsubscribe. > >> To unsubscribe from this group and all its topics, send an email to > >> [email protected]. > >> To post to this group, send email to [email protected]. > >> Visit this group at http://groups.google.com/group/scrapy-users. > >> For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "scrapy-users" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to [email protected]. > > To post to this group, send email to [email protected]. > > Visit this group at http://groups.google.com/group/scrapy-users. > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to a topic in the > Google Groups "scrapy-users" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/scrapy-users/LyCuWu4ydeA/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
