it depends on the website. scrapy does follow 301 requests. what might happen is that some sites will detect user agent (you can define scrapy's user agent) and serve a different response. and some sites use javascript to populate the prices in the page. in that case, you'll need to simulate a browser (scrapy doesn't build a dom nor runs javascript). for that you can either use splash (https://github.com/scrapinghub/splash) with scrapy or try another technology (I like CasperJS for those kinds of sites).
Em terça-feira, 18 de fevereiro de 2014 10h11min18s UTC-3, Guz Vinueza escreveu: > > Hi, > > I am trying to build a new scrapper to get Hotel rates. With some sites, > the response url is the same as the request url, so the information I see > in my website is the same as the one I can use for doing XPath filtering > and scrapping information. For some other sites, I get a 301 REDIRECT from > the Hotel site that gives me information about the hotel but it eliminates > the rates (in the response object). > > How can I get rid of this? I am new to Scrapy, perhaps it is a very basic > question.... > > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/groups/opt_out.
