Hi Guys,
So I’ve developed spider which will iterate over pages, issue is that only
iteration over first page that get’s processed works, here’s code:
# -*- coding: utf-8 -*-from scrapy import Spiderfrom scrapy.http import
Request, FormRequest
class HomeCareSpider(Spider):
name = 'home_care'
def start_requests(self):
for number in range(1, 2):
yield
Request('https://apps.health.ny.gov/professionals/home_care/registry/home.action',
dont_filter=True,
meta={'number': str(number)},
callback=self.parse_form)
def parse_form(self, response):
yield
FormRequest('https://apps.health.ny.gov/professionals/home_care/registry/searchworker.action',
formdata={'action:searchbynumber': 'Search by
Registry Number',
'ascr.dohRegistryStr':
response.meta['number'],
'hcrt':
response.xpath('//*[@name="hcrt"]/@value').extract_first()},
meta={'number': response.meta['number']},
dont_filter=True,
callback=self.parse_search_result)
def parse_search_result(self, response):
yield
FormRequest('https://apps.health.ny.gov/professionals/home_care/registry/worker.action',
formdata={'resinums[' + response.meta['number'] +
']': '',
'hcrt':
response.xpath('//*[@name="hcrt"]/@value').extract_first()},
dont_filter=True,
callback=self.parse_person_details)
def parse_person_details(self, response):
print response.xpath('//h1/text()').extract_first()
So if I run this spider with: scrapy runspider foo.py it will work without
any issues. If I edit for number in range(1, 2): to something like: for
number in range(1, 3):, in other words iterating over two pages this will
not work. Whichever page get’s processed first will be successful and the
other will get twisted related error. Has anybody experienced this type of
issue before?
*Scrapy==1.1.1* is version that I’m using.
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.