HI ALL
I am trying to click the next page link using scrapy + python + selenium
webdriver
Platform : Python + scrapy + Selenum webdriver
Note:
I am trying to get all 11 page jobs, but its repeatedly looping with first
page jobs alone, not crawled all pages.
Here is my Html code:
<!-- begin snippet: js hide: false -->
<!-- language: lang-html -->
<div id="win0divHRS_APPL_WRK_HRS_LST_NEXT">
<span class="PSHYPERLINK" title="Next In List">
<a id="HRS_APPL_WRK_HRS_LST_NEXT" class="PSHYPERLINK"
href="javascript:submitAction_win0(document.win0,'HRS_APPL_WRK_HRS_LST_NEXT');"
tabindex="74" ptlinktgt="pt_replace"
name="HRS_APPL_WRK_HRS_LST_NEXT">Next</a>
</span>
</div>
<!-- end snippet -->
Here is my spider code :
<!-- begin snippet: js hide: false -->
<!-- language: lang-html -->
def __init__(self):
self.driver = webdriver.Firefox()
def parse(self,response):
self.driver.get('https://eapplicant.northshore.org/psc/psapp/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL')
selector = Selector(response)
while True:
try:
next =
self.driver.find_element_by_id('HRS_APPL_WRK_HRS_LST_NEXT')
links = []
for link in
selector.xpath('.//*[@id="HRS_CE_JO_EXT_I$scroll$0"]'):
for link in
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*'):
#intjid =
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*')
abc =
'https://eapplicant.northshore.org/psp/psapp/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_JOB_DTL&Action=A&JobOpeningId='+link+'&SiteId=1&PostingSeq=1'
#print abc
yield Request(abc,callback=self.parse_iframe,
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)
except NoSuchElementException:
break
next.click()
#self.driver.close()
def parse_iframe(self,response):
selector = Selector(response)
url =
selector.xpath('//*[@id="ptifrmtgtframe"]/@src').extract()[0]
yield Request(url,callback=self.parse_listing_page,
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)
<!-- end snippet -->
Please let me know how to iterate the next page jobs using selenium
webdriver + scrapy
Is there any one guide me crawlled all 11 pages.
Thanks advance
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.