How to get the xpath for joblist page?

james josh Fri, 28 Nov 2014 02:41:59 -0800

Hi All

Aleaready I have discussed with stack overflow :


http://stackoverflow.com/questions/27185466/how-to-click-the-search-buttton?noredirect=1#comment42859639_27185466
http://stackoverflow.com/questions/27182350/how-to-get-the-xpath-for-jobpage

Note :

spider code: *https://gist.github.com/joshbolt/bd41a7fd17888ecea043*

Can anyone get it to resolved?

I Can able to click the "Search Jobs Now" page, Its switch over to another 
page for search the joblistpage. I have tried too many possibilities to get 
"xpath" and frame id for "Search " button for to get job list page. but its 
not find exactly what i am expected.

Thanks advance,


Here is html code:

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

   <a id="Left" class="PSPUSHBUTTON" 
href="javascript:submitAction_win0(document.win0,'#ICSetFieldHRS_APP_SCHJOB.EOTL_UI_BTN_ID.SEARCHACTIONS#SEARCH');"
 
tabindex="48">

    <span>
        <input class="PSPUSHBUTTON" type="button" tabindex="-1" 
onclick="javascript:submitAction_win0(document.win0,'#ICSetFieldHRS_APP_SCHJOB.EOTL_UI_BTN_ID.SEARCHACTIONS#SEARCH');"
 
title="" value=" Search " name="SEARCHACTIONS#SEARCH"></input>
    </span>

</a>

<!-- end snippet -->



Spider code:



<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

    class WellsfargocomSpider(Spider):
    name = 'wellsfargo'
    allowed_domains = ['www.wellsfargo.com']
    start_urls = 
['https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS=']
    

    #driver = webdriver.Remote('http://127.0.0.1:4444/wd/hub', 
desired_capabilities=webdriver.DesiredCapabilities.HTMLUNIT)
        # Create a new instance of the Firefox webdriver
    driver = webdriver.Firefox()
        # Create implicitly wait for 30
    #driver.implicitly_wait(0.5)
    

    def parse(self,response):
    
        selector = Selector(response)
        
#self.driver.get('https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS=')
        driver = self.driver
        
        #driver = webdriver.Firefox()
        
self.driver.get('https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS=');
        frame = self.driver.find_element_by_id('ptifrmtgtframe')
        self.driver.switch_to.frame(frame)
        button = 
driver.find_element_by_id('HRS_CE_WELCM_WK_HRS_CE_WELCM_BTN')
        button.click()
        #frame1 = self.driver.find_element_by_id('ptifrmtemplate')
        #self.driver.switch_to.frame1(frame1)
        button1 = 
driver.find_element_by_xpath('//span[class="PSPUSHBUTTON"]')
        button1.click()
       
        #clk1 = 
self.driver.find_element_by_xpath("//input[@name='SEARCHACTIONS#SEARCH']")
        #clk1.click()
        
        #self.driver.switchTo().defaultContent()
        
        #inputElement = 
self.driver.find_element_by_css_selector("input.PSPUSHBUTTON")
        #inputElement.submit()
        #inputElement1 = 
self.driver.find_element_by_xpath("//input[@name='SEARCHACTIONS#SEARCH']")
        #inputElement1.click()
        
        
        

        #while True:
            #next = 
self.driver.find_element_by_xpath(".//*[@id='HRS_APPL_WRK_HRS_LST_NEXT']")

            #try:
        links = []
        for link in 
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*'):\\
                #intjid = 
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*')
                    abc = 
'https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?Page=HRS_APP_JBPST&FOCUS=Applicant&SiteId=1&JobOpeningId='+link+'&PostingSeq=1'
                        #print abc
                    yield Request(abc,callback=self.parse_iframe, 
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)

                #next.click()
            #except:
                #break

        #self.driver.close()

    def parse_iframe(self,response):
        selector = Selector(response)
        url = selector.xpath('//*[@id="ptifrmtgtframe"]/@src').extract()[0]
        yield Request(url,callback=self.parse_listing_page, 
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)

<!-- end snippet -->


OUTPUT:



<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

    C:\Users\xxxx\Downloads\wellsfargocom>scrapy crawl wellsfargo
        2014-11-28 12:20:05+0530 [scrapy] INFO: Scrapy 0.24.4 started (bot: 
wellsfargoco
        m)
        2014-11-28 12:20:05+0530 [scrapy] INFO: Optional features 
available: ssl, http11

        2014-11-28 12:20:05+0530 [scrapy] INFO: Overridden settings: 
{'NEWSPIDER_MODULE'
        : 'wellsfargocom.spiders', 'SPIDER_MODULES': 
['wellsfargocom.spiders'], 'BOT_NAM
        E': 'wellsfargocom'}
        2014-11-28 12:20:05+0530 [scrapy] INFO: Enabled extensions: 
LogStats, TelnetCons
        ole, CloseSpider, WebService, CoreStats, SpiderState
        2014-11-28 12:20:05+0530 [scrapy] INFO: Enabled downloader 
middlewares: HttpAuth
        Middleware, DownloadTimeoutMiddleware, UserAgentMiddleware, 
RetryMiddleware, Def
        aultHeadersMiddleware, MetaRefreshMiddleware, 
HttpCompressionMiddleware, Redirec
        tMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, 
DownloaderStats
        2014-11-28 12:20:05+0530 [scrapy] INFO: Enabled spider middlewares: 
HttpErrorMid
        dleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, 
DepthMiddlew
        are
        2014-11-28 12:20:05+0530 [scrapy] INFO: Enabled item pipelines:
        2014-11-28 12:20:05+0530 [wellsfargo] INFO: Spider opened
        2014-11-28 12:20:05+0530 [wellsfargo] INFO: Crawled 0 pages (at 0 
pages/min), sc
        raped 0 items (at 0 items/min)
        raped 0 items (at 0 items/min)
        2014-11-28 12:20:05+0530 [scrapy] DEBUG: Telnet console listening 
on 127.0.0.1:6
        023
        2014-11-28 12:20:05+0530 [scrapy] DEBUG: Web service listening on 
127.0.0.1:6080

        2014-11-28 12:20:08+0530 [wellsfargo] DEBUG: Redirecting (302) to 
<GET https://e
        
mployment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GB
        L?FOCUS=&> from <GET 
https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRM
        S/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS=>
        2014-11-28 12:20:08+0530 [wellsfargo] DEBUG: Redirecting (302) to 
<GET https://e
        
mployment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GB
        L?FOCUS=> from <GET 
https://employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS
        /c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS=&>
        2014-11-28 12:20:09+0530 [wellsfargo] DEBUG: Crawled (200) <GET 
https://employme
        
nt.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB.GBL?FOCUS
        => (referer: None)
        2014-11-28 12:20:18+0530 [wellsfargo] ERROR: Spider error 
processing <GET https:
        
//employment.wellsfargo.com/psp/PSEA/APPLICANT_NW/HRMS/c/HRS_HRAM.HRS_APP_SCHJOB
        .GBL?FOCUS=>
                Traceback (most recent call last):
                  File 
"C:\Python27\lib\site-packages\twisted\internet\base.py", line 82
        4, in runUntilCurrent
                    call.func(*call.args, **call.kw)
                  File 
"C:\Python27\lib\site-packages\twisted\internet\task.py", line 63
        8, in _tick
                    taskObj._oneWorkUnit()
                  File 
"C:\Python27\lib\site-packages\twisted\internet\task.py", line 48
        4, in _oneWorkUnit
                    result = next(self._iterator)
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\uti
        ls\defer.py", line 57, in <genexpr>
                    work = (callable(elem, *args, **named) for elem in 
iterable)
                --- <exception caught here> ---
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\uti
        ls\defer.py", line 96, in iter_errback
                    yield next(it)
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\con
        trib\spidermiddleware\offsite.py", line 26, in process_spider_output
                    for x in result:
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\con
        trib\spidermiddleware\referer.py", line 22, in <genexpr>
                    return (_set_referer(r) for r in result or ())
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\con
        trib\spidermiddleware\urllength.py", line 33, in <genexpr>
                    return (r for r in result or () if _filter(r))
                  File 
"C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\con
        trib\spidermiddleware\depth.py", line 50, in <genexpr>
                    return (r for r in result or () if _filter(r))
                  File 
"C:\Users\xxx\Downloads\wellsfargocom\wellsfargocom\spiders\w
        ellsfargo.py", line 53, in parse
                    button1 = 
driver.find_element_by_xpath("//input[@name='SEARCHACTIONS
        #SEARCH']")
                  File 
"C:\Python27\lib\site-packages\selenium\webdriver\remote\webdrive
        r.py", line 230, in find_element_by_xpath
                    return self.find_element(by=By.XPATH, value=xpath)
                  File 
"C:\Python27\lib\site-packages\selenium\webdriver\remote\webdrive
        r.py", line 662, in find_element
                    {'using': by, 'value': value})['value']
                  File 
"C:\Python27\lib\site-packages\selenium\webdriver\remote\webdrive
        r.py", line 173, in execute
                    self.error_handler.check_response(response)
                  File 
"C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhan
        dler.py", line 166, in check_response
                    raise exception_class(message, screen, stacktrace)
                selenium.common.exceptions.NoSuchElementException: Message: 
Unable to lo
        cate element: 
{"method":"xpath","selector":"//input[@name='SEARCHACTIONS#SEARCH'
        ]"}
                Stacktrace:
                    at FirefoxDriver.prototype.findElementInternal_ 
(file:///c:/users/
                    
/appdata/local/temp/tmpxra5mc/extensions/[email protected]/components
        /driver-component.js:9641:26)
                    at FirefoxDriver.prototype.findElement 
(file:///c:/users/xxxxx/app
        
data/local/temp/tmpxra5mc/extensions/[email protected]/components/driver-c
        omponent.js:9650:3)
                    at DelayedCommand.prototype.executeInternal_/h 
(file:///c:/users/
                    
/appdata/local/temp/tmpxra5mc/extensions/[email protected]/components/
        command-processor.js:11635:16)
                    at DelayedCommand.prototype.executeInternal_ 
(file:///c:/users/
                    
/appdata/local/temp/tmpxra5mc/extensions/[email protected]/components/co
        mmand-processor.js:11640:7)
                    at DelayedCommand.prototype.execute/< 
(file:///c:/users/xxxxx/appd
        
ata/local/temp/tmpxra5mc/extensions/[email protected]/components/command-p
        rocessor.js:11582:5)

        2014-11-28 12:20:18+0530 [wellsfargo] INFO: Closing spider 
(finished)
        2014-11-28 12:20:18+0530 [wellsfargo] INFO: Dumping Scrapy stats:
                {'downloader/request_bytes': 1878,
                 'downloader/request_count': 3,
                 'downloader/request_method_count/GET': 3,
                 'downloader/response_bytes': 7188,
                 'downloader/response_count': 3,
                 'downloader/response_status_count/200': 1,
                 'downloader/response_status_count/302': 2,
                 'finish_reason': 'finished',
                 'finish_time': datetime.datetime(2014, 11, 28, 6, 50, 18, 
647000),
                 'log_count/DEBUG': 5,
                 'log_count/ERROR': 1,
                 'log_count/INFO': 7,
                 'response_received_count': 1,
                 'scheduler/dequeued': 3,
                 'scheduler/dequeued/memory': 3,
                 'scheduler/enqueued': 3,
                 'scheduler/enqueued/memory': 3,
                 'spider_exceptions/NoSuchElementException': 1,
                 'start_time': datetime.datetime(2014, 11, 28, 6, 50, 5, 
790000)}
        2014-11-28 12:20:18+0530 [wellsfargo] INFO: Spider closed (finished)


<!-- end snippet -->




-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

How to get the xpath for joblist page?

Reply via email to