Now I can get values with scrapy shell, but can't when I run scrapy crawl. 
Did you maybe encountered this kind of problem before?

Here's the sample code from spider:

try:
    plot_size = response.xpath(u'//span[text()="Plot Size (m x 
m)"]/preceding::span/text()').extract()[-1]
except IndexError:
    plot_size = ''

l.add_value('plot_size', plot_size)

It seems like spider is always catching exception, but when I check pages 
manually after scrape I find that some properties have Plot Size defined 
but they don't get scraped.

субота, 22. август 2015. 16.16.33 UTC+2, Paul Tremberth је написао/ла:
>
> Try using Unicode string parameter for xpath()
> response.xpath(u'//span[text()="Plot Size (m²)"]/preceding::span/text()')
> Le 22 août 2015 15:00, "Mario" <[email protected] <javascript:>> a 
> écrit :
>
>> I'm having issues with getting some values from this page:
>>
>>
>> http://www.remax-malta.com/Maisonette-For-Sale-St-Pauls-Bay-North_240041024-145
>>
>> To be more specific here's a picture of what I can(colored green) and 
>> can't(colored red) scrape:
>>
>> http://i.imgur.com/xT6wTtl.png
>>
>> Example of writing xpath for Total rooms is:
>>
>> response.xpath('//span[text()="Total Rooms:"]/preceding::span/text()').
>> extract()[-1]
>>
>> This prints: u'3' which is value I'm after.
>>
>> But when I try to write xpath for Plot Size (m²) like this one:
>>
>> response.xpath('//span[text()="Plot Size (m²)"]/preceding::span/text()').
>> extract()[-1]
>>
>>
>> I get this error:
>>
>> ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL 
>> bytes or control characters
>>
>>
>> I know I get this because of '²' character. Can somebody help me out 
>> with writing proper xpath? Or maybe there's another way of getting value 
>> from xpath?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "scrapy-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/scrapy-users.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to