Re: issue with Xpath in scrapy.

Chetan Motamarri Wed, 26 Nov 2014 13:34:32 -0800

Thanks a lot Paul  :)

On Wednesday, November 26, 2014 3:13:41 AM UTC-7, Paul Tremberth wrote:
>
> Hi Chetan,
>
> Regarding the syntax, it should be parenthesis and not square brackets for 
> selecting text nodes
>
>
> //div[@class='forum_list_name']/a[contains(.,'Workshop')]/preceding-sibling::text()[1]
>
>
> Now, regarding what you want to select, the count of threads for workshop 
> discussions,
> instead of selecting the element that has "Workshop" and then going 
> backwards in the document tree with preceding-sibling,
> I suggest you use the content condition on the parent div element,
> and then select the element with class "forum_list_postcount" containing 
> the count:
>
> workshopDiscussions= 
> hxs.select("""//div[div[@class='forum_list_name']/a[contains(.,'Workshop')]]
>                                         
> /div[@class='forum_list_postcount']""").xpath("normalize-space()").extract()
>
>
>
> On Wednesday, November 26, 2014 9:57:42 AM UTC+1, Chetan Motamarri wrote:
>>
>> Hi All,
>>
>> I want to "Workshop Discussions" value in this URL(
>> http://steamcommunity.com/workshop/discussions/?appid=220700). For this 
>> I wrote like scrapy code like this. But I was unable to extract "70" which 
>> is Workshop Discussions count.
>>
>> from scrapy.selector import HtmlXPathSelector
>> from scrapy.spider import BaseSpider
>> from extractDiscussionsCount.items import ExtractdiscussionscountItem
>>
>> class ScrapePriceSpider(BaseSpider):
>>     
>>     name = 'ScrapeDiscussionsCount'     
>>     allowed_domains = ['steamcommunity.com']    
>>     start_urls = ["
>> http://steamcommunity.com/workshop/discussions/?appid=220700";]
>>     
>>     def parse(self, response):
>>            
>>             hxs = HtmlXPathSelector(response)
>>             currentDate = datetime.datetime.now()
>>             currentTime = str(datetime.datetime.now().time())
>>             items = []    
>>             item = ExtractdiscussionscountItem()       
>>             
>> *            workshopDiscussions= 
>> hxs.select("//div[@class='forum_list_name']/a[contains(.,'Workshop')]/preceding-sibling::text[][1]").extract()*
>>                         
>>             item["DiscussionCount"] = str(workshopDiscussions)           
>>   
>>             items.append(item)
>>             return items
>>   
>>
>> I know there is xpath issue with red colored text in the above code. 
>> Please help me in writing this. 
>>
>> Thanks
>> Chetan Motamarri
>>
>


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: issue with Xpath in scrapy.

Reply via email to