Re: Combining adjacent nodes in xpath for selection list

Jaspreet Singh Sun, 01 Jun 2014 04:23:30 -0700

Thanks a lot. It worked!

On Sunday, June 1, 2014 4:02:45 PM UTC+5:30, Luis Miguel Morillas wrote:
>
> Something like: 
>
> for newitem in sel.xpath(u'//div[@id="wb_Text4"]//u'): 
>     print newitem.xpath(u'./text()').extract() 
>     print newitem.xpath(u'(./following-sibling::text())[1]').extract() 
>     print 
> Saludos, 
>
> -- luismiguel  (@lmorillas) 
>
>
> 2014-06-01 11:16 GMT+02:00 Nikolaos-Digenis Karagiannis <
> [email protected] <javascript:>>: 
> > I misinterpreted the specification there. Also, in other implementations 
> I 
> > found it possible to start with a text node as the context node and 
> select 
> > parents. siblings etc. With lxml.etree once you select a text node you 
> get a 
> > text result and you are done, no more xpath() methods on this object. 
> Scrapy 
> > suppresses this 
> > 
> https://github.com/scrapy/scrapy/blob/554102fd70b14ee83109003cf77ab3a4f91f4f58/scrapy/selector/unified.py#L88-L92
>  
> > and I didn't notice at first. 
> > I 'd call this a bug in lxml (or libxml2). 
> > 
> > 
> > On Sunday, 1 June 2014 11:18:56 UTC+3, Nikolaos-Digenis Karagiannis 
> wrote: 
> >> 
> >> Usually you can just count(preceding-sibling::u|self::u) and group them 
> by 
> >> this count. 
> >> But alas! here you can not, because the sibling axis does not work on 
> >> text() nodes. 
> >> http://www.w3.org/TR/xpath/#node-tests -> Bullet point 3: "For other 
> axes, 
> >> the principal node type is element" 
> >> Types of nodes: http://www.w3.org/TR/xpath/#data-model 
> >> Try counting <u> nodes manually. 
> >> 
> >> On Sunday, 1 June 2014 04:57:34 UTC+3, Jaspreet Singh wrote: 
> >>> 
> >>> Hi, 
> >>> 
> >>> I am looking to scrape a page where  the required items are adjacent 
> in 
> >>> pairs having a single parent node. 
> >>> 
> >>> The page is http://www.intradaystocktips.org/stocks_to_watch_today.php 
> >>> 
> >>> I want the xpath to be specified such that "Tata Motors Ltd" and the 
> >>> following text i.e. "Automobile major reported a net profit of Rs 
> 3,920 
> >>> crore during Jan-March quarter, down 0.3 per cent, against a net 
> profit of 
> >>> Rs 3,931 crore, in the corresponding quarter last fiscal" is the first 
> item. 
> >>> Similarly the second item will be "Trent Ltd" followed by "Undeterred 
> by 
> >>> the BJP's apparently unyielding stance on foreign direct investment 
> (FDI) in 
> >>> multi-brand retail, Tesco is going ahead with its proposed $110 
> million 
> >>> investment to open stores in a joint venture with Tata's Trent 
> Hypermarket. 
> >>> ". 
> >>> 
> >>> In short I need to select a node along with its adjacent node (i.e. 
> >>> combining adjacent nodes) in a single item of the selection list. 
> >>> 
> >>> How can i create a selection using an xpath for the above rule? 
> >>> 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "scrapy-users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to [email protected] <javascript:>. 
> > To post to this group, send email to [email protected] 
> <javascript:>. 
> > Visit this group at http://groups.google.com/group/scrapy-users. 
> > For more options, visit https://groups.google.com/d/optout. 
>


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Combining adjacent nodes in xpath for selection list

Reply via email to