Re: how to get elements in original order withing selector.xpath()?

Nikolaos-Digenis Karagiannis Tue, 20 May 2014 02:51:07 -0700

I didn't notice you post earlier.
The w3c recommendation for XPath1.0 defines node-sets as *unordered*.
Various XPath implementations tend to use document order.
*Do not depend on this unless your software documents it as an extension to 
the standard.*


Fortunately, XPath defines axis (predicates) for ordering. *Axis differ 
from node-sets.*
Let us name "siblings" a node-set of nodes nested under the same parent:
>>> (parent,) = sel.xpath('//*[@id="Unique_ID"]')
>>> siblings = parent.xpath('node()')
Now every sibling has an amount of siblings preceding it, we can use this 
amount to "count" the distance from the beginning:
>>> position_31_maybe = sib[30].xpath('count(preceding-sibling::node())').
extract()[0]
'31.0'
Using this number (string actually) as a key we can sort them:
>>> ordered_siblings = sorted(siblings, key=lambda sib: sib.xpath(
'count(preceding-sibling::node())').extract()[0])
I encourage you to experiment with mixed node types 
(elements/text/comments) to understand why I used node() here instead of * 
or no step at all.
Otherwise you may bump into strange bugs.

On Monday, 19 May 2014 17:00:10 UTC+3, jinchao wang wrote:
>
> May I ask a question:   how to get the elements in their origin order 
> withing selector.xpath()
> for example, <ul><li>1</li><li>2</li><li>3</li></ul>,   I want [1, 2, 3], 
> but it return [2,3,1]
> I use this xpath experssion: //ul/li/text()
> thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: how to get elements in original order withing selector.xpath()?

Reply via email to