Hernan, the PharoExtras/XPath repo has a major rewrite of your package to support all of XPath 1.0 + XPath 2.0 extensions like the element() and attribute() type tests and namespace literals in name tests like '{namespaceURI}localName'. A rewrite was needed because the old lib only implemented a small subset of the spec and would infinite loop on some inputs.
 
Sent: Thursday, September 01, 2016 at 3:56 PM
From: "Hernán Morales Durand" <hernan.mora...@gmail.com>
To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Coding XPath as Smalltalk
 
 
2016-09-01 16:51 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:

Hi Hernan

 

I don’t understand your first question – I can’t see a connection between SPARQL and what I am doing.

 

 
You could get the Wikitionary data by querying a SPARQL endpoint http://wiktionary.dbpedia.org/sparql instead of scrapping web pages (which seems more difficult)
 

 

I downloaded XPath from http://smalltalkhub.com/mc/PharoExtras/XPath/. However, I am probably using a somewhat out of date version; I downloaded it about a year ago.

 

 
I don't know about that version. I copied an old version from SqueakSource (with permission) and updated from time to time, but there is no much. There is also a XPath2 repository which you may try.
 
Hernán
 

 

Peter

 

From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of Hernán Morales Durand
Sent: 01 September 2016 18:54
To: Any question about pharo is welcome <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Coding XPath as Smalltalk

 

Hi Peter,

 

2016-09-01 10:26 GMT-03:00 PBKResearch <pe...@pbkresearch.co.uk>:

Hello

 

I am using XPath as a way of dissecting web pages, especially from Wiktionary.

 

Any specific reason to not use the SPARQL endpoint?


 

Generally I get good results, but I could get useful extra flexibility by using the binary Smalltalk operators to represent XPath, as mentioned at the end of the class comment for XPath. However, the description there is very terse, and I am having difficulty seeing how to include more complex expressions, especially attribute tests.

 

Which XPath version are you using? How did you installed it?


 

I have put some of my XPath expressions through the XPath compiler and looked at the output, and out of that I have found expressions which work but look very clumsy. As an example, I have used the fragment:

 

document xPath: '//div[@id=''catlinks'']//li//text()'

 

and found that an equivalent is:

 

document //'div' ?? [:node :x :y|(node attributeAt: 'id') = 'catlinks']//'li'//[:n| n isStringNode]].

(I had to put two dummy arguments in the three-argument block to get it to work.)

 

Is there a more extensive explanation of the use of these binary operators? If not, could some kind person show me the most concise translation of the sample XPath above, to give me a start in working out more complex cases?

 

Many thanks for any help.

 

Peter Kenny

 

 

Reply via email to