Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-08 Thread Ben Coman
On Wed, 8 Jan 2020 at 06:32, LawsonEnglish wrote: > “Simple inspect” works fine. > > THe trace is: > > UndefinedObject(Object)>>doesNotUnderstand: #new > Message>>sentTo: > UndefinedObject(Object)>>doesNotUnderstand: #new > XMLDocumentHighlightDefaults class(XMLHighlightDefaults >

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
; is defined as a workspace variable as soon as you >>> evaluate >>>- the contents of "ingredientsXML" is preserved over different >>> evaluations within the workspace / playground >>>- you can use only "ingredientsXML" within this pla

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Sven Van Caekenberghe
as slipshod in my drafting – I was in a hurry. >> Instead of saying ‘can screw things up’ I should have said ‘can produce >> counter-intuitive results’, as exemplified by the fact that, in your first >> example, ‘ingredientsXML’ can mean different things dependin

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
time. > > From: Pharo-users On Behalf Of > LawsonEnglish > Sent: 07 January 2020 20:55 > To: Any question about pharo is welcome > Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub > > I deleted the playground and entered the text thusly > > ingredie

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Sven Van Caekenberghe
n you can just inspect it or >> evaluate the second line in the same playground. >> >> If you like you can open a second playground which can have its own >> "ingredientsXML" workspace variable. >> >> Workspace variables (or "playground variables&

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
gt;If you like you can open a second playground which can have its own > "ingredientsXML" workspace variable. > > Workspace variables (or "playground variables") are convenient for > experimenting - as they are preserved - but > yes they might confuse you w

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread PBKResearch
er you execute it all in one go or a line at a time. From: Pharo-users On Behalf Of LawsonEnglish Sent: 07 January 2020 20:55 To: Any question about pharo is welcome Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub I deleted the playground and entered the text

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread LawsonEnglish
thout the first line. Compare Torsten's > code. > > HTH > > Peter Kenny > > -Original Message- > From: Pharo-users On Behalf Of Torsten > Bergmann > Sent: 07 January 2020 07:47 > To: pharo-users@lists.pharo.org > Cc: pharo-users@lists.pharo.org > Sub

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread Torsten Bergmann
ing - as they are preserved - but yes they might confuse you when you cant remember what was done with them last. Bye T. > Gesendet: Dienstag, 07. Januar 2020 um 09:55 Uhr > Von: "PBKResearch" > An: "'Any question about pharo is welcome'" > Betreff: Re: [Phar

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-07 Thread PBKResearch
On Behalf Of Torsten Bergmann Sent: 07 January 2020 07:47 To: pharo-users@lists.pharo.org Cc: pharo-users@lists.pharo.org Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub Works without a problem (Pharo 8 on Windows), see attached. So it looks like a local problem. Just check

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2020-01-06 Thread LawsonEnglish
Torsten Bergmann wrote > Hi, > > > You can load using > >Metacello new > baseline: 'XMLParserHTML'; > repository: 'github://pharo-contributions/XML-XMLParserHTML/src'; > load. > > > Bye > T. Hi, I'm trying to use the sample code in the pharo screen scraping booklet —

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread PBKResearch
:43 To: pharo-users@lists.pharo.org Subject: Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub cedreek wrote > To me, far better than using Soup. Ah, interesting! I use Soup almost exclusively. What did you find superior about XMLParserHTML? I may give it a try... cedreek wrote >

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
I couldn’t get it from Zn as (I think) there are some js lib that defer the full rendering. I have the same problem with a site in France (leboncoin). They use https://datadome.co to complicate webscrapping. So an headless browser is the only solution I know. Cheers,

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Esteban Maringolo
Why use Chrome instead of ZnClient? To get a "real" render of the content? (including JS and whatnot). Regards! Esteban A. Maringolo On Sat, Nov 30, 2019 at 8:11 PM Cédrick Béler wrote: > > > > > > Also interesting! Any publicly available examples? How does one load "Google > > chrome pharo

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
> > Also interesting! Any publicly available examples? How does one load "Google > chrome pharo integration »? "https://github.com/astares/Pharo-Chrome; "https://github.com/akgrant43/Pharo-Chrome » Cheers, Cédrick

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Cédrick Béler
> cedreek wrote >> To me, far better than using Soup. > > Ah, interesting! I use Soup almost exclusively. What did you find superior > about XMLParserHTML? I may give it a try... > It’s mainly xpath which I find easier than navigating the html tree with soup or even The xmlHtmlparser. I

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-30 Thread Sean P. DeNigris
cedreek wrote > To me, far better than using Soup. Ah, interesting! I use Soup almost exclusively. What did you find superior about XMLParserHTML? I may give it a try... cedreek wrote > Google chrome pharo integration helps top to scrap complex full JS web > site like google ;) Also

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Esteban Maringolo
Great! I just added a link to the README.md of the project and created a PR, because it is very likely that if you're parsing HTML you're doing some scrapping. :-) Esteban A. Maringolo On Fri, Nov 29, 2019 at 2:18 PM Cédrick Béler wrote: > > Stef and other wrote this book a while ago: > >

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Cédrick Béler
Stef and other wrote this book a while ago: http://books.pharo.org/booklet-Scraping/html/scrapingbook.html Basically XMLHtmlParser + XPath To me, far better than using Soup. Google chrome pharo integration helps top to scrap complex full JS web site like google ;) Cheers, Cedrick > Le 29

Re: [Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-29 Thread Esteban Maringolo
Thank you Torsten, I wasn't aware of this tool, I'm already using it to scrap content from a website and fed a Pharo driven system :) The XML integration in the Inspector is great too. Regards! Esteban A. Maringolo On Tue, Nov 19, 2019 at 8:40 AM Torsten Bergmann wrote: > > Hi, > > the STHub

[Pharo-users] [ANN] XMLParserHTML moved to GitHub

2019-11-19 Thread Torsten Bergmann
Hi, the STHub -> PharoExtras project "XMLParserHTML" was now moved from http://smalltalkhub.com/#!/~PharoExtras/XMLParserHTML to https://github.com/pharo-contributions/XML-XMLParserHTML including the FULL HISTORY The old STHub repo was marked as obsolete - but is linking to the new one. I've