Hi Torsten and All, Quick Introduction for those not familiar with Pharo-Chrome:
Pharo-Chrome enables Pharo to control and query Chrome / Chromium, in particular to retrieve the DOM of a page. This is useful as many modern pages are just a template which then loads some javascript to asynchronously build the DOM, meaning that the ZnEasy / Soap combination doesn't get the bulk of the information on a page. Pharo-Chrome is now mostly working, i.e. it is possible to open a connection to Chrome, navigate to a requested URL, wait for it to load, retrieve the DOM and then navigate the DOM using a subset of the Soap API, e.g. #findAllStrings:, #findAllTags:, attributeAt:, etc.. GoogleChrome class>>exampleNavigation has been updated to retrieve the DOM from http://pharo.org. GoogleChrome class>>get: is analogous to ZnEasy class>>get:, although it returns a ChromeNode, not an html string. I wasn't able to get rid of the delay while waiting for the page to finish loading. This actually makes sense, since, as mentioned above, many modern pages build the DOM asynchronously, so there's no clear indication of when it is complete. The default delay is currently 2000 milliseconds, which is about twice the maximum I saw needed (983ms), but this can be changed (ChromeTabPage>>pageLoadDelay:). I had three use cases for this library: one which works with ZnEasy+Soap, one that used to work with ZnEasy+Soap, but doesn't due to a page redesign, and one which I hadn't got working before. All three are working now. Unlike Soap, I've currently modelled the nodes as a single class, and have only implemented a subset of Soap's methods, but is enough for what I need. I've introduced a dependency on the Beacon logging framework. I find it useful, but can remove it if you don't want the additional dependency. (I'm planning to add some GoogleChrome specific logging classes and use those to better understand what pageLoadDelay should be). I was focussed on trying to understand the events that Chrome generates, so documentation is still lacking (read "missing" :-)). I'll generate a pull request after some more testing, tweaking and documenting, but if you would like to take a look, the code is available at: https://github.com/akgrant43/Pharo-Chrome/tree/development I haven't yet updated BaselineOfChrome with the Beacon dependency. I did merge in your two commits from May 23. If you, or anyone else, finds this useful, I welcome any feedback. P.S. I've just realised that I need to tidy up #sendMessage:, #sendMessageDictionary and #sendMessageDictionary:wait:. I'll do that as part of the genral tidy up. Cheers, Alistair # vim: tw=72 On Sun, May 21, 2017 at 09:37:56PM +0000, Alistair Grant wrote: > Hi Torsten, > > On Fri, May 19, 2017 at 09:20:48PM +0000, Alistair Grant wrote: > > > > On Fri, May 19, 2017 at 10:50:41PM +0200, Torsten Bergmann wrote: > > > Hi Alistair, > > > > > > cant look right now but two things: > > > > > > - there are also events in the protocol - if we could hook Pharo into > > > them > > > this would solve the problem without abusing delay (because then you > > > will > > > get informed when the page loading is finished) > > > > That would be great. It will be a while before I get a chance to look > > at this (I want to finish some proposed changes to the FileSystem > > packages first), but I'll try and include it then. > > I've got basic event listening working. It requires that all messages > are read asynchronously, so I'll need to change the interface to handle > that. > > Knowing when a page has finished loading isn't quite as simple as > looking for an event - a page can consist of multiple frames, and > notifications are delivered for each frame. The page I'm interested in > has around 25 frames. > > If anyone has a good design pattern for writing an asynchronous > WebSocket client please let me know, I don't have anything concrete in > mind. > > Thanks, > Alistair >