Re: [Factor-talk] Using find-by-class in html.parser.analyzer
2013/7/15 Alex Vondrak ajvond...@gmail.com In general, I'm not sure if html.parser is very mature compared to, say, the XML vocab: http://docs.factorcode.org/content/article-xml.html The major difference is that html.parser handles invalid html which the xml vocab doesn't. So for parsing pages on the web, html.parser is what you have to use. Would be great if someone wrote a wrapper for libxml2 which is the best xml/html parsing library ever. Then you wouldn't need to use different interfaces for xml and html. -- mvh Björn -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Using find-by-class in html.parser.analyzer
Hi Alex, Thanks very much for your reply. I think my issue might come from the fact that find-by-class or its XML equivalent require the entire class string to match which is technically not correct (an element can have several CSS classes so it should be a string contains as a whole word type search, not an exact match). Is there any function that does a match of this time? I expect it could be written up as a quotation. Mark -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Using find-by-class in html.parser.analyzer
On Mon, Jul 15, 2013 at 12:11 PM, Mark Green m...@antelope.nildram.co.ukwrote: I expect it could be written up as a quotation. Maybe something like this? ``` IN: scratchpad htmlbodydiv class=\food is good\pmmm, candy bar/p/div/body/html parse-html [ class attribute foo swap subseq? ] find-all . { { 2 T{ tag { name div } { attributes H{ { class food is good } } } } } } ``` Or like this? ``` IN: scratchpad htmlbodydiv class=\food is good\pmmm, candy bar/p/div/body/html parse-html [ class attribute split food swap member? ] find-all . { { 2 T{ tag { name div } { attributes H{ { class food is good } } } } } } ``` Just some ideas; I'm sure this could/should all be factored out into the proper helper words. Regards, --Alex Vondrak -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
[Factor-talk] Using find-by-class in html.parser.analyzer
Hi, Is there any documentation for find-by-class in html.parser.analyzer? I'm not sure what it does. It doesn't seem to search for elements with a given value in the class attribute and I'm not sure how it would return them anyway (is it a filter?) Thanks! -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Using find-by-class in html.parser.analyzer
No docs apparently (and unfortunately), but I suppose you can get a feel for what the words do by looking at the source + tests. That's what I've done to give you this reply, anyhow. :-) I'm not sure how you were invoking the word before to not see the desired behavior---I'm guessing probably on a raw HTML string, like ``` htmlbodydiv class=\foo\pbar/p/div/body/html foo find-by-class ``` which results in a cryptic error. It turns out the word is actually supposed to be called on a vector of `tag` tuples, which you can generate using the `parse-html` word from html.parser: ``` IN: scratchpad htmlbodydiv class=\foo\pbar/p/div/body/html parse-html foo find-by-class --- Data stack: 2 T{ tag f div H{ ~array~ } f f } ``` Because it uses `find`, this leaves two values on the stack: the index of the element, and the element itself---a `tag` instance. http://docs.factorcode.org/content/word-find,sequences.html To find *every* tag with the given class, there's apparently no predefined word. But I do spy a `find-all` helper: ``` IN: scratchpad htmlbodydiv class=\foo\p class=\foo\bar/p/div/body/html parse-html [ class attribute foo = ] find-all . { { 2 T{ tag { name div } { attributes H{ { class foo } } } } } { 3 T{ tag { name p } { attributes H{ { class foo } } } } } } ``` In general, I'm not sure if html.parser is very mature compared to, say, the XML vocab: http://docs.factorcode.org/content/article-xml.html ``` IN: scratchpad htmlbodydiv class=\foo\p class=\foo\bar/p/div/body/html stringxml foo class deep-tags-with-attr children-tags [ Found: write xmlstring print ] each Found: div class=foop class=foobar/p/div Found: p class=foobar/p ``` But there are some answers anyway. Hope that helps, --Alex Vondrak On Sun, Jul 14, 2013 at 3:52 PM, Mark Green m...@antelope.nildram.co.ukwrote: Hi, Is there any documentation for find-by-class in html.parser.analyzer? I'm not sure what it does. It doesn't seem to search for elements with a given value in the class attribute and I'm not sure how it would return them anyway (is it a filter?) Thanks! -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk