Re: [Factor-talk] Using find-by-class in html.parser.analyzer

2013-07-15 Thread Björn Lindqvist
2013/7/15 Alex Vondrak ajvond...@gmail.com


 In general, I'm not sure if html.parser is very mature compared to, say,
 the XML vocab: http://docs.factorcode.org/content/article-xml.html


The major difference is that html.parser handles invalid html which the xml
vocab doesn't.  So for parsing pages on the web, html.parser is what you
have to use. Would be great if someone wrote a wrapper for libxml2 which is
the best xml/html parsing library ever. Then you wouldn't need to use
different interfaces for xml and html.


--
mvh Björn
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Using find-by-class in html.parser.analyzer

2013-07-15 Thread Mark Green
Hi Alex,

Thanks very much for your reply. I think my issue might come from the fact
that find-by-class or its XML equivalent require the entire class string to
match which is technically not correct (an element can have several CSS
classes so it should be a string contains as a whole word type search,
not an exact match).  Is there any function that does a match of this time?
I expect it could be written up as a quotation.

Mark
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Using find-by-class in html.parser.analyzer

2013-07-15 Thread Alex Vondrak
On Mon, Jul 15, 2013 at 12:11 PM, Mark Green m...@antelope.nildram.co.ukwrote:

 I expect it could be written up as a quotation.


Maybe something like this?

```
IN: scratchpad htmlbodydiv class=\food is good\pmmm, candy
bar/p/div/body/html parse-html
[ class attribute foo swap subseq? ] find-all .
{
{
2
T{ tag
{ name div }
{ attributes H{ { class food is good } } }
}
}
}
```

Or like this?

```
IN: scratchpad htmlbodydiv class=\food is good\pmmm, candy
bar/p/div/body/html parse-html
[ class attribute   split food swap member? ] find-all .
{
{
2
T{ tag
{ name div }
{ attributes H{ { class food is good } } }
}
}
}
```

Just some ideas; I'm sure this could/should all be factored out into the
proper helper words.

Regards,
--Alex Vondrak
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] Using find-by-class in html.parser.analyzer

2013-07-14 Thread Mark Green
Hi,

Is there any documentation for find-by-class in html.parser.analyzer? I'm
not sure what it does. It doesn't seem to search for elements with a given
value in the class attribute and I'm not sure how it would return them
anyway (is it a filter?)

Thanks!
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Using find-by-class in html.parser.analyzer

2013-07-14 Thread Alex Vondrak
No docs apparently (and unfortunately), but I suppose you can get a feel
for what the words do by looking at the source + tests.  That's what I've
done to give you this reply, anyhow.  :-)

I'm not sure how you were invoking the word before to not see the desired
behavior---I'm guessing probably on a raw HTML string, like

```
htmlbodydiv class=\foo\pbar/p/div/body/html
foo find-by-class
```

which results in a cryptic error.  It turns out the word is actually
supposed to be called on a vector of `tag` tuples, which you can generate
using the `parse-html` word from html.parser:

```
IN: scratchpad htmlbodydiv
class=\foo\pbar/p/div/body/html parse-html
foo find-by-class

--- Data stack:
2
T{ tag f div H{ ~array~ } f f }
```

Because it uses `find`, this leaves two values on the stack: the index of
the element, and the element itself---a `tag` instance.
http://docs.factorcode.org/content/word-find,sequences.html

To find *every* tag with the given class, there's apparently no predefined
word.  But I do spy a `find-all` helper:

```
IN: scratchpad htmlbodydiv class=\foo\p
class=\foo\bar/p/div/body/html parse-html
[ class attribute foo = ] find-all .
{
{
2
T{ tag
{ name div }
{ attributes H{ { class foo } } }
}
}
{
3
T{ tag
{ name p }
{ attributes H{ { class foo } } }
}
}
}
```

In general, I'm not sure if html.parser is very mature compared to, say,
the XML vocab: http://docs.factorcode.org/content/article-xml.html

```
IN: scratchpad htmlbodydiv class=\foo\p
class=\foo\bar/p/div/body/html stringxml
foo class deep-tags-with-attr children-tags [
Found:  write xmlstring print
] each
Found: div class=foop class=foobar/p/div
Found: p class=foobar/p
```

But there are some answers anyway.

Hope that helps,
--Alex Vondrak

On Sun, Jul 14, 2013 at 3:52 PM, Mark Green m...@antelope.nildram.co.ukwrote:

 Hi,

 Is there any documentation for find-by-class in html.parser.analyzer? I'm
 not sure what it does. It doesn't seem to search for elements with a given
 value in the class attribute and I'm not sure how it would return them
 anyway (is it a filter?)

 Thanks!



 --
 See everything from the browser to the database with AppDynamics
 Get end-to-end visibility with application monitoring from AppDynamics
 Isolate bottlenecks and diagnose root cause in seconds.
 Start your free trial of AppDynamics Pro today!
 http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk


--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk