[Factor-talk] HTML parsing vocabulary

2010-03-08 Thread Paul Moore
Is there a HTML parsing vocabulary for Factor? I guess I might be able to get away with read-xml at a pinch, but given the state of most HTML out there, I'd feel safer not assuming anything stronger than I have to... I searched the documentation but couldn't see anything immediately (plenty for

Re: [Factor-talk] HTML parsing vocabulary

2010-03-08 Thread Doug Coleman
Hi Paul, There's an html parser with a word that should do exactly what you want already. USE: html.parser.analyzer http://reddit.com; scrape-html find-hrefs I had a better html parser but I reverted the code by accident... Doug On Mar 8, 2010, at 12:56 PM, Paul Moore wrote: Is there a

Re: [Factor-talk] HTML parsing vocabulary

2010-03-08 Thread Doug Coleman
I forgot to mention that there's also a spider library in extra/spider that might be useful. On Mar 8, 2010, at 12:56 PM, Paul Moore wrote: Is there a HTML parsing vocabulary for Factor? I guess I might be able to get away with read-xml at a pinch, but given the state of most HTML out

Re: [Factor-talk] HTML parsing vocabulary

2010-03-08 Thread Samuel Tardieu
2010/3/8 Doug Coleman doug.cole...@gmail.com I had a better html parser but I reverted the code by accident... Incidentally, if you happen to revert some code that you had previously checked in, git reflog will allow you to get back any version for 90 days by default (git never deletes

Re: [Factor-talk] HTML parsing vocabulary

2010-03-08 Thread Paul Moore
On 8 March 2010 19:16, Doug Coleman doug.cole...@gmail.com wrote: Hi Paul, There's an html parser with a word that should do exactly what you want already. USE: html.parser.analyzer http://reddit.com; scrape-html find-hrefs Ah, got it. That looks like just what I want! Thanks a lot.