2013/9/5 John Porubek <jporu...@gmail.com>:
> I'm re-posting this message since I didn't get an answer the first
> time and I'm nothing if not persistent!
>
> Let me recast the original question differently. Is there a fairly
> easy way, using Factor, to scrape a blog website for a list of blog
> titles? This seems like it would be a really useful tool for finding
> information, assuming the author chose fairly meaningful titles.
>
> I'm no expert in web technology, but it occurs to me, as I think about
> this problem, that it might be kind of difficult in the general case.
> I have no idea how similar different blogs might be. For now, I'm most
> interested in the special case of John Benediktsson's "Re:Factor"
> blog.

Hi John,

Yours truly have written a tutorial for scraping with Factor available here:

https://github.com/bjourne/playground-factor/wiki/Parsing-gmane-with-factor

Though it's much a rough work in progress which I might never finish
because I get bored quickly. Maybe you can salvage some information
from it. The general strategy for scraping is:

    IN: USE: html.parser.analyzer
    IN: "http://www.factorcode.org/"; scrape-html
    ! Get stuff from the tag seq, eg.
    IN: "title" find-between-first first text>>
    "Factor programming language"


-- 
mvh/best regards Björn Lindqvist
http://www.bjornlindqvist.se/

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk

Reply via email to