looks perfect - thanks! On Thu, Apr 22, 2010 at 10:45 AM, Myles Eftos <[email protected]> wrote:
> Sanitize will do it (It's based on Nokogiri) > > http://github.com/rgrove/sanitize/ > > ---------------------------------------------- > Myles Eftos > Mobile: +61-409-293-183 > > MadPilot Productions - Created to be Different > URL: http://www.madpilot.com.au > Phone: +618-6424-8234 > Fax: +618-9467-6289 > > Try our time tracking system: 88 Miles! > http://www.88miles.net > > -----Original Message----- > From: [email protected] [mailto: > [email protected]] > On Behalf Of Dan Cheail > Sent: Thursday, 22 April 2010 08:31 > To: [email protected] > Subject: Re: [rails-oceania] ruby gem/lib for parsing text from html? > > I'd say nokogiri or Hpricot (http://hpricot.com/) would be your best bets. > > On Thu, Apr 22, 2010 at 10:28 AM, Korny Sietsma <[email protected]> wrote: > > Hi folks - I'm looking for something that will load a web page and > extract > > just visible text elements from it. > > I could probably write something using nokogiri (is nokogiri still the > best > > option for html parsing?) but I was wondering if someone had alread done > > something similar. > > > > I don't need initially to crawl links, though this might be a later > > requirement - maybe there's a web crawler that could do the job... > > > > - Korny > > > > -- > > Kornelis Sietsma korny at my surname dot com > > kornys on twitter/fb/gtalk/gwave www.sietsma.com/korny > > "Every jumbled pile of person has a thinking part > > that wonders what the part that isn't thinking > > isn't thinking of" > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Ruby or Rails Oceania" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<rails-oceania%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/rails-oceania?hl=en. > > > > > > -- > Dan Cheail > Big Geek, > > -- > You received this message because you are subscribed to the Google Groups > "Ruby or Rails Oceania" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<rails-oceania%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/rails-oceania?hl=en. > > > -- > You received this message because you are subscribed to the Google Groups > "Ruby or Rails Oceania" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<rails-oceania%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/rails-oceania?hl=en. > > -- Kornelis Sietsma korny at my surname dot com kornys on twitter/fb/gtalk/gwave www.sietsma.com/korny "Every jumbled pile of person has a thinking part that wonders what the part that isn't thinking isn't thinking of" -- You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.
