On Wed, Sep 5, 2012 at 3:40 AM, 7stud -- <[email protected]> wrote: > I'm not at all clear what the *specific* things are that you want to > extract from the website. > > In any case, you need to click on View/Source in your browser and > examine the raw html to figure out what tags you need to extract and how > to identify them. Look at the web page in your browser then use Find or > Search to locate the same text in the raw html. > > Then read some basic xpath tutorials starting here: > > http://www.engineyard.com/blog/2010/getting-started-with-nokogiri/
More at http://www.w3schools.com/xpath/ http://www.zvon.org/xxl/XPathTutorial/General/examples.html > Parsing html requires a good understanding of html structure, e.g. > parents, children, siblings, etc., and css, e.g. classes, ids, etc. As > a beginner it is better to take baby steps, not jump in the deep end of > the pool, so this project may be too hard for you. When using Firefox there are some useful extensions for XPath testing, namely https://code.google.com/p/xpathchecker/ http://robertnyman.com/firefinder/ (needs Firebug) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- You received this message because you are subscribed to the Google Groups ruby-talk-google group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at https://groups.google.com/d/forum/ruby-talk-google?hl=en
