Re: Parsing Newb Help

Robert Klemme Wed, 05 Sep 2012 01:12:45 -0700

On Wed, Sep 5, 2012 at 3:40 AM, 7stud -- <[email protected]> wrote:
> I'm not at all clear what the *specific* things are that you want to
> extract from the website.
>
>  In any case, you need to click on View/Source in your browser and
> examine the raw html to figure out what tags you need to extract and how
> to identify them.  Look at the web page in your browser then use Find or
> Search to locate the same text in the raw html.
>
> Then read some basic xpath tutorials starting here:
>
> http://www.engineyard.com/blog/2010/getting-started-with-nokogiri/


More at
http://www.w3schools.com/xpath/
http://www.zvon.org/xxl/XPathTutorial/General/examples.html

> Parsing html requires a good understanding of html structure, e.g.
> parents, children, siblings, etc., and css, e.g. classes, ids, etc.  As
> a beginner it is better to take baby steps, not jump in the deep end of
> the pool, so this project may be too hard for you.

When using Firefox there are some useful extensions for XPath testing, namely
https://code.google.com/p/xpathchecker/
http://robertnyman.com/firefinder/  (needs Firebug)

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- You received this message because you are subscribed to the Google Groups 
ruby-talk-google group. To post to this group, send email to 
[email protected]. To unsubscribe from this group, send email 
to [email protected]. For more options, visit this 
group at https://groups.google.com/d/forum/ruby-talk-google?hl=en

Re: Parsing Newb Help

Reply via email to