In my experience, the easiest way is: run website through tidy, load it into a DOMDocument, and use xpath.
The xpath patterns are SO much easier to read and write than regex and more resistant to changes to the website (if you write them correctly). You can also use regex within xpath if you ever need it. Alvaro Nathan Lane wrote: > I want to make what in effect is a website scraper using PHP, but it isn't > obvious how this would best be done. I've tried using DOMDocument and I'm > not sure if that's the best option or not. I'd really like to use something > where I could use XPath to get the elements out that I want. Recently I > wrote a similar program in C# that I call HttpAnalyzer. Could I just use > that with PHP (i.e. call it from PHP) to get what I'm looking for? Any > suggestions? > _______________________________________________ UPHPU mailing list [email protected] http://uphpu.org/mailman/listinfo/uphpu IRC: #uphpu on irc.freenode.net
