subject:"\[Tutor\] How to read websites \- Web Scraping or Parsing in python"

[Tutor] How to read websites - Web Scraping or Parsing in python

2012-06-13 Thread Surya K

Hi, I am trying to write a python program which reads any webpage's content. Considering a blog, I'd like to read all the content written by the author in it. So, each blog/ site would be having different type of HTML/ XML whether its Blogger or Wordpress or Typepad or any.. I thought of

Re: [Tutor] How to read websites - Web Scraping or Parsing in python

2012-06-13 Thread Alan Gauld

On 13/06/12 10:06, Surya K wrote: As my target webpage could be anyone of web, and each website's designers could have designed in their own fashion using different class names, I am unable to figure out how to read article content in a webpage. This is always the problem with scraping

Re: [Tutor] How to read websites - Web Scraping or Parsing in python

2012-06-13 Thread Yashwin Kanchan

Hi Surya Have you tried using IE automation (assuming you are using windows). I used the library from http://www.mayukhbose.com/python/IEC/index.php import IEC ie = IEC.IEController() ie.Navigate('http://knolzone.com/unlock-hidden-themes-in-windows-7-and-other-useful-tips-part-5-of-7/')