[Tutor] more scraping and saving

Tommy Kaas Mon, 03 Jan 2011 04:34:09 -0800

Hi - I was helped the other day in an attempt to scrape and save a simple
web page. I'm using what I learned and trying another. It should be very
simple, but I only get the first row of names saved.


Can anybody help with an explanation?

 

(It's a public list of names of doctors with knows connections to the
farmaceutical industry).

This time I try to start at the right table (the third on the page) by using
the class attribute. Does that make sense?

Thanks in advance - here is the code:

 

import urllib2 

from BeautifulSoup import BeautifulSoup

 

 

import codecs

 

f = codecs.open("laeger.txt", "w", encoding="Latin-1")

 

 

soup =
BeautifulSoup(urllib2.urlopen('http://www.laegemiddelstyrelsen.dk/include/88
06/tilladelse_laeger.asp').read())

 

for row in soup('table', {'class' : 'tableLeftRight3030'}):

    tds = row('td')

 

    output = ";".join(tds[i].string for i in (0, 1, 2, 3, 4))

    f.write(output + '\n')

f.close()

 

 

Tommy Kaas

 

Kaas & Mulvad

Lykkesholms Alle 2A, 3.

1902 Frederiksberg C

 

Mobil: 27268818

Mail:  <mailto:[email protected]> [email protected]

Web:  <http://www.kaasogmulvad.dk> www.kaasogmulvad.dk

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

[Tutor] more scraping and saving

Reply via email to