Kent Johnson writes: > [EMAIL PROTECTED] wrote: >> Kent Johnson writes: >> >> >>>[EMAIL PROTECTED] wrote: >>> >>> >>>>List of states: >>>>http://en.wikipedia.org/wiki/U.S._state >>>> >>>>: soup = BeautifulSoup(html) >>>>: # Get the second table (list of states). >>>>: table = soup.first('table').findNext('table') >>>>: print table >>>> >>>>... >>>><tr> >>>><td>WY</td> >>>><td>Wyo.</td> >>>><td><a href="/wiki/Wyoming" title="Wyoming">Wyoming</a></td> >>>><td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, >>>>Wyoming">Cheyenne</a></td> >>>><td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, >>>>Wyoming">Cheyenne</a></td> >>>><td><a href="/wiki/Image:Flag_of_Wyoming.svg" class="image" title=""><img >>>>src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin >>>> >>>>g.svg/45px-Flag_of_Wyoming.svg.png" width="45" alt="" height="30" >>>>longdesc="/wiki/Image:Flag_of_Wyoming.svg" /></a></td> >>>></tr> >>>></table> >>>> >>>>Of each row (tr), I want to get the cells (td): 1,3,4 >>>>(postal,state,capital). But cells 3 and 4 have anchors. >>> >>>So dig into the cells and get the data from the anchor. >>> >>>cells = row('td') >>>cells[0].string >>>cells[2]('a').string >>>cells[3]('a').string >>> >>>Kent >>> >>>_______________________________________________ >>>Tutor maillist - Tutor@python.org >>>http://mail.python.org/mailman/listinfo/tutor >> >> >> for row in table('tr'): >> cells = row('td') >> print cells[0] >> >> IndexError: list index out of range > > It works for me: > > > In [1]: from BeautifulSoup import BeautifulSoup as bs > > In [2]: soup=bs('''<tr> > ...: <td>WY</td> > ...: <td>Wyo.</td> > ...: <td><a href="/wiki/Wyoming" title="Wyoming">Wyoming</a></td> > ...: <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > ...: Wyoming">Cheyenne</a></td> > ...: <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > ...: Wyoming">Cheyenne</a></td> > ...: <td><a href="/wiki/Image:Flag_of_Wyoming.svg" class="image" > title=""><img > ...: > src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/Flag_of_Wyomin > ...: g.svg/45px-Flag_of_Wyoming.svg.png" width="45" alt="" height="30" > ...: longdesc="/wiki/Image:Flag_of_Wyoming.svg" /></a></td> > ...: </tr> > ...: </table> ''' > ...: > ...: > ...: > ...: ) > > In [18]: rows=soup('tr') > > In [19]: rows > Out[19]: > [<tr> > <td>WY</td> > <td>Wyo.</td> > <td><a href="/wiki/Wyoming" title="Wyoming">Wyoming</a></td> > <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > Wyoming">Cheyenne</a></td> > <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > Wyoming">Cheyenne</a></td> > <td><a href="/wiki/Image:Flag_of_Wyoming.svg" class="image" > title=""><img src="http://upload. > > g.svg/45px-Flag_of_Wyoming.svg.png" width="45" alt="" height="30" > longdesc="/wiki/Image:Flag_ > </tr>] > > In [21]: cells=rows[0]('td') > > In [22]: cells > Out[22]: > [<td>WY</td>, > <td>Wyo.</td>, > <td><a href="/wiki/Wyoming" title="Wyoming">Wyoming</a></td>, > <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > Wyoming">Cheyenne</a></td>, > <td><a href="/wiki/Cheyenne%2C_Wyoming" title="Cheyenne, > Wyoming">Cheyenne</a></td>, > <td><a href="/wiki/Image:Flag_of_Wyoming.svg" class="image" > title=""><img src="http://upload > n > g.svg/45px-Flag_of_Wyoming.svg.png" width="45" alt="" height="30" > longdesc="/wiki/Image:Flag_ > > In [23]: cells[0].string > Out[23]: 'WY' > > In [24]: cells[2].a.string > Out[24]: 'Wyoming' > > In [25]: cells[3].a.string > > > Kent > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor
Yes, ok. But so, it is only possible get data from a row (rows[0]) cells=rows[0]('td') And I want get data from all rows. I have trying with several 'for' setences but i can not. _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor