Che M wrote: > <div class="blueBox"> > <div id="curcondbox"> > <div class="subG b">West of Town, Jamestown, Pennsylvania > (PWS)</div> > <div class="bm10">Updated: <span class="pwsrt" > pwsid="KPAJAMES1" pwsunit="english" pwsvariable="lu" value="1247814018">3:00 > AM EDT on July 17, 2009</span></div> > <table cellspacing="0" cellpadding="0" class="full"> > <tr> > <td class="vaT full"> > <table cellspacing="0" cellpadding="5" class="full"> > <tr> > <td class="vaM taC"><img > src="http://icons-pe.wxug.com/i/c/a/nt_clear.gif" width="42" height="42" > alt="Clear" class="condIcon" /></td> > <td class="vaM taC full"> > <div style="font-size: 17px;"><span class="pwsrt" > pwsid="KPAJAMES1" pwsunit="english" pwsvariable="tempf" english="°F" > metric="°C" value="60.3"> > <span class="nobr"><span class="b">60.3</span> °F</span> > </span></div> > > The 60.3 is the value I want to extract. It appears to be down within a > hierarchy > something like: > > <body > <div class="blueBox"> > <div id="curcondbox"> > <table > <table > <div> > <span class="nobr"> > <span class="b">
You may consider using lxml's cssselect module: from lxml import html doc = html.parse("http://some/url/to/parse.html") spans = doc.cssselect("div.bluebox > #curcondbox span.b") print spans[0].text However, I'd rather go for the other "60.3" value using XPath: print doc.xpath('//sp...@pwsvariable="tempf"]/@value') Stefan _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor