On 6 Sep, 17:11, "Jackie Wang" <[EMAIL PROTECTED]> wrote: > > I have the following html code: > > <td valign="top" headers="col1"> > <font size="2"> > Center Bank > <br /> > Los Angeles, CA > </font> > </td> > > <td valign="top" headers="col1"> > <font size="2"> > Salisbury > Bank and Trust Company > <font face="arial, helvetica" size="2" color="#0000000"> > <br /> > Lakeville, CT > </font> > </font> > </td> > > How should I delete the 'font' tags while keeping the content inside?
This sounds like an editing exercise, really. If you're comfortable learning a new tool, I can recommend XSLT for this kind of job. Here's the stylesheet: <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/ Transform"> <xsl:template match="font"> <xsl:apply-templates/> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> This just describes two things: firstly, that you want to recognise font elements and to include their contents, not each element's start and end tags; secondly, that all other parts of the document should be copied. You can apply stylesheets using a number of XSL processors. The xsltproc program is usually available where libxslt is installed, and although I'm sure others will be along to tell you all about their favourite libraries and tools, here's how I use mine within Python: # XSLTools: http://www.python.org/pypi/XSLTools # libxml2dom: http://www.python.org/pypi/libxml2dom import XSLTools.XSLOutput import libxml2dom # If s is the document text... d = libxml2dom.parseString(s) # Save the above stylesheet to a file somewhere, then... proc = XSLTools.XSLOutput.Processor(["/tmp/no-font.xsl"]) # Get the result document d2 = proc.get_result(d) Anyway, this is just one option of many to deal with this kind of problem. Paul -- http://mail.python.org/mailman/listinfo/python-list