Horst Gutmann napisał(a):

I currently have quite a big problem with minidom and special chars (for example ü) in HTML.

Let's say I have following input file:
--------------------------------------------------
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd";>

HTML4 is not an XML application. Even if minidom will fetch this DTD and be able to parse character entities, it may not be able to parse the document.


Any idea how I could solve this problem?

Don't use minidom or convert HTML4 to XHTML and change declaration of doctype.


--
Jarek Zgoda
http://jpa.berlios.de/ | http://www.zgodowie.org/
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to