On 12/17/05, Daniel Veillard <[EMAIL PROTECTED]> wrote: > On Fri, Dec 16, 2005 at 07:55:07PM -0500, Jon Smirl wrote: > > I have things working now. > > > > The help/man for --loaddtd wasn't enough for me to figure out what it did. > > --loaddtd : fetch external DTD > > > > Second the ruby wrapper for loaddtd was broken, > > XML::Parser::default_load_external_dtd. I've sent a patch to the > > maintainer. This was was was causing me a lot of problems, I set the > > option but it was not getting set because of the breakage in the C > > wrapper. > > > > Needing to install the DTD and use --loaddtd to fix the undefined > > entity error was not obvious. It might make a good entry for your > > libxml FAQ. google didn't turn up a ready answer either. > > Hum, how would you phrase this ?
Put the error message into the FAQ so that google will pick it up. root.xml:43: parser error : Entity 'nbsp' not defined I was searching for variations of "libxml entity not defined" and couldn't get a good hit. Then add a paragraph about what an external subset is and why you need to install a DTD in order to resolve the entities. Knowledge of external subsets is probably not common. Most XML files don't use them and most of the HTML parsers build the entities in. It's only XHTML that commonly needs the external subset. A warning indicating that a DTD is being fetched over the net instead of from the catalog would probably be good too. w3.org is so busy that sometimes the DTDs don't come back. This may be from web sites using parsers and not knowing that the DTDs are being fetched. With the ruby wrapper there is no indication that this is happening other than your parse is slow. My current problem is that my XHTML xpath queries aren't matching. This seems to be because the namespace associated with the query isn't being set into libxml correctly by the ruby wrapper when the namespace is the default one. I'm still debugging it. > > The xhtml 1.1 DTD is huge, fifty files. Is there some way to set > > things up so that libxml can use a small DTD which only contains the > > external subset in non-validating mode and then use the full one for > > validating? I'd rather not parse 10,000 lines of DTD just to read a 20 > > line xhtml file. > > set an entity resolution handler (see the section on I/O in the doc) > and catch the request for XHTML1.1 then provide a reduced input. > XHTML-1.1 being a bit nebulous is probably one of the reasons it's > not very common. I'll give this a try. A lot of people using XHTML may not have the knowledge to set something like this up. It would be nice if there were rpms for xhtml1.0 and xhtml1.1 that set up the DTDs in the catalog for validation and also set up a minimal external subset for performance. You'd have to modify the catalog mechanism to return the full DTD or minimal subset depending on which mode the parser was in. > > Daniel > > -- > Daniel Veillard | Red Hat http://redhat.com/ > [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ > http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ > -- Jon Smirl [EMAIL PROTECTED] _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
