On Tue, 2005-11-08 at 00:44 +0100, Kail wrote:
> I've a problem with an old SGLM.
> This have many format error, the 2 most annoing are:
> 
> 1- Have more than 1 element as root child

SGML does not allow this.

>      //Start of file
>      <reuters> ........ </reuters>
>      <reuters> ........ </reuters>

As Daniel has suggested, this must be an external entity.
You are missing a main or "driver" file.

> etc.
> This file is 7 years old, but i need to parse it :(

Maybe use osx to convert it to XML -- it's part of
OpenJade I think these days.

> There is a possibility to parse it without add a node from the start
> of file to the end?
> 
> 2- There are also some char like &#31; that obviusly are not
> recognised and generate errors...there is a way to avoid the errors
> and make the parser recognise  them as TEXT element avoiding the call
> of xmlParseCharRef or make this function don't generate error? (an
> Option i haven't found ^_^)

There should be a SGML Declaration which says which characters
are allowed in that SGML document.  It's often considered to be
part of the SGML DTD.

Typically you give something like osx the SGML declaration, the
DTD file, and the document, all in one stream.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org www.advogato.org



_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to