Oren Tirosh wrote:

> It all boils down to how you define "the same". Which parts of the XML
> document are meaningful content that needs to be preserved and which
> ones are mere encoding variations that may be omitted from the internal
> representation?
>
> Some relevant references which may be used as guidelines:
>
> * http://www.w3.org/TR/xml-infoset
> The XML infoset defines 11 types of information items including
> document type declaration, notations and other features. It does not
> appear to be suitable for a lightweight API like ElementTree.
>
> * http://www.w3.org/TR/xpath-datamodel
> The XPath data model uses a subset of the XML infoset with "only" seven
> node types.
>
> http://www.w3.org/TR/xml-c14n
> The canonical XML recommendation is meant to describe a process but it
> also effectively defines a data model: anything preserved by the
> canonicalization process is part of the model. Anything not preserved
> is not part of the model.

you forgot

    http://effbot.org/zone/element-infoset.htm

which describes the 3-node XML infoset subset used by ElementTree.

</F>



-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to