>   "we dont' know what the data are but we expect to process 
> them cleanly"
> this sounds very similar to not knowing the encoding of a 
> piece of text,

Similar, maybe. Tho my heuristics would've to answer a question
with far less possibilties: Does an input buffer contain references
to (pre-) defined entities or not?
I still think it's possible, but it's not efficient (at least not
by comparison to a developer using a proper call for an actual
piece of input).


> you may try to apply heuristic but it will come bite you back 
> no matter what.

Maybe, but I'll bite back... 8)


> In both case only one solution: educate the users/developpers.

Well, I'm sure that you're well aware that some (ideal) solutions
are hard to realise...


>   it's perfectly clean to use diret access to the node structure. Just
> check that the target strings are not from the doc 
> dictionnary (in which case don't free them)

How do I check this (OK, I can look that up, but probably you know
it OTTOYH)?

> and overwise use xmlMalloc/xmlFree to  manipulate the target
> text nodes. Of course if you use non-predefined entities, 
> then you will
> have to add entities references to the element children list.

I'll take xmlNodeSetContent() as template to implement an
"...addContent" with entity support

 
Thanks & Ciao, Markus

P.S.: Attached is a diff of tree.c (against the current CVS head)
      which would add some documentation for the
      xmlNode[Set|Add]Content functions.
*** tree.c.orig 2006-11-01 10:29:52.931711300 +0100
--- tree.c      2006-11-01 10:33:00.168482900 +0100
***************
*** 5197,5206 ****
--- 5197,5209 ----
   * xmlNodeSetContent:
   * @cur:  the node being modified
   * @content:  the new value of the content
   *
   * Replace the content of a node.
+  * NOTE: @content is supposed to be a piece of XML CDATA, so it allows entity
+  *       references, but XML special chars need to be escaped first by using
+  *       xmlEncodeEntitiesReentrant() resp. xmlEncodeSpecialChars().
   */
  void
  xmlNodeSetContent(xmlNodePtr cur, const xmlChar *content) {
      if (cur == NULL) {
  #ifdef DEBUG_TREE
***************
*** 5271,5280 ****
--- 5274,5286 ----
   * @cur:  the node being modified
   * @content:  the new value of the content
   * @len:  the size of @content
   *
   * Replace the content of a node.
+  * NOTE: @content is supposed to be a piece of XML CDATA, so it allows entity
+  *       references, but XML special chars need to be escaped first by using
+  *       xmlEncodeEntitiesReentrant() resp. xmlEncodeSpecialChars().
   */
  void
  xmlNodeSetContentLen(xmlNodePtr cur, const xmlChar *content, int len) {
      if (cur == NULL) {
  #ifdef DEBUG_TREE
***************
*** 5342,5351 ****
--- 5348,5360 ----
   * @cur:  the node being modified
   * @content:  extra content
   * @len:  the size of @content
   * 
   * Append the extra substring to the node content.
+  * NOTE: In contrast to xmlNodeSetContentLen(), @content is supposed to be
+  *       raw text, so unescaped XML special chars are allowed, entity
+  *       references are not supported.
   */
  void
  xmlNodeAddContentLen(xmlNodePtr cur, const xmlChar *content, int len) {
      if (cur == NULL) {
  #ifdef DEBUG_TREE
***************
*** 5414,5423 ****
--- 5423,5435 ----
   * xmlNodeAddContent:
   * @cur:  the node being modified
   * @content:  extra content
   * 
   * Append the extra substring to the node content.
+  * NOTE: In contrast to xmlNodeSetContent(), @content is supposed to be
+  *       raw text, so unescaped XML special chars are allowed, entity
+  *       references are not supported.
   */
  void
  xmlNodeAddContent(xmlNodePtr cur, const xmlChar *content) {
      int len;
  
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to