Re: [xml] Adjacent text node merging

Eric S. Eberhard Wed, 08 May 2013 05:01:45 -0700

I think I am missing something ... I don't think it really "merges" ...it is comparing the address of the node elements, e.g.


 if ((parent->last != NULL) && (parent->last->type == XML_TEXT_NODE) &&
     (parent->last->name == cur->name) &&
     (parent->last != cur)) {

/* this "merge" simply puts the content intoparent->last and removes the redundant cur ... because the address ofthe name as well as the type are the same */

So parent->last->name is an address of a string ... and if it is thesame as cur->name THEN the content is "merged" -- it is more a check onhaving the exact same tag in the tree many times (when I say exact sametag I don't mean the tag data or name -- I am talking about the samememory location). If you do as you suggest I would be careful checkingfor NULL values because if parent->last->name == cur->name (andparent->last != cur_ then cur->name would become NULL whenparent->last->name becomes NULL.

In addition if you edit text content of a particular text node in thechildren list ... yet that content is from the same cur->name I wouldthink other odd problems would happen.

This has come up a few times on this list because people think thatadding an empty name is name=NULL (that is a NULL name, name="" is anempty name).

I use xmlNewDocNode which makes new memory for name so the above mergewould never happen.

If it was me I'd look at leaving the code alone in libxml2 and then whenyou need to make this content change use the SAME test as above beforecalling the standard routines:


cur->name = xmlStrdup(cur->name);

they will have the same value but DIFFERENT pointers and won't triggerthe merge.

Or if you only notice after the fact then copy the node -- in non-trickycases a memcpy will work just fine, and then use the xml Strdup on thename, and then


xmlReplaceNode(oldnode,newnode);

FREE the old node :-)  xmlFreeNode(oldnode);

Note I have been stating in previous emails and discussions lately theconcept of using one's own wrappers -- never call any of the libxml2functions directly if you can avoid it (you can carry that too far).This is a perfect example where either your adding of the node or yourreading and changing of the content of a node could be well wrapped --and then just use your wrappers.



E



On 5/4/2013 12:59 AM, Nikolay Sivov wrote:

I think it's more a question for Daniel, but any help is welcome ofcourse. Libxml2 merges text nodes to a single node when you add textchild next to existing text node for example, so at leastxmlAddNextSibling, xmlAddPrevSibling and xmlAddChild are doing that.For a project I'm using libxml2 I want all nodes to be preserved as Iadd them, so for example I can edit text content of particular textnode in children list. The question is what could or will potentiallybreak if I'll use my own versions of these tree manipulation callsthat do not perform such merging? e.g. does a lib really expect tohave only one text node with no text siblings somewhere in the code,or maybe libxslt does?
P.S. attached patch is just to fix a compiler warning I'm seeing withcurrent git builds, obviously completely unrelated to this topic.
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml


--
Eric S. Eberhard
VICS
2933 W Middle Verde Road
Camp Verde, AZ  86322

928-567-3727  work                      928-301-7537  cell

http://www.vicsmba.com/index.html             (our work)
http://www.vicsmba.com/ourpics/index.html     (fun pictures)

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml

Re: [xml] Adjacent text node merging

Reply via email to