We can make the test-case even shorter:
<test>
<strings>{ for $i in 1 to 2 return "dummy" }</strings>
<texts>{ for $i in 1 to 2 return text { "dummy" } }</texts>
</test>=> <test><strings>dummy dummy</strings><texts>dummydummy</texts></test>I believe that this is the specified behavior, from http://www.w3.org/TR/xquery/#id-content (elided for simplicity):
1.e.i: For each adjacent sequence of one or more atomic values returned by an enclosed expression, a new text node is constructed, containing the result of casting each atomic value to a string, with a single space character inserted between adjacent values.
That matches "strings", above.
3. Adjacent text nodes in the content sequence are merged into a single text node by concatenating their contents, with no intervening blanks."
And that matches "texts", above. -- Mike Williams, Paul wrote:
Sorry... not an answer, just more on the question... I reduced the sample code down to what I've included below in order to wrap my head around this a little better. This code shows both results as George described. But it focuses on the piece of the code that seems pertinent. Running this test produces this output... <text> <strings>dummy dummy</strings> <texts>dummydummy</texts> </text> So why doesn't the explicit text constructor version in the "texts" element produce the same space-joined single text node as theauto-constructed version in the "strings" element?The "strings" version, I would assume, produces a set of strings first, then decides it needs a text node and must construct it. The "texts" version, I assume, produces a set of text nodes first, then decides they need to be concatenated. But for the "strings" version to end up with the space, it must be converting the set of strings into a set of text nodes and then concatenating into one. So why doesn't that result in the same output as the set of text nodes in the "texts" version? Hmmm. Curious. Sample code, try this in CQ ... ---------------------------------------------------------------- <test> <strings>{ for $node in (<elem/>,<elem/>) return "dummy" }</strings> <texts>{ for $node in (<elem/>,<elem/>) return text{"dummy"} }</texts> </test> ---------------------------------------------------------------- -- Paul -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Florentine, George Sent: Thursday, March 13, 2008 7:04 PM To: [email protected] Subject: [MarkLogic Dev General] Surprising behavior with text nodeconstruction I've run into an interesting behavior (optimization? bug?) in MarkLogic and wanted to see what others thought of this. Here's the background - we have some code that dynamically generates content by processing DITA topics. Depending upon the structure of the content it's possible that our XQuery code may process two sequential elements that would each return a text node from a function. What we see is that in this case, only one text node is returned and its value is the concatenation of the two string values separated by a single space character. This is somewhat in line with the 2003 spec (http://www.w3.org/TR/2003/WD-xquery-20030502/#doc-ComputedTextConstruct or, section 3.7.2.4), which states: ---- The content expression of a text node constructor is processed as follows: 1. Atomization is applied to the value of the content expression, converting it to a sequence of atomic values. 2. If the result of atomization is an empty sequence, no text node is constructed. Otherwise, each atomic value in the atomized sequence is cast into a string. 3. The individual strings resulting from the previous step are merged into a single string by concatenating them with a single space character between each pair. The resulting string becomes the content of the constructed text node. ----- So it appears that there's some optimization in the output generation of nodes such that two sequential text nodes are collapsed into one. Below is a concrete code example. If you run the 1st code snippet in CQ, the code generates the output <p>dummy dummy</p>, showing an example of two calls to a function that should return two text nodes but only returns one text node, with the return value of each call ("dummy") concatenated into a single text node with a space character separating the two. If you run the same code (2nd snippet) with the one change that the return value from the function transform_dummy returns an explicitly created text constructor the output is <p>dummydummy</p> (no space character). This is the behavior I was expecting and seems like the right behavior. Note that the return value in function signature for the transform_dummy() function is text() so I would assume that the xs:string "dummy" would be coerced into a text node and that a text node would be returned from this function in all cases. It seems bad that this behavior is different. I'd like to get other perspectives on this. Thx, G ------------------------------- Code snippet 1 - no explicit text constructor in the function transform_dummy, returns <p>dummy dummy</p> ------------------------------- define function transform_default_element($element as element()) as node() { (: create a new element with the same name and attributes and recurse to travel the subtree. :) element {fn:node-name($element)} {$element/@*,transform_template($element/node())} } define function transform_dummy($element as element()) as text() { "dummy" } define function transform_element ( $element as element()) as node()* { (: branch to more specialized functions based on the type of element :) typeswitch ($element) case element(dummy) return transform_dummy($element)default return transform_default_element ($element)}define function transform_template ( $nodes as node()* ) as node()* { for $node in $nodes return typeswitch($node) case element() return transform_element($node)default (: PIs, text and comment nodes are outputted here :)return $node }(: module start :) let $para := xdmp:unquote("<p><dummy/><dummy/></p>") return transform_template($para/node()) ----------------------------------------- Code snippet 2: explicit creation of text node in transform_dummy, returns <p>dummydummy</p> ------------------------------------------ define function transform_default_element($element as element()) as node() { (: create a new element with the same name and attributes and recurse to travel the subtree. :) element {fn:node-name($element)} {$element/@*,transform_template($element/node())} } define function transform_dummy($element as element()) as text() { (: explicitly create a text node before returning :) text { "dummy" } } define function transform_element ( $element as element()) as node()* { (: branch to more specialized functions based on the type of element :) typeswitch ($element) case element(dummy) return transform_dummy($element)default return transform_default_element ($element)}define function transform_template ( $nodes as node()* ) as node()* { for $node in $nodes return typeswitch($node) case element() return transform_element($node)default (: PIs, text and comment nodes are outputted here :)return $node }(: module start :) let $para := xdmp:unquote("<p><dummy/><dummy/></p>") return transform_template($para/node()) ------------------------------------------------------------------------ --- George Florentine [EMAIL PROTECTED] O: 303.542.2173 C: 303.669.8628 F: 303.544.0522 www.FlatironsSolutions.com An Inc. 500 Company _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
