It seems to me that the real issue here is this function definition from snippet 1:

define function transform_dummy($element as element()) as text()
{
   "dummy"
}

I'm not sure why this is not throwing an error, as the data type of the return value (xs:string) is not the declared type of the return value. Under Saxon, the equivalent function definition in XQuery 1.0 throws a static error: you MUST have

  text { "dummy" }

in the function declaration for it to run.

This may be an issue of differences between the May 2003 definition of XQuery and the current one. I can't tell for sure from looking at the 2003 specs whether the fuction conversion rules allow a return value
of xs:string to be converted to text() automatically:

http://www.w3.org/TR/2003/WD-xquery-20030502/#id-function-calls

But it does seem that snippet 1 should either throw an error or
behave like snippet 2.

On Thu, 13 Mar 2008, Florentine, George wrote:

I've run into an interesting behavior (optimization? bug?) in MarkLogic
and wanted to see what others thought of this.

Here's the background - we have some code that dynamically generates
content by processing DITA topics. Depending upon the structure of the
content it's possible that our XQuery code may process two sequential
elements that would each return a text node from a function. What we see
is that in this case, only one text node is returned and its value is
the concatenation of the two string values separated by a single space
character. This is somewhat in line with the 2003 spec
(http://www.w3.org/TR/2003/WD-xquery-20030502/#doc-ComputedTextConstruct
or, section 3.7.2.4), which states:

----
The content expression of a text node constructor is processed as
follows:
1. Atomization is applied to the value of the content expression,
converting it to a sequence of atomic values.
2. If the result of atomization is an empty sequence, no text node is
constructed. Otherwise, each atomic value in the atomized sequence is
cast into a string.
3. The individual strings resulting from the previous step are merged
into a single string by concatenating them with a single space character
between each pair. The resulting string becomes the content of the
constructed text node.
-----

So it appears that there's some optimization in the output generation of
nodes such that two sequential text nodes are collapsed into one.

Below is a concrete code example. If you run the 1st code snippet in CQ,
the code generates the output <p>dummy dummy</p>, showing an example of
two calls to a function that should return two text nodes but only
returns one text node, with the return value of each call ("dummy")
concatenated into a single text node with a space character separating
the two.

If you run the same code (2nd snippet) with the one change that the
return value from the function transform_dummy returns an explicitly
created text constructor the output is <p>dummydummy</p> (no space
character). This is the behavior I was expecting and seems like the
right behavior. Note that the return value in function signature for the
transform_dummy() function is text() so I would assume that the
xs:string "dummy" would be coerced into a text node and that a text node
would be returned from this function in all cases.

It seems bad that this behavior is different. I'd like to get other
perspectives on this.

Thx,

G
-------------------------------

Code snippet 1 - no explicit text constructor in the function
transform_dummy, returns <p>dummy dummy</p>
-------------------------------

define function transform_default_element($element as element()) as
node()
{
   (: create a new element with the same name and attributes and
recurse to travel the subtree. :)
   element
    {fn:node-name($element)}
    {$element/@*,transform_template($element/node())}
}
define function transform_dummy($element as element()) as text()
{
  "dummy"
}
define function transform_element ( $element as element())  as node()*
{
   (: branch to more specialized functions based on the type of element
:)
   typeswitch ($element)
       case element(dummy)
           return transform_dummy($element)
       default
           return transform_default_element ($element)
}
define function transform_template ( $nodes as node()* )  as node()*
{

  for $node in $nodes
  return
      typeswitch($node)
          case element()
              return transform_element($node)
           default
               (: PIs, text and comment nodes are outputted here :)
               return $node
}

(: module start :)
let $para := xdmp:unquote("<p><dummy/><dummy/></p>")
return transform_template($para/node())

-----------------------------------------
Code snippet 2: explicit creation of text node in transform_dummy,
returns <p>dummydummy</p>
------------------------------------------

define function transform_default_element($element as element()) as
node()
{
   (: create a new element with the same name and attributes and
recurse to travel the subtree. :)
   element
    {fn:node-name($element)}
    {$element/@*,transform_template($element/node())}
}
define function transform_dummy($element as element()) as text()
{
  (: explicitly create a text node before returning :)
  text { "dummy" }
}
define function transform_element ( $element as element())  as node()*
{
   (: branch to more specialized functions based on the type of element
:)
   typeswitch ($element)
       case element(dummy)
           return transform_dummy($element)
       default
           return transform_default_element ($element)
}
define function transform_template ( $nodes as node()* )  as node()*
{

  for $node in $nodes
  return
      typeswitch($node)
          case element()
              return transform_element($node)
           default
               (: PIs, text and comment nodes are outputted here :)
               return $node
}

(: module start :)

let $para := xdmp:unquote("<p><dummy/><dummy/></p>")
return transform_template($para/node())

------------------------------------------------------------------------
---
George Florentine

[EMAIL PROTECTED]
 O:  303.542.2173
 C:  303.669.8628
 F:  303.544.0522
 www.FlatironsSolutions.com
An Inc. 500 Company


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general


--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: [EMAIL PROTECTED]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to