Hi David,

I know the order of attributes has no semantic difference. I was merely
suggesting it as an attempt to bypass the bug.

For the record, I filed bug #34234..

Cheers,
Geert

On 6/24/15, 2:59 PM, "David Lee" <david....@marklogic.com> wrote:

>The order of attributes is explicitly something that should not and
>cannot be used in a way that is semantically different.
>Going down the path of either trying to get the attributes in the 'right'
>order or 'fixing' things that reorder attributes so they do so
>differently is futile.
>
>
>
>If there is an attribute related bug here, it would be that the an
>attribute is applied which depends on another attribute value before all
>the attributes of an element are read.
>
>But I can't fully decipher the expected results since you are using an
>unfiltered query and I don't know the index settings of the database.
>Furthermore your elements have no content and you're not doing stemmed
>searching, your using unfiltered searches, and searching on empty
>elements (that may or may not have significant whitespace) - many
>variables - here- and according to the docs
>https://docs.marklogic.com/guide/search-dev/languages#id_91703
>There is a constrained range of expected behavior ...
>
>
>xmls:lang applies only to 'the text children'  -- So this test case seems
>to fall in a somewhat undefined array -
>
>
>"All of the text node children and text node descendants of an element
>with an xml:lang attribute are treated as the language specified in the
>xml:lang attribute, unless a child element has an xml:lang attribute with
>a different value. If so, any text node children and text node
>descendants are treated as the new language, and so on until no other
>xml:lang attributes are encountered."
>
>"Any content within an element having an xml:lang attribute is indexed in
>that language. Additionally, the xml:lang value is inherited by all of
>the descendants of that element, until another xml:lang value is
>encountered."
>
>
>
>
>
>
>
>--------------------------------------------------------------------------
>---
>David Lee
>Lead Engineer
>MarkLogic Corporation
>d...@marklogic.com
>Phone: +1 812-482-5224
>Cell:  +1 812-630-7622
>www.marklogic.com
>
>-----Original Message-----
>From: general-boun...@developer.marklogic.com
>[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten
>Sent: Wednesday, June 24, 2015 3:40 AM
>To: MarkLogic Developer Discussion
>Subject: Re: [MarkLogic Dev General] Indexing strategy for attributes
>when using xdmp:xlst-invoke
>
>Hi Johan,
>
>I will file a bug. Can you tell which version of MarkLogic you are
>running exactly?
>
>Not uncommonly, XSLT transforms like below reorder attributes. Does it
>make a difference if you try to get the xml:lang attribute first in the
>XSLT output?
>
>Last but not least, is this related to a customer case? It will push up
>priority if it is.. (you can let me know offline if necessary..)
>
>Cheers,
>Geert
>
>On 6/22/15, 5:10 PM, "Johan de Boer" <johan.de.b...@hinttech.com> wrote:
>
>>Hi,
>>
>>I have discovered that when you use a stylesheet with xdmp:xlst-invoke
>>to transform your document content in some circumstances attributes are
>>not indexed as you might expect.
>>
>>- If within an element an attribute x appears before the xml:lang
>>attribute then this attribute x is indexed based on the default
>>language of the database.
>>- If within an element an attribute x appears after the xml:lang
>>attribute then this attribute x is indexed based on the language in
>>this previous xml:lang attribute.
>>
>>Because the default language of the database can differ from the
>>language in the xml:lang attribute values for attribute x can be found
>>within different languages.
>>
>>After reindexing the database all these attributes x are indexed
>>according to the xml:lang attribute that appears within the same
>>element.
>>
>>This appears in both Marklogic 7 and Marklogic 8
>>
>>Although this problem can easily be avoided does anyone know if a
>>certain option within the stylesheet should be used to avoid this? Or
>>might this perhaps be a bug?
>>
>>An example is given below:
>>
>>xquery version "1.0-ml";
>>declare namespace html = "http://www.w3.org/1999/xhtml";; import module
>>namespace search="http://marklogic.com/appservices/search"; at
>>"/MarkLogic/appservices/search/search.xqy";
>>
>>declare variable $SEARCH-OPTIONS :=
>>    <options xmlns="http://marklogic.com/appservices/search";>
>>        <search-option>unfiltered</search-option>
>>        <return-query>true</return-query>
>>        <return-results>true</return-results>
>>
>>        <constraint name="type-de">
>>            <word>
>>                <attribute ns="" name="type"/>
>>                <element ns="" name="bar"/>
>>                <term-option>lang=de</term-option>
>>            </word>
>>        </constraint>
>>        <constraint name="type-en">
>>            <word>
>>                <attribute ns="" name="type"/>
>>                <element ns="" name="bar"/>
>>                <term-option>lang=en</term-option>
>>            </word>
>>        </constraint>
>>    </options>;
>>
>>let $content1 :=
>><foo>
>>   <bar type="abc" xml:lang="de">
>>   </bar>
>></foo>
>>
>>let $content2 :=
>><foo>
>>   <bar xml:lang="de" type="def">
>>   </bar>
>></foo>
>>
>>(: default database language is 'en' :)
>>
>>(: copy-and-paste.xsl is a stylesheet:
>>
>><xsl:stylesheet version="2.0"
>>xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>>    <xsl:template match="@*|node()">
>>        <xsl:copy>
>>            <xsl:apply-templates select="@*|node()" />
>>        </xsl:copy>
>>    </xsl:template>
>></xsl:stylesheet>
>>:)
>>
>>(: Run 1: I add two documents :)
>>
>>(:
>>let $_ := xdmp:document-insert("/test/foo1",$content1)
>>let $_ := xdmp:document-insert("/test/foo2",$content2)
>>return "inserted documents 1 and 2"
>>:)
>>
>>(: Run 2 : I check the number of documents found in each language after
>>run 1 :)
>>
>>(:
>>let $found-de-abc := search:search("type-de:abc",
>>$SEARCH-OPTIONS)/@total let $found-en-abc :=
>>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def
>>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let
>>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total
>>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and
>>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ",
>>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>>:)
>>
>>(: Run 2 returns:
>>Language 'de'/'abc' : 1 and language 'en'/'abc' : 0 and language
>>'de'/'def' : 1 and language 'en'/'def' : 0
>>:)
>>
>>(: Run 3 : I add two more documents based on the previous documents
>>using xdmp:xlst-invoke and the stylesheet :)
>>
>>(:
>>let $content3 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl",
>>fn:doc("/test/foo1"))
>>let $content4 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl",
>>fn:doc("/test/foo2"))
>>let $_ := xdmp:document-insert("/test/foo3",$content3)
>>let $_ := xdmp:document-insert("/test/foo4",$content4)
>>return "inserted documents 3 and 4"
>>:)
>>
>>(: Run 4 : I check the number of documents found in each language after
>>run 1 and 2 :)
>>
>>(:
>>let $found-de-abc := search:search("type-de:abc",
>>$SEARCH-OPTIONS)/@total let $found-en-abc :=
>>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def
>>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let
>>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total
>>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and
>>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ",
>>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>>:)
>>
>>(: Run 4 returns:
>>Language 'de'/'abc' : 1 and language 'en'/'abc' : 1 and language
>>'de'/'def' : 2 and language 'en'/'def' : 0
>>:)
>>
>>(: Then I reindex the database :)
>>
>>(: Run 5 : I check the number of documents found in each language after
>>reindex :)
>>
>>let $found-de-abc := search:search("type-de:abc",
>>$SEARCH-OPTIONS)/@total let $found-en-abc :=
>>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def
>>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let
>>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total
>>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and
>>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ",
>>$found-de-def," and language 'en'/'def' : ", $found-en-def)
>>
>>(: Run 5 returns:
>>Language 'de'/'abc' : 2 and language 'en'/'abc' : 0 and language
>>'de'/'def' : 2 and language 'en'/'def' : 0
>>:)
>>
>>
>>Thanks,
>>
>>Johan de Boer
>>_______________________________________________
>>General mailing list
>>General@developer.marklogic.com
>>Manage your subscription at:
>>http://developer.marklogic.com/mailman/listinfo/general
>
>_______________________________________________
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general
>_______________________________________________
>General mailing list
>General@developer.marklogic.com
>Manage your subscription at:
>http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to