Hi David, I know the order of attributes has no semantic difference. I was merely suggesting it as an attempt to bypass the bug.
For the record, I filed bug #34234.. Cheers, Geert On 6/24/15, 2:59 PM, "David Lee" <david....@marklogic.com> wrote: >The order of attributes is explicitly something that should not and >cannot be used in a way that is semantically different. >Going down the path of either trying to get the attributes in the 'right' >order or 'fixing' things that reorder attributes so they do so >differently is futile. > > > >If there is an attribute related bug here, it would be that the an >attribute is applied which depends on another attribute value before all >the attributes of an element are read. > >But I can't fully decipher the expected results since you are using an >unfiltered query and I don't know the index settings of the database. >Furthermore your elements have no content and you're not doing stemmed >searching, your using unfiltered searches, and searching on empty >elements (that may or may not have significant whitespace) - many >variables - here- and according to the docs >https://docs.marklogic.com/guide/search-dev/languages#id_91703 >There is a constrained range of expected behavior ... > > >xmls:lang applies only to 'the text children' -- So this test case seems >to fall in a somewhat undefined array - > > >"All of the text node children and text node descendants of an element >with an xml:lang attribute are treated as the language specified in the >xml:lang attribute, unless a child element has an xml:lang attribute with >a different value. If so, any text node children and text node >descendants are treated as the new language, and so on until no other >xml:lang attributes are encountered." > >"Any content within an element having an xml:lang attribute is indexed in >that language. Additionally, the xml:lang value is inherited by all of >the descendants of that element, until another xml:lang value is >encountered." > > > > > > > >-------------------------------------------------------------------------- >--- >David Lee >Lead Engineer >MarkLogic Corporation >d...@marklogic.com >Phone: +1 812-482-5224 >Cell: +1 812-630-7622 >www.marklogic.com > >-----Original Message----- >From: general-boun...@developer.marklogic.com >[mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten >Sent: Wednesday, June 24, 2015 3:40 AM >To: MarkLogic Developer Discussion >Subject: Re: [MarkLogic Dev General] Indexing strategy for attributes >when using xdmp:xlst-invoke > >Hi Johan, > >I will file a bug. Can you tell which version of MarkLogic you are >running exactly? > >Not uncommonly, XSLT transforms like below reorder attributes. Does it >make a difference if you try to get the xml:lang attribute first in the >XSLT output? > >Last but not least, is this related to a customer case? It will push up >priority if it is.. (you can let me know offline if necessary..) > >Cheers, >Geert > >On 6/22/15, 5:10 PM, "Johan de Boer" <johan.de.b...@hinttech.com> wrote: > >>Hi, >> >>I have discovered that when you use a stylesheet with xdmp:xlst-invoke >>to transform your document content in some circumstances attributes are >>not indexed as you might expect. >> >>- If within an element an attribute x appears before the xml:lang >>attribute then this attribute x is indexed based on the default >>language of the database. >>- If within an element an attribute x appears after the xml:lang >>attribute then this attribute x is indexed based on the language in >>this previous xml:lang attribute. >> >>Because the default language of the database can differ from the >>language in the xml:lang attribute values for attribute x can be found >>within different languages. >> >>After reindexing the database all these attributes x are indexed >>according to the xml:lang attribute that appears within the same >>element. >> >>This appears in both Marklogic 7 and Marklogic 8 >> >>Although this problem can easily be avoided does anyone know if a >>certain option within the stylesheet should be used to avoid this? Or >>might this perhaps be a bug? >> >>An example is given below: >> >>xquery version "1.0-ml"; >>declare namespace html = "http://www.w3.org/1999/xhtml"; import module >>namespace search="http://marklogic.com/appservices/search" at >>"/MarkLogic/appservices/search/search.xqy"; >> >>declare variable $SEARCH-OPTIONS := >> <options xmlns="http://marklogic.com/appservices/search"> >> <search-option>unfiltered</search-option> >> <return-query>true</return-query> >> <return-results>true</return-results> >> >> <constraint name="type-de"> >> <word> >> <attribute ns="" name="type"/> >> <element ns="" name="bar"/> >> <term-option>lang=de</term-option> >> </word> >> </constraint> >> <constraint name="type-en"> >> <word> >> <attribute ns="" name="type"/> >> <element ns="" name="bar"/> >> <term-option>lang=en</term-option> >> </word> >> </constraint> >> </options>; >> >>let $content1 := >><foo> >> <bar type="abc" xml:lang="de"> >> </bar> >></foo> >> >>let $content2 := >><foo> >> <bar xml:lang="de" type="def"> >> </bar> >></foo> >> >>(: default database language is 'en' :) >> >>(: copy-and-paste.xsl is a stylesheet: >> >><xsl:stylesheet version="2.0" >>xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> >> <xsl:template match="@*|node()"> >> <xsl:copy> >> <xsl:apply-templates select="@*|node()" /> >> </xsl:copy> >> </xsl:template> >></xsl:stylesheet> >>:) >> >>(: Run 1: I add two documents :) >> >>(: >>let $_ := xdmp:document-insert("/test/foo1",$content1) >>let $_ := xdmp:document-insert("/test/foo2",$content2) >>return "inserted documents 1 and 2" >>:) >> >>(: Run 2 : I check the number of documents found in each language after >>run 1 :) >> >>(: >>let $found-de-abc := search:search("type-de:abc", >>$SEARCH-OPTIONS)/@total let $found-en-abc := >>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >>$found-de-def," and language 'en'/'def' : ", $found-en-def) >>:) >> >>(: Run 2 returns: >>Language 'de'/'abc' : 1 and language 'en'/'abc' : 0 and language >>'de'/'def' : 1 and language 'en'/'def' : 0 >>:) >> >>(: Run 3 : I add two more documents based on the previous documents >>using xdmp:xlst-invoke and the stylesheet :) >> >>(: >>let $content3 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl", >>fn:doc("/test/foo1")) >>let $content4 := xdmp:xslt-invoke("/app/xsl/copy-and-paste.xsl", >>fn:doc("/test/foo2")) >>let $_ := xdmp:document-insert("/test/foo3",$content3) >>let $_ := xdmp:document-insert("/test/foo4",$content4) >>return "inserted documents 3 and 4" >>:) >> >>(: Run 4 : I check the number of documents found in each language after >>run 1 and 2 :) >> >>(: >>let $found-de-abc := search:search("type-de:abc", >>$SEARCH-OPTIONS)/@total let $found-en-abc := >>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >>$found-de-def," and language 'en'/'def' : ", $found-en-def) >>:) >> >>(: Run 4 returns: >>Language 'de'/'abc' : 1 and language 'en'/'abc' : 1 and language >>'de'/'def' : 2 and language 'en'/'def' : 0 >>:) >> >>(: Then I reindex the database :) >> >>(: Run 5 : I check the number of documents found in each language after >>reindex :) >> >>let $found-de-abc := search:search("type-de:abc", >>$SEARCH-OPTIONS)/@total let $found-en-abc := >>search:search("type-en:abc", $SEARCH-OPTIONS)/@total let $found-de-def >>:= search:search("type-de:def", $SEARCH-OPTIONS)/@total let >>$found-en-def := search:search("type-en:def", $SEARCH-OPTIONS)/@total >>return fn:concat ("Language 'de'/'abc' : ", $found-de-abc," and >>language 'en'/'abc' : ", $found-en-abc, " and language 'de'/'def' : ", >>$found-de-def," and language 'en'/'def' : ", $found-en-def) >> >>(: Run 5 returns: >>Language 'de'/'abc' : 2 and language 'en'/'abc' : 0 and language >>'de'/'def' : 2 and language 'en'/'def' : 0 >>:) >> >> >>Thanks, >> >>Johan de Boer >>_______________________________________________ >>General mailing list >>General@developer.marklogic.com >>Manage your subscription at: >>http://developer.marklogic.com/mailman/listinfo/general > >_______________________________________________ >General mailing list >General@developer.marklogic.com >Manage your subscription at: >http://developer.marklogic.com/mailman/listinfo/general >_______________________________________________ >General mailing list >General@developer.marklogic.com >Manage your subscription at: >http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general