Hello,

There is a bug in ML when you add an xml:lang attribute programmatically.
In a certain case text is indexed in the wrong language.

The following script explains the bug pretty well:

xquery version "1.0-ml";

(: We insert a test document. The database language has to be "english".
   The <foo> child is important to reproduce the bug (text on the root
level is indexed correctly). :)
xdmp:document-insert("/xml-lang-test.xml", <root><foo>text</foo></root>)
;
(: We expect "1". Correct. :)
"test1: " || count(cts:search(/, cts:word-query("text", ("lang=en"))))
;
(: We add an xml:lang attribute to the root node (xml:lang is inherited by
definition). :)
xdmp:node-insert-child(doc("/xml-lang-test.xml")/node(), attribute {
xs:QName("xml:lang") } {"de"})
;
(: We search with lang=de and expect again "1" but this time the document
is NOT FOUND - BUG! :)
"test2: " || count(cts:search(/, cts:word-query("text", ("lang=de"))))
;
(: It was not found with lang=de because it is still indexed as English
text, as we learn from this query. :)
"test3: " || count(cts:search(/, cts:word-query("text", ("lang=en"))))
;
(: Workaround: We replace the xml:lang attribute with itself. :)
xdmp:node-replace(doc("/xml-lang-test.xml")/node()/@xml:lang, attribute {
xs:QName("xml:lang") } {"de"})
;
(: Now the search with lang=de works. :)
"test4: " || count(cts:search(/, cts:word-query("text", ("lang=de"))))
;

xdmp:document-delete("/xml-lang-test.xml")

Is this bug known? Can you tell me when it will be fixed?

Thanks,
Andreas

-- 
Andreas Hubmer
IT Consultant

EBCONT enterprise technologies GmbH
Millennium Tower
Handelskai 94-96
A-1200 Vienna

Web: http://www.ebcont.com

OUR TEAM IS YOUR SUCCESS

UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to