I knew there'd be something simple I was missing.

Thank you!

James

On 15 Nov 2013, at 07:40, Alexander Holupirek 
<[email protected]> wrote:

> 
> On 14.11.2013, at 20:39, James Ball <[email protected]> wrote:
> 
>> Hello,
>> 
>> I'm doing some work matching between XML documents - one set has no 
>> characters outside the basic ASCII range while the other has a mix of of Ø 
>> and Ö and lots of others. Some are in UPPER case and some in Mixed. I need 
>> to match a "James" in one file to "JAMES" in another and so on. To do the 
>> comparisons I've been looking at BaseX's support for collations.
>> 
>> Following the example in the documentation like this works perfectly:
>> 
>> declare default collation 'http://basex.org/collation?strength=primary';
>> "Straße" = "Strasse",
>> "Jérome" = "Jerome",
>> "James" = "JAMES"
>> 
>> But it doesn't work when testing attribute (or node) values in a statement 
>> like this:
>> 
>> declare default collation 'http://basex.org/collation?strength=primary';
>> let $doc := doc('
>>              <root>
>>                      <test name="Straße">Straße</test>
>>                      <test name="Strasse">Strasse</test>
>>              </root>
>> ')
>> return count($doc/root/test[@name = "Strasse"])
> 
> Applying @name/data() already helps:
> 
> declare default collation 'http://basex.org/collation?strength=primary';
> let $doc := doc('
>               <root>
>                       <test name="Straße">Straße</test>
>                       <test name="Strasse">Strasse</test>
>               </root>
> ')
> return count($doc/root/test[@name/data() = "Strasse"])
> -> 2
> 
>> I would expect that to return a count of 2 but it returns a count of 1.
>> 
>> I can get round this by calling fn:compare() like this but it feels like a 
>> hack:
>> 
>> declare default collation 'http://basex.org/collation?strength=primary';
>> let $doc := doc('
>>              <root>
>>                      <test name="Straße">Straße</test>
>>                      <test name="Strasse">Strasse</test>
>>              </root>
>> ')
>> return count($doc/root/test[0=fn:compare(@name,"Strasse")])
>> 
>> Is this behaviour as intended? I can see that it might make query speed and 
>> indexes much better to ignore collation for = but I couldn't find it stated 
>> in the documentation. My quick read of the specification suggested that the 
>> operation of fn:compare would drive the behaviour of eq, gt, lt etc.
>> 
>> I think that I'm probably doing this completely the wrong way and I should 
>> be using some of the other features of Full-Text but I'm not sure. If anyone 
>> can point me in the right direction I will be very grateful.
>> 
>> Many thanks, James
> 
> 
> 
> 

_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to