On 14.11.2013, at 20:39, James Ball <[email protected]> wrote:
> Hello, > > I'm doing some work matching between XML documents - one set has no > characters outside the basic ASCII range while the other has a mix of of Ø > and Ö and lots of others. Some are in UPPER case and some in Mixed. I need to > match a "James" in one file to "JAMES" in another and so on. To do the > comparisons I've been looking at BaseX's support for collations. > > Following the example in the documentation like this works perfectly: > > declare default collation 'http://basex.org/collation?strength=primary'; > "Straße" = "Strasse", > "Jérome" = "Jerome", > "James" = "JAMES" > > But it doesn't work when testing attribute (or node) values in a statement > like this: > > declare default collation 'http://basex.org/collation?strength=primary'; > let $doc := doc(' > <root> > <test name="Straße">Straße</test> > <test name="Strasse">Strasse</test> > </root> > ') > return count($doc/root/test[@name = "Strasse"]) Applying @name/data() already helps: declare default collation 'http://basex.org/collation?strength=primary'; let $doc := doc(' <root> <test name="Straße">Straße</test> <test name="Strasse">Strasse</test> </root> ') return count($doc/root/test[@name/data() = "Strasse"]) -> 2 > I would expect that to return a count of 2 but it returns a count of 1. > > I can get round this by calling fn:compare() like this but it feels like a > hack: > > declare default collation 'http://basex.org/collation?strength=primary'; > let $doc := doc(' > <root> > <test name="Straße">Straße</test> > <test name="Strasse">Strasse</test> > </root> > ') > return count($doc/root/test[0=fn:compare(@name,"Strasse")]) > > Is this behaviour as intended? I can see that it might make query speed and > indexes much better to ignore collation for = but I couldn't find it stated > in the documentation. My quick read of the specification suggested that the > operation of fn:compare would drive the behaviour of eq, gt, lt etc. > > I think that I'm probably doing this completely the wrong way and I should be > using some of the other features of Full-Text but I'm not sure. If anyone can > point me in the right direction I will be very grateful. > > Many thanks, James _______________________________________________ BaseX-Talk mailing list [email protected] https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

