David E. Wheeler napsal(a):
Replying to myself, but I've made some local changes (see other messages) and just wanted to follow up on some of my own comments.

On Jul 2, 2008, at 21:38, David E. Wheeler wrote:

4) Operator = citext_eq is not correct. See comment http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac

So should citextcmp() call strncmp() instead of varst_cmp()? The latter is what I saw in varlena.c.

I'm guessing that the answer is "no," since varstr_cmp() uses strncmp() internally, as appropriate to the locale. Correct?

You have to use varstr_cmp  in citextcmp. Your code is correct, because for
< <= >= > operators you need collation sensible function.

You need to change only citext_cmp function to use strncmp() or call texteq function.

There must be difference between equality and collation for example in Czech language 'láska' and 'laská' are different word it means that 'láska' != 'laská'. But there is no difference in collation order. See Unicode Universal Collation Algorithm for detail.

I'll leave the collation stuff to the functions I call (*far* from my specialty), but I'll add a test for this and make sure it works as expected. Um, although, with what collation should it be tested? The tests I wrote assume en_US.UTF-8.

I added this test and is passes:

SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented characters should not be equivalent' );

I'm think that this test will work correctly for en_US.UTF-8 at any time. I guess the test make sense only when Czech collation (cs_CZ.UTF-8) is selected, but unfortunately, you cannot change collation during your test :(.

I think, Best solution for now is to keep the test and add comment about recommended collation for this test.


                Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to