David E. Wheeler napsal(a):
Replying to myself, but I've made some local changes (see other
messages) and just wanted to follow up on some of my own comments.
On Jul 2, 2008, at 21:38, David E. Wheeler wrote:
4) Operator = citext_eq is not correct. See comment
http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac
So should citextcmp() call strncmp() instead of varst_cmp()? The
latter is what I saw in varlena.c.
I'm guessing that the answer is "no," since varstr_cmp() uses strncmp()
internally, as appropriate to the locale. Correct?
You have to use varstr_cmp in citextcmp. Your code is correct, because for
< <= >= > operators you need collation sensible function.
You need to change only citext_cmp function to use strncmp() or call texteq
function.
There must be difference between equality and collation for example
in Czech language 'láska' and 'laská' are different word it means
that 'láska' != 'laská'. But there is no difference in collation
order. See Unicode Universal Collation Algorithm for detail.
I'll leave the collation stuff to the functions I call (*far* from my
specialty), but I'll add a test for this and make sure it works as
expected. Um, although, with what collation should it be tested? The
tests I wrote assume en_US.UTF-8.
I added this test and is passes:
SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented
characters should not be equivalent' );
I'm think that this test will work correctly for en_US.UTF-8 at any time. I
guess the test make sense only when Czech collation (cs_CZ.UTF-8) is selected,
but unfortunately, you cannot change collation during your test :(.
I think, Best solution for now is to keep the test and add comment about
recommended collation for this test.
Zdenek
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers