Re: [HACKERS] PATCH: CITEXT 2.0

David E. Wheeler Fri, 04 Jul 2008 23:12:50 -0700

Replying to myself, but I've made some local changes (see othermessages) and just wanted to follow up on some of my own comments.


On Jul 2, 2008, at 21:38, David E. Wheeler wrote:

4) Operator =  citext_eq is not correct. See comment 
http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac
So should citextcmp() call strncmp() instead of varst_cmp()? Thelatter is what I saw in varlena.c.

I'm guessing that the answer is "no," since varstr_cmp() usesstrncmp() internally, as appropriate to the locale. Correct?

There must be difference between equality and collation for examplein Czech language 'láska' and 'laská' are different word it meansthat 'láska' != 'laská'. But there is no difference in collationorder. See Unicode Universal Collation Algorithm for detail.
I'll leave the collation stuff to the functions I call (*far* frommy specialty), but I'll add a test for this and make sure it worksas expected. Um, although, with what collation should it be tested?The tests I wrote assume en_US.UTF-8.


I added this test and is passes:

SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accentedcharacters should not be equivalent' );

5) There are several commented out lines in CREATE OPERATORstatement mostly related to NEGATOR. Is there some reason for that?
I copied it from the original citext.sql. Not sure what effect it has.


I restored these (and one of them was wrong anyway).

Also OPERATOR || has probably wrong negator.


Right, good catch.

Stupid question: What would the negation of || actually be? Thereisn't one is, there?


Thanks!

David
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: CITEXT 2.0

Reply via email to