Reading through this thread... On Sat, Mar 26, 2011 at 12:36 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > ** Selecting a field from a record-returning function's output. > Currently, we'll use the field's declared collation; except that > if the field has default collation, we'll replace that with the common > collation of the function's inputs, if any. Is either part of that > sane? Do we need to make this work for functions invoked with other > syntax than a plain function call, eg operator or cast syntax?
There were a couple of different ideas about which way we ought to go with this, but I'm happy to defer to what Tom and Martijn hashed out: MO> That seems all a bit weird. I spent some time reading through the SQL MO> spec to see if I could came up with a few ideas about what they thought MO> relevent. I think the gist of it is that I think the result row should MO> have for each column its declared collation in all cases. TL> That interpretation would be fine with me. It would let us get rid of TL> the special-case code at lines 307-324 of parse_collate.c, which I put TL> in only because there are cases in the collate.linux.utf8.sql regression TL> test that fail without it. But I'm perfectly happy to conclude that TL> those test cases are mistaken. I'm not sure whether that's been done, though, or whether we're even going to do it. > ** What to do with domains whose declaration includes a COLLATE clause? > Currently, we'll impute that collation to the result of a cast to the > domain type --- even if the cast's input expression includes an > explicit COLLATE clause. It's not clear that that's per spec. If it > is correct, should we behave similarly for functions that are declared > to return a domain type? Should it matter if the cast-to-domain is > explicit or implicit? Perhaps it'd be best if domain collations only > mattered for columns declared with that domain type. Then we'd have > a general rule that collations only come into play in an expression > as a result of (a) the declared type of a column reference or (b) > an explicit COLLATE clause. I think we had agreement than a cast to a domain type with a collation should stomp on any existing collation on the contained expression. > * In plpgsql, is it OK for declared local variables to inherit the > function's input collation? Should we provide a COLLATE option in > variable declarations to let that be overridden? I think everyone who responded said "yes" to both questions. > * RI triggers should insert COLLATE clauses in generated queries to > satisfy SQL2008 9.13 SR 4a, which says that RI comparisons use the > referenced column's collation. Right now you may get either table's > collation depending on which query type is involved. I think an obvious > failure may not be possible so long as equality means the same thing in > all collations, but it's definitely possible that the planner might > decide it can't use the referenced column's unique index, which would > suck for performance. (Note: this rule seems to prove that the > committee assumes equality can mean different things in different > collations, else they'd not have felt the need to specify.) I tested this and, indeed, if the collations don't match, the index can't be used. CREATE TABLE me (x character varying COLLATE "en_US"); CREATE TABLE me2 (x character varying COLLATE "es_ES"); CREATE TABLE you (x character varying COLLATE "es_ES" NOT NULL, PRIMARY KEY (x)); ALTER TABLE me ADD FOREIGN KEY (x) REFERENCES you(x); ALTER TABLE me2 ADD FOREIGN KEY (x) REFERENCES you(x); SET enable_seqscan=false; SET enable_hashjoin=false; With that setup, this still does a seqscan-and-sort: EXPLAIN select * from me, you where me.x = you.x; But this uses the index: EXPLAIN select * from me2, you where me2.x = you.x; I found another problem, too: rhaas=# insert into you values ('1'); INSERT 0 1 rhaas=# insert into me values ('1'); INSERT 0 1 rhaas=# alter table me alter column x set data type varchar collate "en_GB"; ERROR: could not determine which collation to use for string comparison HINT: Use the COLLATE clause to set the collation explicitly. CONTEXT: SQL statement "SELECT fk."x" FROM ONLY "public"."me" fk LEFT OUTER JOIN ONLY "public"."you" pk ON ( pk."x"::pg_catalog.text OPERATOR(pg_catalog.=) fk."x"::pg_catalog.text) WHERE pk."x" IS NULL AND (fk."x" IS NOT NULL)" > * It'd sure be nice if we had some nontrivial test cases that work in > encodings besides UTF8. I'm still bothered that the committed patch > failed to cover single-byte-encoding cases in upper/lower/initcap. Seems like no one knows how to do this. > * Remove initdb's warning about useless locales? Seems like pointless > noise, or at least something that can be relegated to debug mode. Everyone was in favor of this. > * Is it worth adding a cares-about-collation flag to pg_proc? Probably > too late to be worrying about such refinements for 9.1. Probably too late for this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers