On Jul 29, 2008, at 14:00, Tom Lane wrote:

Well, a rough estimate of the places where implicit coercion to text
might be relevant to resolving ambiguity is

select proname from pg_proc
 where 'text'::regtype = any(proargtypes)
 group by proname having count(*)>1;

select oprname from pg_operator
 where oprleft='text'::regtype or oprright='text'::regtype
 group by oprname having count(*)> 1;

I count 37 functions and 10 operators as of CVS HEAD.  Perhaps not all
would need to be fixed in practical use, but if you wanted seamless
integration of citext it's quite possible that you'd need alias
functions/operators (maybe more than one) in each of those cases.

Well, there are already citext aliases for all of those operators, for this very reason. There are citext aliases for a bunch of the functions, too (ltrim(), substring(), etc.), so I wouldn't worry about adding more. I've added more of them since I last sent a patch, mainly for the regexp functions, replace(), strpos(), etc. I'd guess that I'm about half-way there already, and there probably are a few I wouldn't bother with (like timezone()).

Anyway, would this issue then go away once the type stuff was added and citext was specified as TYPE = 'S'?

[ squint... ]  Actually, this is an underestimate since these queries
aren't finding cases like quote_literal, where there is ambiguity but
only one of the alternatives takes 'text'.  I'm too lazy to work out a
better query though.

Thanks.

Perhaps tangential: What does it mean for a type to be "preferred"?

See the ambiguous-function resolution rules in chapter 10 of the fine
manual ...

I see this:

C. Run through all candidates and keep those that accept preferred types (of the input data type's type category) at the most positions where type conversion will be required. Keep all candidates if none accept preferred types. If only one candidate remains, use it; else continue to the next step.

That doesn't exactly explain what "preferred" means, just that it seems to prioritize the resolution of a function a bit. Which, I guess, is the point.

Wouldn't this then limit them to 52 possible categories?

It'd be either 94 - 26 or 94 - 26 - 26 depending on what the policy is
about lower-case letters (and assuming they wanted to stay away from
control characters, which seems like a good idea).  Considering the
world supply of categories up to now has been about ten, it's hard
to imagine that this is really a limitation.

Okay.

Does that
matter? Given your suggestion, I'm assuming that a single character is
somehow more efficient than an enum, yes?

Marginally so; but an enum wouldn't help anyway unless we are prepared
to invent ALTER ENUM.  We'd have to go to an actual new system catalog
if we wanted something noticeably better than the poor-mans-enum
approach, and as I mentioned earlier, that just seems like overkill.
(Besides, we could always add it later if there's suddenly a gold rush
for categories.  The only thing we'd be locking ourselves into, if
we view this as a stopgap implementation, is the need to accept
single-character abbreviations in future, even after the system knows
actual names for categories.)

Makes sense.

Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to