Re: [GENERAL] PostgreSQL Gotchas

Chris Travers Thu, 13 Oct 2005 14:14:47 -0700

Ok. here are some indepth thoughts after reviewing as many prior threadsas I could find on the archives.


Tom Lane wrote:

Chris Travers <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

Since the end reward for all this work would be having to read CATALOGS
WRITTEN IN ALL UPPER CASE, none of the key developers seem very
interested ...

Why would this be required?


If you write, say,

        select max(relpages) from pg_class;

I have gone back and read the previous discussion and I still do notunderstand what the real impediment is (to at least putting it on theTODO list, at least). I see a lot of reasons that don't make any senseto me, and have come across one substantial obstacle that has not beenmentioned yet to my knowledge.

I understood this exmaple from your previous comment. But aside fromthe aggregate issue, I fail to see why fixing it is a requirement.Perhaps I am being unclear in my thoughts and we are talking past eachother.

If I write that statement, and it gets back an error saying that notable is named PG_CLASS, that is my fault as an individual developer.After all I set the configuration option to fold to uppercase, right?

The relevant question is very simple. Where does one draw the linebetween between the responsibility of the programmer and theresponsibility of the DBA? Personally I think it is important to offermodes that offer as much standards-compliance as possible.

From the previous discussion, it was mentioned that the backend treatsidentifiers as quoted. It seems to me that this should make it *easier*rather than *harder* to impliment because this is largely a changeregarding what a given SQL statement means. I.e. most of the work I hadforeseen has already been done. (My proposal would have been to havethe backend treat identifiers used internally as already double-quoted.)

If you are folding to lower case, your example, select max(relpages)from pg_class, is the same as SELECT MAX("relpages") FROM "pg_class";

If you are folding to upper case, your example is the same as SELECTMAX("RELPAGES") FROM "PG_CLASS"; Of course, we don't expect this togive us any results today. In essence, I don't see why we would expectthis to return any results. If you issue an incorrect SQL statement,whose fault is that?

Now, the one place where this might create a problem is in theinformation_schema. The problem here is not the same as any issue Ihave seen discussed before, but the fact that case folding could createnon-standard behavior here absent other changes. The only option I seehere is to create a second INFORMATION_SCHEMA with upper case view names.

and the lexer thinks that it should fold unquoted identifiers to upper
case, then the catalog entries defining these names had better read
PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
today.

Ok. so the only problem out of these three that I see is with MAX.PG_CLASS and RELPAGES are the responsibility of the developer, IMHO.Also, with functions and aggregates, this is not the problem that it iswith tables (as the name isn't usually sent back to the client), so Idon't know how much logic it would be to differentiate betweentable/column names which might need to be folded andfunction/aggregates which could continue the way that they are currentlydone at least for now. It might also be possible to create duplicateentries to the catelogs for builtin functions/aggregates in the catalogsso that this case folding is not causing the same type of problem.However, for builtin functions/aggregates, I am not sure if this islikely to have any significant performance hit.

 So this wouldn't be something you could flip on-the-fly --- at
the latest, an installation would have to commit to upper or lower case
at initdb time, because the initial contents of all the system catalogs
would need to match the choice.

Ok, so I see the objection basically being that changing the semanticsof the SQL statement would need to be done in a way that preventsuser-issued queries from having to know which way this is done. Thereis no way around this objection because it is inconsistant with the veryidea of semantic changes to the SQL parser. Yet we have done this inthe past in areas I have previously mentioned. So the question reallyis what is really required to make this work in a semantically cleanway. IMO, the following requirement is acceptable:


SELECT MAX("relpages") FROM "pg_class"

But the following is not:

SELECT "max"("relpages") FROM "pg_class"

However, I am sure that there will be people who don't see that as OK.So, I would suggest that if they neeed to avoid quoting relpages andpg_class, then they can create a PG_CATALOG schema and a PG_CLASS view.Maybe this could be a pgfoundry project even. But I don't see it as arequirement of the core team. The bigger issue is with "MAX" v. "max"and in "INFORMATION_SCHEMA" v. "information_schema." This mightrequire duplicate entries in the system catalogs.

Please read the previous discussions on the topic, if you want to
pontificate about it.

Is there still any specific reason why this does not belong on the TODOlist?

I am not arguing that this should be a priority in development. I amhowever arguing that since the behavior is non-standard, it might beworth acknowledging this and at least suggesting that it should be fixedat some indefinite point in the future.... If this was a requirementfor me, I would hire someone to make the changes and submit a patch...It is just an attempt to ensure that it is on the roadmap at this time.


Best Wishes,
Chris Travers
Metatron Technology Consulting

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] PostgreSQL Gotchas

Reply via email to