Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-11-14 Thread Bruce Momjian
beau hargis wrote:
 Having installed DB2 Enterprise today and taking it for a spin, it does 
 indeed 
 behave in a similar manner. However, after reading through both 
 specifications, it seems that DB2 follows more of the spec than PostgreSQL. 
 The specifications state that for purpose of comparing identifiers, both 
 shall be converted to upper-case. DB2 displays all identifiers in upper-case 
 whereas PostgreSQL displays all identifiers in lower-case. This alone would 
 be a deviation from the specification. 

True.  We lowercase because historically we have, and because
all-upper-case is hard to read.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread beau hargis
On Friday 27 October 2006 19:38, Joe wrote:
 Hi Beau,

 On Fri, 2006-10-27 at 16:23 -0700, beau hargis wrote:
  I am hoping that there is an easy way to obtain case-preservation with
  case-insensitivity, or at the very least, case-preservation and complete
  case-sensitivity, or case-preservation and a consistant case-conversion
  strategy.
 
  The case of the column names need to be preserved because that is the way
  the schema is designed and most importantly (VERY, VERY IMPORTANT),
  column names are used in apps as hash values, or as named references
  which are case sensitive and as such need to be delivered to the client
  in exactly in the manner specified at the time of table creation.
 

 I went through the same issue in my conversion from MySQL to Postgres
 and (since I had a small application) I ended up changing up all my
 tables and columns UserProfile to user_profile.

 I'm afraid however, that it's MySQL that is the odd man out.  I haven't
 researched this completely but I believe PG follows either the FIPS-127
 or SQL-92 standard with respect to what are called delimited
 identifiers.  Basically, this says if you want case sensitivity in
 identifier names, you have to use double quotes wherever you refer to
 the identifier.  Without the double quotes, the SQL implementor can
 either use UPPERCASE (as I believe Oracle and DB2 do) or lowercase (as
 PG does) when it displays those identifiers.

 Joe

Again, I am at the same point I was at when searching and searching for 
information on the problem, and I am not sure what the SQL standard has to 
say about it: I do not, and I dont think that anyone else who is struggling 
to deal with the problem does, care about case-sensitivity. I am interested 
in case preservation of column names. I do, indeed, want identifiers treated 
in a case insensitive way, but I want the case PRESERVED in the table 
definitions and I want that case, as preserved, to be reflected in the field 
names as returned by the server to any client library that connects and 
initiates a query. 

Case-preservation is not the same as case-sensitivity; nor is 
case-normalization the same as case-insensitivity. What PostgreSQL is doing 
is converting any, and all, identifiers to a lower case and then matching 
those against the identifiers (as stored in the table definition) in a 
case-sensitive manner. It 'normalizes' the case of the identifiers so that it 
has a common internal representation; the desires of the programmer and 
database architect be damned. 

Referenced specification details:
From FIPS-127:
===
3. Delimited identifiers. In the previous ANSI SQL specification, it was not 
possible for an application to specify identifiers with spaces or other 
special symbols. Also, it was not possible to protect against future assaults 
on the name space for (identifier) by additions to the (reserved word) list. 
The new facility for (delimited identifier) allows a user to enclose all 
identifiers in double-quotation marks, thereby ensuring that the name defined 
or referenced may contain spaces or other special symbols and will not be 
impacted by future additions to the (reserved word) list.
===

From SQL-92/Sec. 5.2:
===
10)The identifier body of a regular identifier is equivalent
to an identifier body in which every letter that is a lower-
case letter is replaced by the equivalent upper-case letter
or letters. This treatment includes determination of equiva-
lence, representation in the Information and Definition Schemas,
representation in the diagnostics area, and similar uses.

11)The identifier body of a regular identifier (with every
letter that is a lower-case letter replaced by the equivalent
upper-case letter or letters), treated as the repetition of
a character string literal that specifies a character set
specification of SQL_TEXT, shall not be equal, according to
the comparison rules in Subclause 8.2, comparison predicate,
to any reserved word (with every letter that is a lower-case
letter replaced by the equivalent upper-case letter or letters),
treated as the repetition of a character string literal that
specifies a character set specification of SQL_TEXT.

12)Two regular identifiers are equivalent if their identifier
bodys, considered as the repetition of a character string
literal that specifies a character set specification of
SQL_TEXT, compare equally according to the comparison rules
in Subclause 8.2, comparison predicate.

13)A regular identifier and a delimited identifier are equiva-
lent if the identifier body of the regular identifier (with
every letter that is a lower-case letter replaced by the equiva-
lent upper-case letter or letters) and the delimited identifier
body of the delimited identifier (with all occurrences of
quote replaced by quote symbol and all occurrences of dou-
blequote symbol replaced by double quote), considered as
the 

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane
beau hargis [EMAIL PROTECTED] writes:
 Considering the differences that already exist between database systems and 
 their varying compliance with SQL and the various extensions that have been 
 created, I do not consider that the preservation of case for identifiers 
 would violate any SQL standard.

That's not how I read the spec.  It is true that we are not 100% spec
compliant, but that isn't a good argument for moving further away from
spec.  Not to mention breaking backwards compatibility with our
historical behavior.  The change you propose would fix your application
at the cost of breaking other people's applications.   Perhaps you
should consider fixing your app instead.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane
Chuck McDevitt [EMAIL PROTECTED] writes:
 At Teradata, we certainly interpreted the spec to allow case-preserving,
 but case-insensitive, identifiers.

Really?

As I see it, the controlling parts of the SQL spec are (SQL99 sec 5.2)

26) A regular identifier and a delimited identifier are
equivalent if the identifier body of the regular identifier
(with every letter that is a lower-case letter replaced by the
corresponding upper-case letter or letters) and the delimited
identifier body of the delimited identifier (with all
occurrences of quote replaced by quote symbol and all
occurrences of doublequote symbol replaced by double quote),
considered as the repetition of a character string literal
that specifies a character set specification of SQL_IDENTIFIER
and an implementation-defined collation that is sensitive to
case, compare equally according to the comparison rules in
Subclause 8.2, comparison predicate.

27) Two delimited identifiers are equivalent if their delimited
identifier bodys, considered as the repetition of a character
string literal that specifies a character set specification
of SQL_IDENTIFIER and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, comparison predicate.

Note well the sensitive to case bits there.  Now consider

CREATE TABLE tab (
foobar int,
FooBar timestamp,
FOOBAR varchar(3)
);

We can *not* reject this as containing duplicate column names, else we
have certainly violated rule 27.  Now what will you do with

SELECT fooBar FROM tab;

?  The spec is unquestionably on the side of you selected the varchar
column; historical Postgres practice is on the side of you selected
the int column.  AFAICS a case-insensitive approach would have to
fail with some I can't identify which column you mean error.  I am
interested to see where you find support for that in the spec...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org