from:"Lewis, Ian \(Microstar Laboratories\)"

Re: [HACKERS] Cluster wide option to control symbol case folding

2017-01-04 Thread Lewis, Ian (Microstar Laboratories)

Robert Haas [mailto:robertmh...@gmail.com] wrote:
>> Where you get into trouble there is that you might run CREATE
EXTENSION 
>> from that session

Yes. I can see this problem. And, while I can imagine resolving it with
context belonging to the extension, separate from the current session's
context, any resolution gets to be pretty complex. Probably complex
enough that the resolution is worse than living with the problem as part
of the cost of the feature. Which means, it remains one of the arguments
against it.

>> Again, I'm not trying to rain down fire and brimstone
>> on your idea here and I clearly see the utility of it.

I do not feel this at all (though, since this is e-mail, it is helpful
that you state it explicitly). And, I have not felt like this was the
case at any point in these discussions. I have consistently received
thoughtful and remarkably good responses containing solid points. 

While I did not recognize all the impacts, and I probably still do not,
I never thought this was a trivial issue with no arguments against it
(not least, that what I was asking for is not standard compliant). Every
modal behavior of any sort in any software adds pain. At the very least
it increases the regression testing burden. And, this is a mode with
pretty fundamental impact. It has to be worth a lot to somebody to be
worth having. And, of course, it also has to work.

Ian Lewis (www.mstarlabs.com)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2017-01-03 Thread Lewis, Ian (Microstar Laboratories)

Tom Lane [mailto:t...@sss.pgh.pa.us] wrote:
>> 2. If the folding mode is chosen through a GUC variable, which
>> is certainly what people would expect, then it turns out that
>> it breaks client libraries/applications *anyway*, because an
>> installation-wide setting could impose itself on a client that
>> hadn't asked for it.

I know that some variables can only be configured at a wide scope, and
not a narrow one. Is there no way to restrict a GUC variable's
configuration scope to session and finer, but force a fixed value at
global scope? 

If it is possible to restrict global configuration, that at least
protects the general purpose administrative tools to a significant
degree.

>> And for libraries, that isn't a great solution because then they're
incompatible with applications that wanted another setting.

Good point. Libraries continue to have problems even with session level
configuration if they are to operate in the context of an application
that reconfigures its session case folding for its own purposes. 

But, that seems like a problem that is much more likely to affect
developers of new systems rather than general users or administrators of
existing database systems. 

If so, it is more of a forward looking problem than a legacy problem in
the sense that the person who encounters it is likely to be in a
position to do something about it. This makes it much less critical to
get every library in the world updated to support all case folding modes
than would be the case for general administrative tools like pgAdmin.

Depending on the nature of the library, a developer would have the
option of using multiple sessions or, perhaps, if it were possible,
modifying the folding configuration when using the library. 

Anyhow, as you say, libraries clearly continue to have issues even with
restricted scope on case folding configuration.

And the session level idea really helps nothing unless the global
default session configuration is fixed.

Ian Lewis (www.mstarlabs.com)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2017-01-03 Thread Lewis, Ian (Microstar Laboratories)

Robert Haas [mailto:robertmh...@gmail.com] wrote:
> The issue is, rather, that every extension written for
> PostgreSQL, whether in or out of core, needs to handle this issue and
> every general-purpose client tool (pgAdmin, etc.) needs to be aware of
> it. 

I can see the accuracy of all of the points you make here. And, I
definitely had not thought through the side effects on support tools and
third party libraries of implementing such modal behavior on the server
when I originally asked my question. I did not even understand the
ramifications of upper case folding on the server until Tom pointed out
the earlier conversations on the subject (in my defense, I was not
confused enough to think I had thought through all the effects of a
fundamental change to language recognition based on writing one e-mail
message). 

A fully case sensitive mode, leaving the server catalogs all in lower
case, which is what we would really like to have for our use, still
looks pretty easy to implement on the server. And, it would at least
behave consistently with the lower case folding mode if one quoted all
identifiers, unlike a case preserving, case insensitive mode.

One idea, which would likely be harder to implement on the server, but
that would have less impact on third party tools and libraries, would be
to configure case folding on a session basis. There would have to be
some means to configure a session for the case folding your application
wants to see. And, the general default would have to be the current
PostgreSQL behavior so that an application that was designed for current
behavior would never see a change. 

While not quite obvious to me how one would implement this for all
client environments, it would make such a feature more useful if it
included a means to make the configuration outside of the scope of an
application itself so that one could give an application over which one
has no control the behavior it expects. That is, provide a means to
configure a specific application's session default behavior on the
client. But, provide no means to configure the server's general default
behavior so that the server itself is never modal with respect to case
folding. Only the client session is modal.

It is pretty easy to see the pain of adding symbol case folding modes.
On the other hand, there is no way to know exactly the gain (or loss) in
adoption to providing alternate case folding. So, you have one fact (the
pain) and one speculation (the gain). I can see that makes deciding
whether this is a good or bad idea for the project not at all easy. 

Anyhow, I appreciate the time you, and others, have taken to explain
your thinking and the impacts of adding modal case folding to the
server. 

Ian Lewis (www.mstarlabs.com)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2017-01-02 Thread Lewis, Ian (Microstar Laboratories)

From: Robert Haas [mailto:robertmh...@gmail.com] wrote:

> I'm not sure there's any way to split the baby here: tool authors will
obviously prefer that PostgreSQL's behavior in this area be invariable,
while people trying to develop portable database applications will
prefer configurability.
> As far as I can see, this is a zero sum game that is bound to have one
winner and one loser.

Tom is clearly right that such modes make life harder in a fundamental
way for anyone writing only against PostgreSQL. And, excepting the upper
case folding option, which is of no interest at all to me personally - I
do not care which case folding messes up my symbol declarations, it
would move PostgreSQL away from the standard rather than closer to it
(against that, however, PostgreSQL has many features that are not part
of the standard, including its existing lower case folding).

If he is also right that addition of such an option would deteriorate
into a situation where more people think PostgreSQL is broken, rather
than fewer people thinking that, as I think would be the case, I have no
strong argument for why PostgreSQL - as a project - should support such
modal behavior. 

Personally, I believe such an option would increase, not decrease the
number of people who could relatively easily use PostgreSQL. If that is
right it is a strong argument for such a modal behavior in spite of the
obvious real pain. 

And, from what I can see, many, maybe most, general purpose tool authors
target many backends. So, they already have to deal with some
signficiant degree of variation in case folding behavior.

So, I do not really see this as a zero sum game. It is a question of
whether such an option would grow the user base. If not, it is clearly a
bad idea for the project. But, if it would grow the user base
sufficiently, then, yes, there is pain for those who write general
purpose tools aimed only at PostgreSQL. But, such tools gain from a
wider adoption of PostgreSQL.

Ian Lewis (www.mstarlabs.com)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2017-01-02 Thread Lewis, Ian (Microstar Laboratories)

gsst...@gmail.com [mailto:gsst...@gmail.com] On Behalf Of Greg Stark
wrote:

> But the problem with configurable quoting rules is a bit different.
> Imagine your application later decides to depend on PostGIS. So you
load the PostGIS extension and perhaps also some useful functions you
found on Stack Overflow for solving some GIS problem you have. Those
extensions will create objects and then work with those objects and may
use CamelCase for clarity -- in > fact I think PostGIS functions are
documented as CamelCase.  The PostGIS extensions might not work on your
system with different case rules if they haven't been 100% consistent
with their camelCasing, and the functions from StackOverflow would be
even less likely to work.

Well, in the case of StackOverflow suggestions, I cannot remember a time
when I did not have to rewrite whatever suggestions I have found and
used. That is not to say that StackOverflow is not useful. It is
incredibly useful at times. But, the suggestions are usually fragments
showing how to do something, not solutions. And, most such suggestions
are small. And, so, relatively easy to understand and patch as needed.
Many such suggestions are not very useful verbatim anyhow. They are
useful exactly because they allow you to understand something that you
were unable to glean from the documentation. Certainly, making symbol
usage consistent is not a hard patch on a small fragment of code that
probably needs help anyhow to bring it to production grade. I would not
consider this a strong argument against having modal symbol recognition.

Your point about PostGIS, and other full or partial solutions for a
complex problem, is a more serious issue. I do not have a strong answer
to this point. However, at the least a CamelCase case defect in a tool
is a pretty easy problem to locate and submit as a patch. (I understand
that your point is not just about PostGIS, but for PostGIS itself I have
read in a few places that they quote everything already. I do not know
whether that is true or not as I have never even looked at the tool.
However, if it is true they quote everything, then they already have
their CamelCase exactly right everywhere. If they did not the symbol
lookup would fail against current PostgreSQL. Any tool that quotes
everything should work the same way against any mode as long as all
modes are case sensitive. It might be ugly, but at least it should
always work no matter what the back end case translation.)

In our own code, I actually would prefer that we were forced to always
use the same case everywhere we refer to a symbol. And a case sensitive
behavior would enforce that at testing. I do not want this because I
want to be able to define symbols that differ only in case. I want it so
that every symbol reference is exactly visually like every other symbol
reference to the same object. Even though the effect is small, I think
such consistency makes it easier to read code. Even in C we almost never
use the ability to overload on case alone except in a few rare - and
localized - cases where the code is actually clearer with such a
notation. For example, in a mathematical implementation, using a
notation where something like t acts as an index and T defines the range
of t the difference in case is very clear. Perhaps more importantly,
this use of overload on case is consistent with conventional
mathematical notation (which, in my opinion is very good where it
belongs). This is not true when dealing with TheLongSymbolWithMixedCase
vs. TheLongSymbolWithMixedcase. The human brain cannot see that
difference easily, while it can see the difference between t and T very
easily, and it can see the relationship between the two symbols more
easily than it can see the relationship between t and tmax, say. Still,
we almost never have such code running on a database server.

Anyhow, you have a good point about third party libraries and tools that
integrate with PostgreSQL. However, I for one would be willing to live
with and address that kind of issue as needed. If the behavior were
controlled at database create time, which, from the articles Tom linked,
seems to be the general consensus as the right time for such a choice
given the current implementation, then one would at least have the
option of having databases with different case rules within a cluster.
Since each session can only connect to one database, this is not a
solution to every such situation, but it would address at least some
such cases.

Ian Lewis (www.mstarlabs.com)

PS. To anyone who might know the answer: My Reply All to this group does
not seem to join to the original thread. All I am doing is Reply All
from Outlook. Is there something else I need to do to allow my responses
to join the original thread?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2016-12-25 Thread Lewis, Ian (Microstar Laboratories)

Tom Lane [mailto:t...@sss.pgh.pa.us] wrote:
>  Quite aside from the impact on the server (which would be extensive),
it would break every nontrivial application, and force them all to try
to deal with each possible folding behavior.
I have read through the various threads related to this issue that you
supplied. And, it looks quite clear that a change is unlikely. Maybe it
is even a bad idea, though personally I think it could prove worth the
pain that you obviously anticipate.

However, before fully dropping this issue, I do have one comment I would
like to make (just for the record) on your statement that a change in
the current behavior would break most non-trivial applications.

I assume you are talking about general purpose tools that attempt to
interact with any database in any configuration. Obviously, a purpose
built tool, such as our own internal database applications, would be
designed only for the behavior of the databases it is intended to work
against.

I have, over the past few months, tried quite a large number (10, maybe
a few more than that) of general purpose tools against PostgreSQL
looking for replacements for some of the tools we use against our old
database server. And, almost none of them work well if I quote
identifiers on the server. Almost all work perfectly well if I accept
that all my symbols will be converted to lower case and do not quote the
identifiers. Most fail - often not even in corner cases - when the
catalog (properly) returns a mixed case symbol like WeeklySales. Very
few of the tools I have tried seem to know to quote such a symbol to
preserve the case when generating a query to send back to the server. 

Most of the tools I have tested work through ODBC, though a few connect
directly to PostgreSQL. Both types of tools have exhibited similar
issues. Maybe the ODBC connection is relevant, since it appears that
Microsoft SQL Server does what our current server does and preserves
case, at least in some modes. A good fraction of the tools I have tried,
push their compatibility with SQL Server.

So, the current behavior already breaks many tools unless one accepts
that all symbols on the server are lower case. At root, based on reading
the threads you provided, this probably indicates defects in the tools,
rather than a problem with PostgreSQL. My reading of the standard text
quoted in various places is that any mixed case identifier returned from
the catalog has to be quoted to match in a query (whether you fold to
lower or upper case). 

But, I can easily imagine a good number of people deciding they want
mixed case on the server, and so quoting their identifiers. And, then
deciding PostgreSQL is defective, rather than deciding their favorite
administration or query tool is defective. Almost all of the tools I
tried worked fine when I had all lower case symbols on the server. Based
on observing the generated SQL, most of the tools that failed for me
when I had mixed case symbols on the server would work against a case
preserving mode in PostgreSQL. The tools generally pass through the
catalog reported symbols without manipulation. 

I fully understand your concern to keep the number of modal behaviors to
a minimum. And, the upper case folding looks to have a lot of side
effects, and so is a whole lot harder to implement than I expected.
Preserving case, which is what we actually want, by contrast still looks
pretty easy, though it is clearly not standard compliant. 

I do not have any more to add on the subject. I just wanted to make this
note that a good fraction of third party tools already fail unless one
never quotes identifiers on PostgreSQL. 

Thank you for taking the time to reply to my original inquiry. 

Ian Lewis
www.mstarlabs.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2016-12-24 Thread Lewis, Ian (Microstar Laboratories)

On Saturday, December 24, 2016 10:49 PM Tom Lane
[mailto:t...@sss.pgh.pa.us] wrote:
> No.  This has been looked into repeatedly in the past, and we simply
don't want to deal with it.  

Fair enough. We will not pursue the issue then. That is why I asked.

Ian Lewis (www.mstarlabs.com)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cluster wide option to control symbol case folding

2016-12-24 Thread Lewis, Ian (Microstar Laboratories)

On  December 24, 2016 9:52 PM Craig Ringer
[mailto:craig.rin...@2ndquadrant.com] wrote:
> Personally I can see such an option being ok as an initdb-time setting
or at CREATE DATABASE time. Case folding can know the current db from
global context.
>
> It'd have to be fast though. Very fast.

That seems reasonable. In fact, it is nicer to configure at the database
level, rather than at the cluster level. I did not know it was possible
to tell the database from global context. Since a connection only allows
access to a single database, it makes sense that could be possible.

A check of a single global configuration variable to switch between one
of three handler paths would be very small compared with the
character-by-character checks currently performed by
downcase_identifier(), though, of course, the extra check would not be
free.

However, it would likely be faster to setup up the processing to call
through a global pointer to one of three handler functions. On most
Intel processors, at least, that extra pointer indirection costs little
to nothing. The pointer could be set up during database connect (I do
not know what I am talking about here, but there must be such a process
somewhere).

Presumably, the handler pointer would have to go into the global
database descriptor whatever that is. Or, if you allow use of global
objects for storing information about database scope run-time
configuration, it could just be a function pointer stored with the
handlers. The database initialization processing could call a setup
function when it runs to select the correct handling for its
configuration. The default would be the current downcase_identifier()
handling.

Does this seem like an approach that would meet your "Very fast"
requirement?

Ian Lewis (www.mstarlabs.com)

[HACKERS] Cluster wide option to control symbol case folding

2016-12-24 Thread Lewis, Ian (Microstar Laboratories)

Is there any chance that the PostgreSQL developers would accept a new
cluster wide configuration option to control how the system handles
symbol case folding? 

Currently PostgreSQL folds all un-quoted symbols to lower case. 

We would like to add a global configuration option with a name like
  symbol_casefold
 
with settings to allow folding symbols to
  lc - lower case (default)
  uc   - upper case  
  none - no case folding


USE CASE 1
The option we actually want for our own use is "none". Currently we have
several large bodies of code that work with an SQL Anywhere backend.
This server preserves case and - by a configuration option - performs
case insensitive lookup on all symbols. We are moving to PostgreSQL for
internal applications. We are also considering using PostgreSQL for the
storage backend in a product.

We use camel case for symbol names in all of our application and backend
code (largely written in C++, Object Pascal, and Python). In a language
like Pascal that ignores case, we still maintain consistent use of case
in symbol names for the sake of human readers of the code.

Where we make a connection from application code to a storage backend we
would like to use exactly the same symbol name - including case - for
the symbol in the backend as we use in the corresponding symbol in
application code. For example, if we intend to read a value into a
variable called FirstName we would like the corresponding field in the
database to also be FirstName. The main reason we want this exact match
is so that a human reader sees exactly the same thing in the two places.
This makes it easier to see the connection between the two bodies of
code. However, in places we also automate the connection between client
symbols and server symbols, and in such a case it is useful, though
certainly not necessary, to have an exact match.

While we do not do so at present, where we use a database purely through
code, we can likely add double quotes around all symbols in our
generated SQL, which means we can have an exact match as we want under
PostgreSQL as it currently behaves.

However, in many cases we also access the same database backend through
third party tools and ad hoc queries. 

If we double quote all of our symbols, which is what we have tried to do
in our initial tests, then all such tools and manually written queries
must also double quote all symbols. 

We use several tools and libraries that do not appear to have any way to
properly quote symbols when they are obtained automatically from the
schema. So, for example, if our reporting tool retrieves a field called
FirstName from the database schema, it happily uses that name in its
internally generated SQL. And, on PostgreSQL, of course this fails
because FirstName in script is firstname at the server, which is not
defined if we have double quoted the field name in the table definition.
Since we do not have the code, we cannot work around this in any easy
way.

For ad hoc queries we can double quote all symbol references. But, this
makes the queries noticeably harder for a person to read. And, I do not
believe this is just a matter of "getting used to it". The quotes
clutter the script, and that clutter makes the script fundamentally
harder to understand. Maybe this is small, but anything that makes
comprehension harder is a bad thing. Queries can be hard enough to
understand without extra syntactic clutter.

So, these considerations leave us the option of never quoting symbols
when using PostgreSQL. This works everywhere we have tried it. But, it
is pretty unattractive from the point of view of looking at the symbol
names in the backend if they are to match exactly the symbol names we
use in code. Using a different naming convention than we use everywhere
else in our code (underscore separated all lower case symbol names, say)
is not appealing either. We are very consistent in our symbol name
handling in our code, and breaking our conventions in some relatively
large section of our code is very unattractive. 

In addition, in many places our tools and code use schema supplied field
names to form column titles in a table or in a caption on an edit box
(etc.). The mixed case names are much nicer for this purpose than the
folded names.

For our purposes, these are our arguments for wanting control of how the
server folds case. 


USE CASE 2
Even though we have no use for it, I have included the option "uc"
because, in trying to determine whether PostgreSQL could support our
desired behavior, I found a fairly large number of people who are coming
from a different backend, such as Oracle (from what I see on the
internet - no personal experience), that case fold similarly to
PostgreSQL. However, instead of folding to lower case, it appears a
number of other database servers fold to upper case. 

This leaves people who are moving from these other database systems with
problems in their own code that they have to patch up to be able to make
the port succes

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

Re: [HACKERS] Cluster wide option to control symbol case folding

[HACKERS] Cluster wide option to control symbol case folding

9 matches

Site Navigation

Mail list logo

Footer information