Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Craig Ringer
On 09/05/2013 08:47 PM, Peter Eisentraut wrote:
 Other ideas?  Are there legitimate uses for SQL_ASCII?

IMO people who want SQL_ASCII should actually be storing everything in
`bytea`; that's a truer reflection of what they're actually storing,
retrieving, and working with and how they're doing it.

Unfortunately there'll be enough users of it around that I don't think
we can drop it.

What we SHOULD be doing is making it an explicit decision to use
SQL_ASCII, and NEVER creating a cluster or database with that encoding
by default. Ever. If we can't decide what the correct default encoding
is (say, if locale is C) we should error out unless a specific flag is
set.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Florian Weimer

On 09/06/2013 09:14 AM, Craig Ringer wrote:

On 09/05/2013 08:47 PM, Peter Eisentraut wrote:

Other ideas?  Are there legitimate uses for SQL_ASCII?


IMO people who want SQL_ASCII should actually be storing everything in
`bytea`; that's a truer reflection of what they're actually storing,
retrieving, and working with and how they're doing it.


Practically speaking, the escaping gets in the way, and there isn't full 
feature parity with TEXT.  Regular expression matching seems to be 
missing, for instance.


But apart from that, yes, BYTEA would be the more appropriate choice.

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 What we SHOULD be doing is making it an explicit decision to use
 SQL_ASCII, and NEVER creating a cluster or database with that encoding
 by default. Ever. If we can't decide what the correct default encoding
 is (say, if locale is C) we should error out unless a specific flag is
 set.

There's a large undercurrent of I say it's bad for you in this thread,
with frankly nothing to back it up.  If we try to be as nanny-ish as
you're suggesting here, we'll just annoy users.

And just to push back on the specific point: SQL_ASCII *is* the correct
default encoding for C locale.  Both are agnostic about the meaning of
anything outside the 7-bit ASCII set, while not rejecting such data.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Robert Haas
On Fri, Sep 6, 2013 at 10:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 There's a large undercurrent of I say it's bad for you in this thread,
 with frankly nothing to back it up.  If we try to be as nanny-ish as
 you're suggesting here, we'll just annoy users.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-06 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 There's a large undercurrent of I say it's bad for you in
 this thread, with frankly nothing to back it up.  If we try to
 be as nanny-ish as you're suggesting here, we'll just annoy
 users.

 +1.

+1

I can definitely see a place for an ASCII7 encoding which would
reject anything with the high bit set; but there is a clear place
for the current SQL_ASCII, too.  Eliminating it would be much pain
for no discernible gain.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Merlin Moncure
On Thu, Sep 5, 2013 at 7:47 AM, Peter Eisentraut pete...@gmx.net wrote:
 Can we consider getting rid of the SQL_ASCII server-side encoding?  I
 don't see any good use for it, and it's often a support annoyance, and
 it leaves warts all over the code.  This would presumably be a
 multi-release effort.

 As a first step in accommodating users who have existing SQL_ASCII
 databases, we could change SQL_ASCII into a real encoding with
 conversion routines to all other encodings that only convert 7-bit ASCII
 characters.  That way, users who use SQL_ASCII as real ASCII or don't
 care could continue to use it.  Others would be forced to either set
 SQL_ASCII as the client encoding or adjust the encoding on the server.

 On the client side, the default libpq client encoding SQL_ASCII would
 be renamed to something like SAME or whatever, so the behavior would
 stay the same.

 Other ideas?  Are there legitimate uses for SQL_ASCII?

performance?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Heikki Linnakangas

On 05.09.2013 15:47, Peter Eisentraut wrote:

Can we consider getting rid of the SQL_ASCII server-side encoding?  I
don't see any good use for it, and it's often a support annoyance, and
it leaves warts all over the code.  This would presumably be a
multi-release effort.


I think warts all over the code is an overstatement. There aren't that 
many places in the code that care about SQL_ASCII, and they're all 
related to encoding conversions.



As a first step in accommodating users who have existing SQL_ASCII
databases, we could change SQL_ASCII into a real encoding with
conversion routines to all other encodings that only convert 7-bit ASCII
characters.  That way, users who use SQL_ASCII as real ASCII or don't
care could continue to use it.  Others would be forced to either set
SQL_ASCII as the client encoding or adjust the encoding on the server.

On the client side, the default libpq client encoding SQL_ASCII would
be renamed to something like SAME or whatever, so the behavior would
stay the same.

Other ideas?  Are there legitimate uses for SQL_ASCII?


One use is if you want to use some special encoding that's not supported 
by PostgreSQL, and you want PostgreSQL to just regurgitate any strings 
as is. It's not common, but would be strange to remove that capability 
altogether, IMHO.


I agree it would be nice to have a real ASCII encoding, which only 
accepts 7-bit ASCII characters. And it would be nice if SQL_ASCII was 
called something else, like UNDEFINED or BYTE_PER_CHAR, to make the 
meaning more clear. But I'm not in favor of deprecating it altogether.


Also, during backend initialization there is a phase where 
client_encoding has not been set yet, and we don't do any conversions 
yet. That's exactly what SQL_ASCII means, so even if we get rid of 
SQL_ASCII, we'd still need to have some encoding value in the backend to 
mean that intermediate state.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread k...@rice.edu
On Thu, Sep 05, 2013 at 08:47:32AM -0400, Peter Eisentraut wrote:
 Can we consider getting rid of the SQL_ASCII server-side encoding?  I
 don't see any good use for it, and it's often a support annoyance, and
 it leaves warts all over the code.  This would presumably be a
 multi-release effort.
 
 As a first step in accommodating users who have existing SQL_ASCII
 databases, we could change SQL_ASCII into a real encoding with
 conversion routines to all other encodings that only convert 7-bit ASCII
 characters.  That way, users who use SQL_ASCII as real ASCII or don't
 care could continue to use it.  Others would be forced to either set
 SQL_ASCII as the client encoding or adjust the encoding on the server.
 
 On the client side, the default libpq client encoding SQL_ASCII would
 be renamed to something like SAME or whatever, so the behavior would
 stay the same.
 
 Other ideas?  Are there legitimate uses for SQL_ASCII?
 
Hi Peter,

Yes, we have processes that insert data from a large number of locales
into the same database and we need to process the information in a locale
agnostic way, just a a range of bytes. Not to mention how much faster it
can be.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Alvaro Herrera
Joshua D. Drake wrote:
 
 On 09/05/2013 09:42 AM, Josh Berkus wrote:

 Other ideas?  Are there legitimate uses for SQL_ASCII?
 
 Migrating from MySQL.  We've had some projects where we couldn't fix
 MySQL's non-enforcement text garbage, and had to use SQL_ASCII on the
 receiving side.  If it hadn't been available, the user would have given
 up on Postgres.
 
 iconv?

Command Prompt helped a customer normalize encodings in their data,
which was a mixture of Latin1 and UTF8.  PGLoader was used for this, in
two stages; the first run in UTF8 saved the rejected data to a file
which was loaded in the second run as Latin1.  This worked like a charm.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Josh Berkus
Peter,

 Other ideas?  Are there legitimate uses for SQL_ASCII?

Migrating from MySQL.  We've had some projects where we couldn't fix
MySQL's non-enforcement text garbage, and had to use SQL_ASCII on the
receiving side.  If it hadn't been available, the user would have given
up on Postgres.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread k...@rice.edu
On Thu, Sep 05, 2013 at 09:42:17AM -0700, Josh Berkus wrote:
 Peter,
 
  Other ideas?  Are there legitimate uses for SQL_ASCII?
 
 Migrating from MySQL.  We've had some projects where we couldn't fix
 MySQL's non-enforcement text garbage, and had to use SQL_ASCII on the
 receiving side.  If it hadn't been available, the user would have given
 up on Postgres.
 
+++1  :)

Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Josh Berkus
On 09/05/2013 10:02 AM, Alvaro Herrera wrote:
 Joshua D. Drake wrote:
 iconv?
 
 Command Prompt helped a customer normalize encodings in their data,
 which was a mixture of Latin1 and UTF8.  PGLoader was used for this, in
 two stages; the first run in UTF8 saved the rejected data to a file
 which was loaded in the second run as Latin1.  This worked like a charm.

There's certainly alternatives.  But all of the alternatives increase
the cost of the migration (either in staff time or in downtime), which
increases the likelyhood that the organization will abandon the migration.

Anyway, I think we've established that there are enough legitimate
uses for SQL_ASCII that we can't casually discard it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread k...@rice.edu
On Thu, Sep 05, 2013 at 09:53:18AM -0700, Joshua D. Drake wrote:
 
 On 09/05/2013 09:42 AM, Josh Berkus wrote:
 
 Peter,
 
 Other ideas?  Are there legitimate uses for SQL_ASCII?
 
 Migrating from MySQL.  We've had some projects where we couldn't fix
 MySQL's non-enforcement text garbage, and had to use SQL_ASCII on the
 receiving side.  If it hadn't been available, the user would have given
 up on Postgres.
 
 iconv?
 

Yes, you can use iconv but then you have to check that it generated
values that do not break your system including the application logic.
That can prove a major stumbling block to changing DBs.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of SQL_ASCII?

2013-09-05 Thread Joshua D. Drake


On 09/05/2013 09:42 AM, Josh Berkus wrote:


Peter,


Other ideas?  Are there legitimate uses for SQL_ASCII?


Migrating from MySQL.  We've had some projects where we couldn't fix
MySQL's non-enforcement text garbage, and had to use SQL_ASCII on the
receiving side.  If it hadn't been available, the user would have given
up on Postgres.


iconv?






--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers