Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-23 Thread Jeff Janes
On Mon, Sep 22, 2014 at 8:40 AM, John McKown john.archie.mck...@gmail.com
wrote:

 On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com
 wrote:
  On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com
 wrote:
  On Fri, 19 Sep 2014 09:32:09 +0200
  Marius Grama marius...@gmail.com wrote:
  Can anybody explain me what happens in the background when the alter
  statement is executed? I've tried it out on a small copy of the table
 (70K)
  and the operation completed in 0.2 seconds.
  Will the table be completely locked during the execution of the ALTER
  statement?
 
  I share Gavin's concern that you're fixing this in the wrong place.  I
 expect
  that you'll be better served by configuring the middleware to do the
 right thing.
 
  I'll pile on here: in almost 20 years of professional database
  development I've never had an actual problem that was solved by
  introducing or shortening a length constraint to text columns except
  in cases where overlong strings violate the data model (like a two
  character state code for example).  It's a database equivalent of C
  programmer's disease.  Input checks from untrusted actors should
  happen in the application.
 
  merlin
 

 I do not have your experience level with data bases, but if I may, I
 will make an addition. Input checks should also happen in the RDBMS
 server. I have learned you cannot trust end users _or_ programmers.
 Most are good and conscientious. But there are a few who just aren't.


So fire them.


 And those few seem to be very prolific in making _subtle_ errors. Had
 one person who was really good at replacing every p with a [ and P
 with {


Your solution is what, arbitrarily forbidding the use of '[' when that is
not a logically forbidden character, just because someone might make a
mistake?  What do you do when they wish that someone have a lot of gun on
your vacation?

Nothing ticks me off more than some DBA deciding that it is unreasonable
for my street address to be more than 25 characters long, when obviously
neither I nor the USPS agrees with that arbitrary limitation.  Unless 25 is
the maximum number of characters that physically fit on the mailing label
(and you are sure you will never change label printers), it is not your job
to decide how long my street name can be.  Get over yourself.

If you need to verify that the data is accurate, then implement methods to
verify that.  Verifying that the data is reasonable, according to some
ignorant standard of reasonableness, is not the same thing as verifying
that it is accurate.

More than one company has lost business by refusing to acknowledge that I
might know how to spell my own address.

Cheers,

Jeff, whose street address has 27 characters, whether you like it or not.


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread Merlin Moncure
On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote:
 On Fri, 19 Sep 2014 09:32:09 +0200
 Marius Grama marius...@gmail.com wrote:
 Can anybody explain me what happens in the background when the alter
 statement is executed? I've tried it out on a small copy of the table (70K)
 and the operation completed in 0.2 seconds.
 Will the table be completely locked during the execution of the ALTER
 statement?

 I share Gavin's concern that you're fixing this in the wrong place.  I expect
 that you'll be better served by configuring the middleware to do the right 
 thing.

I'll pile on here: in almost 20 years of professional database
development I've never had an actual problem that was solved by
introducing or shortening a length constraint to text columns except
in cases where overlong strings violate the data model (like a two
character state code for example).  It's a database equivalent of C
programmer's disease.  Input checks from untrusted actors should
happen in the application.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread John McKown
On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote:
 On Fri, 19 Sep 2014 09:32:09 +0200
 Marius Grama marius...@gmail.com wrote:
 Can anybody explain me what happens in the background when the alter
 statement is executed? I've tried it out on a small copy of the table (70K)
 and the operation completed in 0.2 seconds.
 Will the table be completely locked during the execution of the ALTER
 statement?

 I share Gavin's concern that you're fixing this in the wrong place.  I expect
 that you'll be better served by configuring the middleware to do the right 
 thing.

 I'll pile on here: in almost 20 years of professional database
 development I've never had an actual problem that was solved by
 introducing or shortening a length constraint to text columns except
 in cases where overlong strings violate the data model (like a two
 character state code for example).  It's a database equivalent of C
 programmer's disease.  Input checks from untrusted actors should
 happen in the application.

 merlin


I do not have your experience level with data bases, but if I may, I
will make an addition. Input checks should also happen in the RDBMS
server. I have learned you cannot trust end users _or_ programmers.
Most are good and conscientious. But there are a few who just aren't.
And those few seem to be very prolific in making _subtle_ errors. Had
one person who was really good at replacing every p with a [ and P
with {


-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread Rob Sargent

On 09/22/2014 09:40 AM, John McKown wrote:

On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote:

On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote:

On Fri, 19 Sep 2014 09:32:09 +0200
Marius Grama marius...@gmail.com wrote:

Can anybody explain me what happens in the background when the alter
statement is executed? I've tried it out on a small copy of the table (70K)
and the operation completed in 0.2 seconds.
Will the table be completely locked during the execution of the ALTER
statement?

I share Gavin's concern that you're fixing this in the wrong place.  I expect
that you'll be better served by configuring the middleware to do the right 
thing.

I'll pile on here: in almost 20 years of professional database
development I've never had an actual problem that was solved by
introducing or shortening a length constraint to text columns except
in cases where overlong strings violate the data model (like a two
character state code for example).  It's a database equivalent of C
programmer's disease.  Input checks from untrusted actors should
happen in the application.

merlin


I do not have your experience level with data bases, but if I may, I
will make an addition. Input checks should also happen in the RDBMS
server. I have learned you cannot trust end users _or_ programmers.
Most are good and conscientious. But there are a few who just aren't.
And those few seem to be very prolific in making _subtle_ errors. Had
one person who was really good at replacing every p with a [ and P
with {


You don't want that string to get all the way to the server and fail, 
blow out a transaction and carry that joyous news back to the user who 
now has to start over completely.  Further no mear length constraint is 
going to fix p=[.  Not say the db cannot have the constraint (no [ 
allowed?) but a good app checks input on the fly.






Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread Tim Clarke
On 22/09/14 17:18, Rob Sargent wrote:
 On 09/22/2014 09:40 AM, John McKown wrote:
 On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com 
 wrote:
 On Fri, 19 Sep 2014 09:32:09 +0200
 Marius Grama marius...@gmail.com wrote:
 Can anybody explain me what happens in the background when the alter
 statement is executed? I've tried it out on a small copy of the table 
 (70K)
 and the operation completed in 0.2 seconds.
 Will the table be completely locked during the execution of the ALTER
 statement?
 I share Gavin's concern that you're fixing this in the wrong place.  I 
 expect
 that you'll be better served by configuring the middleware to do the right 
 thing.
 I'll pile on here: in almost 20 years of professional database
 development I've never had an actual problem that was solved by
 introducing or shortening a length constraint to text columns except
 in cases where overlong strings violate the data model (like a two
 character state code for example).  It's a database equivalent of C
 programmer's disease.  Input checks from untrusted actors should
 happen in the application.

 merlin

 I do not have your experience level with data bases, but if I may, I
 will make an addition. Input checks should also happen in the RDBMS
 server. I have learned you cannot trust end users _or_ programmers.
 Most are good and conscientious. But there are a few who just aren't.
 And those few seem to be very prolific in making _subtle_ errors. Had
 one person who was really good at replacing every p with a [ and P
 with {


 You don't want that string to get all the way to the server and fail,
 blow out a transaction and carry that joyous news back to the user who
 now has to start over completely.  Further no mear length constraint
 is going to fix p=[.  Not say the db cannot have the constraint (no
 [ allowed?) but a good app checks input on the fly.




Indeed - both is the answer; back-end (trigger) checks for safety,
front-end application polite messages for clarity and ease of use.

-- 
Tim Clarke



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread John McKown
Sorry guess I wasn't being as clear as I thought. To be a bit more
precise, I really think that validation should occur _first_ at the
point of entry (for a web browser, I put in Javascript code to verify
it there as well as in the web service doing the same validation
because some people disable Javascript as a possible security breach
vector), then also do the same, or even more, validation in the back
end server. I.e. don't trust any step of the process which is not
under your immediate control. As the owner of the data base, I want
to validate the data myself according to the proper business rules.
The application developer should also validate the input. What I don't
believe in is a trusted application from which I would accept data
and not validate it before updating the data base. If such an
application were to exist, due to management dictum, I would audit
everything that I could to prove any corruption to the data base was
caused by this can't ever be wrong application. Yes, I am a
paranoid.

On Mon, Sep 22, 2014 at 11:18 AM, Rob Sargent robjsarg...@gmail.com wrote:
 On 09/22/2014 09:40 AM, John McKown wrote:

 On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com
 wrote:

 On Fri, 19 Sep 2014 09:32:09 +0200
 Marius Grama marius...@gmail.com wrote:

 Can anybody explain me what happens in the background when the alter
 statement is executed? I've tried it out on a small copy of the table (70K)
 and the operation completed in 0.2 seconds.
 Will the table be completely locked during the execution of the ALTER
 statement?

 I share Gavin's concern that you're fixing this in the wrong place.  I
 expect
 that you'll be better served by configuring the middleware to do the right
 thing.

 I'll pile on here: in almost 20 years of professional database
 development I've never had an actual problem that was solved by
 introducing or shortening a length constraint to text columns except
 in cases where overlong strings violate the data model (like a two
 character state code for example).  It's a database equivalent of C
 programmer's disease.  Input checks from untrusted actors should
 happen in the application.

 merlin

 I do not have your experience level with data bases, but if I may, I
 will make an addition. Input checks should also happen in the RDBMS
 server. I have learned you cannot trust end users _or_ programmers.
 Most are good and conscientious. But there are a few who just aren't.
 And those few seem to be very prolific in making _subtle_ errors. Had
 one person who was really good at replacing every p with a [ and P
 with {


 You don't want that string to get all the way to the server and fail, blow
 out a transaction and carry that joyous news back to the user who now has to
 start over completely.  Further no mear length constraint is going to fix
 p=[.  Not say the db cannot have the constraint (no [ allowed?) but a good
 app checks input on the fly.






-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread Merlin Moncure
On Mon, Sep 22, 2014 at 10:40 AM, John McKown
john.archie.mck...@gmail.com wrote:
 On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I'll pile on here: in almost 20 years of professional database
 development I've never had an actual problem that was solved by
 introducing or shortening a length constraint to text columns except
 in cases where overlong strings violate the data model (like a two
 character state code for example).  It's a database equivalent of C
 programmer's disease.  Input checks from untrusted actors should
 happen in the application.

 merlin


 I do not have your experience level with data bases, but if I may, I
 will make an addition. Input checks should also happen in the RDBMS
 server. I have learned you cannot trust end users _or_ programmers.
 Most are good and conscientious. But there are a few who just aren't.
 And those few seem to be very prolific in making _subtle_ errors. Had
 one person who was really good at replacing every p with a [ and P
 with {

Sure.  The point is distinguishing things which are *demonstrably*
false (like a US VIN must be exactly 17 chars) from those that are
based assumption (such as a cityname must be = 50 characters).  The
former should be validated in the schema and the latter should not be.
If you're paranoid about the user submitting 100mb strings for
username and don't trust the application to deal with that, I'd
maybe consider making a domain 'safetext' which checks length on the
order of a few thousand bytes and using that instead of 'text' and use
it everywhere.  This will prevent the dba from outsmarting the
datamodel which is a *much* bigger problem in practice than the one
length checks attempt to solve.

Domains have certain disadvantages (like no array type) -- be advised.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread David G Johnston
Merlin Moncure-2 wrote
 On Mon, Sep 22, 2014 at 10:40 AM, John McKown
 lt;

 john.archie.mckown@

 gt; wrote:
 On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure lt;

 mmoncure@

 gt; wrote:
 I'll pile on here: in almost 20 years of professional database
 development I've never had an actual problem that was solved by
 introducing or shortening a length constraint to text columns except
 in cases where overlong strings violate the data model (like a two
 character state code for example).  It's a database equivalent of C
 programmer's disease.  Input checks from untrusted actors should
 happen in the application.

 merlin


 I do not have your experience level with data bases, but if I may, I
 will make an addition. Input checks should also happen in the RDBMS
 server. I have learned you cannot trust end users _or_ programmers.
 Most are good and conscientious. But there are a few who just aren't.
 And those few seem to be very prolific in making _subtle_ errors. Had
 one person who was really good at replacing every p with a [ and P
 with {
 
 Sure.  The point is distinguishing things which are *demonstrably*
 false (like a US VIN must be exactly 17 chars) from those that are
 based assumption (such as a cityname must be = 50 characters).  The
 former should be validated in the schema and the latter should not be.
 If you're paranoid about the user submitting 100mb strings for
 username and don't trust the application to deal with that, I'd
 maybe consider making a domain 'safetext' which checks length on the
 order of a few thousand bytes and using that instead of 'text' and use
 it everywhere.  This will prevent the dba from outsmarting the
 datamodel which is a *much* bigger problem in practice than the one
 length checks attempt to solve.
 
 Domains have certain disadvantages (like no array type) -- be advised.
 
 merlin

These responses all seem beside the point.  The OP isn't concerned that
too-long data is making it into the database but rather that an unadorned
text type is functionally a CLOB which the application he is using is
treating like a document instead of a smallish text field that would be
treated like any other value.  It's like the difference between choosing
input/text or textarea in HTML.  Now, some tools distinguish between text
and varchar only and the length piece is irrelevant; but whether that
applies here I have no idea.

It might be easier to simply create a view over the table, using the desired
type (truncating the actual value if needed), and feed that view to the
reporting engine.

In the end the two questions are:
1) does adding a length restriction cause a table rewrite?
2) what level of locking occurs while the length check is resolving?

I don't confidently know the answers to those two questions.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/ALTER-TEXT-field-to-VARCHAR-1024-tp5819608p5819939.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-22 Thread Tom Lane
David G Johnston david.g.johns...@gmail.com writes:
 In the end the two questions are:
 1) does adding a length restriction cause a table rewrite?

Yes.  In principle the restriction could be checked with just a scan,
not a rewrite, but ALTER TABLE doesn't currently understand that --- and
in any case a scan would still be potentially a long time.

 2) what level of locking occurs while the length check is resolving?

AccessExclusiveLock.  This would be necessary in any case for a data type
change.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-19 Thread Gavin Flower

On 19/09/14 19:32, Marius Grama wrote:

Hello,


i am using Postgres 9.2 and I'd like to perform the following ALTER 
statement on a database table with about 30M entries :


ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR(1024);

The mask_descriptors field is currently having the type TEXT.
I want to perform the ALTER due to the fact that it seems that copying 
the contents of the table to a BI SQL Server is done in row by row 
(instead of batch) when handling CLOBs.


From the Postgres documentation I got the following :

http://www.postgresql.org/docs/8.3/static/datatype-character.html

Tip: There are no performance differences between these three
types, apart from increased storage size when using the
blank-padded type, and a few extra cycles to check the length when
storing into a length-constrained column. While character has
performance advantages in some other database systems, it has no
such advantages in PostgreSQL. In most situations text or
character varying should be used instead.



Can anybody explain me what happens in the background when the alter 
statement is executed? I've tried it out on a small copy of the table 
(70K) and the operation completed in 0.2 seconds.
Will the table be completely locked during the execution of the ALTER 
statement?




I am curious as to why you want to change text to VARCHAR(1024), 
especially as I think that the middleware should be controlling how long 
a string is saved in the database rather than end user client code (for 
several reasons., including security concerns).  However, I do not know 
your use cases, nor your overall situation - so my concerns may not 
apply to you.



Cheers,
Gavin




Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)

2014-09-19 Thread Bill Moran
On Fri, 19 Sep 2014 09:32:09 +0200
Marius Grama marius...@gmail.com wrote:

 Hello,
 
 
 i am using Postgres 9.2 and I'd like to perform the following ALTER
 statement on a database table with about 30M entries :
 
 ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR(1024);
 
 
 
 The mask_descriptors field is currently having the type TEXT.
 I want to perform the ALTER due to the fact that it seems that copying the
 contents of the table to a BI SQL Server is done in row by row (instead of
 batch) when handling CLOBs.
 
 From the Postgres documentation I got the following :
 
 http://www.postgresql.org/docs/8.3/static/datatype-character.html
 
 Tip: There are no performance differences between these three types, apart
 from increased storage size when using the blank-padded type, and a few
 extra cycles to check the length when storing into a length-constrained
 column. While character has performance advantages in some other database
 systems, it has no such advantages in PostgreSQL. In most situations text
 or character varying should be used instead.
 
 
 
 Can anybody explain me what happens in the background when the alter
 statement is executed? I've tried it out on a small copy of the table (70K)
 and the operation completed in 0.2 seconds.
 Will the table be completely locked during the execution of the ALTER
 statement?

I share Gavin's concern that you're fixing this in the wrong place.  I expect
that you'll be better served by configuring the middleware to do the right 
thing.

However, a more direct answer to your question: VARCHAR and TEXT are _the_same_
internally.  Thus:

ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR;

would do nothing more than change the table definition.  There is no need for
that statement to touch any data.

However, adding the length constraint of (1024) will force Postgres to check
every single value to ensure it complies with the constraint.  I believe if
any row is longer than 1024 it will throw an error and abort the entire ATLER.

-- 
Bill Moran
I need your help to succeed:
http://gamesbybill.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general