Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On Mon, Sep 22, 2014 at 8:40 AM, John McKown john.archie.mck...@gmail.com wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. So fire them. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { Your solution is what, arbitrarily forbidding the use of '[' when that is not a logically forbidden character, just because someone might make a mistake? What do you do when they wish that someone have a lot of gun on your vacation? Nothing ticks me off more than some DBA deciding that it is unreasonable for my street address to be more than 25 characters long, when obviously neither I nor the USPS agrees with that arbitrary limitation. Unless 25 is the maximum number of characters that physically fit on the mailing label (and you are sure you will never change label printers), it is not your job to decide how long my street name can be. Get over yourself. If you need to verify that the data is accurate, then implement methods to verify that. Verifying that the data is reasonable, according to some ignorant standard of reasonableness, is not the same thing as verifying that it is accurate. More than one company has lost business by refusing to acknowledge that I might know how to spell my own address. Cheers, Jeff, whose street address has 27 characters, whether you like it or not.
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On 09/22/2014 09:40 AM, John McKown wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { You don't want that string to get all the way to the server and fail, blow out a transaction and carry that joyous news back to the user who now has to start over completely. Further no mear length constraint is going to fix p=[. Not say the db cannot have the constraint (no [ allowed?) but a good app checks input on the fly.
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On 22/09/14 17:18, Rob Sargent wrote: On 09/22/2014 09:40 AM, John McKown wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { You don't want that string to get all the way to the server and fail, blow out a transaction and carry that joyous news back to the user who now has to start over completely. Further no mear length constraint is going to fix p=[. Not say the db cannot have the constraint (no [ allowed?) but a good app checks input on the fly. Indeed - both is the answer; back-end (trigger) checks for safety, front-end application polite messages for clarity and ease of use. -- Tim Clarke -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
Sorry guess I wasn't being as clear as I thought. To be a bit more precise, I really think that validation should occur _first_ at the point of entry (for a web browser, I put in Javascript code to verify it there as well as in the web service doing the same validation because some people disable Javascript as a possible security breach vector), then also do the same, or even more, validation in the back end server. I.e. don't trust any step of the process which is not under your immediate control. As the owner of the data base, I want to validate the data myself according to the proper business rules. The application developer should also validate the input. What I don't believe in is a trusted application from which I would accept data and not validate it before updating the data base. If such an application were to exist, due to management dictum, I would audit everything that I could to prove any corruption to the data base was caused by this can't ever be wrong application. Yes, I am a paranoid. On Mon, Sep 22, 2014 at 11:18 AM, Rob Sargent robjsarg...@gmail.com wrote: On 09/22/2014 09:40 AM, John McKown wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Sep 19, 2014 at 7:16 AM, Bill Moran wmo...@potentialtech.com wrote: On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { You don't want that string to get all the way to the server and fail, blow out a transaction and carry that joyous news back to the user who now has to start over completely. Further no mear length constraint is going to fix p=[. Not say the db cannot have the constraint (no [ allowed?) but a good app checks input on the fly. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On Mon, Sep 22, 2014 at 10:40 AM, John McKown john.archie.mck...@gmail.com wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure mmonc...@gmail.com wrote: I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { Sure. The point is distinguishing things which are *demonstrably* false (like a US VIN must be exactly 17 chars) from those that are based assumption (such as a cityname must be = 50 characters). The former should be validated in the schema and the latter should not be. If you're paranoid about the user submitting 100mb strings for username and don't trust the application to deal with that, I'd maybe consider making a domain 'safetext' which checks length on the order of a few thousand bytes and using that instead of 'text' and use it everywhere. This will prevent the dba from outsmarting the datamodel which is a *much* bigger problem in practice than the one length checks attempt to solve. Domains have certain disadvantages (like no array type) -- be advised. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
Merlin Moncure-2 wrote On Mon, Sep 22, 2014 at 10:40 AM, John McKown lt; john.archie.mckown@ gt; wrote: On Mon, Sep 22, 2014 at 10:31 AM, Merlin Moncure lt; mmoncure@ gt; wrote: I'll pile on here: in almost 20 years of professional database development I've never had an actual problem that was solved by introducing or shortening a length constraint to text columns except in cases where overlong strings violate the data model (like a two character state code for example). It's a database equivalent of C programmer's disease. Input checks from untrusted actors should happen in the application. merlin I do not have your experience level with data bases, but if I may, I will make an addition. Input checks should also happen in the RDBMS server. I have learned you cannot trust end users _or_ programmers. Most are good and conscientious. But there are a few who just aren't. And those few seem to be very prolific in making _subtle_ errors. Had one person who was really good at replacing every p with a [ and P with { Sure. The point is distinguishing things which are *demonstrably* false (like a US VIN must be exactly 17 chars) from those that are based assumption (such as a cityname must be = 50 characters). The former should be validated in the schema and the latter should not be. If you're paranoid about the user submitting 100mb strings for username and don't trust the application to deal with that, I'd maybe consider making a domain 'safetext' which checks length on the order of a few thousand bytes and using that instead of 'text' and use it everywhere. This will prevent the dba from outsmarting the datamodel which is a *much* bigger problem in practice than the one length checks attempt to solve. Domains have certain disadvantages (like no array type) -- be advised. merlin These responses all seem beside the point. The OP isn't concerned that too-long data is making it into the database but rather that an unadorned text type is functionally a CLOB which the application he is using is treating like a document instead of a smallish text field that would be treated like any other value. It's like the difference between choosing input/text or textarea in HTML. Now, some tools distinguish between text and varchar only and the length piece is irrelevant; but whether that applies here I have no idea. It might be easier to simply create a view over the table, using the desired type (truncating the actual value if needed), and feed that view to the reporting engine. In the end the two questions are: 1) does adding a length restriction cause a table rewrite? 2) what level of locking occurs while the length check is resolving? I don't confidently know the answers to those two questions. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/ALTER-TEXT-field-to-VARCHAR-1024-tp5819608p5819939.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
David G Johnston david.g.johns...@gmail.com writes: In the end the two questions are: 1) does adding a length restriction cause a table rewrite? Yes. In principle the restriction could be checked with just a scan, not a rewrite, but ALTER TABLE doesn't currently understand that --- and in any case a scan would still be potentially a long time. 2) what level of locking occurs while the length check is resolving? AccessExclusiveLock. This would be necessary in any case for a data type change. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On 19/09/14 19:32, Marius Grama wrote: Hello, i am using Postgres 9.2 and I'd like to perform the following ALTER statement on a database table with about 30M entries : ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR(1024); The mask_descriptors field is currently having the type TEXT. I want to perform the ALTER due to the fact that it seems that copying the contents of the table to a BI SQL Server is done in row by row (instead of batch) when handling CLOBs. From the Postgres documentation I got the following : http://www.postgresql.org/docs/8.3/static/datatype-character.html Tip: There are no performance differences between these three types, apart from increased storage size when using the blank-padded type, and a few extra cycles to check the length when storing into a length-constrained column. While character has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I am curious as to why you want to change text to VARCHAR(1024), especially as I think that the middleware should be controlling how long a string is saved in the database rather than end user client code (for several reasons., including security concerns). However, I do not know your use cases, nor your overall situation - so my concerns may not apply to you. Cheers, Gavin
Re: [GENERAL] ALTER TEXT field to VARCHAR(1024)
On Fri, 19 Sep 2014 09:32:09 +0200 Marius Grama marius...@gmail.com wrote: Hello, i am using Postgres 9.2 and I'd like to perform the following ALTER statement on a database table with about 30M entries : ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR(1024); The mask_descriptors field is currently having the type TEXT. I want to perform the ALTER due to the fact that it seems that copying the contents of the table to a BI SQL Server is done in row by row (instead of batch) when handling CLOBs. From the Postgres documentation I got the following : http://www.postgresql.org/docs/8.3/static/datatype-character.html Tip: There are no performance differences between these three types, apart from increased storage size when using the blank-padded type, and a few extra cycles to check the length when storing into a length-constrained column. While character has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. Can anybody explain me what happens in the background when the alter statement is executed? I've tried it out on a small copy of the table (70K) and the operation completed in 0.2 seconds. Will the table be completely locked during the execution of the ALTER statement? I share Gavin's concern that you're fixing this in the wrong place. I expect that you'll be better served by configuring the middleware to do the right thing. However, a more direct answer to your question: VARCHAR and TEXT are _the_same_ internally. Thus: ALTER TABLE images ALTER COLUMN mask_descriptors TYPE VARCHAR; would do nothing more than change the table definition. There is no need for that statement to touch any data. However, adding the length constraint of (1024) will force Postgres to check every single value to ensure it complies with the constraint. I believe if any row is longer than 1024 it will throw an error and abort the entire ATLER. -- Bill Moran I need your help to succeed: http://gamesbybill.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general