Re: A simple question about text fields

2021-06-18 Thread Laurenz Albe
On Fri, 2021-06-18 at 10:28 +1000, Gavan Schneider wrote:
> On 18 Jun 2021, at 9:34, David G. Johnston wrote:
> > On Thursday, June 17, 2021, Gavan Schneider  
> > wrote:
> > 
> > > My approach is to define such fields as ‘text’ and set a constraint using
> > > char_length(). This allows PG to do the business with the text in native
> > > form, and only imposes the cost of any length check when the field is
> > > updated… best of both worlds.
> > > 
> > 
> > Those are basically the same world…your alternative probably is strictly
> > worse than varchar(n) because of its novel way of implementing the same
> > functionality.
> 
> Not sure if this is strictly true. Novelty per se is not always worse. :)

True in general, but not in this case.

There is no advantage in a "text" with a check constraint on the length,
that is, no added functionality.

And it is worse for these reasons:

- the performance will be worse (big reason)

- the length limit is less obvious if you look at the table definition
  (small reason)

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: A simple question about text fields

2021-06-17 Thread Gavan Schneider
On 18 Jun 2021, at 9:34, David G. Johnston wrote:

> On Thursday, June 17, 2021, Gavan Schneider 
> wrote:
>
>>
>> My approach is to define such fields as ‘text’ and set a constraint using
>> char_length(). This allows PG to do the business with the text in native
>> form, and only imposes the cost of any length check when the field is
>> updated… best of both worlds.
>>
>
> Those are basically the same world…your alternative probably is strictly
> worse than varchar(n) because of its novel way of implementing the same
> functionality.
>
Not sure if this is strictly true. Novelty per se is not always worse. :)
The design advantage is in all text fields being defined the same — no built in 
length.
When it becomes apparent a length constraint is needed it can be added for the 
relevant field(s), e.g., when the system does not impose proper restraint at 
the input stage.

> For most text fields any good constraint is going be done in the form of a
> regular expression, one that at minimum prevents non-printable characters
> (linefeed and carriage return being obvious examples).
>
Agree. If the business rules need some additional restrictions they can go here 
as well.
Not so obvious that newlines, etc. are unwelcome. Consider the need for human 
readable text in comment fields (or other annotation where the readability is 
enhanced by such layout). There is always the consideration of SQL injection 
(but it’s mostly too late if we’re leaving this to a constraint) and other 
toxic string sequences. But this is all business logic. The database just needs 
to handle the ‘text’ and we need to design the restraint around the content.

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920




Re: A simple question about text fields

2021-06-17 Thread David G. Johnston
On Thursday, June 17, 2021, Gavan Schneider 
wrote:

>
> My approach is to define such fields as ‘text’ and set a constraint using
> char_length(). This allows PG to do the business with the text in native
> form, and only imposes the cost of any length check when the field is
> updated… best of both worlds.
>

Those are basically the same world…your alternative probably is strictly
worse than varchar(n) because of its novel way of implementing the same
functionality.

For most text fields any good constraint is going be done in the form of a
regular expression, one that at minimum prevents non-printable characters
(linefeed and carriage return being obvious examples).

David J.


Re: A simple question about text fields

2021-06-17 Thread Gavan Schneider
On 17 Jun 2021, at 1:08, Tom Lane wrote:

> Martin Mueller  writes:
>
>> Are there performance issues with the choice of 'text' vs. varchar and some 
>> character limit?  For instance, if I have a table with ten million records 
>> and text fields that may range in length from 15 to 150, can I expect a 
>> measurable improvement in response time for using varchar(150) or will text  
>>   do just or nearly as well.
>
>  There is no situation where varchar outperforms text in Postgres.
>  If you need to apply a length constraint for application semantic
>  reasons, do so ... otherwise, text is the native type.  It's
>  useful to think of varchar as being a domain over text, though
>  for various reasons it's not implemented quite that way.
>
This reminds of my days converting from MySQL to PostgreSQL. MySQL, along with 
other databases, seemed to have a strong preference for setting a length on 
character strings. And all this from before the advent of UTF encoding which 
has made the concept of string ‘length’ very messy.

Database guru and SQL author Joe Celko asserts in his ’SQL for Smarties’ that 
if he finds a text field without a length limit he will input the Heart Sutra 
(presumably in ASCII :) to demonstrate the design error. (Of course he is 
ignoring the potential for this input to help the database achieve inner 
consistency. :) . But taking Joe’s central point there do seem to be grounds 
for restricting user input text fields to a reasonable length according to the 
business need… if only to limit the damage of a cat sitting on the keyboard.

My approach is to define such fields as ‘text’ and set a constraint using 
char_length(). This allows PG to do the business with the text in native form, 
and only imposes the cost of any length check when the field is updated… best 
of both worlds.

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a 
well-known solution to every human problem — neat, plausible, and wrong.
— H. L. Mencken, 1920




Re: A simple question about text fields

2021-06-16 Thread Tom Lane
Martin Mueller  writes:
> Are there performance issues with the choice of 'text' vs. varchar and some 
> character limit?  For instance, if I have a table with ten million records 
> and text fields that may range in length from 15 to 150, can I expect a 
> measurable improvement in response time for using varchar(150) or will text   
>  do just or nearly as well. 

There is no situation where varchar outperforms text in Postgres.
If you need to apply a length constraint for application semantic
reasons, do so ... otherwise, text is the native type.  It's
useful to think of varchar as being a domain over text, though
for various reasons it's not implemented quite that way.

regards, tom lane