Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-07 Thread Steve Crawford

Did I miss the exciting conclusion or did this drift silently off radar?

I seem to recall three options:

1. Leave as is. Arguments: least effort, no backward compatibility 
issues, since array_to_string evaluate both an array with single empty 
string and an array with no elements to an empty string, string_to_array 
on empty strings is ambiguous so we'll call it null. But: means that the 
result of null input and non-null empty-string both result in null 
output, requires everyone to explicitly handle empty strings (with the 
side effect that they really know what the result will be) instead of 
helping the majority of users. Requires: documentation change to 
accurately describe function's behavior.


2. Change function to return an array. Arguments: Distinguishes null 
from non-null input, easier coding for most cases, perhaps a less 
surprising result. But: not backward compatible, requires somewhat 
arbitrary decision on correct return value. Requires: code 
change/testing, documentation updates.


In scenario 2, there were two options:
2a. Return zero-element array.
2b. Return array with single empty-string element.

My impression was that among the change options, 2b had the most 
support (it is the most useful for the use-cases I've encountered so it 
gets my vote). If the consensus is to change the function, it may be too 
late for 8.4. But the documentation could be updated to reflect current 
and planned behavior.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-07 Thread justin

Steve Crawford wrote:

Did I miss the exciting conclusion or did this drift silently off radar?


it was pretty well split between the options.  tabled for another time. 


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-07 Thread David E. Wheeler

On Apr 7, 2009, at 8:07 AM, Steve Crawford wrote:


In scenario 2, there were two options:
2a. Return zero-element array.
2b. Return array with single empty-string element.

My impression was that among the change options, 2b had the most  
support (it is the most useful for the use-cases I've encountered so  
it gets my vote). If the consensus is to change the function, it may  
be too late for 8.4. But the documentation could be updated to  
reflect current and planned behavior.


+1

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread David E. Wheeler

On Apr 1, 2009, at 12:19 PM, Robert Haas wrote:


my @ints = map { $_ || 0 } split ',', $string;

This ensures that I get the proper number of records in the example  
of something like '1,2,,4'.


I can't see that there's any way to do this in SQL regardless of how
we define this operation.


It's easy enough to write a function to do it:

CREATE OR REPLACE FUNCTION trim_blanks (anyarray) RETURNS anyarray AS $$
SELECT ARRAY(
SELECT CASE WHEN $1[i] IS NULL OR $1[i] = '' THEN '0' ELSE  
$1[i] END

  FROM generate_series(1, array_upper($1, 1)) s(i)
 ORDER BY i
);
$$ LANGUAGE SQL IMMUTABLE;

Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread David E. Wheeler

On Apr 1, 2009, at 2:22 PM, Tom Lane wrote:


Another way to state the point is that we can offer people a choice of
two limitations: string_to_array doesn't work for zero-length lists,
or string_to_array doesn't work for empty strings (except most of the
time, it does).  The former is sounding less likely to bite people
unexpectedly.


Right, very well put.


Or we could stick to the current behavior and say use COALESCE() to
resolve the ambiguity, if you need to.


Steve has a point that leaving it as-is leaves it as impossible to  
tell the difference between string_to_array(NULL, ',') and  
string_to_array('', ','). The former properly handles an unknown  
value, while the latter, where '' is a known value, seems weird to be  
returning NULL.


Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Steve Crawford

Leif B. Kristensen wrote:

On Thursday 2. April 2009, Steve Crawford wrote:
  

Currently string_to_array(null, ',') yields a null result -
indistinguishable from string_to_array('',','). Wrapping in coalesce
does not help distinguish true null input from empty-string input. I'm
not sure at the moment what other cases exist where non-null input
generates null output.



Somehow this reminds me of the old division by zero problem.

IMO, the proper way to handle this kind of anomaly would be to test if 
the length of the string is non-zero before submitting it to the 
string_to_array() function.
  
Quite the opposite. Where division by zero is simply illegal as is, say, 
string_to_array(1234, ','), string_to_array('', ',') is legal. 
Unfortunately it is legal, and legal and legal with numerous 
reasonable interpretations of which legal is most appropriate/consistent.


I would argue against a change to have string_to_array('',',') throw an 
error.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Sam Mason
On Thu, Apr 02, 2009 at 09:29:04AM -0700, Steve Crawford wrote:
 Leif B. Kristensen wrote:
 Somehow this reminds me of the old division by zero problem.
 
 IMO, the proper way to handle this kind of anomaly would be to test if 
 the length of the string is non-zero before submitting it to the 
 string_to_array() function.
 
 Quite the opposite. Where division by zero is simply illegal

This is just a matter of definitions; divide by zero is fine in lots of
languages and no exception will be raised.  The fact that you're saying
it's simply illegal means that you've internalised the definition to
such an extent that any alternative appears simply illegal.

It seems reasonable to assume that if, to pick an arbitrary choice,
string_to_array returned a zero element set people would say it was
simply illegal for it to do anything else.  There are choices for
either and a choice needs to be made or the situation should somehow be
made impossible.

 I would argue against a change to have string_to_array('',',') throw an 
 error.

I'd agree, throwing an exception here doesn't seem useful.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Robert Haas
On Thu, Apr 2, 2009 at 12:17 PM, David E. Wheeler da...@kineticode.com wrote:
 On Apr 1, 2009, at 2:22 PM, Tom Lane wrote:

 Another way to state the point is that we can offer people a choice of
 two limitations: string_to_array doesn't work for zero-length lists,
 or string_to_array doesn't work for empty strings (except most of the
 time, it does).  The former is sounding less likely to bite people
 unexpectedly.

 Right, very well put.

 Or we could stick to the current behavior and say use COALESCE() to
 resolve the ambiguity, if you need to.

 Steve has a point that leaving it as-is leaves it as impossible to tell the
 difference between string_to_array(NULL, ',') and string_to_array('', ',').
 The former properly handles an unknown value, while the latter, where '' is
 a known value, seems weird to be returning NULL.

*shrug* CASE WHEN blah IS NOT NULL THEN string_to_array(blah, ',') END

More and more I'm leaning toward leaving this alone.  No matter how
you define it, the behavior can be changed to whichever alternative
you prefer with a 1-line case statement.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Robert Haas
On Thu, Apr 2, 2009 at 12:10 PM, David E. Wheeler da...@kineticode.com wrote:
 On Apr 1, 2009, at 12:19 PM, Robert Haas wrote:

 my @ints = map { $_ || 0 } split ',', $string;

 This ensures that I get the proper number of records in the example of
 something like '1,2,,4'.

 I can't see that there's any way to do this in SQL regardless of how
 we define this operation.

 It's easy enough to write a function to do it:

 CREATE OR REPLACE FUNCTION trim_blanks (anyarray) RETURNS anyarray AS $$
    SELECT ARRAY(
        SELECT CASE WHEN $1[i] IS NULL OR $1[i] = '' THEN '0' ELSE $1[i] END
          FROM generate_series(1, array_upper($1, 1)) s(i)
         ORDER BY i
    );
 $$ LANGUAGE SQL IMMUTABLE;

Ah!  Thanks for the tip.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 Or we could stick to the current behavior and say use COALESCE() to
 resolve the ambiguity, if you need to.

 Steve has a point that leaving it as-is leaves it as impossible to  
 tell the difference between string_to_array(NULL, ',') and  
 string_to_array('', ','). The former properly handles an unknown  
 value, while the latter, where '' is a known value, seems weird to be  
 returning NULL.

Yeah, COALESCE is an abuse of a convenient notation, which will fall
over if you also want NULL to yield NULL.  A correct fix
outside-the-function would look more like

case when str = '' then '{}'::text[] else string_to_array(str, ',') end

which should correctly yield NULL for NULL input and an empty array
for empty input.  Similarly, if someone wanted to force the
single-empty-string result, they should do

case when str = '' then '{}'::text[] else string_to_array(str, ',') end

which also still yields NULL if str is NULL.

Right at the moment, if we stick with the historical definition
of the function, *both* camps have to write out their choice of
the above.  Seems like this is the worst of all possible worlds.
We should probably pick one or the other.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Robert Haas
On Thu, Apr 2, 2009 at 2:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Right at the moment, if we stick with the historical definition
 of the function, *both* camps have to write out their choice of
 the above.  Seems like this is the worst of all possible worlds.
 We should probably pick one or the other.

ISTM there are three camps.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 2, 2009 at 2:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Right at the moment, if we stick with the historical definition
 of the function, *both* camps have to write out their choice of
 the above.  Seems like this is the worst of all possible worlds.
 We should probably pick one or the other.

 ISTM there are three camps.

If there's a camp that actually *wants* a NULL result for this case,
I missed the reasoning.  AFAICS we can either say that every application
is going to have to put in a CASE wrapper around this function, or say
that we'll make it do the right thing for some of them and the rest have
to put the same wrapper around it.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Sam Mason
On Thu, Apr 02, 2009 at 02:04:41PM -0400, Tom Lane wrote:
 A correct fix
 outside-the-function would look more like
 
 case when str = '' then '{}'::text[] else string_to_array(str, ',') end
 
 which should correctly yield NULL for NULL input and an empty array
 for empty input.  Similarly, if someone wanted to force the
 single-empty-string result, they should do
 
 case when str = '' then '{}'::text[] else string_to_array(str, ',') end
 
 which also still yields NULL if str is NULL.
 
 Right at the moment, if we stick with the historical definition
 of the function, *both* camps have to write out their choice of
 the above.  Seems like this is the worst of all possible worlds.
 We should probably pick one or the other.

Yes, I'd be tempted to pick one and go with it.  It's seems a completely
arbitrary choice one way or the other but the current behaviour is
certainly wrong.

I'd go with returning a zero element array because it would do
the right thing more often when paired with array_to_string.
I've also been through the first few pages of a Google search for
array_to_string and it seems to do the right thing for the majority
of the cases.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Robert Haas
On Thu, Apr 2, 2009 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 2, 2009 at 2:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Right at the moment, if we stick with the historical definition
 of the function, *both* camps have to write out their choice of
 the above.  Seems like this is the worst of all possible worlds.
 We should probably pick one or the other.

 ISTM there are three camps.

 If there's a camp that actually *wants* a NULL result for this case,
 I missed the reasoning.  AFAICS we can either say that every application
 is going to have to put in a CASE wrapper around this function, or say
 that we'll make it do the right thing for some of them and the rest have
 to put the same wrapper around it.

So that we don't break existing apps because of an issue that is
trivial to work around.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 2, 2009 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If there's a camp that actually *wants* a NULL result for this case,
 I missed the reasoning.

 So that we don't break existing apps because of an issue that is
 trivial to work around.

We would only be breaking them if a NULL result were actually the
correct behavior for the application's requirements, which seems
a bit unlikely.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread David E. Wheeler

On Apr 2, 2009, at 11:24 AM, Sam Mason wrote:

Yes, I'd be tempted to pick one and go with it.  It's seems a  
completely

arbitrary choice one way or the other but the current behaviour is
certainly wrong.

I'd go with returning a zero element array because it would do
the right thing more often when paired with array_to_string.
I've also been through the first few pages of a Google search for
array_to_string and it seems to do the right thing for the  
majority

of the cases.


Forgive me if I'm missing something, but it seems to me that  
array_to_string() works either way, no?


try=# select '' || array_to_string('{}'::text[], ',') || ''; ?column?
--
 
(1 row)

Time: 72.129 ms
try=# select '' || array_to_string('{}'::text[], ',') || '';
 ?column?
--
 
(1 row)

Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Robert Haas
On Thu, Apr 2, 2009 at 2:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 2, 2009 at 2:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If there's a camp that actually *wants* a NULL result for this case,
 I missed the reasoning.

 So that we don't break existing apps because of an issue that is
 trivial to work around.

 We would only be breaking them if a NULL result were actually the
 correct behavior for the application's requirements, which seems
 a bit unlikely.

But that's completely untrue.  If the most useful behavior is either
ARRAY[''] or ARRAY[], then there are presumably lots and lots of
people out there who have apps that do COALESCE(string_to_array(...),
something).  Whichever way you change string_to_array() will break all
of the people doing this who wanted the opposite behavior for no good
reason.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-02 Thread Sam Mason
On Thu, Apr 02, 2009 at 12:06:01PM -0700, David E. Wheeler wrote:
 On Apr 2, 2009, at 11:24 AM, Sam Mason wrote:
 Yes, I'd be tempted to pick one and go with it.  It's seems a
 completely arbitrary choice one way or the other but the current
 behaviour is certainly wrong.
 
 I'd go with returning a zero element array because it would do
 the right thing more often when paired with array_to_string.
 I've also been through the first few pages of a Google search for
 array_to_string and it seems to do the right thing for the
 majority of the cases.

 Forgive me if I'm missing something, but it seems to me that
 array_to_string() works either way, no?

Sorry, I meant to type string_to_array but typed array_to_string
instead---after doing exactly the same thing when searching for stuff in
Google!  I think I should be using copy/paste more!

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Robert Haas
On Tue, Mar 31, 2009 at 10:44 AM, Greg Stark st...@enterprisedb.com wrote:
 On Tue, Mar 31, 2009 at 3:42 PM, Sam Mason s...@samason.me.uk wrote:

  string_to_array('',',')::INT[]  = invalid input syntax for integer: 

 Oof. That's a good point.

+1.  I find this argument much more compelling than anything else
that's been offered up so far.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 31, 2009 at 10:44 AM, Greg Stark st...@enterprisedb.com wrote:
 On Tue, Mar 31, 2009 at 3:42 PM, Sam Mason s...@samason.me.uk wrote:
 string_to_array('',',')::INT[]  = invalid input syntax for integer: 
 
 Oof. That's a good point.

 +1.  I find this argument much more compelling than anything else
 that's been offered up so far.

Yeah.  It seems to me that if you consider only the case where the array
elements are text, there's a weak preference for considering '' to be a
single empty string; but as soon as you think about any other datatype,
there's a strong preference to consider it a zero-element list.  So I
too have come around to favor the latter interpretation.  Do we have
any remaining holdouts?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread David E. Wheeler

On Apr 1, 2009, at 9:02 AM, Tom Lane wrote:


+1.  I find this argument much more compelling than anything else
that's been offered up so far.


Yeah.  It seems to me that if you consider only the case where the  
array
elements are text, there's a weak preference for considering '' to  
be a
single empty string; but as soon as you think about any other  
datatype,

there's a strong preference to consider it a zero-element list.  So I
too have come around to favor the latter interpretation.  Do we have
any remaining holdouts?


Well, I'd just point out that the return value of string_to_array() is  
text[]. Thus, this is not a problem with string_to_array(), but a  
casting problem from text[] to int[]. Making string_to_array() return  
a NULL for this case to make casting simpler is addressing the problem  
in the wrong place, IMHO. If I want to do this in Perl, for example,  
I'd do something like this:


my @ints = grep { defined $_  $_ ne '' } split ',', $string;

So I split the string into an array, and then remove unreasonable  
values. This also allows me to set defaults:


my @ints = map { $_ || 0 } split ',', $string;

This ensures that I get the proper number of records in the example of  
something like '1,2,,4'.


So I still think that string_to_array('', ',') should return '{}',  
and how casting is handled should be left to the user to flexibly  
handle.


That said, I'm not seeing a simple function for modifying an array.  
I'd have to write one for each specific case. :-(


Best,

David




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread justin




Tom Lane wrote:

  Robert Haas robertmh...@gmail.com writes:
  
  
On Tue, Mar 31, 2009 at 10:44 AM, Greg Stark st...@enterprisedb.com wrote:


  On Tue, Mar 31, 2009 at 3:42 PM, Sam Mason s...@samason.me.uk wrote:
  
  
string_to_array('',',')::INT[] = invalid input syntax for integer: ""

  
  Oof. That's a good point.
  

  
  
  
  
+1.  I find this argument much more compelling than anything else
that's been offered up so far.

  
  
Yeah.  It seems to me that if you consider only the case where the array
elements are text, there's a weak preference for considering '' to be a
single empty string; but as soon as you think about any other datatype,
there's a strong preference to consider it a zero-element list.  So I
too have come around to favor the latter interpretation.  Do we have
any remaining holdouts?

			regards, tom lane
  


I'm still a hold out, We are taking a string putting it into a array
based on a delimiter. That is very simple and straight forward. Yet
many argue if we want to cast this into another data type the function
should deal with in limited cases. 

string_to_array('',',')::INT[] works as proposed 

But 
string_to_array(',,,', ',' )::INT[] Fails 
or
string_to_array('1,2,,4', ',' )::INT[] Fails .


I'm trying to understand the difference between a empty string to a
string with many blank entries between the delimiter. 
Consider
',,' = '' once the delimiter is removed . Yet Seven zero length
entries were passed. How is that going to
be handled 

In one case it works and yet other cases it fails. This is
inconsistent behavior. Unless all zero length strings are removed or
are treated as NULLs I can't see how casting to another type is going
to work.

If zero length strings are treated as NULLs this creates idea that zero
length strings are = to NULLs. 

The input is a string and the output is text[], casting to another
data type is error prone and should be handled by the programmer. 




Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 Well, I'd just point out that the return value of string_to_array() is  
 text[].

True...

 Thus, this is not a problem with string_to_array(), but a  
 casting problem from text[] to int[].

Nonsense.  The question is whether string_to_array is meant to be useful
for lists of anything except text.  I agree you could argue that it
isn't.  But even in the domain of text it's not all that cut-and-dried
whether string_to_array should return array[] or array[''] for empty
input.  So ISTM we're giving up less than we gain by choosing the former.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread David E. Wheeler

On Apr 1, 2009, at 10:09 AM, Tom Lane wrote:


Thus, this is not a problem with string_to_array(), but a
casting problem from text[] to int[].


Nonsense.  The question is whether string_to_array is meant to be  
useful

for lists of anything except text.  I agree you could argue that it
isn't.  But even in the domain of text it's not all that cut-and-dried
whether string_to_array should return array[] or array[''] for empty
input.  So ISTM we're giving up less than we gain by choosing the  
former.


Yeah. I'm okay with either, as long as it's consistent. I have a mild  
preference for '{}', but I can live with ARRAY[] instead. As long as  
it's not NULL that gets returned.


Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread David E. Wheeler

On Apr 1, 2009, at 10:05 AM, justin wrote:


string_to_array('',',')::INT[]  works as proposed

But
string_to_array(',,,', ',' )::INT[]   Fails
or
string_to_array('1,2,,4', ',' )::INT[] Fails .


I'm trying to understand the difference between a empty string to a  
string with  many blank entries between  the delimiter.
Consider   ',,'  = ''  once the delimiter is removed .  Yet  
Seven zero length entries were passed.  How is that going to be  
handled


Right, it's making a special case of '', which does seem rather  
inconsistent to me.


Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Sam Mason
On Wed, Apr 01, 2009 at 10:23:18AM -0700, David E. Wheeler wrote:
 On Apr 1, 2009, at 10:05 AM, justin wrote:
 string_to_array('',',')::INT[]  works as proposed
 
 But
 string_to_array(',,,', ',' )::INT[]   Fails
 or
 string_to_array('1,2,,4', ',' )::INT[] Fails .
 
 
 I'm trying to understand the difference between a empty string to a  
 string with  many blank entries between  the delimiter.
 Consider   ',,'  = ''  once the delimiter is removed .  Yet  
 Seven zero length entries were passed.  How is that going to be  
 handled
 
 Right, it's making a special case of '', which does seem rather  
 inconsistent to me.

Yes it is; but it's a useful special case because it allows:

  string_to_array(array_to_string(col,','),',')

to do the right thing whether it's got zero or more elements in.  With
the current implementation you get a NULL back in the case of zero
elements and the expected array back the rest of the time.

To me, it doesn't really matter whether:

  string_to_array(',', ',' )::INT[]

fails or not; because array_to_string will never generate a string that
looks like this.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Greg Stark
On Wed, Apr 1, 2009 at 6:23 PM, David E. Wheeler da...@kineticode.com wrote:
 Right, it's making a special case of '', which does seem rather inconsistent
 to me.


David E. Wheeler da...@kineticode.com writes:

 On Apr 1, 2009, at 10:05 AM, justin wrote:

 string_to_array('',',')::INT[]  works as proposed

 But
 string_to_array(',,,', ',' )::INT[]   Fails
 or
 string_to_array('1,2,,4', ',' )::INT[] Fails .


 I'm trying to understand the difference between a empty string to a  string
 with  many blank entries between  the delimiter.

Well, uh, in one case it's empty and in the other case it's not?

 Consider   ',,'  = ''  once the delimiter is removed .  Yet  Seven zero
 length entries were passed.  How is that going to be  handled

Well it's pretty clear empty delimiters cannot be handled consistently. Some
languages handle them as a special case (splitting every character into a
separate string, for example -- which I'll point out will result in an empty
array as a result for an empty string input) or make it an error.

 Right, it's making a special case of '', which does seem rather  inconsistent
 to me.

It's not a special case -- or it's a special case whichever we choose,
depending on which way you look at it.

What we're talking about here is replacing the blank values in the following
tables. We can get either the first one right in both cases with {} as the
result, or we can get the second one right in the second table with {}.
Either way there is an inconsistency in at least one case.

The existing behaviour of returning NULL is the only consistent choice since
the correct value is unknown. And one could argue that it's easier to
replace NULL with the correct value if the programmer knows using coalesce
than it is to replace either  or {}. But I'm still leaning to thinking
that using an arbitrary choice that at least gets most users intentions is
better.

postgres=# select input,
string_to_array(array_to_string(input,','),',') as output from (values
(array[]::text[]),(array['foo']),(array['foo','bar']),(array['foo','bar','baz']))
as input(input);
 input |output
---+---
 {}|
 {foo} | {foo}
 {foo,bar} | {foo,bar}
 {foo,bar,baz} | {foo,bar,baz}
(4 rows)

postgres=# select input,
string_to_array(array_to_string(input,','),',') as output from (values
(array[]::text[]),(array['']),(array['','']),(array['','',''])) as
input(input);
   input|   output
+
 {} |
 {}   |
 {,}| {,}
 {,,} | {,,}
(4 rows)

-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Sam Mason
On Wed, Apr 01, 2009 at 07:40:16PM +0100, Greg Stark wrote:
 The existing behaviour of returning NULL is the only consistent
 choice since the correct value is unknown. And one could argue that
 it's easier to replace NULL with the correct value if the programmer
 knows using coalesce than it is to replace either  or {}.

Couldn't a similar argument be applied for division by zero?  Since it's
not known whether the user wants to get a divide by zero exception or
infinity PG should return NULL and punt the choice to the user.

I think everybody would agree that this would be a bad thing to do!

 But I'm
 still leaning to thinking that using an arbitrary choice that at least
 gets most users intentions is better.

I'd agree; returning NULL and not forcing the user to make a choice
is a bad design decision---the user doesn't need to put a coalesce
in and hence their code will probably break in strange ways when
they're not expecting it.  Nobody suggest adding a third parameter to
string_to_array, please!

The general mantra that seems to apply here is one good option is
better than two bad ones.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Robert Haas
On Wed, Apr 1, 2009 at 12:52 PM, David E. Wheeler da...@kineticode.com wrote:
 Well, I'd just point out that the return value of string_to_array() is
 text[]. Thus, this is not a problem with string_to_array(), but a casting
 problem from text[] to int[]. Making string_to_array() return a NULL for
 this case to make casting simpler is addressing the problem in the wrong
 place, IMHO. If I want to do this in Perl, for example, I'd do something
 like this:

 my @ints = grep { defined $_  $_ ne '' } split ',', $string;

I've written code that looks a whole lot like this myself, but there's
no easy way to do that in SQL.  SQL, in particular, lacks closures, so
grep {} and map {} don't exist.  I really, really wish they did, but I
believe that our type system is too woefully pathetic to be up to the
job.  So it seems to me that arguing that SQL (which lacks those
primitives) should match Perl (which has them) isn't really getting us
anywhere.

 my @ints = map { $_ || 0 } split ',', $string;

 This ensures that I get the proper number of records in the example of 
 something like '1,2,,4'.

I can't see that there's any way to do this in SQL regardless of how
we define this operation.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Robert Haas
On Wed, Apr 1, 2009 at 1:05 PM, justin jus...@emproshunts.com wrote:
 I'm still a hold out,  We are taking a string putting it into a array based
 on a delimiter.  That is very simple and straight forward.  Yet many argue
 if we want to cast this into another data type the function should deal with
 in limited cases.

 string_to_array('',',')::INT[]  works as proposed

 But
 string_to_array(',,,', ',' )::INT[]   Fails
 or
 string_to_array('1,2,,4', ',' )::INT[] Fails .

But... but... those aren't comma-separated lists of integers.  If they
were, it would work.

string_to_array('cow,dog,horse')::INT[] will also fail.

If you take 0 items of any type whatsoever and join them together with
commas, you will get the empty string.  It is also true that if you
join 1 item together with commas, you will get that item back, and if
that item is the empty string, you will now have the empty string.  I
think it's better to worry more about the first case because it
applies to any type at all, whereas the latter case ONLY applies in
situations where the empty string is a potentially legal value.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread justin
If someone can show me a real world example  this logic simplifies the 
code and has more uses I'll bite



I just presently can't see how this works better.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Martin Gainty

Split strings into array elements using provided delimiter



string_to_array('xx~^~yy~^~zz', '~^~') 

output: {xx,yy,zz}


http://www.postgresql.org/docs/8.3/interactive/functions-array.html

?
Martin 
__ 
Disclaimer and confidentiality note 
This message is confidential and may be privileged. If you are not the intended 
recipient, we kindly ask you to  please inform the sender. Any unauthorised 
dissemination or copying hereof is prohibited. This message serves for 
information purposes only and shall not have any legally binding effect. Given 
that e-mails can easily be subject to manipulation, we can not accept any 
liability for the content provided.






 Date: Wed, 1 Apr 2009 15:49:42 -0400
 From: jus...@emproshunts.com
 To: robertmh...@gmail.com
 CC: t...@sss.pgh.pa.us; st...@enterprisedb.com; s...@samason.me.uk; 
 pgsql-general@postgresql.org; pgsql-hack...@postgresql.org
 Subject: Re: [HACKERS] [GENERAL] string_to_array with empty input
 
 If someone can show me a real world example  this logic simplifies the 
 code and has more uses I'll bite
 
 
 I just presently can't see how this works better.
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

_
Rediscover Hotmail®: Get e-mail storage that grows with you. 
http://windowslive.com/RediscoverHotmail?ocid=TXT_TAGLM_WL_HM_Rediscover_Storage1_042009

Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Justin




Martin Gainty wrote:

  
  Split strings into array elements using provided
delimiter
  
string_to_array('xx~^~yy~^~zz', '~^~') 
output: {xx,yy,zz}
  
http://www.postgresql.org/docs/8.3/interactive/functions-array.html
  

Sorry thats not the question i'm asking. 

We are debating if it makes sense to change the output in certain
cases. 

I'm for not returning nulls or returning zero element array. 

I'm asking how is the other better by giving a real world example??? I
don't see the plus side at the moment. 




Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If you take 0 items of any type whatsoever and join them together with
 commas, you will get the empty string.  It is also true that if you
 join 1 item together with commas, you will get that item back, and if
 that item is the empty string, you will now have the empty string.  I
 think it's better to worry more about the first case because it
 applies to any type at all, whereas the latter case ONLY applies in
 situations where the empty string is a potentially legal value.

I'm starting to vacillate again.  It's clear that for the purposes
of string_to_array, an empty input string is fundamentally ambiguous:
it could mean a list of no things, or a list of one empty thing.
So the two cases in which an application can safely make use of this
function are
(1) if lists of no things never happen.
(2) if lists never contain empty things.
Either rule allows us to resolve the ambiguity.  We've been discussing
the fact that (2) is an okay assumption for many non-text data types,
but none-the-less string_to_array is in itself a text function and (2)
is not very good for text.  Making this worse, the format *appears*
to work fine for empty strings, so long as you don't have exactly
one of them.  So it seems like applications might be much more likely
to violate (2) than (1).

Another way to state the point is that we can offer people a choice of
two limitations: string_to_array doesn't work for zero-length lists,
or string_to_array doesn't work for empty strings (except most of the
time, it does).  The former is sounding less likely to bite people
unexpectedly.

Or we could stick to the current behavior and say use COALESCE() to
resolve the ambiguity, if you need to.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Robert Haas
On Wed, Apr 1, 2009 at 5:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or we could stick to the current behavior and say use COALESCE() to
 resolve the ambiguity, if you need to.

If there's no consensus on changing the behavior, it's probably better
to be backward compatible than not.

...Robert

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Steve Crawford

Tom Lane wrote:


I'm starting to vacillate again.  It's clear that for the purposes
of string_to_array, an empty input string is fundamentally ambiguous:
it could mean a list of no things, or a list of one empty thing.
  
Agreed. Of the two, a list of one empty thing makes string_to_array 
closer to an inverse of array_to_string.



Or we could stick to the current behavior and say use COALESCE() to
resolve the ambiguity, if you need to.

  
Currently string_to_array(null, ',') yields a null result - 
indistinguishable from string_to_array('',','). Wrapping in coalesce 
does not help distinguish true null input from empty-string input. I'm 
not sure at the moment what other cases exist where non-null input 
generates null output.


If the decision is to leave the behavior unchanged, it at least cries 
out for a documentation patch.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-04-01 Thread Leif B. Kristensen
On Thursday 2. April 2009, Steve Crawford wrote:
Currently string_to_array(null, ',') yields a null result -
indistinguishable from string_to_array('',','). Wrapping in coalesce
does not help distinguish true null input from empty-string input. I'm
not sure at the moment what other cases exist where non-null input
generates null output.

Somehow this reminds me of the old division by zero problem.

IMO, the proper way to handle this kind of anomaly would be to test if 
the length of the string is non-zero before submitting it to the 
string_to_array() function.
-- 
Leif Biberg Kristensen | Registered Linux User #338009
Me And My Database: http://solumslekt.org/blog/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-31 Thread Brendan Jurd
On Tue, Mar 31, 2009 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Does anyone want to argue for keeping it the same?  Or perhaps
 argue that a zero-element array is a more sensible result than
 a one-element array with one empty string?  (It doesn't seem
 like it to me, but maybe somebody thinks so.)


My first thought was that it should be a zero-element array, because
then the string_to_array() behaviour would conform to the notion that
it returns an array with 1 element per string fragment bounded by the
delimiter.

However, I note that if you provide an empty delimiter, or one which
doesn't occur anywhere in the source string, you get an array with one
element, being the entire source string.

# select string_to_array('1-2-3', '-');
 {1,2,3}

# select string_to_array('1-2-3', 'x');
 {1-2-3}

Given this behaviour, I would argue for consistent treatment for a
zero-length source string: it should return an array with one element,
being the entire source string, whenever there is no string splitting
to take place.  And if the source string happens to be zero-length,
then the return value would be as expected by the OP.

Cheers,
BJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-31 Thread Sam Mason
On Tue, Mar 31, 2009 at 05:45:33PM +1100, Brendan Jurd wrote:
 On Tue, Mar 31, 2009 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Does anyone want to argue for keeping it the same?  Or perhaps
  argue that a zero-element array is a more sensible result than
  a one-element array with one empty string?  (It doesn't seem
  like it to me, but maybe somebody thinks so.)
 
 Given this behaviour, I would argue for consistent treatment for a
 zero-length source string: it should return an array with one element,
 being the entire source string, whenever there is no string splitting
 to take place.  And if the source string happens to be zero-length,
 then the return value would be as expected by the OP.

I'd agree with this as well, just to be verbose:

  string_to_array(NULL,',')   =  NULL
  string_to_array('',',') =  {}
  string_to_array('a',',')=  {a}
  string_to_array('a,',',')   =  {a,}
  string_to_array('a,b',',')  =  {a,b}

However, I can see (nasty and hacky) reasons why the current behaviour
is there.  You'd get the following error if this change was accepted:

  string_to_array('',',')::INT[]  = invalid input syntax for integer: 

Which you don't get at the moment; although you do currently get it in
other common cases such as:

  string_to_array('1,',',')::INT[]

If you want backwards compatible behaviour you could always bung a
NULLIF in there:

  string_to_array(NULLIF('',''),',')::INT[]  = NULL

To aid porting of code and general utility, I'd be tempted to add a pair
of functions like:

  CREATE FUNCTION array_filter_blanks(TEXT[]) RETURNS TEXT[]
  LANGUAGE SQL IMMUTABLE STRICT AS $$
ARRAY(SELECT s FROM unnest($1) AS s WHERE s  '') $$;

  CREATE FUNCTION array_nullif(ANYARRAY,ANYELEMENT) RETURNS ANYARRAY
  LANGUAGE SQL IMMUTABLE AS $$
ARRAY(SELECT NULLIF(s,$2) FROM unnest($1) AS s) $$;

Although, this is obviously going above and beyond what you originally
asked for.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-31 Thread Tom Lane
Greg Stark st...@enterprisedb.com writes:
 On Tue, Mar 31, 2009 at 3:42 PM, Sam Mason s...@samason.me.uk wrote:
 string_to_array('',',')::INT[]  = invalid input syntax for integer: 

 Oof. That's a good point.

Isn't that an argument in favor of the zero-size-array definition?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-31 Thread David E. Wheeler

On Mar 31, 2009, at 8:34 AM, Sam Mason wrote:


What do you really expect to be returned for things like

select count_elements(string_to_array('butter,tea,milk',','))


3 = {butter,tea,milk}


select count_elements(string_to_array('butter,tea',','))


2 = {butter,tea}


select count_elements(string_to_array('butter',','))


1 = {butter}


select count_elements(string_to_array('',','))


1 = ARRAY['']


I'd expect 3,2,1 and 1.

That's also a disingenuous example; what would you expect back from:

 select count_elements(string_to_array('butter,,milk',','))


3 = ARRAY['butter', '', 'milk']


I think the semantics you want is what you'd get from:

 array_filter_blanks(string_to_array($1,$2))

where I defined array_filter_blanks in my previous post.


Yeah, if I wanted something like that in Perl, I'd do:

  my @stuff = grep { $_ } split /,/, $string;

In no case would I ever expect a NULL, however, unless I was trying to  
split on NULL.


NULL = string_to_array(NULL, ',');

Best,

David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-31 Thread justin

Sam Mason wrote:


  string_to_array('',',')::INT[]  = invalid input syntax for integer: 

Which you don't get at the moment; although you do currently get it in
other common cases such as:

  string_to_array('1,',',')::INT[]

If you want backwards compatible behaviour you could always bung a
NULLIF in there:

  string_to_array(NULLIF('',''),',')::INT[]  = NULL


  


But consider  this fails also

select string_to_array('1, , 3', ',' )::int[] = ERROR:  invalid input 
syntax for integer:  


yet this works

select string_to_array('1, 2, 3',',')::int[]

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] string_to_array with empty input

2009-03-30 Thread David E. Wheeler

On Mar 30, 2009, at 8:26 PM, Tom Lane wrote:


Does anyone want to argue for keeping it the same?  Or perhaps
argue that a zero-element array is a more sensible result than
a one-element array with one empty string?  (It doesn't seem
like it to me, but maybe somebody thinks so.)


Hrm. There seems to be some disagreement about this among some  
languages:


% perl -le '@r = split /-/, ; print length @r; print qq{$r[0]}'
1


% irb
 puts ''.split('-')
= nil

So Perl returns a single element as Steve had been expecting, while  
Ruby returns nil. I'm used to the Perl way, but I guess there's room  
for various interpretations, including the current implementation,  
with which Ruby would seem to agree.


Best,

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general