Re: [GENERAL] Very puzzling sort behavior

Ken Tanzer Thu, 10 Sep 2015 13:51:12 -0700

On Thu, Sep 10, 2015 at 12:56 PM, Peter Geoghegan <
peter.geoghega...@gmail.com> wrote:

> On Thu, Sep 10, 2015 at 12:51 PM, Ken Tanzer <ken.tan...@gmail.com> wrote:
> > OK, can one of you help me out in understanding this?  I would have
> thought that given "CLARK," and "CLARKE" that the comma would get compared
> against the E and come first.  End of story, before we even get to anything
> farther in the string.  What am I missing?
>
> That's only how it works with the C locale. Otherwise, there are
> complicated rules to weigh things like space and punctuation (and
> accents/diacritics) less prominently than primary alphabetical
> ordering. This is often useful. Anyway, based on what you say here, I
> think you should actually "ORDER BY name_last, name_first".
>
> --
> Regards,
> Peter Geoghegan
>

Thanks.  A little more help would be appreciated.  First a little context:

What I mailed out what a boiled down example.  In reality, what I have is a
ton of tables with a client_id in them, and a convenience function
client_name(client_id) that returns the name_last, name_first string (plus
an alias if it exists).  client_name is used all over the place in both
views and in an app that uses the database.  There is a similar, also
heavily used, staff_name function.  Eliminating the use of these functions
is a non-starter for me--I'd much rather live with the existing sort
behavior, especially as no one has ever pointed this out despite over a
decade of use.

I'm hoping to sort change this behavior with as minimal a change as
possible (e.g., minimal potential for unexpected side effects or
breakage).  I was hoping to just add a COLLATE "C" within the function:

CREATE OR REPLACE FUNCTION client_name( cid int4 ) RETURNS text AS $$
     -- client is a view that draws from tbl_client.  name_full is the
field with the name_last, name_first data in it
     SELECT name_full COLLATE "C" FROM client WHERE client_id=$1;
$$

but that seems to have no effect.  And sure enough the documentation seems
to back that up. ("The collation assigned to a function or operator's
combined input expressions is also considered to apply to the function or
operator's result, if the function or operator delivers a result of a
collatable data type.")  So this may be wishful thinking, but is there any
other way to specify the collation of a function result?  Specifying the
collation every time the function is used is likely a no-go for me too.

Alternatively, it seems I could create new databases with a C collation and
then move my data into them.  This seems a bit drastic, although possible.
I'd again be worried about the breakage/side effects.  And actually, will
this work?  (i.e., can you use pg_dump to populate a new database with a
different locale?)

Are there any other potential solutions, pitfalls or considerations that
come to mind?  Any thoughts welcome.  And as I said, if there's not a good
way to do this I'll probably leave it alone.  Thanks.

Ken

-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://agency-software.org/demo/client
<https://agency-software.org/demo/client>*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
<agency-general-requ...@lists.sourceforge.net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

Re: [GENERAL] Very puzzling sort behavior

Reply via email to