Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Marcel van Pinxteren
As I mentioned in my original post, I don't want to use citext or lower().
I tested on Windows, but as I mentioned in one of my first posts, collation
and case sensitivity are separate things.

With this, we are back at the beginning of the circle, so I'll leave it
there.
Maybe I'll check back in a year or so, to see if case insensitivity has
been implemented.

Good luck


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Thomas Kellerer

Marcel van Pinxteren, 21.01.2013 13:22:

As I mentioned in my original post, I don't want to use citext or lower().


Why not for the unique index/constraint?







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Scott Marlowe
On Mon, Jan 21, 2013 at 5:22 AM, Marcel van Pinxteren
marcel.van.pinxte...@gmail.com wrote:
 As I mentioned in my original post, I don't want to use citext or lower().
 I tested on Windows, but as I mentioned in one of my first posts, collation
 and case sensitivity are separate things.

Wait, is there an actual reason for not using it, or you just don't
want to?  I don't think the postgresql hackers are gonna find I don't
want to do it that way a compelling reason to work on your issue.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Marcel van Pinxteren
To be honest, the reason I don't want to use citext and lower(), is me
being lazy. If I have to use these features, there is more work for me
converting from SQL Server to Postgresql. I have to make more changes to my
database, and more to my software.
But, developers are generally lazy, so you could argue that this reason is
compelling.
The other reason, is that I assume that lower() adds overhead, so makes
things slower than they need to be.
Whether that is true, and if that is a compelling reason, I don't know.

Marcel


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Alban Hertroys
On 21 January 2013 17:25, Marcel van Pinxteren 
marcel.van.pinxte...@gmail.com wrote:

 The other reason, is that I assume that lower() adds overhead, so makes
 things slower than they need to be.
 Whether that is true, and if that is a compelling reason, I don't know.


Case insensitive collation adds overhead too. It wouldn't surprise me if
that were more than lower() adds - collation is complicated stuff.
-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Scott Marlowe
On Mon, Jan 21, 2013 at 9:25 AM, Marcel van Pinxteren
marcel.van.pinxte...@gmail.com wrote:
 To be honest, the reason I don't want to use citext and lower(), is me being
 lazy. If I have to use these features, there is more work for me converting
 from SQL Server to Postgresql. I have to make more changes to my database,
 and more to my software.
 But, developers are generally lazy, so you could argue that this reason is
 compelling.
 The other reason, is that I assume that lower() adds overhead, so makes
 things slower than they need to be.
 Whether that is true, and if that is a compelling reason, I don't know.

Honestly as a lazy DBA I have to say it'd be pretty easy to write a
script to convert any unique text index into a unique text index with
a upper() in it.  As another poster added, collation ain't free
either.  I'd say you should test it to see.  My experience tells me
that having an upper() (or lower()) index is not a big performance
hit.  If the storage of the index would be too much due to large text
fields then make it a md5(lower()) index, which WILL cost more CPU
wise, but allow for  3k or so of text in a column to be indexed and
cost less IO wise.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Kevin Grittner
Scott Marlowe wrote:

 Honestly as a lazy DBA I have to say it'd be pretty easy to write a
 script to convert any unique text index into a unique text index with
 a upper() in it. As another poster added, collation ain't free
 either. I'd say you should test it to see. My experience tells me
 that having an upper() (or lower()) index is not a big performance
 hit. If the storage of the index would be too much due to large text
 fields then make it a md5(lower()) index, which WILL cost more CPU
 wise, but allow for  3k or so of text in a column to be indexed and
 cost less IO wise.

Depending on what sort of search you want to do, it might be dead
simple to use tsearch2 (which is case insensitive) or trigram
indexing (for which a similarity search is case insensitive).

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Thomas Kellerer

Marcel van Pinxteren wrote on 21.01.2013 17:25:

The other reason, is that I assume that lower() adds overhead


It won't add any noticeable overhead for the unique index.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-21 Thread Jeff Janes
On Mon, Jan 21, 2013 at 1:45 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Mon, Jan 21, 2013 at 9:25 AM, Marcel van Pinxteren
 marcel.van.pinxte...@gmail.com wrote:
 To be honest, the reason I don't want to use citext and lower(), is me being
 lazy. If I have to use these features, there is more work for me converting
 from SQL Server to Postgresql. I have to make more changes to my database,
 and more to my software.
 But, developers are generally lazy, so you could argue that this reason is
 compelling.
 The other reason, is that I assume that lower() adds overhead, so makes
 things slower than they need to be.
 Whether that is true, and if that is a compelling reason, I don't know.

 Honestly as a lazy DBA I have to say it'd be pretty easy to write a
 script to convert any unique text index into a unique text index with
 a upper() in it.

But changing the application to create queries with upper() in the
queries could be very hard.  And without that, the function index
would be useless.

That objection doesn't apply to citext.  I don't know what object to that is.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-18 Thread Jasen Betts
On 2013-01-16, Marcel van Pinxteren marcel.van.pinxte...@gmail.com wrote:
 --90e6ba6140da259e8204d36d0fa3
 Content-Type: text/plain; charset=ISO-8859-1

 From the Microsoft site I learned
 http://msdn.microsoft.com/en-us/library/ms188046(v=sql.105).aspx
 that they combine collation and ComparisonStyle to a collation name.

 I thought that case insensitivity had to be built into the collation, but
 apparently MS built case sensitivity in the database engine.
 This would mean that Postgresql would need to build case (in)sensitivity
 into her engine as well.

what result are you tring to get?

-- 
⚂⚃ 100% natural



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-18 Thread Marcel van Pinxteren
Desired behaviour:
1. If there is a row with 'ABC' (in a unique column) in the table, a row
with 'abc' should not be allowed
2. If I do SELECT * FROM aTable WHERE aColumn = 'ABC', I should see a row
with 'abc' as well (if there is one in the table)

This has been described in this mailing list a few months ago, in more
detail.


Re: [GENERAL] Case insensitive collation

2013-01-18 Thread Alban Hertroys
You could look into running the DB on an OS that does support case
insensitive collation. It'll likely perform better too.


On 16 January 2013 20:40, Marcel van Pinxteren 
marcel.van.pinxte...@gmail.com wrote:

 From the Microsoft site I learned
 http://msdn.microsoft.com/en-us/library/ms188046(v=sql.105).aspx
 that they combine collation and ComparisonStyle to a collation name.

 I thought that case insensitivity had to be built into the collation, but
 apparently MS built case sensitivity in the database engine.
 This would mean that Postgresql would need to build case (in)sensitivity
 into her engine as well.

 Judging from the small amount of discussion on this subject, I am afraid
 this is not going to happen anytime soon. Alas, we will stay with SQL
 Server then (or maybe MySQL, but I will have to investigate).

 Met vriendelijke groet,

 Marcel van Pinxteren
 --
 Volg onze bouw op http://nieuwekampen.blogspot.com


 On Wed, Jan 16, 2013 at 8:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Marcel van Pinxteren marcel.van.pinxte...@gmail.com writes:
  Therefore the question: are there plans to create a set of case
  insensitive, and maybe also accent insensitive collations in the near
  future?

 Not from the Postgres project -- we just use the collations supplied by
 the operating system.

 regards, tom lane





-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


Re: [GENERAL] Case insensitive collation

2013-01-18 Thread Thomas Kellerer

Marcel van Pinxteren, 18.01.2013 14:13:

Desired behaviour:
1. If there is a row with 'ABC' (in a unique column) in the table, a row with 
'abc' should not be allowed


That's an easy one:

create unique index on foo (lower(the_column));





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-18 Thread Alex Hunsaker
On Fri, Jan 18, 2013 at 6:13 AM, Marcel van Pinxteren 
marcel.van.pinxte...@gmail.com wrote:

 Desired behaviour:
 1. If there is a row with 'ABC' (in a unique column) in the table, a row
 with 'abc' should not be allowed
 2. If I do SELECT * FROM aTable WHERE aColumn = 'ABC', I should see a row
 with 'abc' as well (if there is one in the table)

 This has been described in this mailing list a few months ago, in more
 detail.


Have you seen the citext module?:
http://www.postgresql.org/docs/current/static/citext.html.

It does pretty much that (albeit by having columns be the citext type
instead of text or varchar).


[GENERAL] Case insensitive collation

2013-01-16 Thread Marcel van Pinxteren
The subject has been discussed on this mailing list before, recently.
To be able to switch from SQL Server to Postgresql, for me this is
essential.

Therefore the question: are there plans to create a set of case
insensitive, and maybe also accent insensitive collations in the near
future?
I have no idea how complex this is, but it seems to me, looking at MySQL
and SQL Server that it's not such a strange thing to expect from a database
server.
I know I can use lower (even on indexes) and citext, but this feels like
patchwork, and would mean a lot of work when converting our database (with
122 tables).

Regards,
Marcel van Pinxteren


Re: [GENERAL] Case insensitive collation

2013-01-16 Thread Tom Lane
Marcel van Pinxteren marcel.van.pinxte...@gmail.com writes:
 Therefore the question: are there plans to create a set of case
 insensitive, and maybe also accent insensitive collations in the near
 future?

Not from the Postgres project -- we just use the collations supplied by
the operating system.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Case insensitive collation

2013-01-16 Thread Marcel van Pinxteren
From the Microsoft site I learned
http://msdn.microsoft.com/en-us/library/ms188046(v=sql.105).aspx
that they combine collation and ComparisonStyle to a collation name.

I thought that case insensitivity had to be built into the collation, but
apparently MS built case sensitivity in the database engine.
This would mean that Postgresql would need to build case (in)sensitivity
into her engine as well.

Judging from the small amount of discussion on this subject, I am afraid
this is not going to happen anytime soon. Alas, we will stay with SQL
Server then (or maybe MySQL, but I will have to investigate).

Met vriendelijke groet,

Marcel van Pinxteren
--
Volg onze bouw op http://nieuwekampen.blogspot.com


On Wed, Jan 16, 2013 at 8:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Marcel van Pinxteren marcel.van.pinxte...@gmail.com writes:
  Therefore the question: are there plans to create a set of case
  insensitive, and maybe also accent insensitive collations in the near
  future?

 Not from the Postgres project -- we just use the collations supplied by
 the operating system.

 regards, tom lane