Re: (CAY-1210) mysql does not use index for case insensitive searches

Andrus Adamchik Wed, 12 Jan 2011 04:55:52 -0800

On Jan 12, 2011, at 7:01 AM, Aristedes Maniatis wrote:

> On 11/01/11 10:18 PM, Andrew Lindesay wrote:
>> I think the functional indexes do make sense and it seems reasonable that a 
>> DBA would apply them as part of tuning.  Maybe it would be best to have a 
>> page in the manual on this explaining that for MySQL/... CI search it is 
>> best to change the collation on the schema objects and for Oracle/PG/... CI 
>> search it is best to implement functional indexes?
> 
> It is all a bit ugly. Postgresql docs recommend using LOWER() to achieve ci 
> search, whereas Cayenne spits out UPPER(). Hibernate also produces LOWER(). 
> So a db used by different systems is going to need quite a few indexes.

Not ideal, but DB indexes always depend on the data access patterns by the 
clients, not the other way around. 

> On 11/01/11 11:04 PM, Andrus Adamchik wrote:
>> Maybe do a total hack to get us out of this limbo - how about a DI extension 
>> point for CI LIKE SQL generation strategy (or rather a System property 
>> activating this strategy - 'default collation case sensitivity')? This won't 
>> complicate the mapping, won't force us to generalize, and will allow the 
>> same mapping to be used with both kinds of schemas.
> 
> Well, a database-wide property solves my particular use-case. Seems clunky 
> (since it is database-wide and not specific to columns), but if you are 
> convinced that case-sensitivity is not an attribute of the model, then this 
> is the only way. 

From our discussion, it can be a property of the model (either DB-wide or 
per-column), or a property of a specific DB deployment. Also I am not 
completely opposed to mapping case sensitivity settings per DbAttribute, but 
like I said, I am very conservative about adding new mapping abstractions, so 
I'd rather we try it in a less invasive way first (that would allow us to test 
a new CI LIKE strategy) and then decide if it is worthwhile extending 
DbAttribute.

Anyways, I'll add the description of this interim solution to Jira.

> Other than some JDBC specification, why is the choice between BLOB, CLOB and 
> TEXT a modeller property, but the choice between ci-TEXT and cs-TEXT isn't?
> "BLOB values are treated as binary strings (byte strings). They have no 
> character set, and sorting and comparison are based on the numeric values of 
> the bytes in column values. TEXT values are treated as nonbinary strings 
> (character strings). They have a character set, and values are sorted and 
> compared based on the collation of the character set." Effectively BLOBs are 
> just TEXT with collation = none.

There's no TEXT type in JDBC (and Cayenne). The only similarity between BLOB 
and CLOB is that they are sequences of elements. Beyond that they are different 
data type (of course you can represent anything as a BLOB, but that would be 
highly impractical). So not sure where this analogy goes (and like I said above 
I am not completely against an extra CS property for *char* types).

> Out of interest, my googling discovered that Atlassian developers are running 
> into this problem with Hibernate. But they didn't have a nice solution. 
> http://jira.atlassian.com/browse/CONF-10030

Yeah, that shows that we are not inventing the problem at least.

Andrus

Re: (CAY-1210) mysql does not use index for case insensitive searches

Reply via email to