GK wrote: > I've been digging around the archives, web and source code to figure > case insensitive queries using SA, and am posting these notes: > (a) for review and comment, and > (b) to provide some possible hints for others who may follow > > For me, a big benefit of using SA is that it insulates me from > database specifics. Ideally, I wanted a way to handle case > insensitive queries that would work across all supported databases, or > at least SQLite, MySQL and Postgres. Previous messages on this list > [1][2] suggest that won't happen before SA 0.5 (with no guarantee of > that, I guess) [3]. There was a possible answer at [4], but I didn't > like the potential implications for query performance. > > [1] > http://groups.google.com/group/sqlalchemy/browse_frm/thread/ce87bd74fa4311ba/9c06644a94b358b6?lnk=gst&q=case+insensitive+query#9c06644a94b358b6 > > [2] > http://groups.google.com/group/sqlalchemy/browse_frm/thread/cc8c775dd54b7b9c/18e059753d819455?lnk=gst&q=case+insensitive+query#18e059753d819455 > > [3] http://www.sqlalchemy.org/trac/ticket/487 > > [4] > http://groups.google.com/group/sqlalchemy/browse_frm/thread/a7b1c325c267bba8/6ef14c5957b59395?lnk=gst&q=case+insensitive+query#6ef14c5957b59395 > > > > So the solution I've adopted, and tested with SQLite, is to use a > TypeDecorator class, thus: > [[ > class CI_String(sqlalchemy.types.TypeDecorator): > """ > Defines a case insensitive string type using SQLite dialect > options > > TODO: extend case insensitive specification to support other > databases > """ > impl = sqlalchemy.String > > def get_col_spec(self): > return super(CI_String,self).get_col_spec()+" COLLATE NOCASE" > ]] > > Which I use in a Table definition thus: > [[ > def defineAffyLocalTable(tablename, metadata): > """ > Return SQLAlchemy table for Affymetrix local data entry. > """ > table = Table(tablename, metadata, > Column('recordid', Integer, Sequence('recordid_seq'), > primary_key=True), > Column('probeid', CI_String(25), index=True), > Column('aly_mean', Float), > Column('can_mean', Float), > Column('comr_mean', Float), > Column('topi_mean', Float), > Column('zaa_mean', Float), > Column('red_e_mean', Float), > Column('description', Text), > ) > return table > ]] > > Might something like this be a basis for a fairly simple SA-common > type that can be implemented appropriately for each database?
The generic character types in types.py should accept collate and character set options. The only dialect with implementations for these options currently is mysql, and some code & the docstrings can be cherry picked from msyql for use in types.py. DDL generation implementations can go in for the other databases with collation support. But that only gets partway to something like CI_String. SQLite is the only database I know of with a workable across-the-board 'lower' collation. I'm used to seeing collation tied to the character set of the column storage, with more options than a simple binary upper/lower: Latin1_General_BIN Latin1_General_CI_AI_KS Latin1_General_CS_AS_KS_WS SQL_Latin1_General_CP1_CI_AS utf8_general_ci utf8_bin utf8_unicode_ci utf8_spanish_ci utf8_swedish_ci utf8_turkish_ci The default character set used for column storage is usually unknown to SA at DDL time, and, even if known, which collation to pick as insensitive? Collations are database specific and don't always have a 'general' variant. And then there are database such as Postgres that don't yet have collation support and would need queries to be re-written to use lower(col). I'd love to see easy insensitive query support in SA but I don't see an obvious path forward. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---