I put up ticket #487 for this. theres a whole batch of sql-related tickets im going to try to line up for a marathon session at some future date in the next few weeks.
On Feb 21, 2007, at 10:25 AM, Troy wrote: > > Mike, > > Based on your excellent feedback and trying a couple different > approaches I came up with the following. I have a diff file, but I'm > new to contributing to open source so I'm not sure the best method to > get this to you (so I'll copy paste below). The overview: > > * added visit_case_insensitive and overrode in PGCompiler to use > "lower" > * sub-classed ClauseList with _OrderByClause and _GroupByClause. > > This was done to add some context as to where in the statement the > clauses belong, because we do not want to mess with the case of the > results being returned, only when the character data is compared at > the server. I need to do some more work to peg HAVING, but this works > well otherwise. > > * added _CaseInsensitiveCompare like you mentioned. > * modified _compare_self instead of __eq__, and added case_compare > param to the String type > > I think this is better then added case_compare at the column level, > because this only affects character data. Dates, Int's, etc do not > have the same collation issues and the issues with Date that do exist > are already addressed. > > Let me know if you need me to run in a different direction with this, > or what I need to do differently to get this considered to be included > in sqlalchemy. > > Troy > > > ====== Diff File ====== > > Index: sqlalchemy/sql.py > =================================================================== > --- sqlalchemy/sql.py (revision 2346) > +++ sqlalchemy/sql.py (working copy) > @@ -351,6 +351,7 @@ > def visit_clauselist(self, list):pass > def visit_calculatedclause(self, calcclause):pass > def visit_function(self, func):pass > + def visit_case_insensitive(self, func):pass > def visit_cast(self, cast):pass > def visit_label(self, label):pass > def visit_typeclause(self, typeclause):pass > @@ -947,6 +948,17 @@ > else: > return False > > +class _OrderByClause(ClauseList): > + def append(self, clause): > + if _is_literal(clause): > + clause = _TextClause(str(clause)) > + elif isinstance(clause,ColumnElement) and not > getattr(clause.type,'compare_case',True): > + clause = _CaseInsensitiveCompare(clause) > + self.clauses.append(clause) > + > +class _GroupByClause(_OrderByClause): > + pass > + > class _CompoundClause(ClauseList): > """represents a list of clauses joined by an operator, such as > AND or OR. > extends ClauseList to add the operator as well as a from_objects > accessor to > @@ -1037,6 +1049,16 @@ > c.accept_visitor(visitor) > visitor.visit_function(self) > > +class _CaseInsensitiveCompare(ColumnElement): > + def __init__(self, clause): > + self.clause = clause > + self.type = clause.type > + def accept_visitor(self, visitor): > + self.clause.accept_visitor(visitor) > + visitor.visit_case_insensitive(self) > + def _get_from_objects(self): > + return self.clause._get_from_objects() > + > class _Cast(ColumnElement): > def __init__(self, clause, totype, **kwargs): > if not hasattr(clause, 'label'): > @@ -1364,18 +1386,18 @@ > return True > def order_by(self, *clauses): > if len(clauses) == 1 and clauses[0] is None: > - self.order_by_clause = ClauseList() > + self.order_by_clause = _OrderByClause() > elif getattr(self, 'order_by_clause', None): > - self.order_by_clause = > ClauseList(*(list(self.order_by_clause.clauses) + list(clauses))) > + self.order_by_clause = > _OrderByClause(*(list(self.order_by_clause.clauses) + list(clauses))) > else: > - self.order_by_clause = ClauseList(*clauses) > + self.order_by_clause = _OrderByClause(*clauses) > def group_by(self, *clauses): > if len(clauses) == 1 and clauses[0] is None: > - self.group_by_clause = ClauseList() > + self.group_by_clause = _GroupByClause() > elif getattr(self, 'group_by_clause', None): > - self.group_by_clause = ClauseList(*(list(clauses) > +list(self.group_by_clause.clauses))) > + self.group_by_clause = _GroupByClause(*(list(clauses) > +list(self.group_by_clause.clauses))) > else: > - self.group_by_clause = ClauseList(*clauses) > + self.group_by_clause = _GroupByClause(*clauses) > def select(self, whereclauses = None, **params): > return select([self], whereclauses, **params) > def _get_from_objects(self): > Index: sqlalchemy/databases/postgres.py > =================================================================== > --- sqlalchemy/databases/postgres.py (revision 2346) > +++ sqlalchemy/databases/postgres.py (working copy) > @@ -463,7 +463,7 @@ > > table.append_constraint(ForeignKeyConstraint(constrained_columns, > refspec, conname)) > > class PGCompiler(ansisql.ANSICompiler): > - > + > def visit_insert_column(self, column, parameters): > # all column primary key inserts must be explicitly present > if column.primary_key: > @@ -502,7 +502,11 @@ > if isinstance(binary.type, sqltypes.String) and > binary.operator == '+': > return '||' > else: > - return ansisql.ANSICompiler.binary_operator_string(self, > binary) > + return ansisql.ANSICompiler.binary_operator_string(self, > binary) > + > + def visit_case_insensitive(self, object): > + self.strings[object] = "lower(%s)" % > self.strings[object.clause] > + > > class PGSchemaGenerator(ansisql.ANSISchemaGenerator): > > Index: sqlalchemy/schema.py > =================================================================== > --- sqlalchemy/schema.py (revision 2346) > +++ sqlalchemy/schema.py (working copy) > @@ -497,6 +497,12 @@ > for constraint in self.constraints: > constraint.accept_schema_visitor(visitor, > traverse=True) > visitor.visit_column(self) > + > + def _compare_self(self): > + if isinstance(self.type,types.String) and not > self.type.compare_case: > + return sql._CaseInsensitiveCompare(self) > + else: > + return self > > > class ForeignKey(SchemaItem): > Index: sqlalchemy/types.py > =================================================================== > --- sqlalchemy/types.py (revision 2346) > +++ sqlalchemy/types.py (working copy) > @@ -169,11 +169,18 @@ > return super(String, cls).__new__(cls, *args, **kwargs) > else: > return super(String, TEXT).__new__(TEXT, *args, **kwargs) > - def __init__(self, length = None): > + def __init__(self, length = None, compare_case = False): > + """Initializes new string instance > + > + compare_case - determines if case affects comparisons, > sorting, etc > + """ > self.length = length > + self.compare_case = compare_case > def adapt(self, impltype): > return impltype(length=self.length) > def convert_bind_param(self, value, dialect): > + if not self.compare_case and hasattr(value,'lower'): > + value = value.lower() > if not dialect.convert_unicode or value is None or not > isinstance(value, unicode): > return value > else: > @@ -200,7 +207,7 @@ > return value.decode(dialect.encoding) > else: > return value > - > + > class Integer(TypeEngine): > """integer datatype""" > def get_dbapi_type(self, dbapi): > Index: sqlalchemy/ansisql.py > =================================================================== > --- sqlalchemy/ansisql.py (revision 2346) > +++ sqlalchemy/ansisql.py (working copy) > @@ -225,6 +225,10 @@ > else: > self.strings[column] = > self.preparer.format_column_with_table(column) > > + def visit_case_insensitive(self, object): > + #override this in databases that default to case insesitive > collation > + self.strings[object] = > self.strings[object.clause] > + > def visit_fromclause(self, fromclause): > self.froms[fromclause] = fromclause.name > > @@ -257,7 +261,7 @@ > self.strings[compound] = "(" + s + ")" > else: > self.strings[compound] = s > - > + > def visit_clauselist(self, list): > if list.parens: > self.strings[list] = "(" + string.join([s for s in > [self.get_str(c) for c in list.clauses] if s is not None], ', ') + ")" > > > > > > > > > > On Feb 19, 12:17 pm, "Michael Bayer" <[EMAIL PROTECTED]> wrote: >> On Feb 19, 2:32 pm, "Troy" <[EMAIL PROTECTED]> wrote: >> >>> Even though MySQL allows the setting of COLLATE, it does not support >>> functional indexes, so if your code explicitly calls lower you >>> technically now have code that will work for both MySQL and >>> Postgres, >>> but MySQL is going to take a big performance hit and perform table >>> scans regardless of any created indexes. Realistically, I can't see >>> anyone setting MySQL collate rules to case-sensitive without the >>> support of functional indexes, which MySQL does not have. >> >> right...so some explicitness is required, somewhere (since SA cant >> just put lower() across the board). >> >> >> >>> In simple terms, if the Postgres dialect supported >>> compare_insensitive=True|False|[upper|lower]? >> >> on a Column-by-Column or even per-operation context is probably more >> useful. >> >>> sqlalchemy easily support Postgres and MySQL with the same code >>> in out- >>> of-the-box configurations. But even if it didn't, if there was a >>> way >>> to override the default postgres dialect I'd be a happy camper. >>> Infact, that is exactly what I have done. I added a dialect called >>> "lowergres", but I'm stuck because I can not seem to find the >>> appropriate hook to alter the column to a func.lower when the column >>> is not part of the selected column list. >> >> format_column looks to me >> >>> like the right place to do it without converting a column to a >>> function, but the column object in that function has no context >>> as to >>> where it lies in the sql statement. >> >> for this approach, id advise setting "state" on the compiler when you >> are processing the column clause of a SELECT (or dealing with insert/ >> update column lists too), vs. when you are processing WHERE, ORDER >> BY, >> etc. that flag can be used to determine where youre traversing. the >> compiler knows the context since its the one generating the >> statement. >> >> a more accurate way to do is to wrap the Column itself, as it goes >> into a comparison operation or ORDER BY, in a new construct such as >> "CaseInsensitive"...heres some pseudo-ish code based on the __eq__() >> method you see in sqlalchemy/sql.py _CompareMixin (assume "self" is a >> Column): >> >> def __eq__(self, other): >> return _BooleanExpression(CaseInsensitive(self), other, '==') >> >> Better yet its configurable on Column: >> >> def __eq__(self, other): >> if self.case_insensitive_compare: >> return _BooleanExpression(CaseInsensitive(self), other, '==') >> else: >> return _BooleanExpression(self, other, '==') >> >> CaseInsensitive looks a lot like _Function and is just: >> >> class CaseInsensitive(_CalculatedClause): >> def __init__(self, target): >> self.target = target >> def accept_visitor(self, visitor): >> self.target.accept_visitor(visitor) >> visitor.visit_case_insensitive(self) >> >> ansicompiler provides the string representation of the underlying >> target with no modification: >> >> def visit_case_insensitive(self, object): >> self.strings[object] = self.strings[object.target] >> >> postgres compiler with case_insensitive provides it as: >> >> def visit_case_insensitive(self, object): >> self.strings[object] = "lower(%s)" % self.strings[object.target] >> >> other dialects can have whatever flags to turn on/off the "lower()" >> wrapping as well. >> >> what we're really talking about here is a func.lower() that has >> special meaning to the compiler, i.e. that it should be conditionally >> applied based on dialect flags. i think the flag on Column to have >> the wrapper applied might be pretty slick. >> >> >> >>> I'm curious as to others experiences with writing an app that >>> supports >>> both Postgres and MySQL with sqlalchemy, because if someone else is >>> doing this then I must be missing something, or maybe not. At >>> first, >>> our unit tests all passed, then when we added real world data with >>> mixed case, tests started to fail on everything doing sorts and >>> where's on character data. >> >> yeah i dont think anyone has gotten too far with this issue, also ive >> no idea what the giant universe of Hibernate users do either (i think >> there just arent db-platform-neutral J2EE apps). >> >> >> >>> How about a Pepsi (at PyCon)? >> >> sure thing ! > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---