[sqlalchemy] Re: Differentiate ANSIIdentifierPreparer.format_column select clause from where clause??

Troy Wed, 21 Feb 2007 07:29:36 -0800

Mike,

Based on your excellent feedback and trying a couple different
approaches I came up with the following.  I have a diff file, but I'm
new to contributing to  open source so I'm not sure the best method to
get this to you (so I'll copy paste below).  The overview:


* added visit_case_insensitive and overrode in PGCompiler to use
"lower"
* sub-classed ClauseList with _OrderByClause and _GroupByClause.

This was done to add some context as to where in the statement the
clauses belong, because we do not want to mess with the case of the
results being returned, only when the character data is compared at
the server.  I need to do some more work to peg HAVING, but this works
well otherwise.

* added _CaseInsensitiveCompare like you mentioned.
* modified _compare_self instead of __eq__, and added case_compare
param to the String type

I think this is better then added case_compare at the column level,
because this only affects character data.  Dates, Int's, etc do not
have the same collation issues and the issues with Date that do exist
are already addressed.

Let me know if you need me to run in a different direction with this,
or what I need to do differently to get this considered to be included
in sqlalchemy.

Troy


====== Diff File ======

Index: sqlalchemy/sql.py
===================================================================
--- sqlalchemy/sql.py   (revision 2346)
+++ sqlalchemy/sql.py   (working copy)
@@ -351,6 +351,7 @@
     def visit_clauselist(self, list):pass
     def visit_calculatedclause(self, calcclause):pass
     def visit_function(self, func):pass
+    def visit_case_insensitive(self, func):pass
     def visit_cast(self, cast):pass
     def visit_label(self, label):pass
     def visit_typeclause(self, typeclause):pass
@@ -947,6 +948,17 @@
         else:
             return False

+class _OrderByClause(ClauseList):
+    def append(self, clause):
+        if _is_literal(clause):
+            clause = _TextClause(str(clause))
+        elif isinstance(clause,ColumnElement) and not
getattr(clause.type,'compare_case',True):
+            clause = _CaseInsensitiveCompare(clause)
+        self.clauses.append(clause)
+
+class _GroupByClause(_OrderByClause):
+    pass
+
 class _CompoundClause(ClauseList):
     """represents a list of clauses joined by an operator, such as
AND or OR.
     extends ClauseList to add the operator as well as a from_objects
accessor to
@@ -1037,6 +1049,16 @@
             c.accept_visitor(visitor)
         visitor.visit_function(self)

+class _CaseInsensitiveCompare(ColumnElement):
+    def __init__(self, clause):
+        self.clause = clause
+        self.type = clause.type
+    def accept_visitor(self, visitor):
+        self.clause.accept_visitor(visitor)
+        visitor.visit_case_insensitive(self)
+    def _get_from_objects(self):
+        return self.clause._get_from_objects()
+
 class _Cast(ColumnElement):
     def __init__(self, clause, totype, **kwargs):
         if not hasattr(clause, 'label'):
@@ -1364,18 +1386,18 @@
         return True
     def order_by(self, *clauses):
         if len(clauses) == 1 and clauses[0] is None:
-            self.order_by_clause = ClauseList()
+            self.order_by_clause = _OrderByClause()
         elif getattr(self, 'order_by_clause', None):
-            self.order_by_clause =
ClauseList(*(list(self.order_by_clause.clauses) + list(clauses)))
+            self.order_by_clause =
_OrderByClause(*(list(self.order_by_clause.clauses) + list(clauses)))
         else:
-            self.order_by_clause = ClauseList(*clauses)
+            self.order_by_clause = _OrderByClause(*clauses)
     def group_by(self, *clauses):
         if len(clauses) == 1 and clauses[0] is None:
-            self.group_by_clause = ClauseList()
+            self.group_by_clause = _GroupByClause()
         elif getattr(self, 'group_by_clause', None):
-            self.group_by_clause = ClauseList(*(list(clauses)
+list(self.group_by_clause.clauses)))
+            self.group_by_clause = _GroupByClause(*(list(clauses)
+list(self.group_by_clause.clauses)))
         else:
-            self.group_by_clause = ClauseList(*clauses)
+            self.group_by_clause = _GroupByClause(*clauses)
     def select(self, whereclauses = None, **params):
         return select([self], whereclauses, **params)
     def _get_from_objects(self):
Index: sqlalchemy/databases/postgres.py
===================================================================
--- sqlalchemy/databases/postgres.py    (revision 2346)
+++ sqlalchemy/databases/postgres.py    (working copy)
@@ -463,7 +463,7 @@
 
table.append_constraint(ForeignKeyConstraint(constrained_columns,
refspec, conname))

 class PGCompiler(ansisql.ANSICompiler):
-
+
     def visit_insert_column(self, column, parameters):
         # all column primary key inserts must be explicitly present
         if column.primary_key:
@@ -502,7 +502,11 @@
         if isinstance(binary.type, sqltypes.String) and
binary.operator == '+':
             return '||'
         else:
-            return ansisql.ANSICompiler.binary_operator_string(self,
binary)
+            return ansisql.ANSICompiler.binary_operator_string(self,
binary)
+
+    def visit_case_insensitive(self, object):
+        self.strings[object] = "lower(%s)" %
self.strings[object.clause]
+

 class PGSchemaGenerator(ansisql.ANSISchemaGenerator):

Index: sqlalchemy/schema.py
===================================================================
--- sqlalchemy/schema.py        (revision 2346)
+++ sqlalchemy/schema.py        (working copy)
@@ -497,6 +497,12 @@
             for constraint in self.constraints:
                 constraint.accept_schema_visitor(visitor,
traverse=True)
         visitor.visit_column(self)
+
+    def _compare_self(self):
+        if isinstance(self.type,types.String) and not
self.type.compare_case:
+            return sql._CaseInsensitiveCompare(self)
+        else:
+            return self


 class ForeignKey(SchemaItem):
Index: sqlalchemy/types.py
===================================================================
--- sqlalchemy/types.py (revision 2346)
+++ sqlalchemy/types.py (working copy)
@@ -169,11 +169,18 @@
             return super(String, cls).__new__(cls, *args, **kwargs)
         else:
             return super(String, TEXT).__new__(TEXT, *args, **kwargs)
-    def __init__(self, length = None):
+    def __init__(self, length = None, compare_case = False):
+        """Initializes new string instance
+
+        compare_case - determines if case affects comparisons,
sorting, etc
+        """
         self.length = length
+        self.compare_case = compare_case
     def adapt(self, impltype):
         return impltype(length=self.length)
     def convert_bind_param(self, value, dialect):
+        if not self.compare_case and hasattr(value,'lower'):
+            value = value.lower()
         if not dialect.convert_unicode or value is None or not
isinstance(value, unicode):
             return value
         else:
@@ -200,7 +207,7 @@
              return value.decode(dialect.encoding)
          else:
              return value
-
+
 class Integer(TypeEngine):
     """integer datatype"""
     def get_dbapi_type(self, dbapi):
Index: sqlalchemy/ansisql.py
===================================================================
--- sqlalchemy/ansisql.py       (revision 2346)
+++ sqlalchemy/ansisql.py       (working copy)
@@ -225,6 +225,10 @@
             else:
                 self.strings[column] =
self.preparer.format_column_with_table(column)

+    def visit_case_insensitive(self, object):
+        #override this in databases that default to case insesitive
collation
+        self.strings[object] =
self.strings[object.clause]
+
     def visit_fromclause(self, fromclause):
         self.froms[fromclause] = fromclause.name

@@ -257,7 +261,7 @@
             self.strings[compound] = "(" + s + ")"
         else:
             self.strings[compound] = s
-
+
     def visit_clauselist(self, list):
         if list.parens:
             self.strings[list] = "(" + string.join([s for s in
[self.get_str(c) for c in list.clauses] if s is not None], ', ') + ")"









On Feb 19, 12:17 pm, "Michael Bayer" <[EMAIL PROTECTED]> wrote:
> On Feb 19, 2:32 pm, "Troy" <[EMAIL PROTECTED]> wrote:
>
> > Even though MySQL allows the setting of COLLATE, it does not support
> > functional indexes, so if your code explicitly calls lower you
> > technically now have code that will work for both MySQL and Postgres,
> > but MySQL is going to take a big performance hit and perform table
> > scans regardless of any created indexes.  Realistically, I can't see
> > anyone setting MySQL collate rules to case-sensitive without the
> > support of functional indexes, which MySQL does not have.
>
> right...so some explicitness is required, somewhere (since SA cant
> just put lower() across the board).
>
>
>
> > In simple terms, if the Postgres dialect supported
> > compare_insensitive=True|False|[upper|lower]?
>
> on a Column-by-Column or even per-operation context is probably more
> useful.
>
> > sqlalchemy easily support Postgres and MySQL with the same code in out-
> > of-the-box configurations.  But even if it didn't, if there was a way
> > to override the default postgres dialect I'd be a happy camper.
> > Infact, that is exactly what I have done.  I added a dialect called
> > "lowergres", but I'm stuck because I can not seem to find the
> > appropriate hook to alter the column to a func.lower when the column
> > is not part of the selected column list.
>
> format_column looks to me
>
> > like the right place to do it without converting a column to a
> > function, but the column object in that function has no context as to
> > where it lies in the sql statement.
>
> for this approach, id advise setting "state" on the compiler when you
> are processing the column clause of a SELECT (or dealing with insert/
> update column lists too), vs. when you are processing WHERE, ORDER BY,
> etc.  that flag can be used to determine where youre traversing.  the
> compiler knows the context since its the one generating the statement.
>
> a more accurate way to do is to wrap the Column itself, as it goes
> into a comparison operation or ORDER BY, in a new construct such as
> "CaseInsensitive"...heres some pseudo-ish code based on the __eq__()
> method you see in sqlalchemy/sql.py _CompareMixin (assume "self" is a
> Column):
>
> def __eq__(self, other):
>     return _BooleanExpression(CaseInsensitive(self), other, '==')
>
> Better yet its configurable on Column:
>
> def __eq__(self, other):
>     if self.case_insensitive_compare:
>         return _BooleanExpression(CaseInsensitive(self), other, '==')
>     else:
>         return _BooleanExpression(self, other, '==')
>
> CaseInsensitive looks a lot like _Function and is just:
>
> class CaseInsensitive(_CalculatedClause):
>     def __init__(self, target):
>         self.target = target
>     def accept_visitor(self, visitor):
>         self.target.accept_visitor(visitor)
>         visitor.visit_case_insensitive(self)
>
> ansicompiler provides the string representation of the underlying
> target with no modification:
>
> def visit_case_insensitive(self, object):
>     self.strings[object] = self.strings[object.target]
>
> postgres compiler with case_insensitive provides it as:
>
> def visit_case_insensitive(self, object):
>     self.strings[object] = "lower(%s)" % self.strings[object.target]
>
> other dialects can have whatever flags to turn on/off the "lower()"
> wrapping as well.
>
> what we're really talking about here is a func.lower() that has
> special meaning to the compiler, i.e. that it should be conditionally
> applied based on dialect flags.  i think the flag on Column to have
> the wrapper applied might be pretty slick.
>
>
>
> > I'm curious as to others experiences with writing an app that supports
> > both Postgres and MySQL with sqlalchemy, because if someone else is
> > doing this then I must be missing something, or maybe not.  At first,
> > our unit tests all passed, then when we added real world data with
> > mixed case, tests started to fail on everything doing sorts and
> > where's on character data.
>
> yeah i dont think anyone has gotten too far with this issue, also ive
> no idea what the giant universe of Hibernate users do either (i think
> there just arent db-platform-neutral J2EE apps).
>
>
>
> > How about a Pepsi (at PyCon)?
>
> sure thing !


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Differentiate ANSIIdentifierPreparer.format_column select clause from where clause??

Reply via email to