Re: [sqlalchemy] SQLite: OperationalError when decoding 0x92 in TEXT column

2014-02-09 Thread Erich Blume
Thanks for the help on this, everyone! I found two ways to deal with this
and figured I should share in case it comes up in the future.

The first approach, the one I went with (because in my case, fidelity was
not as important) was to alter the 'text_factory' the sqlite3 uses. One
trick here was that to access this parameter for a sqlalchemy connection
object 'conn' as returned from an engine, I had to do it like so:

conn.connection.connection.text_factory

The extra indirection on .connection* was because the connection's internal
dbapi connection was in fact a proy '_ConnectionFairy' instance. A small
issue but one that's difficult to notice at first as it raises no error to
set a new property on an object that will never use that property.

The second approach I found was more involved but would be more appropriate
for large projects that didn't want to have this behavior on every column.
In this approach you still change the text_factory of sqlite3's connection,
but instead change it to `bytes` (or perhaps an identity? ie lambda x: x -
not sure what is best). This will cause sqlite3 to return encoded utf-8
bytes instead of unicode strings.

Then, you have to tell SQLAlchemy to convert these strings to unicode. I
did not persue this approach far enough to find the right set of arguments
but I imagine this would be very simple - set 'force_unicode' to True, I
suspect, would be all you would need.

Finally, for the column with the invalid utf-8 sequences, just also set the
`unicode_error` to your preferred resolution strategy - usually 'ignore' or
'replace'.

I suppose it is possible that this could incur a performance penalty - the
sqlite3 de/encoding process is done in a compiled C module and as such
could possibly be faster than using native python for the task. I suspect
though that the module is just calling to the usual Python library
functions for encoding/decoding (but did not check). So that may be of a
concern if you follow this approach.

Thanks again!


On Thu, Feb 6, 2014 at 4:19 PM, Michael Bayer mike...@zzzcomputing.comwrote:


 On Feb 6, 2014, at 6:59 PM, Erich Blume blume.er...@gmail.com wrote:

  Hmm, this one has me stumped. As best I can tell after poking at it
 using the column_reflect event, a custom dialect, etc. - the issue here is
 that in pysqlite.py we (in my Python 3.3 install) are selecting
 `sqlite3.dbapi2` as the dbapi interface, but we aren't telling sqlite3
 anything about how to treat unicode errors. From what I am reading (but it
 seems inconsistent, maybe?) sqlite3 automatically decodes all database
 retrieved values from their bytes for text fields, returning unicode
 strings. Except... that doesn't always seem to be true. I hex-edited a db
 file to change the utf-8 string hello to hell + 0x92 and sqlite3
 switched from returning hello to bhell\x92, or something like that -
 I've been poking at this for so long I've lost track of that transcript.
 
  One can override sqlite3's text factory, apparently, with (for instance)
 `sqlite3.text_factory = lambda x: x.decode('utf-8', errors='ignore')`.
 Maybe the key is to try and find a way to trigger that from sqlalchemy? I
 tried and failed, maybe someone else can point me back to the path?
 
  Just to re-summarize the problem: In python 3, I'm getting errors trying
 to read a row from a sqlite database that has a TEXT column with an invalid
 utf-8 sequence (specifically, the singleton bye '0x92'). I'd love to just
 have sqlalchemy move along and ignore the byte, but I'm not clear how to do
 that.

 Pysqlite (e.g. sqlite3 module) returns TEXT as Python unicode out of the
 gate.  That exception message is being raised by sqlite3 itself, SQLAlchemy
 is just a pass-through, as the string type knows on sqlite that the value
 is already unicode.

 you might need to CAST() the value as BINARY perhaps, not sure.   You’d
 first want to get a plain sqlite3 script to do what you want.   Setting a
 “text_factory” at the module level of sqlite3 is certainly easy enough but
 that seems way too broad.   Ideally you’d want to be able to get the value
 on a per-column basis.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] SQLite: OperationalError when decoding 0x92 in TEXT column

2014-02-09 Thread Michael Bayer

On Feb 9, 2014, at 5:34 AM, Erich Blume blume.er...@gmail.com wrote:

 
 Then, you have to tell SQLAlchemy to convert these strings to unicode. I did 
 not persue this approach far enough to find the right set of arguments but I 
 imagine this would be very simple - set 'force_unicode' to True, I suspect, 
 would be all you would need.

if the SQLite connection is set up to return bytes ahead of when SQLAlchemy 
does anything with the connection, this will be automatic.  DBAPIs vary so much 
in this regard that we test the connection when the dialect first connects.   
but if you’re using connection events to achieve this, the event needs to be 
set in a certain way at the moment to make sure you get the connection before 
sqlalchemy does anything with it, there’s a ticket to document that.

 
 Finally, for the column with the invalid utf-8 sequences, just also set the 
 `unicode_error` to your preferred resolution strategy - usually 'ignore' or 
 'replace'.
 
 I suppose it is possible that this could incur a performance penalty - the 
 sqlite3 de/encoding process is done in a compiled C module and as such could 
 possibly be faster than using native python for the task.

SQLAlchemy’s C extensions do the encoding/decoding and this process has been 
enhanced in 0.9 to also take the job of an expensive and sometimes-necessary 
“check if it’s already unicode” step.I’ve already observed that SQLA’s C 
exts seem to be faster than MySQLdb’s “use_unicode”, Postgresqls unicode 
extension (which is unfortunate, we use that anyway) and using a unicode type 
with a cx_oracle outputtypehandler (which we’ve also stopped using as users 
complained about performance).


signature.asc
Description: Message signed with OpenPGP using GPGMail


[sqlalchemy] Many to many association record orphaned when deleting parent of parent of relation

2014-02-09 Thread Jay Pipes
Hi all,

I have some models with the following relationship:

An Organization can have many Groups
A Group belongs to one Organization
A User can belong to many Groups
A Group can have many Users

I use a mapping model called UserGroupMembership that stores the user_id, 
group_id relation of the many-to-many user to groups relationship.

I cannot seem to get cascading deletes to work properly for the mapping 
table.

If I create an organization, a group under that organization, a user, and 
add a record to the mapping table relating the user to the group, I would 
expect that deleting the organization would delete the group record, as 
well as the records in the mapping table that correspond to the group.

Unfortunately, what actually happens is that the organization is deleted, 
along with the group record. But the mapping table records corresponding to 
that group are not deleted -- leaving orphan records.

Here is code that demonstrates this:

http://paste.openstack.org/show/63736/

If you place the above code into a file called manydelete.py and execute 
it, you will see an assertion raised:

$ python manydelete.py 
Traceback (most recent call last):
  File manydelete.py, line 127, in module
assert len(recs) == 0
AssertionError

Note: In case anyone is wondering, the reason I use sess.commit() a number 
of times is to emulate what my code is doing (in multiple processes hitting 
the database in different API calls...)

I'm wondering if I have set up the cascading bits or the relation()s 
improperly? Any help would be most appreciated!

Best,
-jay

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Re: Many to many association record orphaned when deleting parent of parent of relation

2014-02-09 Thread Jay Pipes
Also note that I have tried this variation of code as well, using 
secondary=MAPPING_TABLE, with the same result:

http://paste.openstack.org/show/63752/

-jay

On Sunday, February 9, 2014 10:45:55 PM UTC-5, Jay Pipes wrote:

 Hi all,

 I have some models with the following relationship:

 An Organization can have many Groups
 A Group belongs to one Organization
 A User can belong to many Groups
 A Group can have many Users

 I use a mapping model called UserGroupMembership that stores the user_id, 
 group_id relation of the many-to-many user to groups relationship.

 I cannot seem to get cascading deletes to work properly for the mapping 
 table.

 If I create an organization, a group under that organization, a user, and 
 add a record to the mapping table relating the user to the group, I would 
 expect that deleting the organization would delete the group record, as 
 well as the records in the mapping table that correspond to the group.

 Unfortunately, what actually happens is that the organization is deleted, 
 along with the group record. But the mapping table records corresponding to 
 that group are not deleted -- leaving orphan records.

 Here is code that demonstrates this:

 http://paste.openstack.org/show/63736/

 If you place the above code into a file called manydelete.py and execute 
 it, you will see an assertion raised:

 $ python manydelete.py 
 Traceback (most recent call last):
   File manydelete.py, line 127, in module
 assert len(recs) == 0
 AssertionError

 Note: In case anyone is wondering, the reason I use sess.commit() a number 
 of times is to emulate what my code is doing (in multiple processes hitting 
 the database in different API calls...)

 I'm wondering if I have set up the cascading bits or the relation()s 
 improperly? Any help would be most appreciated!

 Best,
 -jay


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Many to many association record orphaned when deleting parent of parent of relation

2014-02-09 Thread Michael Bayer
the problem is this:

org = Organization(name='my org')
sess.add(org)

group = Group(name='my group')
group.organization_id = org.id
sess.add(group)

when you assign group.organization_id = org.id, org has not been flushed yet so 
org.id is None.  The org is then never associated with the group.  Setting 
echo=True is the best way to quickly analyze these issues.

It’s a good idea to work with object references if at all possible to avoid 
issues like these:

org = Organization(name='my org')
sess.add(org)

group = Group(name='my group')
group.organization = org
sess.add(group)

user = User(name='my user')
sess.add(user)
sess.commit()

group_membership = UserGroupMembership(user=user, group=group)
sess.add(group_membership)
sess.commit()

assert sess.query(UserGroupMembership).count() == 1

sess.delete(org)
sess.commit()

assert sess.query(UserGroupMembership).count() == 0




On Feb 9, 2014, at 10:45 PM, Jay Pipes jaypi...@gmail.com wrote:

 Hi all,
 
 I have some models with the following relationship:
 
 An Organization can have many Groups
 A Group belongs to one Organization
 A User can belong to many Groups
 A Group can have many Users
 
 I use a mapping model called UserGroupMembership that stores the user_id, 
 group_id relation of the many-to-many user to groups relationship.
 
 I cannot seem to get cascading deletes to work properly for the mapping table.
 
 If I create an organization, a group under that organization, a user, and add 
 a record to the mapping table relating the user to the group, I would expect 
 that deleting the organization would delete the group record, as well as the 
 records in the mapping table that correspond to the group.
 
 Unfortunately, what actually happens is that the organization is deleted, 
 along with the group record. But the mapping table records corresponding to 
 that group are not deleted -- leaving orphan records.
 
 Here is code that demonstrates this:
 
 http://paste.openstack.org/show/63736/
 
 If you place the above code into a file called manydelete.py and execute it, 
 you will see an assertion raised:
 
 $ python manydelete.py 
 Traceback (most recent call last):
   File manydelete.py, line 127, in module
 assert len(recs) == 0
 AssertionError
 
 Note: In case anyone is wondering, the reason I use sess.commit() a number of 
 times is to emulate what my code is doing (in multiple processes hitting the 
 database in different API calls...)
 
 I'm wondering if I have set up the cascading bits or the relation()s 
 improperly? Any help would be most appreciated!
 
 Best,
 -jay
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


[sqlalchemy] What is the point to re-define each and every column in __init__ in a class?

2014-02-09 Thread Bao Niu
I'm new to sqlalchemy. I tried to search this question but didn't come up 
with accurate search terms. So I come here for some help from real people 
instead of search engine. For the following code:

 class Places(Base):

 

 __tablename__ = 'moz_places'

  

 id = Column(Integer, primary_key=True)

 url = Column(String)

 title = Column(String)

 rev_host = Column(String)

 visit_count = Column(Integer)

 hidden = Column(Integer)

 typed = Column(Integer)

  

 #--

 def __init__(self, id, url, title, rev_host, visit_count,

  hidden, typed):

 

 self.id = id

 self.url = url

 self.title = title

 self.rev_host = rev_host

 self.visit_count = visit_count

 self.hidden = hidden

 self.typed = typed


If I already defined each column names as class variable,  why do I still 
need to re-define each of them as instance variables? I just can't 
understand the reason. Any hint would be highly appreciated. Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] What is the point to re-define each and every column in __init__ in a class?

2014-02-09 Thread Aseem Mohanty
As far as I understand it, its just a helper to initialize a new object.
Some things to keep in mind:

- You don't /have/ to provide all parameters, if some of your columns have
default values you can only provide the required set of arguments
- You don't /have/ to do that but then when you want to setup a new model
object you will have to initialize them one by one:
places = Places()
places.url = ...
places.title = ...
 and so on and so forth.

HTH
AM


On Sun, Feb 9, 2014 at 11:13 PM, Bao Niu niuba...@gmail.com wrote:

 I'm new to sqlalchemy. I tried to search this question but didn't come up
 with accurate search terms. So I come here for some help from real people
 instead of search engine. For the following code:

 class Places(Base):

 

 __tablename__ = 'moz_places'



 id = Column(Integer, primary_key=True)

 url = Column(String)

 title = Column(String)

 rev_host = Column(String)

 visit_count = Column(Integer)

 hidden = Column(Integer)

 typed = Column(Integer)




 #--

 def __init__(self, id, url, title, rev_host, visit_count,

  hidden, typed):

 

 self.id = id

 self.url = url

 self.title = title

 self.rev_host = rev_host

 self.visit_count = visit_count

 self.hidden = hidden

 self.typed = typed


 If I already defined each column names as class variable,  why do I still
 need to re-define each of them as instance variables? I just can't
 understand the reason. Any hint would be highly appreciated. Thanks.

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.