[sqlalchemy] Re: setting table and mapper arguments with Declarative
On Sep 18, 2:59 pm, Michael Bayer [EMAIL PROTECTED] wrote: Well, I would think __table_args__ is the only argument you'd really want to propigate in that way, and this is an inconvenience I've also had so perhaps we'll do something about it...I would propose a default_table_args keyword arg to declarative_base(). Thanks, I would appreciate the default_table_args parameter to declarative_base() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Queries very slow for parsing Wikipedia dump -- any ideas for speeding it up?
this is general programming approach, not sql specific. for a 7 mil objects... u have to try to do some vertical (wrong term probably) layer-splitting of the data. imagine the objects being rectangles on horizontal line, each containg same layers. now u walk the rectangles like for each in X: walk Y. u need to try to walk like for layer in Y: walk all X. i.e. if u have Object A that maps to tables tA and tB, if u do million objects, that would be million of (insert into tA, insert into tB and maybe update of foreignkey). u have to try to reach the other way around, one lump insertmillion into tA, another lump insertmillion into tB, whatever. if it's still slow - or hard to knit the links - u could generate some DB-specific dump/replication-format and import from there. i guess u can go for the the latter approach even now, esp. if your data does not change (or u pick the 99% constant part and import that one, then add the rest slowly and dynamicaly). ciao svilen www.svilendobrev.com On Tuesday 23 September 2008 06:20:06 CodeIsMightier wrote: Hi: I am working on an open source project to write a search engine / datamining framework of sorts for Wikipedia, and one of the first things I need to do is to parse the Wikipedia dumps into an SQL database. I am using the sqlalchemy to do this but it is very slow (at the current rate, 130 days!!!). I am sure that I am doing something wrong since I am new at this, and am wondering whether any sqlalchemy veterans can offer his/her insights. The code can be found here: http://wikiminer.googlecode.com/svn/trunk/wikipedia_miner.py The critical part of the code is this: for link_label, link_dest_title, dest_frag in self.parse_links(self.text): print 'LINK from:', repr(self.title), 'to', repr(link_dest_title + '#' + dest_frag), 'label', repr(link_label) try: link_dest = session.query(Article).filter_by(title=link_dest_title).one() except sqlalchemy.orm.exc.NoResultFound: link_dest = None print link_dest session.add(Link(self, link_label, link_dest, dest_frag)) Basically what this does is that it parses the links in a page, looks it up in the DB to resolve the reference, and then insert a Link into the DB. The problem is that the articles table is over 7 million rows and there are maybe 50 million links. I have tried using both SQLite and Postgres as the database. Postgres EXPLAIN ANALYZE claims that the above statements should take only around 25 ms! I think I am doing something wrong with sqlalchemy, maybe I am creating too many objects? Any help would be very appreciated. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Problems with join query, can't figure out what's wrong. Noob alert!
I've recently started using SQLAlchemy and is a newb. I have good SQL knowledge though. My first project with SQLAlchemy is a big project where I integrate SQLAlchemy to a Plone application. So far it has been working great for queries from a single table however queries with joins in two or more tables is way more difficult. I've tried to join two tables with two different queries without success The query: objects = db.session.query(Sak).join(Prosjekt, Sak.prosjektid).filter(Prosjekt.kundeid==1532).all() returns this error: InvalidRequestError: Mapper 'Mapper|Sak|sak' has no property 'class 'Products.TikkTakk.model.Prosjekt.Prosjekt'' The query: objects = db.session.query(Sak).from_statement(SELECT s.saksnr, s.tittel FROM sak s INNER JOIN prosjekt p ON s.prosjektid = p.prosjektid WHERE p.kundeid = 1532).all() returns this error: NoSuchColumnError: Could not locate column in row for column 'sak.prosjektid' I have some trouble figuring out what I'm doing wrong. Especially the NoSuchColumnError is frustrating as the column prosjektid exists in the table sak, it's not a typo either ;) The mapping looks like this: mappers['prosjekt'] = mapper(Prosjekt, tables['prosjekt'], properties = { 'sak': relation(Sak) }) mappers['sak'] = mapper(Sak, tables['sak'], properties = { 'prosjekt': relation(Prosjekt), }) And the models like this: class Prosjekt(Entity): Prosjekt entity map prosjektid = Field(mysql.MSInteger, primary_key=True) p_prosjektid = ManyToOne('Prosjekt') sak = OneToMany('Sak') using_options(shortnames=True) def __init__(self, data): Objects are initialized using form data def __repr__(self): return 'Prosjekt (%d)' % (self.prosjektid) class Sak(Entity): Sak entity map prosjektid = ManyToOne('Prosjekt', primary_key=True) saksnr = Field(mysql.MSInteger, primary_key=True) using_options(shortnames=True) def __init__(self, data): Objects are initialized using form data def __repr__(self): return 'Sak (%d)' % ( self.prosjektid, self.saksnr, ) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Queries very slow for parsing Wikipedia dump -- any ideas for speeding it up?
Just a couple thoughts that might help you out: 1) I would profile the code. It seems to me that running a regular expression on an entire wikipedia article would be a VERY expensive operation. 2) Did the first pass succeed and how long did it take? 3) Taking a quick look at http://wikiminer.googlecode.com/svn/trunk/wikipedia_miner.py, it seems to me that the second pass through the data would create a 2nd set of article objects that are never saved to the database. Therefore, the 'self' reference in: session.add(Link(self, link_label, link_dest, dest_frag)) would refer to an object that is never saved. I guess this would not matter since the id field is correct (since you set it explicitly) but it seems to me that it might be better (faster) if you just read through the articles table for pass two instead of re-parsing the xml, something similar to: #delete previous data redirects_table.drop(bind=engine) redirects_table.create(bind=engine) links_table.drop(bind=engine) links_table.create(bind=engine) for article in session.query(Article): article.parse_text(session) It's pretty late so I may of missed something. Hope the above helps. Shawn Church I/S Consultant shawn at SChurchComputers.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: cascades over many-to-many
On Sep 23, 2008, at 7:58 AM, [EMAIL PROTECTED] wrote: hi i cant get how to configure cascades over many-to-many relations. in my case these are implicit m2m (via secondary-table), being treated as either versioned one2many, or versioned many2one. lets say there is versioned A having many versioned Bs. (' and denote version number): A1' has B1 which has versions: A1'.bs = [B1', B2'] (another timestamp) A1'.bs = [B1, B2'] A2 has versions which keep same B2, one has also B3: A2'.bs = [B2'] (another timestamp) A2.bs = [B2', B3'] deleting A1' should delete B1' and B1 deleting A2' should not delete B2'; only deleting all A2' and A2 and A1' should delete B2'. is this all,delete-orphan given to the A.relation(B) ? or only delete-orphan? how about backrefs (if any) delete-orphan pretty much needs delete to work correctly. but for the many-to-many use case, I'm not sure if SQLA does the right thing and cross references every relationship - these cascades were designed more for the o2m direction (I think Hibernate supports them only for o2m relations).I'd work up some simple test cases to see what it has to say in this regard (it would have to actively load in other objects to see if an object is truly an orphan, for example. not sure if its going to try that). --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: cascades over many-to-many
On Tuesday 23 September 2008 16:06:07 Michael Bayer wrote: On Sep 23, 2008, at 7:58 AM, [EMAIL PROTECTED] wrote: hi i cant get how to configure cascades over many-to-many relations. in my case these are implicit m2m (via secondary-table), being treated as either versioned one2many, or versioned many2one. lets say there is versioned A having many versioned Bs. (' and denote version number): A1' has B1 which has versions: A1'.bs = [B1', B2'] (another timestamp) A1'.bs = [B1, B2'] A2 has versions which keep same B2, one has also B3: A2'.bs = [B2'] (another timestamp) A2.bs = [B2', B3'] deleting A1' should delete B1' and B1 deleting A2' should not delete B2'; only deleting all A2' and A2 and A1' should delete B2'. is this all,delete-orphan given to the A.relation(B) ? or only delete-orphan? how about backrefs (if any) delete-orphan pretty much needs delete to work correctly. but for the many-to-many use case, I'm not sure if SQLA does the right thing and cross references every relationship - these cascades were designed more for the o2m direction (I think Hibernate supports them only for o2m relations).I'd work up some simple test cases to see what it has to say in this regard (it would have to actively load in other objects to see if an object is truly an orphan, for example. not sure if its going to try that). so far i've used all and it seems to work for one single owner, but i havent really tried the multiple case. the whole thing seems to me like a splitted reference-counting mechanism, one side being weakref, other not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: ordering_list performance
My use case is a bit different : new_entries can be placed everywhere into the existing SA list, not only at the end (actually it depends on the entry date). On 22 sep, 21:20, jason kirtland [EMAIL PROTECTED] wrote: Ah, looking more closely i see you're replacing self.entries with a list, not insorting into a SA list collection- that's totally ok. It might squeeze a little more speed out to do: updated_entries = list(self.entries) + new_entries base = len(self.entries) for idx, entry in enumerate(new_entries): entry.position = base + idx self.entries = updated_entries orderinglist's extend method could be made to do something much like the above quite efficiently. jason kirtland wrote: A warning: that depends on a bug in the C version of bisect. When given a list subclass, it mistakenly ignores the subclass method implementations. The below will break, if and when that's fixed to match the pure Python implementation in the standard lib. Calling list.extend(account_entries, new_entries) is probably a safe alternative. *http://bugs.python.org/issue3935 jean-philippe dutreve wrote: What I've done is something like this: account_entries = self.entries[:] for entry in new_entries: insort_right(account_entries, entry) for i, entry in enumerate(account_entries): entry.position = i self.entries = account_entries Don't know if it's the right way to do it but it's much faster. On 22 sep, 18:41, jason kirtland [EMAIL PROTECTED] wrote: I'm sure there is potential for improvement on the current orderinglist code- please feel free to send a patch with optimizations you've found to the SA trac. The orderinglist hasn't changed much since 0.3, but with 0.5 there may be entirely new implementations possible. For example, I could imagine one that defers calculation and manipulation of the positioning information until a before_flush hook. That may be perform better, with the trade-off that the position attribute can't be trusted to be in sync with the list order. jean-philippe dutreve wrote: Below is the profiling of code that added 1200 items into an ordering_list relation. I had to bypass the ordering_list stuff for bulk additions in order to have better performance (down to 2 seconds). Hope this post helps to improve this part (using 0.5.0rc1, python 2.5, linux i686, 1.5Go RAM) SA is rocking! jean-philippe Time elapsed: 48.4475638866 s 8875046 function calls (8869157 primitive calls) in 48.443 CPU seconds Ordered by: internal time, call count List reduced from 390 to 10 due to restriction 10 ncalls tottime percall cumtime percall filename:lineno(function) 1292937/12922507.8790.000 12.1340.000 attributes.py: 132(__get__) 12410137.6620.000 39.8360.000 orderinglist.py: 221(_order_entity) 12410135.8700.000 16.9160.000 orderinglist.py: 202(_get_order_value) 4408094.5220.0009.5270.000 attributes.py:394(set) 12364.1980.003 44.0250.036 orderinglist.py: 208(reorder) 1299736/12990483.7520.0004.3730.000 attributes.py: 310(get) 4482253.3370.0005.1570.000 identity.py: 208(modified_event) 4370612.7040.000 14.3310.000 orderinglist.py: 205(_set_order_value) 4408092.2250.000 11.7520.000 attributes.py: 126(__set__) 4482251.7750.0001.8120.000 attributes.py: 958(modified_event) Function was called by... attributes.py:132(__get__) - domain.py:200(addEntry) (1236) 46.741 domain.py:248(__init__) (1236) 47.832 domain.py:272(get)(49452) 0.609 orderinglist.py: 202(_get_order_value)(1241013) 16.916 orderinglist.py:221(_order_entity) - orderinglist.py:208(reorder) (1240326) 44.025 orderinglist.py:232(append) (687)0.013 orderinglist.py:202(_get_order_value) - orderinglist.py: 221(_order_entity)(1241013) 39.836 attributes.py:394(set) - attributes.py:126(__set__) (440809) 11.752 orderinglist.py:208(reorder) - orderinglist.py: 266(__setslice__)(1236) 44.061 attributes.py:310(get) - attributes.py:132(__get__) (1292937) 12.134 attributes.py: 347(get_committed_value)(1)0.000 attributes.py:500(set) (3708)0.367 attributes.py: 837(value_as_iterable)(3090)0.108 identity.py:208(modified_event)- attributes.py:394(set) (440809)9.527
[sqlalchemy] Cyclic references and Nullable
Suppose I have two classes of objects which have a reference to each other: Class A: b Class B: a Both references are mandatory so nullable = False I use post_update = True in relation function and use_alter = True in ForeignKey constructor After it I try to add two objects: session = Session() a = A() b = B() a.b = b b.a = a session.commit() and catch an exception, because with post_update = True NULL inserted in DB, but without post_update= True I can't add two circular depending objects. How to solve with situation to add such pair of objects into DB? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relations - How?
Thnx a lot Alex! I already love Pylons and SQLAlchemy! On 23 Wrz, 12:16, Alex K [EMAIL PROTECTED] wrote: Hello, here is the answer: # -*- coding: utf-8 -*- from sqlalchemy import create_engine from sqlalchemy import Table, Column, Integer, String, Unicode, MetaData, ForeignKey from sqlalchemy.orm import sessionmaker, mapper, dynamic_loader,backref, relation, composite, comparable_property, contains_eager,aliased from sqlalchemy.sql import text, and_ from orm.object.factory import make_papped_class from sqlalchemy.orm import PropComparator from common import Application app = Application() session = app.session metadata = app.metadata Book : isbn - integer translations - many translations prices - many prices Translation : book - FK to book laguage - FK to Language (oneTOone) title - string Language : code - string name - string Currency : code - string name - string Price : currency - FK book -FK brutto - int netto - int book_table = Table('book', metadata, Column('isbn', Integer, primary_key=True) ) language_table = Table('language', metadata, Column('code', String(5), primary_key=True), Column('name', Unicode(20)), ) currency_table = Table('currency', metadata, Column('code', String(5), primary_key=True), Column('name', Unicode(20)), ) translation_table = Table('translation', metadata, Column('book_id', Integer, ForeignKey(book_table.c.isbn)), Column('language_id', String(5), ForeignKey(language_table.c.code)), Column('title', Unicode(512)), ) price_table = Table('price', metadata, Column('currency_id', String(5), ForeignKey(currency_table.c.code)), Column('book_id', Integer, ForeignKey(book_table.c.isbn)), Column('brutto', Integer), Column('netto', Integer) ) metadata.create_all() #create objects class Book(object): def __init__(self,isbn): self.isbn = isbn class Language(object): def __init__(self,code,name): self.code = code self.name = name class Currency(object): def __init__(self,code,name): self.code = code self.name = name class Translation(object): def __init__(self,book_id,language_id,title): self.book_id = book_id self.language_id = language_id self.title = title class Price(object): def __init__(self,currency_id,book_id,brutto,netto): self.currency_id = currency_id self.book_id = book_id self.brutto = brutto self.netto = netto mapper(Book,book_table) mapper(Language,language_table) mapper(Currency,currency_table) mapper(Translation,translation_table,properties = { 'book': relation(Book, lazy = False, #1 note lazy here, it means that we # will use lazy loading (more details in the docs backref = backref('translations',lazy = False)), 'language': relation(Language, uselist = False,#2 note uselist, it means #we use one-to-one instead of one-to-many lazy = False),}, primary_key = [translation_table.c.book_id, translation_table.c.language_id] #explicit primary key is needed when SQLA can not assemble the one for you automatically ); mapper(Price,price_table,properties = { 'currency': relation(Currency,lazy = False), 'book': relation(Book,lazy = False, backref = backref('prices',lazy = False)) },primary_key = [price_table.c.book_id,price_table.c.currency_id]); session = app.session if False: #change this to True to issue add statement session.add(Language('en',u'English')) session.add(Language('ru',u'Русский')) session.add(Currency('usd',u'Dollar')) session.add(Currency('rub',u'Рубль')) session.add(Book(1)) session.add(Book(2)) session.flush() session.add(Translation(book_id = 1, language_id = 'en', title = u'The book')) session.add(Translation(book_id = 1, language_id = 'ru', title = u'Книжка')) session.add(Translation(book_id = 2, language_id = 'en', title = u'Book')) session.add(Translation(book_id = 2, language_id = 'ru', title = u'Книжка2')) session.add(Price(book_id = 1, currency_id = 'usd', brutto = 12, netto = 20)) session.add(Price(book_id = 1, currency_id = 'rub', brutto = 250, netto = 500)) session.add(Price(book_id = 2, currency_id = 'usd', brutto = 10, netto = 18)) session.add(Price(book_id = 2, currency_id = 'rub', brutto = 200, netto = 440)) would like now to get books that: isbn1 translation.title in
[sqlalchemy] Re: Cyclic references and Nullable
And how can you do this via explicit SQL? On 23 сент, 18:32, mraer [EMAIL PROTECTED] wrote: Suppose I have two classes of objects which have a reference to each other: Class A: b Class B: a Both references are mandatory so nullable = False I use post_update = True in relation function and use_alter = True in ForeignKey constructor After it I try to add two objects: session = Session() a = A() b = B() a.b = b b.a = a session.commit() and catch an exception, because with post_update = True NULL inserted in DB, but without post_update= True I can't add two circular depending objects. How to solve with situation to add such pair of objects into DB? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Cyclic references and Nullable
I think it depends on specific DB. In DBs I can start checking constraints after comitting a transaction, I think. On Sep 23, 6:58 pm, Alex K [EMAIL PROTECTED] wrote: And how can you do this via explicit SQL? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] multiprocessing best practices
Hello There, I am developing an application that uses sqlalchemy and the py processing packages. My question is this, what is the best practice for using sessions in this type of app. Each subprocess needs to access my db to get work, so currently I am starting a scoped session in the run method of the class and using that session for all db work within that subprocess. Is that the optimal way, or is there a better way of working? thanks, Matt --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: multiprocessing best practices
On Sep 23, 4:20 pm, mg [EMAIL PROTECTED] wrote: Hello There, I am developing an application that uses sqlalchemy and the py processing packages. My question is this, what is the best practice for using sessions in this type of app. Each subprocess needs to access my db to get work, so currently I am starting a scoped session in the run method of the class and using that session for all db work within that subprocess. Is that the optimal way, or is there a better way of working? Use one scoped session + engine per process. Engines cannot be shared across processes. The scoped session machinery will give you thread local connections. If you are not using threads you do not strictly need scoped sessions, but using them now gives you the flexibility to use them should you choose to at some point in the future. Laurence --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Binding different engines to a session object
Hi: I have been having problems in attempting to rebind a given Session object/class to a different engine instance. It seems that it can only be bound once. Subsequent sql statements post to the original binding. I am using sqlalchemy 0.4.6 with elixir. Here is an example: __session__ = scoped_session(sessionmaker(...)) Session.configure(bind=engine1) # do sql ops here on engine1 via elixir __session__.configure(bind=engine2) # do sql ops here against engine2 via elixir # sql ops here get written to database that engine1 is pointing to. The 2 engines I am referring to have identical schemas in different databases. In fact there may be multiple databases (with the same schema) in which I need to set the database context before performing sql operations. I thought the __session__.configure(bind=) would do the trick but it doesn't seem to be working. Can anyone provide some insight into how I can use sqlalchemy to write to multiple databases with the same schema? As shown this app is multithreaded (zope). I have received some advice to create a new session instance per schema (eg: sess=Session()) however elixir only refers to a single session object/class (__session__) which makes things difficult. Do I need to dump elixir and just use sqlalchemy to get this functionality? Thanks! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: BINARY Columns in MS SQL
So I'm working a bit on this. If base64 encoding within the bind_processor() can fix MS-SQL for now, I'd say that would be the approach for the time being. turns out base64 encoding is problematic: it requires a corresponding decode for the data retrieval, which makes it effectively useless for legacy databases with extant values stored. I think an out-only encoding as follows would be all that's needed for now: class MSBinary(sqltypes.Binary): def bind_processor(self, dialect): def process(value): return '0x' + value.encode('hex') return process ...the issue is that the emitted value is getting quoted somewhere after the type conversion: INSERT INTO binary_table VALUES(, '0x6df02da', ...) but MSSQL's parser can recognize the '0x' prefix and wants the value unquoted: INSERT INTO binary_table VALUES(, 0x6df02da, ...) So how do I get the Dialect to drop the quoting behavior for the return from bind_processor() ? I guess you've never gotten testtypes.py BinaryTest to run with MS-SQL? Nope. I run so far only with pymssql, which has never had a chance of passing with binary data. I would definitely want some test coverage in test/dialect/mssql.py for this, I'm pretty shocked nobody has had this problem before. If this works, the regular BinaryTest unit test should pass. There's a few other basic tests in typetest.py that still fail for MSSQL. Someday in a dream world where I have time to look at them and fix them, I will. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Binding different engines to a session object
I believe I have fixed my problem. Instead of binding/rebinding a given engine instance to the Session, I bind to the metadata object. This solves the problem instead. I found that there are numerous references to being able to rebind metadata but not sessions. On Sep 23, 9:03 am, Andrew [EMAIL PROTECTED] wrote: Hi: I have been having problems in attempting to rebind a given Session object/class to a different engine instance. It seems that it can only be bound once. Subsequent sql statements post to the original binding. I am using sqlalchemy 0.4.6 with elixir. Here is an example: __session__ = scoped_session(sessionmaker(...)) Session.configure(bind=engine1) # do sql ops here on engine1 via elixir __session__.configure(bind=engine2) # do sql ops here against engine2 via elixir # sql ops here get written to database that engine1 is pointing to. The 2 engines I am referring to have identical schemas in different databases. In fact there may be multiple databases (with the same schema) in which I need to set the database context before performing sql operations. I thought the __session__.configure(bind=) would do the trick but it doesn't seem to be working. Can anyone provide some insight into how I can use sqlalchemy to write to multiple databases with the same schema? As shown this app is multithreaded (zope). I have received some advice to create a new session instance per schema (eg: sess=Session()) however elixir only refers to a single session object/class (__session__) which makes things difficult. Do I need to dump elixir and just use sqlalchemy to get this functionality? Thanks! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: cascades over many-to-many
On Sep 23, 2008, at 9:06 AM, [EMAIL PROTECTED] wrote: so far i've used all and it seems to work for one single owner, but i havent really tried the multiple case. the whole thing seems to me like a splitted reference-counting mechanism, one side being weakref, other not. its not going to work for multiple owners since cascades dont take backrefs into account. m2m cascade is not a met use case here as it has more ambiguous behavior and complexity, and is something you'd have to roll yourself. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Cyclic references and Nullable
On Sep 23, 2008, at 10:32 AM, mraer wrote: Suppose I have two classes of objects which have a reference to each other: Class A: b Class B: a Both references are mandatory so nullable = False I use post_update = True in relation function and use_alter = True in ForeignKey constructor After it I try to add two objects: session = Session() a = A() b = B() a.b = b b.a = a session.commit() and catch an exception, because with post_update = True NULL inserted in DB, but without post_update= True I can't add two circular depending objects. How to solve with situation to add such pair of objects into DB? this is only possible if you can configure your database to not enforce foreign key constraints until transaction commit time. Its something I have never done myself but I understand it is possible with some databases. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Binding different engines to a session object
On Sep 23, 2008, at 11:03 AM, Andrew wrote: Hi: I have been having problems in attempting to rebind a given Session object/class to a different engine instance. It seems that it can only be bound once. Subsequent sql statements post to the original binding. I am using sqlalchemy 0.4.6 with elixir. Here is an example: __session__ = scoped_session(sessionmaker(...)) Session.configure(bind=engine1) # do sql ops here on engine1 via elixir __session__.configure(bind=engine2) this is the wrong approach. configure() binds the overall scoped_session, which is not itself a session but rather a container of many sessions as well as a factory for new sessions, to a new engine which only takes effect upon the next creation of a Session. configure() is only used for an application which would like to establish a scoped_session but does not yet have an available engine. To bind a single session to an engine, simply assign the bind attribute: mysession.bind = someengine when using a scoped_session(), the current session is available by calling it: mysession = __session__() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: BINARY Columns in MS SQL
On Sep 23, 2008, at 4:24 PM, Rick Morrison wrote: class MSBinary(sqltypes.Binary): def bind_processor(self, dialect): def process(value): return '0x' + value.encode('hex') return process ...the issue is that the emitted value is getting quoted somewhere after the type conversion: INSERT INTO binary_table VALUES(, '0x6df02da', ...) but MSSQL's parser can recognize the '0x' prefix and wants the value unquoted: INSERT INTO binary_table VALUES(, 0x6df02da, ...) So how do I get the Dialect to drop the quoting behavior for the return from bind_processor() ? SQLA doesn't quote bind values. It passes bind parameters, so this is likely pyodbc assigning quoting. The first step here is to get a full binary round trip working with only pyodbc, no SQLA in use. The dialect can then be adjusted to do whatever is needed in that regard. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Queries very slow for parsing Wikipedia dump -- any ideas for speeding it up?
On Sep 22, 2008, at 11:20 PM, CodeIsMightier wrote: for link_label, link_dest_title, dest_frag in self.parse_links(self.text): print 'LINK from:', repr(self.title), 'to', repr(link_dest_title + '#' + dest_frag), 'label', repr(link_label) try: link_dest = session.query(Article).filter_by(title=link_dest_title).one() except sqlalchemy.orm.exc.NoResultFound: link_dest = None print link_dest session.add(Link(self, link_label, link_dest, dest_frag)) Basically what this does is that it parses the links in a page, looks it up in the DB to resolve the reference, and then insert a Link into the DB. The problem is that the articles table is over 7 million rows and there are maybe 50 million links. commenting only on this snippet: 1. throwing/catching exceptions in Python is time consuming. So I would not rely upon one() to achieve this, instead say all() and look to see if rows are present. 2. if the point of session.add(Link(..)) is so that a new Link object is added, you *definitely* need to flush every 100 objects or so. The Session, when flush() proceeds, has to sort its entire contents of dirty objects so this takes exponentially longer as the size of unflushed objects grows - so this list should be kept small. 3. the ORM's creation of objects is time consuming. So I would look at selecting individual columns instead of objects, i.e. sess.query(Article.id).filter(...).all(). 4. the ORM overall is designed to simplify a complex set of operations over a complex object graph, and is not optimized for raw speed as much as plain SQL expressions. Since this is a very simplistic operation, you'll get much better performance using plain SQL expressions (see the tutorial at http://www.sqlalchemy.org/docs/05/sqlexpression.html ). I wouldn't use the ORM at all for this particular operation. 5. The whole pattern of fetch a row, doesnt exist, then insert is itself slow. If you're using MySQL, consider using the REPLACE construct which would remove the need to check first for a row. 6. depending on the database in use, you probably want to commit your transaction every X number of rows - an exceedingly large amount of changes in one transaction can also put a stress on the operation. 7. If the full database of articles is 7 million, you might even try to load the full list of article IDs and titles into an in-memory dictionary (only the integer id and string title, *not* a full ORM object) and operate that way. It would use a few gigs of RAM but would only take a few minutes to try out to see if its feasable. Things like memcached and DBM might be of use here too. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relations - How?
One more thing, If i have just book and translations 1 Book has 3 translations this sql will create me 3 rows (due to the join) isbn translation_code translation_text 1en The Book 1ru Книжка 1pl Ksiazka so if i have 10 books i will have 30 rows. this is NxM, what if I just want to join by : LEFT OUTER JOIN translation AS translation_1 ON (book.isbn = translation_1.book_id , translation_1.code=en) so i will only get the required information ? Limiting the results will also be a problem becouse simple LIMIT will not be enough. (consider having 30 multiple relations, it will generate huge ammount of data M*n*x*...) On 23 Wrz, 16:37, g00fy [EMAIL PROTECTED] wrote: Thnx a lot Alex! I already love Pylons and SQLAlchemy! On 23 Wrz, 12:16, Alex K [EMAIL PROTECTED] wrote: Hello, here is the answer: # -*- coding: utf-8 -*- from sqlalchemy import create_engine from sqlalchemy import Table, Column, Integer, String, Unicode, MetaData, ForeignKey from sqlalchemy.orm import sessionmaker, mapper, dynamic_loader,backref, relation, composite, comparable_property, contains_eager,aliased from sqlalchemy.sql import text, and_ from orm.object.factory import make_papped_class from sqlalchemy.orm import PropComparator from common import Application app = Application() session = app.session metadata = app.metadata Book : isbn - integer translations - many translations prices - many prices Translation : book - FK to book laguage - FK to Language (oneTOone) title - string Language : code - string name - string Currency : code - string name - string Price : currency - FK book -FK brutto - int netto - int book_table = Table('book', metadata, Column('isbn', Integer, primary_key=True) ) language_table = Table('language', metadata, Column('code', String(5), primary_key=True), Column('name', Unicode(20)), ) currency_table = Table('currency', metadata, Column('code', String(5), primary_key=True), Column('name', Unicode(20)), ) translation_table = Table('translation', metadata, Column('book_id', Integer, ForeignKey(book_table.c.isbn)), Column('language_id', String(5), ForeignKey(language_table.c.code)), Column('title', Unicode(512)), ) price_table = Table('price', metadata, Column('currency_id', String(5), ForeignKey(currency_table.c.code)), Column('book_id', Integer, ForeignKey(book_table.c.isbn)), Column('brutto', Integer), Column('netto', Integer) ) metadata.create_all() #create objects class Book(object): def __init__(self,isbn): self.isbn = isbn class Language(object): def __init__(self,code,name): self.code = code self.name = name class Currency(object): def __init__(self,code,name): self.code = code self.name = name class Translation(object): def __init__(self,book_id,language_id,title): self.book_id = book_id self.language_id = language_id self.title = title class Price(object): def __init__(self,currency_id,book_id,brutto,netto): self.currency_id = currency_id self.book_id = book_id self.brutto = brutto self.netto = netto mapper(Book,book_table) mapper(Language,language_table) mapper(Currency,currency_table) mapper(Translation,translation_table,properties = { 'book': relation(Book, lazy = False, #1 note lazy here, it means that we # will use lazy loading (more details in the docs backref = backref('translations',lazy = False)), 'language': relation(Language, uselist = False,#2 note uselist, it means #we use one-to-one instead of one-to-many lazy = False),}, primary_key = [translation_table.c.book_id, translation_table.c.language_id] #explicit primary key is needed when SQLA can not assemble the one for you automatically ); mapper(Price,price_table,properties = { 'currency': relation(Currency,lazy = False), 'book': relation(Book,lazy = False, backref = backref('prices',lazy = False)) },primary_key = [price_table.c.book_id,price_table.c.currency_id]); session = app.session if False: #change this to True to issue add statement session.add(Language('en',u'English')) session.add(Language('ru',u'Русский'))
[sqlalchemy] Performance problem related to saving newly created objects.
Here is the scenario. I'm using SA as the data abstraction/access layer between a desktop application and postgresql database. The user interacts with the application primarily by browsing large numbers of records (on the order of tens of thousands of records at once). Occasionally the users needs to create new records. After the user has interacted with the application for a while, there may be over 100,000 objects that have been fetched into memory. These objects appear in tables that the user sorts and searches, drop down lists for changing foreign key relations and so on. The problem appears when it comes time to create and save a new record. When saving a new record, the commit often takes more than 20 seconds. This is an awfully long time for the user to wait while looking at a spinning beach ball. In some extreme cases, saving a new record can take minutes. Now, my question is, am I using SA the way it was intended to be used? For example, I create a single session when the application starts up and use that single session throughout for all interactions with the database. The reason behind this is because the objects that are used to populate drop down lists and look-up tables in the user interface, need to persist, such that when editing existing records or adding new ones, I don't want to have to merge all of those objects into a new session (there are a LOT of user interface elements that contain references to instances of SA objects) all the time just to make a simple edit to a record. I posted a TRAC ticket which was immediately closed as invalid because, I was told, I should be flush()'ing the session ever 1,000 objects or so. I turned on autoflush in the sessionmaker() as a test and I have not seen any improvement in performance. Furthermore, the behavior of autoflush is incompatible with the way users interact with the application in that when autoflush is on, the session attempts to flush incomplete objects to the database, which end up violating non- null rules, foreign key constraints and whatnot. Is there a particular idiom that I'm missing when it comes to using SA as the DAL for a desktop application? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Performance problem related to saving newly created objects.
I misread your ticket and the resolution has been corrected. The commit() operation expires all objects present in the session as described in http://www.sqlalchemy.org/docs/05/session.html#unitofwork_using_committing . Turn off expire_on_commit to disable the expiration operation, which normally would loop through all 100K objects and set an expiration flag for all of their attributes. Keep in mind that if any concurrent transactions take place subsequent to the end of your transaction, with expire_on_commit=False you may be operating with stale data. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: BINARY Columns in MS SQL
On Sep 23, 2008, at 7:17 PM, Rick Morrison wrote: The first step here is to get a full binary round trip working with only pyodbc, no SQLA in use. Well, that's how I got as far as I did, but that was with straight text, no bind params so this is likely pyodbc assigning quoting. OK then, dead end #2. I'll return to this when I've got a few more minutes. other than this seems to be a bug in pyodbc, obviously we're probably going to have to go with the CONVERT() thing. The path to that, since I wanted this to work for Postgres GIS functions as well, is a new method is added to TypeEngine called something like bind_converter(), MSBinary provides func.convert() from its bind_converter() method, and compiler.visit_bindparam() calls upon this method. A little bit of plumbing would need to exist such that MSBinary can determine exactly how the func.convert() arguments are passed. This is something that can be worked out as a patch to be reviewed, we should add a trac ticket for it. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: BINARY Columns in MS SQL
Michael Bayer wrote: SQLA doesn't quote bind values. It passes bind parameters, so this is likely pyodbc assigning quoting. The first step here is to get a full binary round trip working with only pyodbc, no SQLA in use. The dialect can then be adjusted to do whatever is needed in that regard. Along these lines, I have created a post[1] on the pyodbc Help forum regarding the unit tests failing for BINARY and a few other types. I haven't heard anything back yet. I'll create a bug soon if nothing is forthcoming. -John [1] http://sourceforge.net/forum/forum.php?thread_id=2262548forum_id=550700 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---