Mati Skiva wrote: > I believe I took these issues into account. > But just to be sure, maybe you can highlight something for me. > > I assume the followings about the process of session.add: > * after session.add is called, the objects are placed in a to-do pool > (maybe) > * for self-generating-id objects, a special process is used, one that > fetches the generated id after the insert > * for no self-generating-id objects, a regular process of insert is used > > I came to this conclusion, because otherwise, after each insert all the > data of the row is retrieved and placed inside the object. Which covers > the newly generated id.
Since you're interested in hacking SQLAlchemy internals, it would helpful if you could familiarize yourself fully with the SQL expression and execution API, all of which is public and documented. "id generation" is a standard feature of this system and occurs outside of the ORM. Fetching the ID for a new row with the SQLA expression API is a matter of calling a ResultProxy method which the ORM only calls if it doesn't already know the full set of primary key columns for a specific object. The backend implementation of "id fetching" varies wildly across all dialects as well as versions of each database in use. The flush procedure updates or expires those fields of the object which were known to be generated by the database. Fields which are immediately updated include some or all of the primary key columns. Fields which are expired include any other columns which were generated via server-side SQL expressions or known schema-level defaults - these are fetched when the object's expired attributes are next accessed, if at all. The SQL compiler procedure is aware of these columns when it generates an individual insert() construct, given a list of parameters which are to be bound literals and others which are to be embedded SQL expressions. When executed, the ResultProxy API provides these details fully - you should familiarize yourself with all of its methods. I would also point you to http://www.sqlalchemy.org/trac/ticket/1518 , which suggests that we should open up Mapper to being subclassable (for much more exotic purposes than what you have here). However, this subclassing is possible now, which is where you can have your "insertmany" functionality rolled in a rudimental way for your immediate use case without burdening SQLAlchemy core with the need to have a fully tested/supported/performant feature. See the attached example. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
from sqlalchemy.orm.mapper import Mapper class OptimizedInsertMapper(Mapper): def _save_obj(self, states, uowtransaction, postupdate=False, post_update_cols=None, single=False): # determine which of the incoming "states" are inserts inserts = [s for s in states if not s.key] # split off into inserts, updates states = list(set(states).difference(inserts)) # get max ID of the table max_id = uowtransaction.session.scalar(select([func.max(self.primary_key[0])])) if not max_id: max_id = 0 # insertmany(). assume the incoming states are all of the same format. uowtransaction.session.execute( self.local_table.insert(), [s.dict for s in inserts] ) # guess ids - establish state on the objects. for i, s in enumerate(inserts): s.dict['id'] = max_id + i + 1 # send the updates to the regular _save_obj Mapper._save_obj(self, states, uowtransaction, postupdate=postupdate, post_update_cols=post_update_cols, single=single) if __name__ == '__main__': from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base Base = declarative_base(mapper=OptimizedInsertMapper) class Widget(Base): __tablename__ = 'widget' id = Column(Integer, primary_key=True) data = Column(String) related = relation("Related") def __init__(self, data): self.data = data self.related = [Related() for i in xrange(5)] class Related(Base): __tablename__ = 'related' id = Column(Integer, primary_key=True) widget_id = Column(Integer, ForeignKey('widget.id')) e = create_engine('sqlite://', echo=True) Base.metadata.create_all(e) sess = sessionmaker(e)() widgets = [Widget(data="widget %d" % i) for i in xrange(10)] sess.add_all(widgets) sess.commit() # assert everything was generated for w in widgets: assert w.id for r in w.related: assert r.id assert r.widget_id == w.id assert w in sess w1 = sess.query(Widget).filter_by(data="widget 5").one() w1.data = 'widget 5 modified' sess.add_all([Widget(data="widget %d" % i) for i in range(11, 20)]) sess.commit()