[sqlalchemy] Re: Lazy ID Fetching/generation
> yeah SQLAlchemy started, like Rick was getting at, with a much "closer > to the metal" idea than that, that if you made a new object and put it > in the session, youd "know" that it wasnt flushed yet. My experience > with hibernate is identical, actually, nothing gets generated or > anything in our environment over here until the flush happens. > > But another thing, is that the whole idea of "save/update/save-or- > update", which we obviously got from hibernate, is something ive been > considering ditching, in favor of something more oriented towards a > "container" like add(). since i think even hibernate's original idea > of save/update has proven to be naive (for example, this is why they > had to implement saveOrUpdate()). we like to keep things explicit as > much as possible since thats a central philosophical tenet of Python. To be honest, I think a lot of people aren't thrilled with that aspect of Hibernate. Actually, I don't think I've ever used update() or saveOrUpdate(), since everything else seems "just work" (i.e. changes made to entities loaded from the DB get automatically saved when the transaction commits). So back to my original thoughts regarding the ability to fetch db-generated primary keys (and other DefaultGenerator things) after a save() but before an explicit/implicit flush... I would think it could be an option on the Table (err, Column) itself and not a mapper option. Something like pending_flush=True (which can default to False to keep the current behavior...though that name probably isn't very good, but flush_when_pending_on_read is too long ;). Does that sound like a reasonable idea? You don't have to commit to any work if you don't want, I wouldn't mind trying my hand at it if necessary. Thanks, -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
Rick Morrison wrote: > Wouldn't a flavor of .save() that always flush()'ed work for this case? > > say, Session.persist(obj) > > Which would then chase down the relational references and persist the > object graph of that object...and then add the now-persisted object to > the identity map. > > ...something like a 'mini-flush'. Almost, except I would want it to only flush if I tried to access a db-generated attribute. The normal "lazy" behavior otherwise makes perfect sense to me. -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
Michael Bayer wrote: > > On Dec 10, 2007, at 4:12 PM, Adam Batkin wrote: > >> No, I strongly disagree. Once you save() an object, there are >> absolutely >> no guarantees about when it will be flushed (other than that it will >> happen the transaction is actually committed). >> > > you can guarantee an object is flushed by saying flush(). or if you > just retrieved it from a Query, youre similarly guaranteed that its > flushed. whats the speciic use case you have that youre concerned > about ? nobodys ever had this concern before AFAIK. > >> I can't find any place where sqlalchemy makes any guarantees regarding >> the transition from Pending to Persistent state. Which is why I think >> that objects in the Pending state should function (to the best of >> sqlalchemy's ability) in as close to the same way as possible. > > why is flush() not a guarantee ? it seems really simple to me... Sorry, I should have been clearer. Obviously once you've actually flush()'d there are no problems, and the docs make it clear that a flush() does in fact ensure that everything transitions from Pending to Persistent. My mental model has always been that once you save() something, it will be at _least_ Pending, but might transition at any moment to Persistent. This is only important because in my opinion, once you save() an instance, if you then poke at it's id attribute, you probably shouldn't be disappointed if the id is suddenly not None (even though you never issued an explicit flush()). (for example if you issue a query that involves that table) Regarding the other e-mail you just responded to... >> I know for a fact that Hibernate does it this way (not that sqlalchemy >> has to do everything Hibernate does), and I can't imagine a use case >> where doing what's needed to retrieve database-generated fields on an >> as-needed basis would be considered incorrect behavior. >> > > ive never seen that one before, i.e. the id generator being called in > direct reaction to calling myentity.getId(). In our hibernate apps we > often check for the id being null in order to check if the entity is > persisted yet. if you only mean that hibernate generates its own IDs > externally to INSERTs, yes thats true. SQLAlchemy would not use that Here's what I have seen in Hibernate (at least on Oracle, and I assume Postgres would be similar, assuming db sequences are used for pk generation): Create an object, save it. No db activity yet. Call object.getId(). Nothing inserted, but you can see a value IS pulled from the database sequence. I should test Hibernate with mysql to see if it actually performs a full insert of the row. > 2. I can show you the less-than-public API we use to put "value > generation" callables on attributes, i.e. the same one that issues > deferred loads and lazy relation loads. this would be a little more > like the start of an actual feature. > 3. building on #2 would be some feature to SA called "eager-fetch-ids" > or something like that, which looks at the Column for a > DefaultGenerator that is executable (Sequences, for example, are > executable DefaultGenerator objects). it would probably just take a > list of keys and apply to any columns you want, not just pk cols. That'd be lovely. Although I don't generally use databases that lack sequences very often, perhaps someone who does would also find useful so it might be interesting to see what the implications of having to flush a whole object to the DB if a DefaultGenerator column is accessed after a save() but before a flush() (or implicit flush due to a query). -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
Rick Morrison wrote: > > > Having to > > call flush before doing anything that might require the ID seems > > excessive and too low-level for code like that. > > Why? To me, having to work around the implications of an implicit > persisting of the object for nothing more than a simple attribute access > is much worse. For example, I have code that examines the instance id > attribute of such objects to determine, for example, whether an item > discount needs to be recalculated (not saved) or a differential discount > needs to be calculated (item already saved). It's natural to simply > examine the PK attribute to see if the item is persisted to make the > decision. In this case, an object.is_saved() method could stand in, but > imagine passing your object instance to some other Python function that > just happened to sniff around attributes -- such behavior could cause > the item to flush() without explicit permission? Yuck. I think you missed the part where I said that the object in question was already save()'d, so at any moment it could be flushed and the id magically filled-in. Once you save() an object, you never know when it will be flushed. > > "As far as the application is concerned, objects in the Pending and > > Persistent states should function identically." > > To me, this is a fallacy of how ORMs work, and ignores the particulars > of what happens, or what could happen, during a database save > round-trip. You could have default columns. You could have triggers. > Your could have DRI violations. The database engine could do implicit > type conversion. You simply cannot expect an unchanged object on a > round-trip to a relational database in a real-world case. To always > expect this is to invite huge complexity issues that have been the > downfall of other ORM attempts. For pure unchanged round-trip behavior, > you want a real OO database, not an ORM. No, I strongly disagree. Once you save() an object, there are absolutely no guarantees about when it will be flushed (other than that it will happen the transaction is actually committed). An example: In sqlalchemy, if I create an object and save() it, it won't be flushed (yet. probably). If I then execute some arbitrary query (say, session.query(Something).filter(Something.c.name=='foo').all()) then sqlalchemy WILL flush that object I just saved! And that's the correct behavior because it would be nearly impossible for sqlalchemy to correctly determine if the object I just saved should be returned with the query. In other words, I never flushed, but it got flushed anyway. I'm not saying that sqlalchemy should always flush every object once you start touching any of its attributes. But for attributes that we KNOW need to be fetched, they should be treated exactly the same way that arbitrary queries treat Pending objects. I can't find any place where sqlalchemy makes any guarantees regarding the transition from Pending to Persistent state. Which is why I think that objects in the Pending state should function (to the best of sqlalchemy's ability) in as close to the same way as possible. Just some thoughts, -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
>> I hate to disagree here, and I can see what you're getting at, but >> honestly, the "INSERT on save()" approach is exactly the naive active- >> record-like pattern that SQLAlchemy's ORM was designed to get away from. >> >> The way the unit of work functions, we dont generate ids until a flush >> occurs. Flushes dont occur unless you say flush(), or if you have > > I'm not saying flush on save. I'm saying flush at the last possible > moment (which is what it does now) but I want "last possible moment" to > include "program tried to access a database-generated field" > > s1 = Something('foo1') > session.save(s1) > s2 = Something('foo2') > session.save(s2) > # Nothing flushed yet > s3 = Something('foo3') > session.save(s3) > url_for_foo = "/something?id=%d" % s3.id > # s3 should be flushed, nothing else though (since s3.id was accessed) Can you suggest other alternatives for the above use case? Having to call flush before doing anything that might require the ID seems excessive and too low-level for code like that. I know for a fact that Hibernate does it this way (not that sqlalchemy has to do everything Hibernate does), and I can't imagine a use case where doing what's needed to retrieve database-generated fields on an as-needed basis would be considered incorrect behavior. I think what I'm asking for can be summarized this way: "As far as the application is concerned, objects in the Pending and Persistent states should function identically." (it's possible that this feature would be difficult to implement, in which case that's a good answer and maybe it can go on to a far-off wishlist, or I can try to implement it or something, I just don't see a way for it to be considered incorrect behavior) Thanks, -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
> I hate to disagree here, and I can see what you're getting at, but > honestly, the "INSERT on save()" approach is exactly the naive active- > record-like pattern that SQLAlchemy's ORM was designed to get away from. > > The way the unit of work functions, we dont generate ids until a flush > occurs. Flushes dont occur unless you say flush(), or if you have I'm not saying flush on save. I'm saying flush at the last possible moment (which is what it does now) but I want "last possible moment" to include "program tried to access a database-generated field" s1 = Something('foo1') session.save(s1) s2 = Something('foo2') session.save(s2) # Nothing flushed yet s3 = Something('foo3') session.save(s3) url_for_foo = "/something?id=%d" % s3.id # s3 should be flushed, nothing else though (since s3.id was accessed) -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Lazy ID Fetching/generation
>> My thought is that sqlalchemy should force the object to be flushed >> (or >> whatever must be done to determine the ID, possibly just selecting the >> next value from a sequence) when the id property is retrieved. >> > > can't be done for mysql, sqlite, MSSQL, others, without issuing an > INSERT. you cant INSERT on __init__ since not every attribute may be > populated on the object, and additionally our session doesnt generally > like to do things "automatically", with the exception of the > "autoflush" feature. also we don't emit any modifying SQL externally > to the flush. if youre using a database like postgres or oracle, > you're free to execute the sequence yourself and apply the new value > to the primary key attribute of your object, and it will be used as > the primary key value when the INSERT does actually occur. Ahh, but session.save() was already called, so trying to fetch a database-generated attribute (such as the primary key in my case) should trigger a flush of the row itself. That can be done with any database. It wouldn't be done on __init__, nor would it be done on save(). It would be done only once you tried to fetch the id property (only for objects in the Pending state) Okay, as an example. Let's say you have: something_table = Table('something', metadata, Column('id', Integer, primary_key=True), Column('name', String) ) class Something(object): def __init__(self,name): self.name = name def __repr__(self): return "" % (self.id, self.name) mapper(Something,something_table) obj = Something('blah') session.save(obj) print "Look ma, a something: %s" % obj In theory that will throw an exception since Something's __repr__ will have None for the id property, since id was never retrieved. (if I just did: obj = Something('blah') print "Bad idea: %s" % obj then I would expect an exception, since it's not saved) Does this description make more sense that what I said before? -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Lazy ID Fetching/generation
If I create an object, then save() it, potentially the object won't be actually persisted until sqlalchemy decides that it needs to (for example on flush/commit, or when some query involving Thing's table gets executed) which is good. But (in my opinion) the lazyness is a bit too lazy when it comes to autogenerated primary keys: t = Something('foo') session.save(t) assert t.id is None but if I then: session.flush() assert t.id is not None My thought is that sqlalchemy should force the object to be flushed (or whatever must be done to determine the ID, possibly just selecting the next value from a sequence) when the id property is retrieved. Thoughts? -Adam Batkin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---