[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Arnon, I'm sure you are well-meaning, and I appreciate your enthusiasm, but at this point, I think we're going to have to move on. I was encouraged when you started providing examples and we had something concrete to discuss, and I think that has clarified some issues, but it appears this is no longer productive. You began by proposing that we merely build something like the SQLA ORM. Now that even SQLA doesn't meet your needs, you have progressed to requesting a hypothetical ORM that would not only be substantially more sophisticated than even SQLA but also, it appears, logically impossible to implement (at least in the general case). You seem to think you can query the session without having a way to uniquely associate objects with their RDBMS records, and that you can create dummy partial-objects and somehow associate and merge them with real database records after the fact. And all this for what will likely be modest performance improvements in limited use cases where alternative strategies would be much simpler to implement. Since you do know how to program and have apparently worked out the logical details of your desired functionality, I would suggest you should therefore be able to implement at least a basic proof-of-concept on your own. Maybe see how that goes and get back to us. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Hi Arnon, On 03/05/2013 23:16, Arnon Marcus wrote: Granted, you could use the ID in a DAL context, in a similar way that you would use an ORM-class-instance within the ORM context, but there are other things that an ORM-object can do within an ORM context (aside from these examples), that an ID or a Rows object can not do in the DAL context. OK, well I understand the parts of your reply where you have been explicit, but my original point was that: - I truly do not understand what problem the DAL fails to solve, and I don't expect to without some much-less-abstract discussion. ...and I'm still none the wiser. As a database developer with a set-oriented mind-set, the direct mapping to the db and explicitness of db I/O (as described by Massimo in his recent response) make web2py's approach an elegant and powerful solution to managing relational data in a relational manner. Perhaps my RDBMS mind-set prevents me from understanding the virtues of what you are proposing. Given the complexities of implementing an ORM (not to mention your assertion that it is a solved problem), what was your take on the suggestion that a sensible approach would be to write an adapter for SQLA (or some other ORM) for the DAL? It's not something that I would use myself, but I can see how it would fit into web2py, and surely that would be a more practical approach than developing an ORM for web2py from scratch? -- Regards, PhilK e: p...@xfr.co.uk - m: 07775 796 747 'work as if you lived in the early days of a better nation' - alasdair gray -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Yes, I have actually suggested this myself. SQLA is layered, so it has it's ORM layer built on-top. It very well may be a viable option. However, it is not entierly clear how complex it could be, since the SQLA core is heavilly integrated into the ORM layer. Now, since it is all ope-source code, another option would be to try to extract code-chunks out of it, and reintegrate it with web2py. That way, there may be more freedom in defining the points of intersection, as well as th overall API that would be exposed to the developer. A for why it would be beneficial, I tried many approaches of explaining that, but it starting to be clear to me that a lot of people here have extensive experience in bare-SQL, so you are so used to this mined-set that it is difficult for you to get out of it. It feals akin to people that have extensive experience with managed-code, that they are so used to doing mallocs, that after they discover some API that simplifies their usual manual memomry-management skills, that they can't fathom the benefits of having a memory-manager - it is difficult for them to trust if to be efficient enough. Or another example would be static-vs-dynamic languages. It is the tradeoff between software-performance, and software-developement speed and ease. What can I say, an ORM may not be for everyone... That said, there have been many such threads in this group over the years, so alought it seems in this thread that I am the only advocate of this aaproach, I know for a fact that it is not the case (at least it not has been the case). So, either the other proponents of this approach, are keeping silent this time around, or they have left web2py behind and moved to another framework, once they saw it isn't gonna happen here. The third option would be that they have been 'converted' to the DAL's way of thinking. The reason this option is highly unlikely, is that if that were the case, I would have gotten plenty comments here already, in the form of ... I used to be a proponent of ORM myself, and here is why I converted I have not gotten a single response like that as of yet... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Also, come to think of it, it seems extremely bizarre to me, that this community is not receptive to the ORM concept, and here is why: Web2py is already doing some radically-unorthodox things, architecturally, sometimes paying heavy-prices for it, all in order to gain more conciseness of code - compared to many other frameworks. So much so, in fact, that it historically have been a very tough sell for the python community at large, to accept the web2py-way of doing things - sometimes to the extreme of catching fire from people in the python community, not merely criticizing web2py but downright scorning it's developers. So, it is hard for me to imagine web2py developers as these heavily-orthodox bulk-heads - it just wouldn't add up... But perhaps I have it the other-way around - perhaps an ORM IS the orthodox-way of doing things, and the DAL is the unorthodox way... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Technically, it would be db.Country(Name=countryName).City(db.City.Name== cityName).select().first() . 10x for correcting me. The way you have coded it, it is actually 6 round trips -- there are 2 selects per function -- one for the country, and a second for the city. But that's not how you would do it in web2py anyway. Instead, you would issue no selects and instead do it with just a single update -- so a total of 2 round trips to the db (i.e., 2 updates) : def a_child_was_born_in(countryName, cityName): query = (db.City.Name == cityName) (db.City.Country.belongs(db. Country.Name == countryName)) db(query).update(Population=db.City.Population + 1) Obviously I could have done it this way, but I wanted an example that I can use to illustrate the differences. Your suggestion is a circumstantial optimization to an example that is meant to show something else. It is a testament to my poor example, more than it is for ORM's weaknesses. It is before the optimization process has begun. Lets see where we go from here. So, in order to do the update, we do not first have to query the database to retrieve the record. This is actually an advantage over the ORM, which requires that you first retrieve the record before updating it. That is not true. I could have done it otherwise - this is just for the sake of the example. There is nothing architecturally preventing an ORM from issuing an update without a select. It's an supplementation detail, not an architectural one. You may have encountered ORMs that can't support that, but that doesn't mean that the problem is in the architecture. The ORM will issue two queries to get the record if lazy loading is used, or one if eager loading, a join, or a subquery is used. Correct. Again, for the sake of this example. Furthermore, because web2py doesn't need to retrieve the records, it also has a processing and memory advantage over the ORM, which must create the record object, add it to the session, and hold it in memory. That again is a circumstantial issue, pertaining to this example. In most cases, there would be more reuse of the objects, so creating them would be beneficial. Also, I don't see how in web2py's DAL it is any different - all of the objects you are using in the query/update are objects that has to be created for you to use them... In fact, in an ORM, only the objects that are needed for the query may be created for each transaction, but due to how web2py is executing, you are actually having to create the entire-schema of objects from scratch at every request, so I can't see how web2py would create less-objects - it actually would create more... Now, lets say we want to optimize that, so we do a Lazy version of those functions. There's not much to optimize here. If you don't know ahead of time that you will be making two updates to the same record (which may possibly negate each other), No. My point here was that a lazy-query might be beneficial in some cases, so it could be integrated into the functions, at least optionally, and be chosen by the caller, based on the circumstances. It also has nothing to do with anticipating the override-back-to-default. You are conflating 2 different mechanisms that are at play here. Making it lazy, is mechanism that is being enabled by the Identity Mapper in this case, as these are two name-spaces that need to re-use the same object. Stick to the original example if you want to judge the line-of-thought - obviously it would brake down once you optimize it the way you have, but that's irrelevant to the example at hand. I think the minimum number of db hits is two. You could retrieve the record twice, defer the first update, recognize that the second update cancels the first, and then make no update -- which is still 2 hits (well, 1 hit if you cache the query). Again, you are completely changing the example, in order to optimize the use-case differently. Obviously that is possible, but still irrelevant, as the example is a mere means-to-en-end of explaining something else. All you are saying here, is that it is a bad example, and that may be so, but it has nothing to do with alternative optimization approaches. Or you could just make the 2 updates (as above). In any case, I believe the ORM actually requires a minimum of 4 hits (see below), so web2py is still doing a lot better. Wrong. (see below) Assuming this is SQLA, I don't think that's quite the right syntax -- it appears you are creating object instances rather than issuing queries. I believe it should be something like this: This isn't SQLA - it isn't anything at this point - just a suggestion. def a_child_was_born_in(countryName, cityName): city = session.query(City).join(Country)\ .filter(Country.Name == countryName)\ .filter(City.Name == cityName).first() city.Population += 1 The above
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
It is also worth noting that when using SQLA in a web application, each SQLA session lasts for only a single web request. So, any benefits you may get from having the ORM maintain its own transaction-like session with identity-mapping, etc. are limited to the timescale of a single request. This approach is an improvement over the ActiveRecord style ORM, but it doesn't appear to offer much, if any, benefit over the web2py DAL, which does not suffer from the ActiveRecord problems. For example, in SQLA, you might tout the ability to make multiple updates to a record during a session but have the changes deferred to a single database update. However, you can already do this in web2py by making multiple updates to a Row object and then calling .update_record() to send all the changes to the database in a single update. Of course, in web2py, if you obtain the record via different queries in different parts of the code, then you will get different Row objects (unlike in SQLA, which will ultimately give you back the same object), so you will have to make separate updates to the database for each change. However, this is also true in SQLA -- if you make a separate query that happens to retrieve a record you have previously updated in the session, the previous update will first be flushed to the database before the later query -- so you still end up with the same number of updates as in web2py. Moreover, web2py has an additional advantage -- in web2py, if you happen to make a query in between making two changes to the same Row object, the first change to the Row object is not flushed to the database, so you can still limit yourself to a single db update for that Row. In SQLA, the intervening query would cause a flush, so you end up with more updates than you need. Yes, you can manually prevent the flush in SQLA if you know it will not do any harm, but then there's no benefit over the explicitness of the DAL. Also, it's true that in SQLA, if you do two separate queries that happen to retrieve some of the same records, it will still only hold one copy of each unique record object in memory. But it still needs to pull the duplicate data from the database and therefore hold it in memory for some time before releasing it. On the other hand, in web2py you can update a record without first retrieving it from the database, which saves a database hit, memory, and processing time relative to a SQLA update. Anthony On Saturday, May 4, 2013 12:36:50 AM UTC-4, Anthony wrote: def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select(). first() city.update_record(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select(). first() city.update_record(Population-city.Population - 1) Technically, it would be db.Country(Name=countryName).City(db.City.Name== cityName).select().first() . # In context 1: a_child_was_born_in('France', 'Paris') ... # In context 2: a_person_has_died_in('France', 'Paris') This would issue 4 round-trips to the database - 2 selects and 2 updates. The way you have coded it, it is actually 6 round trips -- there are 2 selects per function -- one for the country, and a second for the city. But that's not how you would do it in web2py anyway. Instead, you would issue no selects and instead do it with just a single update -- so a total of 2 round trips to the db (i.e., 2 updates) : def a_child_was_born_in(countryName, cityName): query = (db.City.Name == cityName) (db.City.Country.belongs(db. Country.Name == countryName)) db(query).update(Population=db.City.Population + 1) So, in order to do the update, we do not first have to query the database to retrieve the record. This is actually an advantage over the ORM, which requires that you first retrieve the record before updating it. The ORM will issue two queries to get the record if lazy loading is used, or one if eager loading, a join, or a subquery is used. Furthermore, because web2py doesn't need to retrieve the records, it also has a processing and memory advantage over the ORM, which must create the record object, add it to the session, and hold it in memory. Now, lets say we want to optimize that, so we do a Lazy version of those functions. There's not much to optimize here. If you don't know ahead of time that you will be making two updates to the same record (which may possibly negate each other), I think the minimum number of db hits is two. You could retrieve the record twice, defer the first update, recognize that the second update cancels the first, and then make no update -- which is still 2 hits (well, 1 hit if you cache the query). Or you could just make the 2 updates (as above). In any case, I believe the ORM actually requires a minimum of 4 hits (see below), so web2py is
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Saturday, May 4, 2013 6:07:21 AM UTC-7, Anthony wrote: It is also worth noting that when using SQLA in a web application, each SQLA session lasts for only a single web request. So, any benefits you may get from having the ORM maintain its own transaction-like session with identity-mapping, etc. are limited to the timescale of a single request. This approach is an improvement over the ActiveRecord style ORM, but it doesn't appear to offer much, if any, benefit over the web2py DAL, which does not suffer from the ActiveRecord problems. For example, in SQLA, you might tout the ability to make multiple updates to a record during a session but have the changes deferred to a single database update. However, you can already do this in web2py by making multiple updates to a Row object and then calling .update_record() to send all the changes to the database in a single update. Of course, in web2py, if you obtain the record via different queries in different parts of the code, then you will get different Row objects (unlike in SQLA, which will ultimately give you back the same object), so you will have to make separate updates to the database for each change. However, this is also true in SQLA -- if you make a separate query that happens to retrieve a record you have previously updated in the session, the previous update will first be flushed to the database before the later query -- so you still end up with the same number of updates as in web2py. Not i you deactivate auto-flushing. Moreover, web2py has an additional advantage -- in web2py, if you happen to make a query in between making two changes to the same Row object, the first change to the Row object is not flushed to the database, so you can still limit yourself to a single db update for that Row. In SQLA, the intervening query would cause a flush, so you end up with more updates than you need. Yes, you can manually prevent the flush in SQLA if you know it will not do any harm, but then there's no benefit over the explicitness of the DAL. You will not get the benefit of auto-flushing for these operations, obviousely, because you turned it off explicitly. But it is EXACTLY in order to GAIN a benefit of Lazy-Updates ACROSS NAME-SAPCES - that you temporarily turned it odd - which is something that web2py CAN NOT DO. Ultimately, get a SINGLE-DATABASE hit in this example, whereas using web2py you would get TWO hits. Also, it's true that in SQLA, if you do two separate queries that happen to retrieve some of the same records, it will still only hold one copy of each unique record object in memory. But it still needs to pull the duplicate data from the database and therefore hold it in memory for some time before releasing For this example, yes, but it is worth is because you are saving a round-trip to the database, when compared to web2py. For other cases, you may still do an update-without-select in SQLA also, using the SQLA-Core directly. You wont get the benefits of an ORM this way, but you will also loose the ability so save yourself a database-hit, and will be back in web2py's DAL land, so you didn't WIN anything from an ORM like that, but you also didn't lose anything that web2py can do that SQLA can't. it. On the other hand, in web2py you can update a record without first retrieving it from the database, which saves a database hit, memory, and processing time relative to a SQLA update. It saves a hit to the database, in exactly the same manner that an SQLA-Lazy-Update would do (with auto-flushing temporarily turned off). But the Unit-Of-Work pattern, in conjunction with the Identity-Map, then saves you ANOTHER hit to the database, in this case, when it is concluded that no change is needed, THAT you CAN NOT do in web2py in this case, since the two Rows object do not know about each other, and so the back-to-previouse-value check thing can not be detected. One more thing - the auto-flush disabling does not have to be decided ahead of time at session-creation (though you can also do that). You can always just tick the flag on and off in runtime, so you could potentially get the nest of both worlds even within a transaction, if you are doing other things in it befire this example occurs, that benefit from auto-flushing. Anthony On Saturday, May 4, 2013 12:36:50 AM UTC-4, Anthony wrote: def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select(). first() city.update_record(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select(). first() city.update_record(Population-city.Population - 1) Technically, it would be db.Country(Name=countryName).City(db.City.Name== cityName).select().first() . # In context 1: a_child_was_born_in('France', 'Paris') ... # In context
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
def a_child_was_born_in(countryName, cityName): query = (db.City.Name == cityName) (db.City.Country.belongs(db. Country.Name == countryName)) db(query).update(Population=db.City.Population + 1) Obviously I could have done it this way, but I wanted an example that I can use to illustrate the differences. Your suggestion is a circumstantial optimization to an example that is meant to show something else. Arnon, we have repeatedly asked you to offer use cases where the ORM will be either easier or more efficient than the DAL. In this case, you have concocted an example that makes it appear that the ORM has advantages, but only because you have used the DAL in the least efficient way possible. If your requirement in making comparisons is that the DAL must be required to do everything in the same fashion as the ORM, then this is nonsensical. The question should be whether the DAL can achieve the same outcome just as easily (i.e., code that is similarly easy to produce, understand, test, debug, etc.) and with similar efficiency. There's no reason the DAL code should therefore have to superficially resemble the ORM code or precisely replicate its operations. Now, you might argue that this is just a bad example, and that there is some real-world example where the web2py DAL will be forced to do something like you have coded here, and that's where the ORM will have an advantage. If that's the case, then you should have no problem presenting such an example. If you cannot do so, then it is hard to take this seriously. So, in order to do the update, we do not first have to query the database to retrieve the record. This is actually an advantage over the ORM, which requires that you first retrieve the record before updating it. That is not true. I could have done it otherwise - this is just for the sake of the example. There is nothing architecturally preventing an ORM from issuing an update without a select. It's an supplementation detail, not an architectural one. You may have encountered ORMs that can't support that, but that doesn't mean that the problem is in the architecture. As far as I can tell, if you want to update a record in SQLA, you must first retrieve it from the database (if using the ORM). If you believe that is not the case, then please show the code for how you would do this in the ORM. As for the architectural issue, I suppose in principle an ORM could update a record without a prior select, but then you lose the other benefits of the ORM, as you will not have an object representation of that updated record in Python. At this point, you might as well have a DAL. In other words, if both the ORM and the DAL handle this case via direct database updates, then this example is irrelevant for establishing the supposed advantages of an ORM. Furthermore, because web2py doesn't need to retrieve the records, it also has a processing and memory advantage over the ORM, which must create the record object, add it to the session, and hold it in memory. That again is a circumstantial issue, pertaining to this example. In most cases, there would be more reuse of the objects, so creating them would be beneficial. In web2py, you create the objects when you need them. You do not always need them. For example, if you are updating a record, you might retrieve it in one request, then update it in a second request -- no need for another retrieval in the second request (unless you want to check for intervening changes). Also, I don't see how in web2py's DAL it is any different - all of the objects you are using in the query/update are objects that has to be created for you to use them... No, in web2py, you do not need to create a Row object in order to update a record in the database. In fact, in an ORM, only the objects that are needed for the query may be created for each transaction, but due to how web2py is executing, you are actually having to create the entire-schema of objects from scratch at every request, so I can't see how web2py would create less-objects - it actually would create more... Not sure what you're talking about here. If you are talking about defining tables in web2py at every request, it is not required that you define all the tables for the entire schema -- you can define conditionally depending on what you need. If you are talking about creating instances of Row objects, then you are mistaken -- only those you explicitly create are created, and exactly when you explicitly create them. Now, lets say we want to optimize that, so we do a Lazy version of those functions. There's not much to optimize here. If you don't know ahead of time that you will be making two updates to the same record (which may possibly negate each other), No. My point here was that a lazy-query might be beneficial in some cases, so it could be integrated into the functions, at least
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
In web2py we assume the code within one transaction is executed sequentially. When you say being used by different contexts within the same transaction you seem two assume different threads may act on data within the same transaction. I think going that rout is dangerous and should not be encouraged. On Friday, 3 May 2013 18:41:47 UTC-5, Arnon Marcus wrote: Here is a use-case that can benefit from an ORM. Let's say we have 2 functions that manipulate the same field in 2 different functions. Here is how it would be done using the DAL: def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update_record(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update_record(Population-city.Population - 1) Now, let's say that both functions are being used by different contexts within the same transaction (hypothetically, say, from some different functions, way deep in the call stack). # In context 1: a_child_was_born_in('France', 'Paris') ... # In context 2: a_person_has_died_in('France', 'Paris') This would issue 4 round-trips to the database - 2 selects and 2 updates. Now, lets say we want to optimize that, so we do a Lazy version of those functions. How would we go about doing that? Well, we could replace the .update_record with an .update. def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update(Population-city.Population - 1) Would that work? Well, let's see, assuming the initial population value of Paris, is 2 million. When a child is born, the value would get incremented locally. But the Row object of the 'city' variable is not persisted in memory when the functions return. So we need to commit the transaction after each call. But wait a minute, that would get us back to 4 operations... Might as well leave the update_record the way it was. What do we do? Well, we could make the laziness optional, and call the first one eagerly and the second lazily. Yes, we would need to keep track of our ordering of calling them, but if we do it right, we could get it down to 3 operations (2 selects and one update). Would that work? Well, no, because then we would loose the second update once the second function returns... Can we still do something? Well, yes, we can activate caching on the City field, so it's internal-values would survive across transactions - given that we give the cache a long-enough time-out. This may not help us in the updates, but it could nock-off a the second query (he select operation in the second function) So the best we get is 3 operations - 1 select and 2 updates. Now, here is the same code, using an ORM: def a_child_was_born_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population += 1 def a_person_has_died_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population -= 1 The syntactic difference is small, but the semantic implication is profound. The automatic cache-mechanism in the ORM will detect that we are querying the same record, and so would not query the database in the second function - just return the same object already in memory. So now we're down to 3 actual operations - 1 select and 2 updates. But it doesn't stop there... In the DAL case, we cached the values inside the city field, but the 'city' variable in the first function, is still a separate Rows object from the 'city' object in the second function, so we couldn't do Lazy updates. But an ORM can have an Identity Mapper, that would make sure they the same object would be returned, It would be bound to two different name-spaces, but it would be the same object. Now we could implement a Truely lazy update. The increment that is done in the first function, would be reflected in the second one, because the same object would be returned, So now we're sown to 2 operations - one select, and one update - the update would automatically be issued for us at transaction-commit time, as it would be balled pending by the time it get's there, using the Unit-of-Work pattern.. But it doesn't have to even stop there... The Unit-of-Work mechanism has this dirty label, which signifies that the current value within a record-object has different value from the one in the database. Now, it may be implemented poorly, and just get flagged as dirty on any update to it, or it could store the original value, and have the dirty-check deferred to the last minute - in which
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
However, this is also true in SQLA -- if you make a separate query that happens to retrieve a record you have previously updated in the session, the previous update will first be flushed to the database before the later query -- so you still end up with the same number of updates as in web2py. Not i you deactivate auto-flushing. Yes, but it's on by default for a reason. You have to be careful about when you disable auto-flushing, or otherwise switch to manual flushing everywhere. You will not get the benefit of auto-flushing for these operations, obviousely, because you turned it off explicitly. But it is EXACTLY in order to GAIN a benefit of Lazy-Updates ACROSS NAME-SAPCES - that you temporarily turned it odd - which is something that web2py CAN NOT DO. Ultimately, get a SINGLE-DATABASE hit in this example, whereas using web2py you would get TWO hits. Well, 3 database hits for the ORM, counting the 2 selects. But how do you know you can turn off auto-flushing in this example? What if a record was inserted in between the first function call and the second, and that's the record that needs to be updated in the second function? In that case, the second call will fail to make its update. Anyway, if you have a scenario like this where you know you're likely to be making multiple updates to the same record within the same web request, but you have to retrieve that record via separate queries in different parts of your code, I suppose you could build a basic identity map that stores queried rows by PK and does a lookup upon subsequent queries, just like SQLA. This doesn't require an ORM. I'm not sure you would want to do it automatically all the time, though, as there would be memory and processing overhead associated with keeping everything in such a structure, which would not be used most of the time. Also, it's true that in SQLA, if you do two separate queries that happen to retrieve some of the same records, it will still only hold one copy of each unique record object in memory. But it still needs to pull the duplicate data from the database and therefore hold it in memory for some time before releasing For this example, yes, but it is worth is because you are saving a round-trip to the database, when compared to web2py. You claimed that keeping only one copy of the object means you save memory -- I was just pointing out that at least the data used to create the object does in fact get duplicated in memory for a time. Furthermore, SQLA will keep *all* objects in memory throughout the session, whereas web2py may release objects when they go out of scope and are no longer needed. But the Unit-Of-Work pattern, in conjunction with the Identity-Map, then saves you ANOTHER hit to the database, in this case, when it is concluded that no change is needed, THAT you CAN NOT do in web2py in this case, since the two Rows object do not know about each other, and so the back-to-previouse-value check thing can not be detected. Does SQLA do a back-to-previous-value check -- I haven't seen anything about that? Anyway, at best that gets you down to 2 database hits, assuming you can solve the flushing problem (see above). Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
No, I am not assuming that. Different contexes can, and actually often do, exisr within a serially run single-thrrad. The difference in contexts lie in the different namespaces being generated by the mere action of calling a function. I am not edorcing multythreading by any means, but even if I was, that kind of worrie would still be unjustified, as the different thread would use a different transaction. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
No, I am not assuming that. Different contexes can, and actually often do, exisr within a serially run single-thrrad. The difference in contexts lie in the different namespaces being generated by the mere action of calling a function. I am not edorcing multythreading by any means, but even if I was, that kind of worrie would still be unjustified, as the different thread would use a different transaction. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
No, I am not assuming that. Different contexes can, and actually often do, exisr within a serially run single-thrrad. The difference in contexts lie in the different namespaces being generated by the mere action of calling a function. I am not edorcing multythreading by any means, but even if I was, that kind of worrie would still be unjustified, as the different thread would use a different transaction. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Now, you might argue that this is just a bad example, and that there is some real-world example where the web2py DAL will be forced to do something like you have coded here, and that's where the ORM will have an advantage. If that's the case, then you should have no problem presenting such an example. If you cannot do so, then it is hard to take this seriously. My difficulty/reluctance to finding more/better examples, are a testament to my poor expertise Nothing more, nothing less. As far as I can tell, if you want to update a record in SQLA, you must first retrieve it from the database (if using the ORM). If you believe that is not the case, then please show the code for how you would do this in the ORM. I am already stretched way beyond my expertise, here, which is an interesting exerciser, but I can't stretch that to the point you are asking me to. I see no logical-fault in my suggestion, so irrespective of where this logic is implemented, irrespective of weather it is implemented in SQLA, and irrespectively of my ability to implement a prototype that would do that, the logic appears to be sound. There is no logical-reason you could not generate an object-graph out of the query you are constructing, Only the benefits of that are questionable, and that may be extremely circumstantial and complicated to benchmark, so U am not going to do that. There are other people (presumably in this group as well), who are much more proficient and experienced in doing so. I will elaborate on where I think my logic is sound, further on, and show you what you are missing. As for the architectural issue, I suppose in principle an ORM could update a record without a prior select, but then you lose the other benefits of the ORM, as you will not have an object representation of that updated record in Python. You are assuming a circumstancial-implementation that is not doing what I have suggested. If the update operation would generate an object-graph, representing what it is assuming is existing in the database schema wise, then I see no reason why it should not be able to store the data tha it is udpating (or even inserting). Remember, we are talking about a context of an atomic transaction, so an ORM is logically safe in assuming that the transaction will complete successfully - it's called optimistic updates. The reason for this safety, is that the object-graph that is being created in the transaction is being tracked, so a rollback, in an event of transaction commit-failure, is going to dump these object anyways... More so, the ORM is aware of the database schema, so it could infet the validity of the structure of the object-graph it would generate. At this point, you might as well have a DAL. In other words, if both the ORM and the DAL handle this case via direct database updates, then this example is irrelevant for establishing the supposed advantages of an ORM. True for SQLA. Not true in what I am proposing. In web2py, you create the objects when you need them. You do not always need them. For example, if you are updating a record, you might retrieve it in one request, then update it in a second request -- no need for another retrieval in the second request (unless you want to check for intervening changes). What I meant by that is that this example is not showing much reuse within the same transaction, so your argument that object-creation, may not be that beneficial, is valid in this example. However, the whole point in having an ORM, is for cases you DO have many reuses of record-objects within a transaction. In these cases, an ORM is more beneficial. In other cases, thr DAL would be more beneficial. The mere fact that even with an ORM, there would still exist circumstances in which it would be less efficient, is not a testament to the ORM's uslessness... By that logic, the mere existence of circumstances in which walking would be more efficient than driving a car, is somehow a testament to a car's uselessness... For circumstances in which the DAL would be more efficient - by all means use it - it is not going anywhere (!) That's what SQLA users are doing (at leas I hope...) Again, it is only a testament to the poorness of my example - that's it. No, in web2py, you do not need to create a Row object in order to update a record in the database. I meant the objects that the DAL is creating for it's internal functionality - on the fly - they are akin to the objects that an ORM would create to implement it's functionality - on the fly... The fact that an ROM would create slightly more record-objects than web2py would create, it not necessarily a bad point for an ORM - again, you can dissect any framework and find inefficiencies linke that everywhere, if you like - it may mean something and may not - there are multiple trade-off being taken in multiple levels, for the greater good so to speak - you need to
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Sorry, but what should do .list() ? it is a query? it is pre-fetched or cached? The whole point of having an ORM, is for you to not have to worry about the answer to that question - it's gonna do the optimal thing it can, based on the circumstances in which it is called. As I described, like 10 times already, ORMs are statefull - they have cache-management built-in. The question of how these cache-management is working, and so whether or not you can trust it, is a long and complex question. Suffice it to say, that there has been more than a decade of research into this issue, so it is solved. Just as example, if the list has objects in it, it will check that what it has is up-to-date. If it is called within the context of a transaction that is in progress, and it had already did a query for that beforehand within the same transaction, than by ACID's C of Consistency rule, the code should assume that it's cached-data is up-to-date. If between the previously-called query, and this call, a transaction was ended and a new one started, then according to ACID's C of Consistency rule, the code should not assume that the data is up-to-date, and issue a query. ORMs like SQLA's ORM do a 'Unit-Of-Work' patterns, in which these caches are managed for you. Any transaction-commit invalidates the caches, so you can guarantee consistency across transactions. What about if I need the cities of countries where main language is not Spanish, and population is above 1 millon? ORMs allow you to attache filters for that, it really depends on the implementation, but let's think of some options, shell we? The most straight-forward approach I would suggest, is the layered-approach - meaning, the ORM should use the DAL for that internally. It should allow you to construct the filters yourself using the DAL, and have mechanisms to facilitate attaching of filters to these attributes. We can go into a discussion of how it will look like in an ORM in web2py in each of the options - which would be the fun-part, I think - so let's see: First, let's extend the schema and ORM example-definitions to include that: db.define_table('Language', Field('Name', 'String')) db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db.Country), Field('Language', db.Language), Field('Population', 'Integer')) @ORM(db.Language) class Language: pass @ORM(db.City) class City: pass @ORM(db.Country) class Country: pass @ORM(db.Continent) class Continent: pass ... spanish = Language(Name='Spanish') french = Language(Name='French') france.City.Language = french ... Then, we could devise to do any number of things: We can build basic stuff like .isGraterThan() into the ORM classes, and do: ... [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] [City Name:Paris] It almost reads like plain English... Beautiful (!) Alternatively, we could have a set of operators that we can give to a filter function: ... from ORM.operators import not, moreThan france.City.where( Language=not(spanish), Population=moreThan(100)) [City Name:Paris] Lastly, we could let the developer filter objects using the DAL indirectly: ... france.City.FilterBy( (City.Languase != spanish) (City.Population 100)) [City Name:Paris] Please, note that you are mixing a declarative query syntax (DAL), with an imperative one (ORM) No, I am LAYERING imperative-syntax on-top of declarative one. I would like a elegant declarative syntax similar to LINQ, but in python: [city for db.City.ALL in db.City, db.Country if db.Country.Name == 'France' and db.Country.id == db.City.country] My first suggestion was way more readable. Sadly, there is NO possible way to archive a pythonic list comprenhension syntax that query the objects in the server side AFAIK Not sure what you mean... Some libraries uses dis (python disassembler) and other internal dirty python hacks to do something similar, please, see: http://www.aminus.net/geniusql/ http://www.aminus.org/blogs/index.php/2008/04/22/linq-in-python?blog=2 http://www.aminus.net/geniusql/chrome/common/doc/trunk/managing.html Note that web2py is much closer to the pythonic expressions, but without the early / late binding and other issues described there ;-) I will look into that. This would require caching or storing previous queried record in memory (something like a singleton), or is will not work as you are expecting (it is for checking identity)... Exactly! This is exactly what SQLA is already doing - I explained that before - it uses what it calls 'Identity Mapping' internally: http://www.youtube.com/watch?v=woKYyhLCcnU#t=6382shttp://www.youtube.com/watch?v=woKYyhLCcnU ** Time-coded link - watch the
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Just to clarify, again, the following syntax: [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] Is not a pie-in-the-sky dream-API - there are implementational details that are viable for each section. First, the reason the is not would work, is that there would exist an implementation of 'Identity Mapping that would take care of having a singleton for each ORM-object representing each row. Second, the access to france.City.list is viable, as an attribute-access is totally customization in python, so we can devise anything we want for the access to do. For example, it could return an iterator, if the City attribute of the france ORM-object is valid (meaning, it is cached and not invalidated by a previous transaction-commit), and do a query right-there on the spot, and return an iterator, if the City attribute of the france ORM-object is invalid at the time of the list attribute access. Lastly, the .isGraterThan() can easily be implemented in the ORM-class that would be generated out of the @ORM() class-decorator. It is used for 3 purposes: 1. Readability 2. Beauty. 3. Conciseness. When comparing this: city.Language is not spanish and city.Population.isGraterThan(100) With this: db.Country.Name http://db.country.name/ == 'France' and db.Country.idhttp://db.country.id/ == db.City.country The Zen of Python comes to mind: Beautiful is better than ugly. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
The whole point of having an ORM, is for you to not have to worry about the answer to that question - it's gonna do the optimal thing it can, based on the circumstances in which it is called. Has anyone given you any Kool-Aid to drink recently? The question of how these cache-management is working, and so whether or not you can trust it, is a long and complex question. Suffice it to say, that there has been more than a decade of research into this issue, so it is solved. Famous last words. What about if I need the cities of countries where main language is not Spanish, and population is above 1 millon? ORMs allow you to attache filters for that, it really depends on the implementation, but let's think of some options, shell we? db.define_table('Language', Field('Name', 'String')) db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db. Country), Field('Language', db.Language), Field('Population', 'Integer')) spanish = Language(Name='Spanish') french = Language(Name='French') france.City.Language = french What does france.City.Language = french do? france.City refers to all cities in France, so does this assign French as the language for all cities? Does SQLA employ that syntax? In web2py, this would be: french = db.Language.insert(Name='French') db.Country(france).City.update(Language=french) Looks a lot like your example (maybe even a bit more explicit about what's going on). (Note, Mariano's example actually assumed language to be a country-level field.) [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] [City Name:Paris] The problem here is that you are doing all the filtering in Python rather than in the database. Not a big deal in this example, but with a large initial set of records, this is inefficient because you will return many unnecessary records from the database and Python will likely be slower to do the filtering. It almost reads like plain English... Beautiful (!) There are differing opinions on this. Some prefer symbols like != and over plain English like is not and isGreaterThan because it is easier to scan the line and quickly discern the comparisons being made. In particular, I would much prefer both typing and reading rather than .isGreaterThan(). france.City.where( Language=not(spanish), Population=moreThan(100)) france.City.FilterBy( (City.Languase != spanish) (City.Population 100)) In web2py, you can already do: db.Country(france).City( (db.City.Language != spanish) (db.City.Population 100)).select() So far, for every example you have shown, web2py has fairly similar syntax. Sure, we can quibble over which is more beautiful, but I haven't seen anything to justify building a massive ORM. And the downside of adding an ORM (aside from the time cost of development and maintenance) is that you would then have to learn yet another abstraction and syntax. That could be also archived hacking Row, but you should use == in python for this kind of comparision (equality) Not is SQLA :) Well, if you're actually making such a comparison, you generally would only be interested in ==, not is -- it just so happens that is would also be True in SQLA because of the identity mapping. Web2py *could *do that too. Sure, but it could do it at the DAL level, without an ORM. So far, though, you haven't made a strong case for the utility of such a feature (I don't doubt that it can be helpful -- the question is how helpful). An ORM is MUCH MORE intuitive, as my examples had demonstrated Maybe I missed an example, but I don't think I saw any that were more intuitive at all, let alone MUCH MORE intuitive. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
When comparing this: city.Language is not spanish and city.Population.isGraterThan(100) With this: db.Country.Name http://db.country.name/ == 'France' and db.Country.idhttp://db.country.id/ == db.City.country Well, we should compare your first line to the DAL equivalent: (db.City.Language != spanish) (db.City.Population 100) I actually find that latter easier to process. The parentheses and make it easier to see there are two separate conditions, and the != and are easier to pick out and comprehend than is not and .isGreaterThan(). A non-programmer may have an easier time with the more English-like version (assuming they happen to speak English, of course), but I think it's reasonable to expect even novice programmers to understand the basic boolean operators. Whatever your opinion on the beauty of one over the other, though, surely this doesn't justify the massive undertaking of building an ORM, particularly since you would still have to know and use the underlying DAL syntax in addition anyway. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Am Freitag, 3. Mai 2013 13:44:20 UTC+2 schrieb Arnon Marcus: Sorry, but what should do .list() ? it is a query? it is pre-fetched or cached? The whole point of having an ORM, is for you to not have to worry about the answer to that question - it's gonna do the optimal thing it can, based on the circumstances in which it is called. Here's the problem. In a perfect world you don't have to worry about what's going on behind the scenes. In a real application you need to know what exactly the ORM is doing, otherwise you'll run into problems sooner or later. I had to work with an ORM (Hibernate) for many years. The ORM is very convenient but the magic which is going on in the background will cause you trouble (at least performance-wise). I'm really happy to use web2py and the DAL for my own project, I always know what exactly is going on and I never encountered a strange bug where I could not figure out what's going on. Not so with the ORM where I lost many hours debugging unexcepted behavior. In my opinion it does not make sense to define how references are fetched on an object level (at least that's how it is done in Hibernate, don't know about SQLA). E.g. when I query a City object I can't say in advance if I always want to access the Country reference as well. In Hibernate single references have eager loading so they are always fetched. That's bad because when you don't need the reference there is unnecessary data fetched (and of course this is recursive, e.g. if Country has a single reference itself this will also be loaded). With lazy loading you often run into the problem that the session or transaction is not active anymore and the references cannot be accessed (just google for Lazy Loading Exception). Therefor we often loaded all those references when the object is queried to avoid running into a lazy loading exception later on - which is of course also not very good for performance. With the DAL I say exactly what I need (joining the tables I really need for my use case) and when, that's so much better and easier than depending on some ORM magic. For me all the added internal complexity of an ORM is not worth the effort. The complexity is just shifted to a different layer. Maybe the videos and presentations of SQLA look nice and promising but you'll only find out about the disadvantages (which are not mentioned in the videos of course) once you develop a real world application. Alex -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Has anyone given you any Kool-Aid to drink recently? I have no idea what you mean by that... Famous last words. I have no idea what you mean by that... What does france.City.Language = french do? france.City refers to all cities in France, so does this assign French as the language for all cities? Does SQLA employ that syntax? Ooops, cought me there... :) Obviously I meant: fance.City.Paris.Language = french The .Paris part can be implemented via* __getAttr__* The .Language = french part can be implemented via a *Property*. It would obviously treat the primary-key internally, for relationships - but that's a pluming-level thing so it's implicit as a convenience layer. But I like your idea - it could be implemented also! :) In web2py, this would be: french = db.Language.insert(Name='French') db.Country(france).City.update(Language=french) Looks a lot like your example (maybe even a bit more explicit about what's going on). There is a hidden fundamental difference - in my approach it is actually making object-references (alongside the database insertion) so that can be used further-on in the code, at the very leas within the same transaction, but even across transactions. Again, this is a domain-model, not just a a database-model. Internally, the values may get invalidated on transaction-sommit, but the object-references would persist within the current runtime. Again, you have to break-away the stateless mind-set and appreciate statefullness. This assumes a very different execution-model than what you are used to in web2py. It is something that would happen within a *module*, not within a *controller-action,* so it is saved within the module-object across transactions/requests/sessions. (Note, Mariano's example actually assumed language to be a country-level field.) [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] [City Name:Paris] The problem here is that you are doing all the filtering in Python rather than in the database. Not a big deal in this example, but with a large initial set of records, this is inefficient because you will return many unnecessary records from the database and Python will likely be slower to do the filtering. Well, that depends a lot on the context in which this is executed. If you have many queries similar to your example, before/around this line of code, that may reuse the same objects for other purposes (which is not uncommon) It may in fact be slower to do it your way in many circumstances, because every specialized-query you are making is another round-trip to the database, which would be orders-of-magnitude slower than doing an eager-loading up-front, and filtering in python. Also, you need to keep in mind that this is assuming a long-lasting set of objects that out-live a single transaction-operation. It almost reads like plain English... Beautiful (!) There are differing opinions on this. Some prefer symbols like != and over plain English like is not and isGreaterThan because it is easier to scan the line and quickly discern the comparisons being made. In particular, I would much prefer both typing and reading rather than .isGreaterThan(). You are right - there are different opinions, but the Zen of Python is conclusive. :) Also, there are both performance AND memory benefits to using is not. An object-id check is much faster that an equality check, and having the same object referenced by different names instead of having copies of it that need to be equality-tested, may save tons of memory. But if you insist in using an ugly form, than in my example you may still do that - it would work just as well - while having the same memory-footprint benefits, just not the performance-benefits. :) france.City.where( Language=not(spanish), Population=moreThan(100)) france.City.FilterBy( (City.Languase != spanish) (City.Population 100)) In web2py, you can already do: db.Country(france).City( (db.City.Language != spanish) (db.City.Population 100)).select() So far, for every example you have shown, web2py has fairly similar syntax. But with radically-different semantics (!!!) Sure, we can quibble over which is more beautiful, but I haven't seen anything to justify building a massive ORM. It is not just more beautiful - it is also faster for long-transactions, and takes less memory. Also it results in developer code being much more readable and concise and hence much more maintainable. And the downside of adding an ORM (aside from the time cost of development and maintenance) is that you would then have to learn yet another abstraction and syntax. This doesn't deter people from using SQLA - how do you account for that? Well, if you're actually making such a comparison, you generally would only be interested in ==, not is -- it just so happens that is would
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I actually find that latter easier to process. The parentheses and make it easier to see there are two separate conditions, and the != and are easier to pick out and comprehend than is not and .isGreaterThan(). A non-programmer may have an easier time with the more English-like version (assuming they happen to speak English, of course), but I think it's reasonable to expect even novice programmers to understand the basic boolean operators. Whatever your opinion on the beauty of one over the other, though, surely this doesn't justify the massive undertaking of building an ORM, particularly since you would still have to know and use the underlying DAL syntax in addition anyway. Anthony Again: There are both performance AND memory benefits to using is not. An object-id check is much faster that an equality check, and having the same object referenced by different names instead of having copies of it that need to be equality-tested, may save tons of memory. But if you insist in using an ugly form, than in my example you may still do that - it would work just as well - while having the same memory-footprint benefits, just not the performance-benefits. :) -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I know a lot of people have been burned in production by many ORMs in the past - and those scares are what anyone suggesting an ORM today has to deal with. SQLA is doing a very good job at that - have you seen the videos I posted here? They are talking about Hibernate, as well as the Active-Record pattern. The guy who wrote it has mentioned all of the worries you are talking about in those videos. For example: Eager-vsLazy loading is configurable - both are supported, and you can use each in you core, depending on circumstances. As I said in the previouse message (I updated it since you posted this one). it depends - the trande-offs are circumstancial - that's why you need both approaches, and the ability to concigure each object to use one or the other in different circumstances - and that's what SQLA provides. As for recursive-queries, it does not occur in eager-loading in SQLA. SQLA features what it called a cascade, which means it traverses through the attribute-accesses, and only fetches what you actually asked for (well, it's not really traversing, there are events that are triggered on each object's attribute-access...) this is for both lazy and eager loading configurations. The cascade makes sure that you only get what you explicitly asked for, no more, no less. As for Magic, he is talking about that also. SQLA is not hiding it's internals like many other ORMs are doing - it is highly configurable, with sane-defaults, so you can go as deep as you like and re-configure things the way you like. In fact, this is not only possible, but rather advised explicitly - at some areas it is even mandatory. The explicit-support for configurability comes in the form of tools for automating manual-configuration, through meta-classes and mix-ins. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Has anyone given you any Kool-Aid to drink recently? I have no idea what you mean by that... http://en.wikipedia.org/wiki/Drinking_the_Kool-Aid Famous last words. I have no idea what you mean by that either... http://idioms.yourdictionary.com/famous-last-words french = db.Language.insert(Name='French') db.Country(france).City.update(Language=french) Looks a lot like your example (maybe even a bit more explicit about what's going on). It is not more explicit - it has the same level of explicitness of what's going on - just less pluming-level explicitness. More explicit because it makes it clear we are doing an update. But the important distinction is that there is a hidden fundamental difference - in my approach it is actually making object-references (alongside the database insertion) so that can be used further-on in the code, You can save the object and refer to it later in the code in web2py as well. Again, you have to break-away the stateless mind-set and appreciate statefullness. I understand the distinction. You simply haven't yet demonstrated a compelling use case for the latter. This assumes a very different execution-model than what you are used to in web2py. It is something that would happen within a *module*, not within a *controller-action,* so it is saved within the module-object across transactions/requests/sessions. No, in web applications, SQLA sessions last only as long as a single web request -- basically the same as in web2py. [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] [City Name:Paris] The problem here is that you are doing all the filtering in Python rather than in the database. Not a big deal in this example, but with a large initial set of records, this is inefficient because you will return many unnecessary records from the database and Python will likely be slower to do the filtering. Well, that depends a lot on the context in which this is executed. If you have many queries similar to your example, before/around this line of code, that may reuse the same objects for other purposes (which is not uncommon) It may in fact be slower to do it your way in many circumstances, because every specialized-query you are making is another round-trip to the database, which would be orders-of-magnitude slower than doing an eager-loading up-front, and filtering in python. Yes, and in that case, you can do something exactly like your code above in web2py (i.e., filtering in Python) -- so what? Also, you need to keep in mind that this is assuming a long-lasting set of objects that out-live a single transaction-operation. Not typically. In most cases, you will probably have one transaction per request in both SQLA and web2py. Furthermore, even if you have multiple transactions within a request, SQLA will expire the state of any instances whenever a transaction is committed. Also, bare in mind that I am not suggesting to replace the DAL, only to augment it with a statefull layer on-top. Why not augment it with statefulness at the DAL level? Why do you need a layer on top? The benefits/trade-offs are not absolute/constant - they vary circumstantially. Yes, it would help to understand the circumstances in which your preferred features offer substantial benefits. It almost reads like plain English... Beautiful (!) There are differing opinions on this. Some prefer symbols like != and over plain English like is not and isGreaterThan because it is easier to scan the line and quickly discern the comparisons being made. In particular, I would much prefer both typing and reading rather than .isGreaterThan(). You are right - there are different opinions, but the Zen of Python is conclusive. :) Where in the Zen of Python does it say that English words are more beautiful than boolean operators when expressing boolean logic? Also, there are both performance AND memory benefits to using is not. An object-id check is much faster that an equality check, and having the same object referenced by different names instead of having copies of it that need to be equality-tested, may save tons of memory. If you're talking about building queries, your point is moot -- the operations happen in the database, not Python. As for comparisons in Python, in web2py, you wouldn't be testing equality of a whole object/record -- typically it would be a scalar (e.g., the integer ID). And you wouldn't have multiple copies of records in memory either. france.City.where( Language=not(spanish), Population=moreThan(100)) france.City.FilterBy( (City.Languase != spanish) (City.Population 100)) In web2py, you can already do: db.Country(france).City( (db.City.Language != spanish) (db.City.Population 100)).select() So far, for every example you have shown, web2py has fairly
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Sadly, there is NO possible way to archive a pythonic list comprenhension syntax that query the objects in the server side AFAIK Some libraries uses dis (python disassembler) and other internal dirty python hacks to do something similar, please, see: http://www.aminus.net/geniusql/ http://www.aminus.org/blogs/index.php/2008/04/22/linq-in-python?blog=2 http://www.aminus.net/geniusql/chrome/common/doc/trunk/managing.html Note that web2py is much closer to the pythonic expressions, but without the early / late binding and other issues described there ;-) I didn't know about this, but it's not what I meant. It might be a cool concept, but I'm not sure how viable it would be in production. I haven't read everything yet, I'll read it some other time. In my example, there was no magic involved, beyond what I've described: Just to clarify, again, the following syntax: [city for city in fance.City.list if \ city.Language is not spanish and city.Population.isGraterThan(100)] Is not a pie-in-the-sky dream-API - there are implementational details that are viable for each section. First, the reason the is not would work, is that there would exist an implementation of 'Identity Mapping that would take care of having a singleton for each ORM-object representing each row. Second, the access to france.City.list is viable, as an attribute-access is totally customization in python, so we can devise anything we want for the access to do. For example, it could return an iterator, if the City attribute of the france ORM-object is valid (meaning, it is cached and not invalidated by a previous transaction-commit), and do a query right-there on the spot, and return an iterator, if the City attribute of the france ORM-object is invalid at the time of the list attribute access. it was a simple list-comprehension. I assumes the data is there - if it isn't, it goes and fetches it. Here is a quote from one of the comments in your second link, that I really resonate with: I think the distinction between: Customer.select and source.Customers is quite important as the developer may be thinking quite differently in each. In the first, I am thinking that I'm grabbing objects from some store that is linked with the class Customer via an ORM. In the latter, I may think the same way, but I personally like to think of it as I've got this object that has many customers and now I can play with them. That collection may be backed in a database, or it may just be a collection. None of this is not already accomplished either in Python (generators or list comprehensions) or via the many community modules that address object-relational mapping or extend what itertools already gives you. However, what I can't find in Python is a unified API regardless of what the underlying mechanism of keeping my data is. To me, it doesn't matter if I'm calling into a for loop or generating a monster SQL statement. I'd rather write the same code regardless of which of these is going to happen under the hood. As for the performance-issue, I've later said this: Well, that depends a lot on the context in which this is executed. If you have many queries similar to your example, before/around this line of code, that may reuse the same objects for other purposes (which is not uncommon) It may in fact be slower to do it your way in many circumstances, because every specialized-query you are making is another round-trip to the database, which would be orders-of-magnitude slower than doing an eager-loading up-front, and filtering in python. Also, you need to keep in mind that this is assuming a long-lasting set of objects that out-live a single transaction-operation. And this: Eager-vs-Lazy loading is configurable - both are supported, and you can use each in you core, depending on circumstances. As I said in the previouse message (I updated it since you posted this one). it depends - the trande-offs are circumstancial - that's why you need both approaches, and the ability to concigure each object to use one or the other in different circumstances - and that's what SQLA provides. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I actually find that latter easier to process. The parentheses and make it easier to see there are two separate conditions, and the != and are easier to pick out and comprehend than is not and .isGreaterThan(). A non-programmer may have an easier time with the more English-like version (assuming they happen to speak English, of course), but I think it's reasonable to expect even novice programmers to understand the basic boolean operators. Whatever your opinion on the beauty of one over the other, though, surely this doesn't justify the massive undertaking of building an ORM, particularly since you would still have to know and use the underlying DAL syntax in addition anyway. Anthony Again: There are both performance AND memory benefits to using is not. An object-id check is much faster that an equality check, and having the same object referenced by different names instead of having copies of it that need to be equality-tested, may save tons of memory. But if you insist in using an ugly form, than in my example you may still do that - it would work just as well - while having the same memory-footprint benefits, just not the performance-benefits. :) OK, then, again: If you're talking about building queries, your point is moot -- the operations happen in the database, not Python. As for comparisons in Python, in web2py, you wouldn't be testing equality of a whole object/record -- typically it would be a scalar (e.g., the integer ID). And you wouldn't have multiple copies of records in memory either. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Eager-vsLazy loading is configurable - both are supported, and you can use each in you core, depending on circumstances. As I said in the previouse message (I updated it since you posted this one). it depends - the trande-offs are circumstancial - that's why you need both approaches, and the ability to concigure each object to use one or the other in different circumstances - and that's what SQLA provides. In web2py, we have lazy loading for individual references and referencing sets, and when that is inefficient, we would just do an explicit join. I think an eager loading option on the individual references would be a cool addition, though that wouldn't require an ORM layer. As for recursive-queries, it does not occur in eager-loading in SQLA. SQLA features what it called a cascade, which means it traverses through the attribute-accesses, and only fetches what you actually asked for (well, it's not really traversing, there are events that are triggered on each object's attribute-access...) this is for both lazy and eager loading configurations. The cascade makes sure that you only get what you explicitly asked for, no more, no less. When you eager load in SQLA, it doesn't know ahead of time what attributes you will access, so it fetches everything (though in just one, or possibly two queries, depending on the eager method used). Lazy loading requires a query for each attribute accessed, just as in web2py. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
If you're talking about building queries, your point is moot -- the operations happen in the database, not Python. I don't know what you mean by this... For complex-queries, I would still be using web2py's DAL layer, even if I had this ORM on top. I would either use them outside the ORM, or insert them into the ORM. My example was not meant to show that you should do complex queries in Python - that would obviously be absurd, and you don't even need a relational database for that - any NoSQL one would do fine. I was giving an example to a case in which I have a simple comparison to make, and want to reuse the ORM objects I already have for it. The DAL is not built for that. As for comparisons in Python, in web2py, you wouldn't be testing equality of a whole object/record -- typically it would be a scalar (e.g., the integer ID). That would still be much slower than an equality-test - especially for large data-sets. And you wouldn't have multiple copies of records in memory either. How so? If I do this: row1 = db.Country[1] ... row2 = db.Country[1] Would I then get: row1 is row2 == True ? How about: row1.Name is row2.Name == True ? I would be surprised to find out this is the case... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
In web2py, we have lazy loading for individual references and referencing sets, and when that is inefficient, we would just do an explicit join. I think an eager loading option on the individual references would be a cool addition, though that wouldn't require an ORM layer. We've been though this already, Anthony, Laziness in web2py has a different meaning - it uses the same word, but it means something completely different. For example, you statement: Lazy loading requires a query for each attribute accessed, just as in web2py. Is plain false. There is another mechanism that exists in SQLA that you are not accounting for, and it is statefullness - a caching-mechanism, using the 'Unit of Work' pattern. I don't know how many times and in how many forms I need to say this. In SQLA, using a LazyLoading configuration on an attribute, would mean that on the first time (within a transaction) that your code accesses this attribute, it would issue a query to the database. The Lazyness here is defined in terms of when you FIRST access the attribute, and NOT BEFORE - but as to what happens AFTER that, THEN the caching-mechanism kicks in - if the attribute has not been invalidated within the same transaction, then the LazyLoading is NOT EVEN ACTIVATED (!) You get the value IMMEDIATELY from the attribute-cache. EagerLoading is therefore a way to configure the attribute to be queried from the database EXPLICITLY, but even in THAT case, that would only apply ONCE within a transaction. Every subsequent accesses to that attribute would use the cached-value just as well. As for recursive-queries, it does not occur in eager-loading in SQLA. When you eager load in SQLA, it doesn't know ahead of time what attributes you will access, so it fetches everything (though in just one, or possibly two queries, depending on the eager method used). You are right about that - my mistake - I meant LazyLoading there. But if I understood correctly, than since the Lazynes vs. Eagerness can be configured on a per-attribute basis, you don't necessarily have to load the whole table - you could configure just the fields you want. If you need a complex query for eager-loading, you can simply execute it at the beginning of your transaction - explicitly. You don't even have to store the results anywhere yourself - they are automatically filled-in within their respective attribute-caches across the object-graph. Every access to those attributes thereafter, within the same transaction, would just use those values - regardless of the Lazyness/Eagerness configurations. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
If you're talking about building queries, your point is moot -- the operations happen in the database, not Python. I don't know what you mean by this... For complex-queries, I would still be using web2py's DAL layer, even if I had this ORM on top. I would either use them outside the ORM, or insert them into the ORM. Not sure what you mean here -- you can do simple queries in the db as well as complex filtering in Python -- they are orthogonal considerations. My example was not meant to show that you should do complex queries in Python - that would obviously be absurd, and you don't even need a relational database for that - any NoSQL one would do fine. Now you've really lost me -- what does any of this have to do with RDBMS vs. NoSQL? And why shouldn't you do complex filtering in Python? I was giving an example to a case in which I have a simple comparison to make, and want to reuse the ORM objects I already have for it. The DAL is not built for that. Not sure what you mean. The DAL returns a Rows object. You can filter it with any simple or complex comparison you like. It even has a .find() method to simplify the process: db.Country(france).select().find(lambda r: r.Language != spanish and r. Population 100) As for comparisons in Python, in web2py, you wouldn't be testing equality of a whole object/record -- typically it would be a scalar (e.g., the integer ID). That would still be much slower than an equality-test - especially for large data-sets. OK, please provide some benchmarks. What percentage decrease in CPU usage can we expect if we compare object identities rather than integer equivalencies? And you wouldn't have multiple copies of records in memory either. How so? If I do this: row1 = db.Country[1] ... row2 = db.Country[1] And why are you doing that? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
In web2py, we have lazy loading for individual references and referencing sets, and when that is inefficient, we would just do an explicit join. I think an eager loading option on the individual references would be a cool addition, though that wouldn't require an ORM layer. We've been though this already, Anthony, Laziness in web2py has a different meaning - it uses the same word, but it means something completely different. I'm not going to repeat what I already said -- please go back and read it earlier in this thread. In my statement above, the term lazy means* exactly* the same thing as it does in SQLA and that you mean when you say it -- the *database query* is deferred until you access the attribute. Lazy loading requires a query for each attribute accessed, just as in web2py. Is plain false. Or, it's plain true. In SQLA, if you specify lazy loading of relationships (which is the default), the query is deferred until the first time you access the attribute, and there is therefore a query for each attribute accessed. This is in contrast to eager loading, which does a single query to populate all attributes (whether or not they are ever accessed). There is another mechanism that exists in SQLA that you are not accounting for, and it is statefullness - a caching-mechanism, using the 'Unit of Work' pattern. I don't know how many times and in how many forms I need to say this. No more times and in no more forms, because nothing I have said has contradicted this. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Not sure what you mean here -- you can do simple queries in the db as well as complex filtering in Python -- they are orthogonal considerations. I CAN, but I SHOULDN'T want to... RDBMS are built for complex filtering - this is (part) of what SQL is allabout - I wouldn't want to dismiss that - it would be a bad choice all-around. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Again, it's a circumstancial-trade-off: For complex filtering, the price of verbosity in using the DAL is worth it, because the performance-benefits are just too high. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. Now you've really lost me -- what does any of this have to do with RDBMS vs. NoSQL? And why shouldn't you do complex filtering in Python? See above. Not sure what you mean. The DAL returns a Rows object. You can filter it with any simple or complex comparison you like. It even has a .find() method to simplify the process: db.Country(france).select().find(lambda r: r.Language != spanish and r. Population 100) That's actully pretty nice - I didn't know I can do that - but what would the france and spanish objects be in this case? Ids? OK, please provide some benchmarks. What percentage decrease in CPU usage can we expect if we compare object identities rather than integer equivalencies? Really? You think I need to? Identity-checking is a built-in and operates within the VM-level. Equality-checks, even for the simplest of objects, are much more complex internally, and are not exclusively operating on the VM-core level - there are object-attributes and value-type-checking involved, etc. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
RDBMS are built for complex filtering - this is (part) of what SQL is allabout - I wouldn't want to dismiss that - it would be a bad choice all-around. A complex filter on a small set of items might be faster in Python than doing another database hit. And a simple filter might belong in the db if it has to go over lots of records. As I said, these are orthogonal considerations. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Don't know why you think that. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. But I thought you were a fan of achieving negligible performance benefits at great cost (see below). Now you've really lost me -- what does any of this have to do with RDBMS vs. NoSQL? And why shouldn't you do complex filtering in Python? See above. Still don't know why you would want a NoSQL database or what it has to do with this topic. db.Country(france).select().find(lambda r: r.Language != spanish and r. Population 100) That's actully pretty nice - I didn't know I can do that - but what would the france and spanish objects be in this case? Ids? Well, you've sure made a lot of claims about what web2py needs without knowing much about what it already has. Those are ids. If they were rows, then you would just do france.id and spanish.id. OK, please provide some benchmarks. What percentage decrease in CPU usage can we expect if we compare object identities rather than integer equivalencies? Really? You think I need to? Yes, I think you need to. If this is only going to save a half a second of CPU time per day, I'm not going to build an ORM to get it. The question isn't how much faster the identity check is (and I don't think it's that much faster) -- the question is how much of your overall application CPU time is spent doing this kind of thing? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Or, it's plain true. In SQLA, if you specify lazy loading of relationships (which is the default), the query is deferred until the first time you access the attribute, and there is therefore a query for each attribute accessed. This is in contrast to eager loading, which does a single query to populate all attributes (whether or not they are ever accessed). I was interpreting your statement of every access to mean every time you access the same attribute and not to mean every attribute you access. Anthony, your use of language is sometimes ambiguous - it is not the first time I have misunderstood you - please be more specific next time. 10x :) As for LazyLoading in web2py: We discussed this in the context of virtual-fields, and we've established that Lazyness in that context was NOT a deferred-access to the database, but a deferred-computation of the results within the run-time heap. Are you now referring to Laziness in web2py within a different context? Say, like in: db.Country(france).City.find(...) ? Because I don't understand how you would consider to see that as Lazy - it is implicit, yes, but a Lazy access to the database can only have meaning within the context of a statefull framework. If every query is accessing the database ANYWAYS, then where is the *laziness* in that case? You mean a LazySet ? Like in Recursive-selects? person = db.person(id) for thing in person.thing.select(orderby=db.thing.name): print person.name, 'owns', thing.name Well, again, it IS implicit, but why call it Lazy? The so-called *LazySet* is in person.thing ? If so, than it isn't Lazy, just implicit and that's a bad choice of name - good thing it isn't in the documentation, as it would have generated even more confusion than already exists there... If not, then where is it? The thing.name ? If so, than It may be legitimate to call the person.thing *lazy, *but then I wouldn't want to use it. The way I see it, the only meaning the term Lazy has for accessing-the-database, can exist within a statefull framework - which is, in relation to eager-loading that can only exist there. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
A complex filter on a small set of items might be faster in Python than doing another database hit. And a simple filter might belong in the db if it has to go over lots of records. As I said, these are orthogonal considerations. Perhaps, but again, we are talking about a context of a statefull system - we might already have some data on our object-graph - so it's more complicated then that - If we're talking about the first query of a transaction, we need to think about the context of that whole transaction - will we be using all of the fields in the subsequent attribute-accesses? How about the of the records? Do we need all of them for our access-pattern later on? How should we construct our query so it's optimal for re-use of the results in subsequent attribute-accesses of that same transaction? Such considerations do not even exist in a stateless system like web2py's DAL - it doesn't have the same kind of re-usability of returned data. For example - If I am writing a code for a transaction that would later need to do a simple-filter on a large data-set, but that I also know I'm gonna need some of that data, and also some-other related data for some other attribute-access, than I should construct the first query in the transaction so it would do complex filtering, even if I have a large data-set, so I can have the data cached for me, for when I do the simple-filtering of that data in-python afterwards. As for that extra-data-reuse, it might not pertain to all the records I got, but it might do-pertain to more fields than my simple filtering needed. So in that case, I might do the simple-filtering in python, even if a large-data-set is involved, because I am optimizing the number of queries for a wider-context. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Don't know why you think that. Because it is. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. But I thought you were a fan of achieving negligible performance benefits at great cost (see below). Now you're being cynical... Still don't know why you would want a NoSQL database or what it has to do with this topic. You're barking on the wrong tree - It does not relate to this discussion, and I didn't say I need a NoSQL database - I don't. I meant it as a hypothetical-alternative to an imaginary scenario of me doing ALL the filtering in python - for THAT I said well I *might-as-well* use NoSql as I would then not muster the benefits of a relational database. It was statement to emphesize why I wouldn't want to do complex filtering in Python in general - obviousely there are edge-cases as you alluded, and then there's the additional complexity of decision-making as I alluded, due to the introduction of statefull-caching/reuse of results. Well, you've sure made a lot of claims about what web2py needs without knowing much about what it already has. Those are ids. If they were rows, then you would just do france.id and spanish.id. I was simply avoiding making assumptions in that example, as there was no context for these variables in it. Yes, I think you need to. If this is only going to save a half a second of CPU time per day, I'm not going to build an ORM to get it. The question isn't how much faster the identity check is (and I don't think it's that much faster) -- the question is how much of your overall application CPU time is spent doing this kind of thing? Fine, don't make the is not usage a reason for an ORM - you may still benefit from an Identity Mapper in an ORM, in terms of memory-efficiency, even if you stick to your ugly !=s and ==s I wouldn't make my decision of having an Identity Mapper only for the usage of is and is not - in fact, it is rarely used even in SQLA - it was just an example of readability that can be harnesses in addition to the memory efficiency that an Identity-Mapper is providing. For benchmarks on THAT, you may look for SQLAlchemy vs DJango if you like... I don't really case much for that - I just know it is obviousely better... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Hi Arnon, I really don't see that your requirements in this example are not a good for web2py's DAL. I'd really like to understand what, if anything, I'm missing here. I'm originally an RDBMS developer, and what I think of as the set-oriented approach of web2py's DAL is a much better fit for my mental model of what my application is doing. Conversely, I find many ORM solutions (and Active Record, come to that) to be an anti-pattern [1]. FWIW, my data is typically consumed by more than one application (e.g. not just web2py), and I've tended to let web2py create and manage the schema in its own way, and work in the db with views etc. to address more complex or performance-critical functions, so perhaps my concerns are different to yours. However, the examples you give don't seem to take best advantage of the DAL to my eye. Anthony provided some examples as to how one might do that, but you have not responded directly to that post (apologies if I've missed your response if it was elsewhere in the thread). I'd like to take the liberty of quitting Anthony's responses and asking for your take on them. On 02/05/13 22:20, Arnon Marcus wrote: Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] Anthony's suggestion: - db.Country(name='France').City.select() europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France') is france True france.City(Name='Paris') is paris True europe.Country(Name='France').City.list [City Name:Paris, City Name:Nice] Anthony's suggestion: - europe = db.Continent.insert(Name='Europe') france = db.Country.insert(Name='France', Continent=europe) paris = db.City.insert(Name='Paris', Country=france) nice = db.City.insert(Name='Nice', Country=france) db.Country(france).City.select() [Row Name:Paris, Row Name:Nice] This would be so intuitive and easy to use... Can you expand on this statement and help me understand how this proposed syntax is any more intuitive or simpler than the web2py suggestions, which to my eye are clearer by dint of being more explict? Am I missing something here? [1] http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern -- Regards, PhilK e: p...@xfr.co.uk - m: 07775 796 747 'work as if you lived in the early days of a better nation' - alasdair gray -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Or, it's plain true. In SQLA, if you specify lazy loading of relationships (which is the default), the query is deferred until the first time you access the attribute, and there is therefore a query for each attribute accessed. This is in contrast to eager loading, which does a single query to populate all attributes (whether or not they are ever accessed). I was interpreting your statement of every access to mean every time you access the same attribute and not to mean every attribute you access. I said, Lazy loading requires a query for each attribute accessed That is a fairly precise statement. It is you who made an error. Anthony, your use of language is sometimes ambiguous - it is not the first time I have misunderstood you - please be more specific next time. 10x :) Interesting that you attribute your misunderstandings to my lack of clarity. Is that the only possibility? As for LazyLoading in web2py: We discussed this in the context of virtual-fields, and we've established that Lazyness in that context was NOT a deferred-access to the database, but a deferred-computation of the results within the run-time heap. No, we didn't establish that at all. I refer you to the written record. Are you now referring to Laziness in web2py within a different context? No. Say, like in: db.Country(france).City.find(...) ? Because I don't understand how you would consider to see that as Lazy - That is not proper web2py code, so I don't see it as anything. but a Lazy access to the database can only have meaning within the context of a statefull framework. Nope. In web2py, if a Row object contains a reference attribute, a query is fired when the attribute is accessed, not when the Row object was first created. This is lazy. person = db.person(id) for thing in person.thing.select(orderby=db.thing.name): print person.name, 'owns', thing.name Well, again, it IS implicit, but why call it Lazy? It's actually not even implicit there -- you explicitly call .select(). Anyway, thing is an attribute of the person object, but there is no database query to retrieve the things when the person object is first created -- instead, the query for things is deferred until actually needed. The so-called *LazySet* is in person.thing ? Yes. If so, than it isn't Lazy, just implicit and that's a bad choice of name... Says you. - good thing it isn't in the documentation, as it would have generated even more confusion than already exists there... More confusion among whom? The way I see it, the only meaning the term Lazy has for accessing-the-database, can exist within a statefull framework - which is, in relation to eager-loading that can only exist there. No, web2py could implement eager loading of these Reference and LazySet attributes by doing a single query and filling in all the attributes with the retrieved values. This would be in contrast to the current behavior, which is lazy (i.e., the values are not retrieved when the Row objects are initially created but at some later time upon access/explicit request). Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Let's take a look at identity comparison vs integer comparison... import timeit setup = def isequalItems(itemone, itemtwo): return itemone is itemtwo def isequalInts(itemone, itemtwo): return itemone == itemtwo def testOne(): a = 1 b = 2 isequalItems(a,b) def testTwo(): a = 1 b = 2 isequalInts(a,b) print isequalitems, timeit.timeit(stmt=testOne(), setup=setup, number=1000) print isequalints, timeit.timeit(stmt=testTwo(), setup=setup, number=1000) I get isequalitems 2.77487170111 isequalints 2.73482146489 So, integer comparison is faster. This is with Python 2.7. PyPy 1.9... isequalitems 0.067024457849 isequalints 0.0263884617855 Integer comparison is still faster. On Friday, May 3, 2013 10:36:10 AM UTC-7, Anthony wrote: RDBMS are built for complex filtering - this is (part) of what SQL is allabout - I wouldn't want to dismiss that - it would be a bad choice all-around. A complex filter on a small set of items might be faster in Python than doing another database hit. And a simple filter might belong in the db if it has to go over lots of records. As I said, these are orthogonal considerations. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Don't know why you think that. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. But I thought you were a fan of achieving negligible performance benefits at great cost (see below). Now you've really lost me -- what does any of this have to do with RDBMS vs. NoSQL? And why shouldn't you do complex filtering in Python? See above. Still don't know why you would want a NoSQL database or what it has to do with this topic. db.Country(france).select().find(lambda r: r.Language != spanish and r. Population 100) That's actully pretty nice - I didn't know I can do that - but what would the france and spanish objects be in this case? Ids? Well, you've sure made a lot of claims about what web2py needs without knowing much about what it already has. Those are ids. If they were rows, then you would just do france.id and spanish.id. OK, please provide some benchmarks. What percentage decrease in CPU usage can we expect if we compare object identities rather than integer equivalencies? Really? You think I need to? Yes, I think you need to. If this is only going to save a half a second of CPU time per day, I'm not going to build an ORM to get it. The question isn't how much faster the identity check is (and I don't think it's that much faster) -- the question is how much of your overall application CPU time is spent doing this kind of thing? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
A complex filter on a small set of items might be faster in Python than doing another database hit. And a simple filter might belong in the db if it has to go over lots of records. As I said, these are orthogonal considerations. Perhaps, but again, we are talking about a context of a statefull system - we might already have some data on our object-graph - so it's more complicated then that... No, your statement had nothing to do with any of this. Anyway, sounds like you now agree with me that it depends on the context. - If we're talking about the first query of a transaction, we need to think about the context of that whole transaction - will we be using all of the fields in the subsequent attribute-accesses? How about the of the records? Do we need all of them for our access-pattern later on? How should we construct our query so it's optimal for re-use of the results in subsequent attribute-accesses of that same transaction? Such considerations do not even exist in a stateless system like web2py's DAL - it doesn't have the same kind of re-usability of returned data. I think it would help if you are more precise about what you mean by stateless and stateful (preferably with code examples). I don't see why these considerations would not be applicable to web2py. I also don't know why you say web2py lacks re-usability of returned data. Perhaps you could offer some examples. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Don't know why you think that. Because it is. OK, apparently your assertions no longer require arguments and evidence. Perhaps you could have started and ended this entire thread with a much more simple, Hey, could you people please build an ORM, because I have determined that *it is* better. Then we would all know the truth and could get to coding. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. But I thought you were a fan of achieving negligible performance benefits at great cost (see below). Now you're being cynical... A bit sarcastic, but a serious point -- in one breath you claim to care deeply about what is probably a negligible performance benefit, and in the next you are willing to tolerate some inefficiency. It appears you are being disagreeable for the sake of being disagreeable rather than trying to progress the discussion. I meant it as a hypothetical-alternative to an imaginary scenario of me doing ALL the filtering in python - for THAT I said well I * might-as-well* use NoSql OK, got it. Well, you've sure made a lot of claims about what web2py needs without knowing much about what it already has. Those are ids. If they were rows, then you would just do france.id and spanish.id. I was simply avoiding making assumptions in that example, as there was no context for these variables in it. I was referring to the fact that you didn't know about the .find() method. You also didn't seem to know much about recursive selects, virtual and method fields, etc. you may still benefit from an Identity Mapper in an ORM, in terms of memory-efficiency Why do you need an ORM to have an identity mapper? And how much benefit are you expecting here? Do you have an example of where this would create big savings? even if you stick to your ugly !=s and ==s Ouch, you better tell Guido to change the equals and not equals operators in Python. I wouldn't make my decision of having an Identity Mapper only for the usage of is and is not - in fact, it is rarely used even in SQLA What do you mean? Since != and == are so obviously ugly, aren't all the SQLA users doing is and is not everywhere, you know, because they can? - it was just an example of readability that can be harnesses is not is very readable within English prose, but it is not more readable in code (though it can be fairly readable with proper syntax highlighting). Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
OK, Arnon, now you have to change all your SQLA code to compare id's instead of objects. Bummer. On Friday, May 3, 2013 3:24:22 PM UTC-4, Derek wrote: Let's take a look at identity comparison vs integer comparison... import timeit setup = def isequalItems(itemone, itemtwo): return itemone is itemtwo def isequalInts(itemone, itemtwo): return itemone == itemtwo def testOne(): a = 1 b = 2 isequalItems(a,b) def testTwo(): a = 1 b = 2 isequalInts(a,b) print isequalitems, timeit.timeit(stmt=testOne(), setup=setup, number=1000) print isequalints, timeit.timeit(stmt=testTwo(), setup=setup, number=1000) I get isequalitems 2.77487170111 isequalints 2.73482146489 So, integer comparison is faster. This is with Python 2.7. PyPy 1.9... isequalitems 0.067024457849 isequalints 0.0263884617855 Integer comparison is still faster. On Friday, May 3, 2013 10:36:10 AM UTC-7, Anthony wrote: RDBMS are built for complex filtering - this is (part) of what SQL is allabout - I wouldn't want to dismiss that - it would be a bad choice all-around. A complex filter on a small set of items might be faster in Python than doing another database hit. And a simple filter might belong in the db if it has to go over lots of records. As I said, these are orthogonal considerations. Conversely, simple-filtering is way too verbose using the DAL - it's an overkill for that, and makes the code much less readable. Don't know why you think that. For simple filtering, well, I'd rather do it in python and get readability, becuase the performance-benefits are negligible. But I thought you were a fan of achieving negligible performance benefits at great cost (see below). Now you've really lost me -- what does any of this have to do with RDBMS vs. NoSQL? And why shouldn't you do complex filtering in Python? See above. Still don't know why you would want a NoSQL database or what it has to do with this topic. db.Country(france).select().find(lambda r: r.Language != spanish and r. Population 100) That's actully pretty nice - I didn't know I can do that - but what would the france and spanish objects be in this case? Ids? Well, you've sure made a lot of claims about what web2py needs without knowing much about what it already has. Those are ids. If they were rows, then you would just do france.id and spanish.id. OK, please provide some benchmarks. What percentage decrease in CPU usage can we expect if we compare object identities rather than integer equivalencies? Really? You think I need to? Yes, I think you need to. If this is only going to save a half a second of CPU time per day, I'm not going to build an ORM to get it. The question isn't how much faster the identity check is (and I don't think it's that much faster) -- the question is how much of your overall application CPU time is spent doing this kind of thing? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Well, no, since as I said, SQLA's Identity Map is actually equating the primary-keys of the object, so I guess they got it right! :P Anyway, I haven't seen this identity-checking actually being used in SQLA code-examples, it was just a side-benefit I though could be cool that you could do that, to make code more readable. It IS weird thought that it's slower - quite surprising. Still, it's not THAT much slower, so if it was me I would still be using identity-checks - you are talking about a difference of 0.01% - whose meddling in negligible stuff now...? :P -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
For me it has been very worthwhile to learn the DAL because it abstracts away most of the problems of switching databases. However, I believe it is still of fundamental importance that we can see how it maps into SQL. The idea of adding an ORM on top of DAL is one abstraction too far for me. My experience is that the further your data travels from home (the DB) the more difficult it is to manage. It somehow reminds me of taking the plates off the table and spinning them on sticks. It looks spectacular and clever at first, but of course the smallest problem makes them all crash. I believe that this link summarises many of my thoughts in a nice readable way: http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Well, no, since as I said, SQLA's Identity Map is actually equating the primary-keys of the object, so I guess they got it right! :P Wasn't saying the SQLA ORM code needs to be changed -- I was talking about your application code comparing object identities rather than comparing primary keys. And of course, it was a joke. whose meddling in negligible stuff now...? :P See http://www.thefreedictionary.com/sense+of+humor. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
It may be 0.01% difference on Python 2.7, but on PyPy, that's a 3x difference in speed there! On Friday, May 3, 2013 2:02:24 PM UTC-7, Arnon Marcus wrote: Well, no, since as I said, SQLA's Identity Map is actually equating the primary-keys of the object, so I guess they got it right! :P Anyway, I haven't seen this identity-checking actually being used in SQLA code-examples, it was just a side-benefit I though could be cool that you could do that, to make code more readable. It IS weird thought that it's slower - quite surprising. Still, it's not THAT much slower, so if it was me I would still be using identity-checks - you are talking about a difference of 0.01% - whose meddling in negligible stuff now...? :P -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On 02/05/13 22:20, Arnon Marcus wrote: Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] Anthony's suggestion: - db.Country(name='France').City.select() It is a comparison between 2 options to do the same thing in web2py, not a comparison between a web2py way of doing it, vs. my suggested-API's way of doing it. Yes, Anthony is right - there is a better-looking way of doing that compared to what I thought would be the best you could get from the DAL. But it is not a testament to web2py's superioriy/equivalence to my suggestions (only to my poof familiarity with some web2py DAL usages), as there is none in that case. If you would construct an equivalence in my suggestion in this case, it would look like this: Country(name='France').City.list vs. db.Country(name='France').City.select() Now, there are syntactic as well as semantic differences here. Syntactically is it shorted and more concise. Semantically, it returns very different kinds of objects. They both may represent rows in the the same table in the database, but they would be fundamentally diffenet in their utility - which would be a function of their API-context. A sequence of ORM objects is no the same thing as a ROWS object - they differ less in what they contain, and more with what you can do with them. The exact same difference goes for the second comparison: europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France') is france True france.City(Name='Paris') is paris True europe.Country(Name='France').City.list [City Name:Paris, City Name:Nice] Anthony's suggestion: - europe = db.Continent.insert(Name='Europe') france = db.Country.insert(Name='France', Continent=europe) paris = db.City.insert(Name='Paris', Country=france) nice = db.City.insert(Name='Nice', Country=france) db.Country(france).City.select() [Row Name:Paris, Row Name:Nice] In my suggestions, you get an ORM instance, which can do more things. In the first example, you get a Rows object, and in the second you get an ID. Granted, you could use the ID in a DAL context, in a similar way that you would use an ORM-class-instance within the ORM context, but there are other things that an ORM-object can do within an ORM context (aside from these examples), that an ID or a Rows object can not do in the DAL context. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I would just add that I often do:: City = db.define_table('City', Field('Name', 'string'), Field('Country', db.Country)) paris = City.insert(Name='Paris', Country=france) apart for the .insert this is the same as what Arnon suggests. The lack of .insert in ORMs: paris = City(Name='Paris', Country=france) make the time when the data is actually stored in the database undefined (deferred). This is why Django and Active Records need a City.save(). The developer loses controls about it and then the internal consistency problem arises. When some people like about ORMs is also their most criticized feature: You do not know when they do DB IO. In web2py DAL we managed to achieve a notation that is very similar to ORMs while preserving the 1-to-1 mapping from API and SQL. We may extend this but I would not want to lose this explicit mapping. In fact in web2py DAL insert, delete, update, selet are the only methods (apart for auxiliar ones like record_update, record_update, tables, files). Almost everything else is achieved by operator overloading. On Thursday, 2 May 2013 18:14:17 UTC-5, Anthony wrote: db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db. Country)) Using an ORM, you could do something like: Country(Name='France').City.list [City Name:Paris, City Name:Nice] Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] Actually, you can do this with the DAL: db.Country(name='France').City.select() Almost identical to your ORM code (not that I think it needs to be similar looking code to be useful). europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France').City.list [City Name:Paris, City Name:Nice] In the DAL, you can do (I think I got this right): europe = db.Continent.insert(Name='Europe') france = db.Country.insert(Name='France', Continent=europe) paris = db.City.insert(Name='Paris', Country=france) nice = db.City.insert(Name='Nice', Country=france) db.Country(france).City.select() [Row Name:Paris, Row Name:Nice] Of course, you don't get the is equivalencies, but your example doesn't actually require that. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Here is a use-case that can benefit from an ORM. Let's say we have 2 functions that manipulate the same field in 2 different functions. Here is how it would be done using the DAL: def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first() city.update_record(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first() city.update_record(Population-city.Population - 1) Now, let's say that both functions are being used by different contexts within the same transaction (hypothetically, say, from some different functions, way deep in the call stack). # In context 1: a_child_was_born_in('France', 'Paris') ... # In context 2: a_person_has_died_in('France', 'Paris') This would issue 4 round-trips to the database - 2 selects and 2 updates. Now, lets say we want to optimize that, so we do a Lazy version of those functions. How would we go about doing that? Well, we could replace the .update_record with an .update. def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first() city.update(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first() city.update(Population-city.Population - 1) Would that work? Well, let's see, assuming the initial population value of Paris, is 2 million. When a child is born, the value would get incremented locally. But the Row object of the 'city' variable is not persisted in memory when the functions return. So we need to commit the transaction after each call. But wait a minute, that would get us back to 4 operations... Might as well leave the update_record the way it was. What do we do? Well, we could make the laziness optional, and call the first one eagerly and the second lazily. Yes, we would need to keep track of our ordering of calling them, but if we do it right, we could get it down to 3 operations (2 selects and one update). Would that work? Well, no, because then we would loose the second update once the second function returns... Can we still do something? Well, yes, we can activate caching on the City field, so it's internal-values would survive across transactions - given that we give the cache a long-enough time-out. This may not help us in the updates, but it could nock-off a the second query (he select operation in the second function) So the best we get is 3 operations - 1 select and 2 updates. Now, here is the same code, using an ORM: def a_child_was_born_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population += 1 def a_person_has_died_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population -= 1 The syntactic difference is small, but the semantic implication is profound. The automatic cache-mechanism in the ORM will detect that we are querying the same record, and so would not query the database in the second function - just return the same object already in memory. So now we're down to 3 actual operations - 1 select and 2 updates. But it doesn't stop there... In the DAL case, we cached the values inside the city field, but the 'city' variable in the first function, is still a separate Rows object from the 'city' object in the second function, so we couldn't do Lazy updates. But an ORM can have an Identity Mapper, that would make sure they the same object would be returned, It would be bound to two different name-spaces, but it would be the same object. Now we could implement a Truely lazy update. The increment that is done in the first function, would be reflected in the second one, because the same object would be returned, So now we're sown to 2 operations - one select, and one update - the update would automatically be issued for us at transaction-commit time, as it would be balled pending by the time it get's there, using the Unit-of-Work pattern.. But it doesn't have to even stop there... The Unit-of-Work mechanism has this dirty label, which signifies that the current value within a record-object has different value from the one in the database. Now, it may be implemented poorly, and just get flagged as dirty on any update to it, or it could store the original value, and have the dirty-check deferred to the last minute - in which the current value would be compared to the original-stored one, and only be deemed dirty if there's a mis-match. In this case, we incremented once, and decremented once, so the value goes back to what it was, ans do the dirty-check would fail and yield a clean flag - so the entire update operation would not even occur on transaction-end time. So we are down to 1 - a single operation - the first select. These are the kinds of benefits an ORM may have.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
def a_child_was_born_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update_record(Population=city.Population + 1) def a_person_has_died_in(countryName, cityName): city = db.Country(Name=countryName).City(Name=cityName).select().first () city.update_record(Population-city.Population - 1) Technically, it would be db.Country(Name=countryName).City(db.City.Name == cityName).select().first(). # In context 1: a_child_was_born_in('France', 'Paris') ... # In context 2: a_person_has_died_in('France', 'Paris') This would issue 4 round-trips to the database - 2 selects and 2 updates. The way you have coded it, it is actually 6 round trips -- there are 2 selects per function -- one for the country, and a second for the city. But that's not how you would do it in web2py anyway. Instead, you would issue no selects and instead do it with just a single update -- so a total of 2 round trips to the db (i.e., 2 updates) : def a_child_was_born_in(countryName, cityName): query = (db.City.Name == cityName) (db.City.Country.belongs(db.Country .Name == countryName)) db(query).update(Population=db.City.Population + 1) So, in order to do the update, we do not first have to query the database to retrieve the record. This is actually an advantage over the ORM, which requires that you first retrieve the record before updating it. The ORM will issue two queries to get the record if lazy loading is used, or one if eager loading, a join, or a subquery is used. Now, lets say we want to optimize that, so we do a Lazy version of those functions. There's not much to optimize here. If you don't know ahead of time that you will be making two updates to the same record (which may possibly negate each other), I think the minimum number of db hits is two. You could retrieve the record twice, defer the first update, recognize that the second update cancels the first, and then make no update -- which is still 2 hits (well, 1 hit if you cache the query). Or you could just make the 2 updates (as above). In any case, I believe the ORM actually requires a minimum of 4 hits (see below), so web2py is still doing a lot better. Now, here is the same code, using an ORM: def a_child_was_born_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population += 1 def a_person_has_died_in(countryName, cityName): city = Country(Name=countryName).City(Name=cityName) city.Population -= 1 Assuming this is SQLA, I don't think that's quite the right syntax -- it appears you are creating object instances rather than issuing queries. I believe it should be something like this: def a_child_was_born_in(countryName, cityName): city = session.query(City).join(Country)\ .filter(Country.Name == countryName)\ .filter(City.Name == cityName).first() city.Population += 1 The above does a join and therefore gets it down to a single query for the select. Otherwise, you could just query for the country, then access the City attribute, which would lazily issue a second query (though only when the first function is called). The syntactic difference is small, but the semantic implication is profound. Yes, but not quite in the way you think. The automatic cache-mechanism in the ORM will detect that we are querying the same record, and so would not query the database in the second function - just return the same object already in memory. Again, assuming this is SQLA, that's not how it works. SQLA does not cache queries -- when you run a query, it doesn't know what record will be retrieved, so it doesn't know whether it already has the associated object in the session. Hence, it will re-run the query both times. (The exception to this is when you use .get() to fetch a record by primary key, which we are not doing here.) But an ORM can have an Identity Mapper, that would make sure they the same object would be returned, It would be bound to two different name-spaces, but it would be the same object. Now we could implement a Truely lazy update. The increment that is done in the first function, would be reflected in the second one, because the same object would be returned, Another problem here. Whenever you execute a new query, SQLA flushes the pending changes. So, when you run the query in the second function, it will first issue the update to the database from the first change. Once it has done that, it will ultimately also have to issue the update from the second function (though perhaps at some later time) in order to have the correct value in the database. So, I believe we have a minimum of 4 database hits with the ORM (5 if you lazy load the cities when running the initial query). To summarize: - The ORM doesn't do direct updates to the database, so it must first select the records before updating them,
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Hi, On 01/05/13 22:07, Michele Comitini wrote: Why not write a driver for SQLA that speaks DAL instead of a sql dialect? +1 -- Regards, PhilK 'a bell is a cup...until it is struck' -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Hi, On 01/05/13 22:27, Cliff Kachinske wrote: I would propose that the best way to get others on board would be to channel the energy being burned on this thread into an implementable design or even a set of specific software requirements or pseudo code. +1 This has been a very interesting thread, but the length (and passion!) of some of the posts has made it a long hard read, and despite my best efforts I truly do not understand what problem the DAL fails to solve, and I don't expect to without some much-less-abstract discussion. -- Regards, PhilK 'a bell is a cup...until it is struck' -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Thursday, May 2, 2013 5:17:41 AM UTC+3, Anthony wrote: Although ORM's may do that, such a feature is not unique to the ORM pattern. In the web2py DAL, for example, in a Row object with a reference to another table, the reference field is actually a DAL.Reference object, not a scalar value (it includes the scalar value but also allows access to related records in the referenced table). In this case it does not reference a set of DAL fields. I'm not sure what you mean. A reference field references records, not fields. The operative word in my comment is *SET *not *FIELD. * I may have gotten the Field-vs-Record terminology in this sentence, but that is irrelevant to what I was referring to. My point was that the assertion of wikipedia that ORM can have references to non-scalar values, applies in your example, as the existence of intermediary-objects on the way to get to the value, does not grant the attribute non-scalar status - only a reference to a *SEQUENCE *of objects can do that. Similarly, a Row object from a table that is referenced by another table includes an attribute that is a DAL.LazySet object (also not a scalar), which allows access to the records in the referencing table that reference the current Row object. I did not know that - what form of *Layziness* are we talking about here? Will it generate a query to fill-up the target rows? In any case, it is stil a reference to something the WOULD generate a Rows object - it is not a reference to an already-exising domain-object (which may then have references to othe domain-objects, etc. - all already within memory) object as is in ORMS Are you saying that when you select a set of records that include reference fields, the ORM automatically selects all the referenced records (and any records they may reference, and so on) and stores them in memory, even if you have not requested that? That sounds inefficient. No. that is not what I am saying, although that might occur if you configure it to behave like that. But the flaw in your reasoning is that you are trying to apply a stateless-mid-set to a statefull-system. I'm not sure why you are doing that. When you say automatically selects you are meaning to say automatically generate queries, because within a stateless system there is no record-cache-management, so any access to an attribute IS a select from the database. But that is not the case with statefull-ORMs - they may issue selects if-and-only-if the requested value is invalid (meaning, it was either never queried at all as of yet, or was invalidated by a previous transaction-commit). Attribute-values along the object-graph, may-or-may-not be valid, and so may-of-may-not require a select. When you use a statefully-automatic system, you are deliberately relinquishing (at-least-some) control over such matters, in favor of not having to worry about whether a given attribute-access would need to occur or not. You make an object-attribute access, and the ORM is traversing the object-graph that is linked to this attribute, in order to get a value at some point. The ORM is doing the traversal for you. If all of the objects and attributes that are traversed over, are valid (a.k.a exist and are up-to-date), then no select would have to be sent to the database. Again, in an ORM you are building your object-graph up-front, using domain-classes. It just get's inhabited and updated for you, as you use it. So saying that any attribute-access would necessarily-always generate a select is inaccurate. Also, it is not necessarily inefficient, as the idea in an ORM is that it figures-out the minimal-required database-access in order to get you the data you ask. It may NOT query entire sets of records, if you only need some of them (here a DAL layer underneath is taking care of that, generating optimized queries for you, an/or using queries that you have defined for it to use). And again, in subsequent access to the same attribute, it would again traverse the same object-graph linked to it, and this time, all the objects/attributes that where previously missing and queried, are now caches-and-valid, so no database-access is performed. The DAL also has list:-type fields, whose values are lists, including lists of DAL.Reference objects in the case of list:reference fields. That's interesting, but that is not exactl the same - list-fields need to be supported in the database, but in any case, it is not comparable to being linked to relation ally-stored primary-keys - which would be how it would be implemented in an ORM. No, list fields do not have to be supported in the database (they are stored as strings) -- they are an abstraction provided by the DAL. list:reference fields do in fact store a list of primary keys (in fact, a list of objects that include the primary keys and know how to retrieve the associated records). web2py also has
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
In this case it does not reference a set of DAL fields. I'm not sure what you mean. A reference field references records, not fields. The operative word in my comment is *SET *not *FIELD. * I may have gotten the Field-vs-Record terminology in this sentence, but that is irrelevant to what I was referring to. I don't see how that is irrelevant, as it makes your statement incomprehensible. I can't infer what you really meant. My point was that the assertion of wikipedia that ORM can have references to non-scalar values, applies in your example, as the existence of intermediary-objects on the way to get to the value, does not grant the attribute non-scalar status - only a reference to a *SEQUENCE *of objects can do that. First, in this case I was talking about a reference field, which references a *single* foreign record -- there is no sequence of objects in this case, not even in an ORM. If you go in the other direction, however, a DAL LazySet does in fact reference a sequence of records. In any case, your reasoning applies just as well to an ORM -- the attributes that are objects are themselves merely intermediate objects on the way to scalar values stored in the database -- so I guess we cannot grant them non-scalar status either. But the flaw in your reasoning is that you are trying to apply a stateless-mid-set to a statefull-system. I'm not sure why you are doing that. Well, I'm not doing that. When you say automatically selects you are meaning to say automatically generate queries, because within a stateless system there is no record-cache-management, so any access to an attribute IS a select from the database. But that is not the case with statefull-ORMs - they may issue selects if-and-only-if the requested value is invalid (meaning, it was either never queried at all as of yet, or was invalidated by a previous transaction-commit). Yes, but you act as if all the data are always magically there without any database queries. There has to be an initial query (or set of queries) to get the data from your database in order to populate the instance objects to begin with. The SQLA session lasts only for a single web request, so this has to happen at every request anyway. Furthermore, if you need to issue a query that happens to require the same records you already have in the session, it still needs to do a database select again (unless you query specifically by primary key) -- the DAL, on the other hand, can cache queries, even across multiple web requests. Yes, in web2py, if you need to apply the same recursive select twice, it will hit the database twice (unless you cache or store the result), whereas SQLA will generally do only one select. On the other hand, in web2py, you might simply do a join, in which case, you have all the related data and don't need any subsequent selects. You make an object-attribute access, and the ORM is traversing the object-graph that is linked to this attribute, in order to get a value at some point. The ORM is doing the traversal for you. Note, this is not unique to an ORM and can be done in a DAL as well. The web2py DAL does this, albeit exclusively with lazy loading, though presumably an eager loading option could be implemented as well. So saying that any attribute-access would necessarily-always generate a select is inaccurate. Right, so good thing I didn't say that. Also, it is not necessarily inefficient, as the idea in an ORM is that it figures-out the minimal-required database-access in order to get you the data you ask. It's not quite that simple. You have to make some decisions, and there are tradeoffs. By default, the SQLA ORM does lazy loading of reference records (i.e., it issues a separate query for each reference attribute accessed) -- same as web2py recursive selects. This is more efficient if you only need to do this on one or a few instances. The ORM can also optionally do eager loading, either with a join or a second subquery, both of which have their advantages and disadvantages. In any case, you are not guaranteed to get the minimum-required database access automatically. No, list fields do not have to be supported in the database (they are stored as strings) -- they are an abstraction provided by the DAL. list:reference fields do in fact store a list of primary keys (in fact, a list of objects that include the primary keys and know how to retrieve the associated records). web2py also has JSON fields, which I would say does not count as a scalar either. It is amazing how far the DAL features have gone to mimic an ROM on the surface, while not being the real thing underneath You are basically suggesting bypassing the relational-functionality of the database, and do it in the DAL, all just to achieve appearance of an ORM... I don't even know where to place that... It's absurd... No, the purpose of list:-type
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
We can continie bickering about who-said-what, and in referrence to what. Instead, I'll just refrase the statement: According to wikipidia, DAL-object-attributes are not referencing an object-graph in memory, as ORM ones do. You may say anything to avoid admitting that, but it would sill be the case. As for lazy-set objects, existing in row-objects and pointing to a backwards-foreign-key records, I dont remember seeing anything about that in the documentation. Can you post a link? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
According to wikipidia, DAL-object-attributes are not referencing an object-graph in memory, as ORM ones do. You may say anything to avoid admitting that, but it would sill be the case. First, as far as I can tell, Wikipedia does not say that about DAL's (it doesn't say all DAL's *do* have object attributes that reference an object-graph in memory, but nor does it say they don't or can't *in principle* have such attributes). Furthermore, it doesn't even appear to say that about ORM's. Second, even if Wikipedia did say that, it would have no bearing on this discussion, as we have not been discussing Wikipedia's definitions of DAL and ORM, but rather the actual web2py DAL and the actual SQLA ORM. web2py does have Reference and LazySet fields that allow you to traverse the relations. Their retrieved values do not stay in memory, but that doesn't mean such persistence could not in principle be implemented. I won't speak for all DAL's, but what you describe could be implemented at least in the web2py DAL. You may say anything to avoid admitting that, but it will still be the case. As for lazy-set objects, existing in row-objects and pointing to a backwards-foreign-key records, I dont remember seeing anything about that in the documentation. Can you post a link? http://web2py.com/books/default/chapter/29/06#Recursive-selects Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
The reference to wikipedia was done by Derek, not me. I was just reacting to this reference to falsify his claim that I am re-inventing my own definitions. You will see in my comment to him where it defines an ORM to be an object-graph. I iterpreted the Wikipedia DAL definition, of scalar-vs-non scalar, to mean an absent of an object-graph, since a reference of an object-attribute to a sequence, is a non-scalar reference - and an object-graph may contain such references, in a way a DAL api can-not, according to the scalarity-definition of the DAL. And again, all of this is in the context of my reaction to Derek's claim that the wikipedia-definitions say otherwize. As to LazySets, no-where in the documentation is that term mentioned. As for Recursive-selects, we've been through this already - twice actually - it is an explicit 'select' statement that generates a query - it is not a reference to a sequence object that exists in memory. Also, it can't be used for complex relational-graphs, only for single-foreign-key relations. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I iterpreted the Wikipedia DAL definition, of scalar-vs-non scalar, to mean an absent of an object-graph, since a reference of an object-attribute to a sequence, is a non-scalar reference - and an object-graph may contain such references, in a way a DAL api can-not, according to the scalarity-definition of the DAL. And again, all of this is in the context of my reaction to Derek's claim that the wikipedia-definitions say otherwize. I don't know why this matters to you, but you clearly have not read the Wikipedia definitions. I will let Derek speak for himself. As to LazySets, no-where in the documentation is that term mentioned. It doesn't mention the term LazySet, as that is an internal class, but the functionality is documented. As for Recursive-selects, we've been through this already - twice actually - it is an explicit 'select' statement that generates a query - it is not a reference to a sequence object that exists in memory. Stipulated, but so is a reference attribute in an ORM until you actually access the attribute and run the query. The only difference is that the ORM keeps the value in the object after the initial query. As I have mentioned, there is no reason why this could not be done in the DAL as well. It is not a feature unique to ORM's. Actually, even now you can do: db.define_table('person', Field('name')) db.define_table('dog', Field('name'), Field('owner', 'reference person')) Bob = db(db.person.name == 'Bob').first() Bob.dog = Bob.dog.select() # db query here print Bob.dog # no db query here Now Bob.dog is an attribute holding Bob's dog records in memory -- subsequent references will not trigger any db queries. Not as automatic as in SQLA, but it could be made more automatic. Also, it can't be used for complex relational-graphs, only for single-foreign-key relations. Agreed, though in principle this could be implemented, so again, not an ORM-specific feature. Keep in mind, I have not claimed that the DAL can do everything the SQLA ORM can do. You have been arguing largely against straw men. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I agree. Here is a more concrete explanation. Given the following tables: db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db.Country )) Using an ORM, you could do something like: Country(Name='France').City.list [City Name:Paris, City Name:Nice] Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] In an ORM on-top of the DAL, I would like to be able to do something like: @ORM(db.City) class City: pass @ORM(db.Country) class Country: pass @ORM(db.Continent) class Continent: pass europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France') is france True france.City(Name='Paris') is paris True europe.Country(Name='France').City.list [City Name:Paris, City Name:Nice] paris.Country is france True france.Continent is europe True This would be so intuitive and easy to use... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db. Country)) Using an ORM, you could do something like: Country(Name='France').City.list [City Name:Paris, City Name:Nice] Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] Actually, you can do this with the DAL: db.Country(name='France').City.select() Almost identical to your ORM code (not that I think it needs to be similar looking code to be useful). europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France').City.list [City Name:Paris, City Name:Nice] In the DAL, you can do (I think I got this right): europe = db.Continent.insert(Name='Europe') france = db.Country.insert(Name='France', Continent=europe) paris = db.City.insert(Name='Paris', Country=france) nice = db.City.insert(Name='Nice', Country=france) db.Country(france).City.select() [Row Name:Paris, Row Name:Nice] Of course, you don't get the is equivalencies, but your example doesn't actually require that. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Thu, May 2, 2013 at 6:20 PM, Arnon Marcus a.m.mar...@gmail.com wrote: I agree. Here is a more concrete explanation. Given the following tables: db.define_table('Continent', Field('Name', 'string')) db.define_table('Country', Field('Name', 'string'), Field('Continent', db.Continent)) db.define_table('City', Field('Name', 'string'), Field('Country', db.Country)) Using an ORM, you could do something like: Country(Name='France').City.list [City Name:Paris, City Name:Nice] Sorry, but what should do .list() ? it is a query? it is pre-fetched or cached? What about if I need the cities of countries where main language is not Spanish, and population is above 1 millon? Please, note that you are mixing a declarative query syntax (DAL), with an imperative one (ORM) Using the DAL, the best you might get is: [city for city in db.City.Country.select() if city.Country.Name == ''France'] [Row Name:Paris, Row Name:Nice] I would like a elegant declarative syntax similar to LINQ, but in python: [city for db.City.ALL in db.City, db.Country if db.Country.Name == 'France' and db.Country.id == db.City.country] Sadly, there is NO possible way to archive a pythonic list comprenhension syntax that query the objects in the server side AFAIK Some libraries uses dis (python disassembler) and other internal dirty python hacks to do something similar, please, see: http://www.aminus.net/geniusql/ http://www.aminus.org/blogs/index.php/2008/04/22/linq-in-python?blog=2 http://www.aminus.net/geniusql/chrome/common/doc/trunk/managing.html Note that web2py is much closer to the pythonic expressions, but without the early / late binding and other issues described there ;-) In an ORM on-top of the DAL, I would like to be able to do something like: @ORM(db.City) class City: pass @ORM(db.Country) class Country: pass @ORM(db.Continent) class Continent: pass europe = Continent(Name='Europe') france = Country(Name='France', Continent=europe) paris = City(Name='Paris', Country=france) nice = City(Name='Nice', Country=france) europe.Country(Name='France') is france True This would require caching or storing previous queried record in memory (something like a singleton), or is will not work as you are expecting (it is for checking identity)... That could be also archived hacking Row, but you should use == in python for this kind of comparision (equality) This would be so intuitive and easy to use... For you that are creating it, but I will not understand it in a first sight, so I prefer the DAL syntax that is a bit verbose but more uniform (at least I know what I'm doing in each step, or can get a SQL book and see how to pythonize the expression) A ORM is not more intuitive and will have a higher learning curve, and no formal / logical / math justification theory BTW I think that a more elegant way for web2py would be research how to put bussiness rules (and not only tables), in the models. This could be done right now to some extent with virtual fields predefined queries (we already have represent, field validators, etc). Implementing stored procedures and triggers in web2py also could be interesting (and you could win a lot of speed), for example, PostgreSQL supports PL/Python for that. Best regards, Mariano Reingart http://www.sistemasagiles.com.ar http://reingart.blogspot.com -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I didn't say there were ORM features in the DAL, just that it includes features that you might otherwise expect to find in an ORM Well, it seems like a semantic-issue. DAL and ORM are pretty abstract-terms. Here is how interpret them: DAL - A way to construct schemas and queries without writing SQL or DBAPI calls. ORM - A way to construct domain-models using a DAL in a statefull manner. The DAL is a way of saying: Hey, here's a bunch of objects and methods, please generate an SQL out of them, send them to the database for me and give me results The ORM is a way of saying: Hey, here's a bunch of classes and attributes, please wire them up so their instances would communicate their state to each other, optimizing my transaction-operations for me as I use them Generally, as Massimo confirmed, the DAL is purely stateless. It only returns dictionary-like immediate-results. ...migrations, automatic file uploads/retrieval, recursive selects, automatic results serialization into HTML, virtual fields, computed fields, validators, field representations, field labels, field comments, table labels, list:-type fields, JSON fields, export to CSV, smart queries, callbacks, record versioning, common fields, multi-tenancy, common filters, GAE support, and MongoDB support? That's a mouth-full... Let's brake it down, shell we?: *Multi-Tenancy, Common-Filters, Smart-Queries:* These are SQL-related features - meaning, DAL-features, not ORM ones. *Common Fields, **Automatic-Migrations, CSV/HTML/XML-Exports:* These are schema-related features - meaning, DAL/framework-features, not ORM ones * * *Labels, Comments:* These are schema-metadata-related features - meaning, DAL-features, not ORM ones. *GAE/MongoDB:* Target database-support is a low-level DAL-feature - The DAL may or may not support specific targets - but it's not an ORM feature. *JSON/List fields:* These are database-related feaatures - they are adapters for data-types that may or may not be supported in your target-database. The DAL may or may not support them, but they are not ORM features either way. *Validators, Upload/Retrieval, **Record-Versioning, Callbacks, Record-** Representations**:* These are not DAL *nor *ORM features - they are framework-features. They are input/output adapters. It is a way of saying: Hey, when you get results back, run them through these operations. *Virtual/Computed fields:* These are kinda-tricky to classify. Computed-Fields are for automating input-transformations. They are a way of saying: Hey, take these values that I'm *already** inserting to *these other fields, *run them through this function*, and *store the result in* that other field. Virtual-Fields are for automating output-transformations. They are a way of saying: Hey, take these values that I'm *already **getting from* these other fields, *run them through this function*, and *produce the results as* that other field. The distinctions between these features vs. the ORM-equivalent ones, are quite subtle and illusive, but profound. The first difference is of scope - Virtual/Computed-fields can only be applied to other fields of the same Table. In (some) ORMs they are not, because an ORM class does not necessarily have to be a 1:1 representation of a table. The whole point of an ORM is to be able to construct domain-models, not mere class-forms of Table-descriptions. In the DAL, Virtual/Computed-fields can NOT generate implicit calls to foreign-table-fields. The second difference is of statelessness-vs-statfullness - The DAL is stateless, so it can-not give values from a previous query. ORMs are statefull in nature, so: - For output, Virtual-fields can use *values **already stored* in those * other-field's* cache, and *not even query* the database. - For input, Computed-Fields can use* values **already stored* in those * other-field's* cache, and *not even insert* them to the database. The DAL is stateless in nature, so: - For output, Virtual-fields *must* *query **values* from those *other-field *, in order to invoke the functionality of the automated-output. - For input, Computed-Fields *must insert values* to those *other fields,* in order to invoke the functionality of the automated-input. ** There is also cache for the DAL, but it's time-based, and not transaction-based.* * * As for *Lazy*-virtual-fields, they are not the sort of laziness that an ORM has - it's a deferred-execution of the automation that is defined to run on the records *after* the query has returned. In ORMs there are deferred-queries for laziness. * * *Recursive-Selects:** * The documentation on this feature is not cleat - it seems that it generates queries on your behalf, but is only useful for single-record queries, as it's like an Active-Record pattern - it doesn't do any transaction-level operation-optimizations within loops (as there are non deferred-queries). But if you use an ORM built on top of the DAL, you won't be
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I didn't say there were ORM features in the DAL, just that it includes features that you might otherwise expect to find in an ORM Well, it seems like a semantic-issue. DAL and ORM are pretty abstract-terms. Here is how interpret them: DAL - A way to construct schemas and queries without writing SQL or DBAPI calls. ORM - A way to construct domain-models using a DAL in a statefull manner. I don't think you are understanding me, so let me try to be more clear. Let's say an ORM is a particular design pattern for modeling data, and a DAL is a different design pattern for modeling data. Each of those different design patterns can nevertheless be used to implement similar types of features. For example, you might want to query the database and return a results set. This can be done in an ORM, and it can be done in a DAL. The implementation and the syntax will be different in each case, but they are both implementing a common feature. So, when I say the DAL implements features that might otherwise be found in a typical ORM, I am not saying the DAL implements an ORM design pattern, just that it replicates functionality for which you might otherwise use an ORM. For example, in an ORM, you can define a method in a class that returns a value calculated from the fields of a database record. In the web2py DAL, this same functionality can be achieved using a virtual field or lazy field. I don't know if the SQLA CORE has virtual fields, but if it doesn't, I would suppose it leaves this kind of functionality to the ORM. The point is, many features found in DAL's and ORM's are not unique or specific to the DAL or ORM design pattern. Each design pattern can be used to implement many common types of functionality (the functionality may not be identical, but at least similar, and used to satisfy the same goals). The ORM is a way of saying: Hey, here's a bunch of classes and attributes, please wire them up so their instances would communicate their state to each other, optimizing my transaction-operations for me as I use them That looks like the definition of the SQLA ORM, not ORM's in general. ...migrations, automatic file uploads/retrieval, recursive selects, automatic results serialization into HTML, virtual fields, computed fields, validators, field representations, field labels, field comments, table labels, list:-type fields, JSON fields, export to CSV, smart queries, callbacks, record versioning, common fields, multi-tenancy, common filters, GAE support, and MongoDB support? That's a mouth-full... Let's brake it down, shell we?: No, let's not. My point is not that any of those items properly belong to either a DAL or an ORM, or that they can only be implemented with either a DAL or an ORM design pattern. Rather, you had claimed that the SQLA CORE is equivalent to the web2py DAL and that all 20,000+ lines of SQLA ORM code must therefore be providing unique functionality not available in the DAL (thus implying that the ORM must be useful). I was just suggesting that the DAL might be doing more than the SQLA CORE (at least in some areas), and that the DAL might possibly be offering some features for which you would otherwise need the SQLA ORM. Regarding all the features you claim are inherently DAL features and not ORM features, I disagree. Any one of those features could rightly be part of either a DAL or an ORM. They are simply features you might want to implement within any data modeling abstraction, whatever the design pattern. The first difference is of scope - Virtual/Computed-fields can only be applied to other fields of the same Table. No, they can also be applied to the results of joins (not sure if that's typically as easy to do in an ORM) -- see http://web2py.com/books/default/chapter/29/06#Old-style-virtual-fields. In the DAL, Virtual/Computed-fields can NOT generate implicit calls to foreign-table-fields. Yes, they can with recursive selects. The second difference is of statelessness-vs-statfullness - The DAL is stateless, so it can-not give values from a previous query. ORMs are statefull in nature, so: - For output, Virtual-fields can use *values **already stored* in those * other-field's* cache, and *not even query* the database. - For input, Computed-Fields can use* values **already stored* in those * other-field's* cache, and *not even insert* them to the database. The DAL is stateless in nature, so: - For output, Virtual-fields *must* *query **values* from those * other-field*, in order to invoke the functionality of the automated-output. - For input, Computed-Fields *must insert values* to those *other fields,* in order to invoke the functionality of the automated-input. ** There is also cache for the DAL, but it's time-based, and not transaction-based.* I'm not quite sure what you mean here. Even in an ORM, in order to calculate the value of a virtual field, you first have to retrieve
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
You don't get to define terms any way you see fit. DAL and ORM have specific meanings. DAL is a TLA (three letter acronym) for Database Abstraction Layer. ORM is a TLA for Object Relational Mapping. So, what does a DAL do? Wikipedia tells us that it ... is an application programming interfacehttp://en.wikipedia.org/wiki/Application_programming_interface which unifies the communication between a computer application and databaseshttp://en.wikipedia.org/wiki/Database such as SQL Server http://en.wikipedia.org/wiki/MSSQL, DB2http://en.wikipedia.org/wiki/IBM_DB2 , MySQL http://en.wikipedia.org/wiki/MySQL, PostgreSQLhttp://en.wikipedia.org/wiki/PostgreSQL , Oracle http://en.wikipedia.org/wiki/Oracle_database or SQLitehttp://en.wikipedia.org/wiki/SQLite. Traditionally, all database vendors provide their own interface tailored to their products which leaves it to the application programmer to implement code for all database interfaces he or she would like to support. Database abstraction layers reduce the amount of work by providing a consistent API to the developer and hide the database specifics behind this interface as much as possible. There exist many abstraction layers with different interfaces in numerous programming languages. What does an ORM do? Wikipedia tells us that it ... is a programminghttp://en.wikipedia.org/wiki/Computer_programming technique for converting data between incompatible type systemshttp://en.wikipedia.org/wiki/Type_system in object-oriented http://en.wikipedia.org/wiki/Object-oriented programming languages. This creates, in effect, a virtualobject databasehttp://en.wikipedia.org/wiki/Object_database that can be used from within the programming language. So, the two terms are not mutually exclusive, but they handle different domains. It may be interesting to have an ORM on top of DAL, but I personally feel that creating YAORM (Yet Another Object Relational Mapping) is counter-productive especially when you could bypass the DAL and just use SQLA which you yourself say is the best ORM there is. Now, perhaps what may be beneficial is to separate the DAL from the HTML generation and data validation logic. That way, you could plug in SQLA and yet your SMARTGRID and FORMs would work with all the bells and whistles. On Wednesday, May 1, 2013 7:15:11 AM UTC-7, Arnon Marcus wrote: I didn't say there were ORM features in the DAL, just that it includes features that you might otherwise expect to find in an ORM Well, it seems like a semantic-issue. DAL and ORM are pretty abstract-terms. Here is how interpret them: DAL - A way to construct schemas and queries without writing SQL or DBAPI calls. ORM - A way to construct domain-models using a DAL in a statefull manner. The DAL is a way of saying: Hey, here's a bunch of objects and methods, please generate an SQL out of them, send them to the database for me and give me results The ORM is a way of saying: Hey, here's a bunch of classes and attributes, please wire them up so their instances would communicate their state to each other, optimizing my transaction-operations for me as I use them Generally, as Massimo confirmed, the DAL is purely stateless. It only returns dictionary-like immediate-results. ...migrations, automatic file uploads/retrieval, recursive selects, automatic results serialization into HTML, virtual fields, computed fields, validators, field representations, field labels, field comments, table labels, list:-type fields, JSON fields, export to CSV, smart queries, callbacks, record versioning, common fields, multi-tenancy, common filters, GAE support, and MongoDB support? That's a mouth-full... Let's brake it down, shell we?: *Multi-Tenancy, Common-Filters, Smart-Queries:* These are SQL-related features - meaning, DAL-features, not ORM ones. *Common Fields, **Automatic-Migrations, CSV/HTML/XML-Exports:* These are schema-related features - meaning, DAL/framework-features, not ORM ones * * *Labels, Comments:* These are schema-metadata-related features - meaning, DAL-features, not ORM ones. *GAE/MongoDB:* Target database-support is a low-level DAL-feature - The DAL may or may not support specific targets - but it's not an ORM feature. *JSON/List fields:* These are database-related feaatures - they are adapters for data-types that may or may not be supported in your target-database. The DAL may or may not support them, but they are not ORM features either way. *Validators, Upload/Retrieval, **Record-Versioning, Callbacks, Record-** Representations**:* These are not DAL *nor *ORM features - they are framework-features. They are input/output adapters. It is a way of saying: Hey, when you get results back, run them through these operations. *Virtual/Computed fields:* These are kinda-tricky to classify. Computed-Fields are for automating input-transformations. They are a way of saying: Hey, take these values
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Why not write a driver for SQLA that speaks DAL instead of a sql dialect? 2013/5/1 Derek sp1d...@gmail.com You don't get to define terms any way you see fit. DAL and ORM have specific meanings. DAL is a TLA (three letter acronym) for Database Abstraction Layer. ORM is a TLA for Object Relational Mapping. So, what does a DAL do? Wikipedia tells us that it ... is an application programming interfacehttp://en.wikipedia.org/wiki/Application_programming_interface which unifies the communication between a computer application and databaseshttp://en.wikipedia.org/wiki/Database such as SQL Server http://en.wikipedia.org/wiki/MSSQL, DB2http://en.wikipedia.org/wiki/IBM_DB2 , MySQL http://en.wikipedia.org/wiki/MySQL, PostgreSQLhttp://en.wikipedia.org/wiki/PostgreSQL , Oracle http://en.wikipedia.org/wiki/Oracle_database or SQLitehttp://en.wikipedia.org/wiki/SQLite. Traditionally, all database vendors provide their own interface tailored to their products which leaves it to the application programmer to implement code for all database interfaces he or she would like to support. Database abstraction layers reduce the amount of work by providing a consistent API to the developer and hide the database specifics behind this interface as much as possible. There exist many abstraction layers with different interfaces in numerous programming languages. What does an ORM do? Wikipedia tells us that it ... is a programminghttp://en.wikipedia.org/wiki/Computer_programming technique for converting data between incompatible type systemshttp://en.wikipedia.org/wiki/Type_system in object-oriented http://en.wikipedia.org/wiki/Object-oriented programming languages. This creates, in effect, a virtualobject databasehttp://en.wikipedia.org/wiki/Object_database that can be used from within the programming language. So, the two terms are not mutually exclusive, but they handle different domains. It may be interesting to have an ORM on top of DAL, but I personally feel that creating YAORM (Yet Another Object Relational Mapping) is counter-productive especially when you could bypass the DAL and just use SQLA which you yourself say is the best ORM there is. Now, perhaps what may be beneficial is to separate the DAL from the HTML generation and data validation logic. That way, you could plug in SQLA and yet your SMARTGRID and FORMs would work with all the bells and whistles. On Wednesday, May 1, 2013 7:15:11 AM UTC-7, Arnon Marcus wrote: I didn't say there were ORM features in the DAL, just that it includes features that you might otherwise expect to find in an ORM Well, it seems like a semantic-issue. DAL and ORM are pretty abstract-terms. Here is how interpret them: DAL - A way to construct schemas and queries without writing SQL or DBAPI calls. ORM - A way to construct domain-models using a DAL in a statefull manner. The DAL is a way of saying: Hey, here's a bunch of objects and methods, please generate an SQL out of them, send them to the database for me and give me results The ORM is a way of saying: Hey, here's a bunch of classes and attributes, please wire them up so their instances would communicate their state to each other, optimizing my transaction-operations for me as I use them Generally, as Massimo confirmed, the DAL is purely stateless. It only returns dictionary-like **immediate-results. ...migrations, automatic file uploads/retrieval, recursive selects, automatic results serialization into HTML, virtual fields, computed fields, validators, field representations, field labels, field comments, table labels, list:-type fields, JSON fields, export to CSV, smart queries, callbacks, record versioning, common fields, multi-tenancy, common filters, GAE support, and MongoDB support? That's a mouth-full... Let's brake it down, shell we?: *Multi-Tenancy, Common-Filters, Smart-Queries:* These are SQL-related features - meaning, DAL-features, not ORM ones. *Common Fields, **Automatic-Migrations, CSV/HTML/XML-Exports:* These are schema-related features - meaning, DAL/framework-**features, not ORM ones * * *Labels, Comments:* These are schema-metadata-related features - meaning, DAL-features, not ORM ones. *GAE/MongoDB:* Target database-support is a low-level DAL-feature - The DAL may or may not support specific targets - but it's not an ORM feature. *JSON/List fields:* These are database-related feaatures - they are adapters for data-types that may or may not be supported in your target-database. The DAL may or may not support them, but they are not ORM features either way. *Validators, Upload/Retrieval, **Record-Versioning, Callbacks, Record-** Representations**:* These are not DAL *nor *ORM features - they are framework-features. They are input/output adapters. It is a way of saying: Hey, when you get results back, run them through these operations. *Virtual/Computed fields:* These are kinda-tricky to classify.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I started doing this stuff hand hacking SELECT, INSERT, DELETE and UPDATE statements and feeding them into MySQL via PHP. It was tedious, boring and error prone. This is probably not a typical background, but it is the one I have. Given my experience, the Web2py DAL feels great. The whole concept behind it is very straightforward: get a request from the controller, manipulate the data and pass a response back. Also, working with the DAL is more akin to configuration than it is to programming. It's a little faster and way less error prone. In my brief encounters with ORMs, they have seemed squishy and obtuse; overly complicated. So I don't see the value of fitting an ORM layer on top of the DAL. But some might. I would propose that the best way to get others on board would be to channel the energy being burned on this thread into an implementable design or even a set of specific software requirements or pseudo code. On Saturday, April 27, 2013 9:18:45 AM UTC-4, Arnon Marcus wrote: I am in the process of researching ways to improve the structure of my web2py-app's code, simplifying usage of certain areas, and enabling RPC-like interface for external programs. I use web2py for over 3 years now, and love every aspect of it - especially the DAL (!) However, as the code grew larger, and as hierarchical domain-model-patterns started to emerge, I started to look for alternative ways of accessing and using the portion of the data-model that is strictly hierarchical in nature. It is a huge controversial issue with relational-data-models which contain hierarchies. I don't intend to open a large discussion about this here. Suffice it to say, that even the most die-hard SQL lover, would admit it's shortcomings when hierarchies are introduces into the data-mode. It is an unsolved (probably unsolvable) problem in data-model theory. So it is no a matter of looking fot the best solution, because there can not exist such a concept - even in theory. It is a matter of looking for the most-fitting set of trade-offs for the problem at hand. That said, some projects are large and/or varied enough, that they *DO *include both *highly-relational* areas*,* *as well as* *highly-hierarchical* areas *- within the same data-model (!)* For such use-cases, a more flexible/hybrid approach is beneficial. You don't expect to have to choose either/or relational-models vs. hierarchical-models - you expect your framework to include and facilitate support for both approaches for the same database. You would use the relational-features of the framework for when it is most suited for, and hierarchical-features for when IT makes better sense. Ideally, your framework would be built in an integrated-yet-layered design, that would make it easy for you to accomplish both approaches in a synergetic manner. My research has led me through ZODB and SQLAlchemy, just to get a feel for what an ORM could provide. Aside from reading a lot and watching a lot of lectures about these technologies, as well as general opinions about them, I have also taken the time to *really go through tons of threads in this group about these issues. as well as the web2py documentation.* Bottom-line, my current feelings about this issue, is that there is still something missing in web2py to facilitate the construction of higher-levels of abstractions, that are more focused on business-logic than database-schema. I also feel that there are dogmatic sentiments being thrown from both sides of the fence in this flame-fest fiasco. I think this hurts us - a lot. I think a more constructive approach would be to acknowledge that there are different use-cases that can benefit from different approaches, and that this leads to opposing opinions regarding certain trade-off that are being sought after. I think that web2py has taken an approach that is still too narrow-minded when it comes to supporting multiple-approaches, and that a layered-design could be beneficial here. Case in point, the philosophy and design of SQLAlchemy: http://www.youtube.com/watch?v=uvFBXxftvN4 Now, just to be clear, I think that the web2py-DAL's API is much cleaner, simpler, and more easy and fun to use than SQA's API, at least for the SQL-Expression layer. But I also think that SQA's is a more flexible approach - it can target a more varied set of use-cases. Contrary to most of what I've read about SQA in this group, it's ORM is NOT mandatory, nor is it a necessarily more-restrictive/less-performant way of using the database. I think most criticisms I've seen here of it, are ill-informed, and have a somewhat prima-facie smell to them. They mainly attack the ORM concept in it's Active-Record form, which is NOT what SQA has. They also don't consider the layered-architecture of SQA, and compare the DAL with different implementations of ORMs that ARE more restrictive and
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I am not re-defining terms - I understand them correctly. An ORM is a Mapping between Objects, and Relations. Here are the parts of the wikipedia pages, that are actually-relevant* * to this discussion (important parts in bold-text): *ORM:* ...Data management http://en.wikipedia.org/wiki/Data_management tasks in object-oriented (OO) programming are typically implemented by manipulating * objects http://en.wikipedia.org/wiki/Object_(computer_science) that are almost always non-scalar http://en.wikipedia.org/wiki/Scalar_(computing)** values.* For example, consider an address book entry that represents *a single person along with zero or more phone numbers and zero or more addresses*. This could be modeled in an object-oriented implementation by a Person object http://en.wikipedia.org/wiki/Object_(computer_science) with attributes/fields http://en.wikipedia.org/wiki/Attribute_(computing) to hold each data item that the entry comprises: the person's name, *a list of phone numbers, and a list of addresses.* The list of phone numbers would itself *contain PhoneNumber objects *and so on. The address book entry is treated as a single object by the programming language (it can be referenced by a single variable containing a pointer to the object, for instance). *Various methods can be associated with the object, such as a method to return the preferred phone number, the home address*, and so on. The first thing to notice here, is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects. There is no feature in web2py that generates such an object. The second thing to notice here, is that the attributes of an ORM object usually contain child-objects (plural) that represent fields from a different table than the parent-object. Again, there is no feature in web2py that can generate such an object. A JOIN operation may return row objects, each of which may contain sub-attributes that hold A SINGLE field-value from a foreign-table, but it is a *scalar-value* - NOT another * domain-entity-object* (with it's own attributes, etc.), NOR a SEQUENCE of *domain-entity objects* Here are some diagrams that presents it really well: http://www.jfwk.com/images/p2.gif http://software-carpentry.org/3_0/summary/orm.png http://www.visual-paradigm.com/VPGallery/img/orm/ERDiagramAndClassDiagramSynchronization/ER-Diagram-and-Class-Diagram-Synchronization-Sample.png The crucial thing to notice here is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects, which themselves may contain links to other objects/sequences-of-objects, etc. As for DAL, here is the part of the wilipedia page, that is relevant to this discussion: ...*Popular use for database abstraction layers* are among object-oriented programming http://en.wikipedia.org/wiki/Object-oriented_programming languages, which are similar to API level abstraction layers. In an object oriented language like C++ or Java, *a database can be represented through an ** object* http://en.wikipedia.org/wiki/Object_(computer_science)*, whose methods and members* (or the equivalent thereof in other programming languages)* represent various functionalities of the database.* They also share the same advantages and disadvantages as API level interfaces. As you can see, even wikipedia says that there is more to a DAL than just the SQL translation. Here is another usage of the same Three Letter Acronym (DAL), that represents how an ORM is layered on-top of a DAL: http://vaadin.com/download/book-of-vaadin/vaadin-7/html/img/jpacontainer/jpa-mapping-graphic-lo.png http://vaadin.com/download/book-of-vaadin/vaadin-7/html/img/jpacontainer/three-layer-architecture-lo.png Obviously, THIS dal is not an abstraction-layer, but an access-layer, but it could very-well be substituted by an abstraction-layer that does the same thing. Granted, this is a different form of an ORM, directly mapping class-attributes to table-fields, but in principal it is the same - an ORM is a layer on-top a DAL, that uses a DAL. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I am not as learned in these things but it seems to me that a DAL is a small black box, whilst an ORM is a larger black box. If I look inside the DAL 'black box', I can just about figure out what's going on. If I look inside an ORM black box, it is already too complex (for me). If we add the two black boxes together, it would only be maintainable by someone of Massimo's skill level (and even he thought it was too complex for his needs, hence the DAL!). I cannot imagine that anyone would commit themselves to such a project. Perhaps we should try to list some of the benefits? Otherwise we shall remain theorists in pursuit of a hypothetical abstraction - which never looks good on a CV. On Wednesday, May 1, 2013 11:02:38 PM UTC+1, Arnon Marcus wrote: I am not re-defining terms - I understand them correctly. An ORM is a Mapping between Objects, and Relations. Here are the parts of the wikipedia pages, that are actually-relevant* * to this discussion (important parts in bold-text): *ORM:* ...Data management http://en.wikipedia.org/wiki/Data_management tasks in object-oriented (OO) programming are typically implemented by manipulating *objectshttp://en.wikipedia.org/wiki/Object_(computer_science) that are almost always non-scalarhttp://en.wikipedia.org/wiki/Scalar_(computing) ** values.* For example, consider an address book entry that represents *a single person along with zero or more phone numbers and zero or more addresses*. This could be modeled in an object-oriented implementation by a Person object http://en.wikipedia.org/wiki/Object_(computer_science) with attributes/fieldshttp://en.wikipedia.org/wiki/Attribute_(computing) to hold each data item that the entry comprises: the person's name, *a list of phone numbers, and a list of addresses.* The list of phone numbers would itself *contain PhoneNumber objects *and so on. The address book entry is treated as a single object by the programming language (it can be referenced by a single variable containing a pointer to the object, for instance). *Various methods can be associated with the object, such as a method to return the preferred phone number, the home address*, and so on. The first thing to notice here, is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects. There is no feature in web2py that generates such an object. The second thing to notice here, is that the attributes of an ORM object usually contain child-objects (plural) that represent fields from a different table than the parent-object. Again, there is no feature in web2py that can generate such an object. A JOIN operation may return row objects, each of which may contain sub-attributes that hold A SINGLE field-value from a foreign-table, but it is a *scalar-value* - NOT another * domain-entity-object* (with it's own attributes, etc.), NOR a SEQUENCE of *domain-entity objects* Here are some diagrams that presents it really well: http://www.jfwk.com/images/p2.gif http://software-carpentry.org/3_0/summary/orm.png http://www.visual-paradigm.com/VPGallery/img/orm/ERDiagramAndClassDiagramSynchronization/ER-Diagram-and-Class-Diagram-Synchronization-Sample.png The crucial thing to notice here is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects, which themselves may contain links to other objects/sequences-of-objects, etc. As for DAL, here is the part of the wilipedia page, that is relevant to this discussion: ...*Popular use for database abstraction layers* are among object-oriented programming http://en.wikipedia.org/wiki/Object-oriented_programming languages, which are similar to API level abstraction layers. In an object oriented language like C++ or Java, *a database can be represented through an ** object* http://en.wikipedia.org/wiki/Object_(computer_science)*, whose methods and members* (or the equivalent thereof in other programming languages)* represent various functionalities of the database.* They also share the same advantages and disadvantages as API level interfaces. As you can see, even wikipedia says that there is more to a DAL than just the SQL translation. Here is another usage of the same Three Letter Acronym (DAL), that represents how an ORM is layered on-top of a DAL: http://vaadin.com/download/book-of-vaadin/vaadin-7/html/img/jpacontainer/jpa-mapping-graphic-lo.png http://vaadin.com/download/book-of-vaadin/vaadin-7/html/img/jpacontainer/three-layer-architecture-lo.png Obviously, THIS dal is not an abstraction-layer, but an access-layer, but it could very-well be substituted by an abstraction-layer that does the same thing. Granted, this is a different form of an ORM, directly mapping class-attributes to table-fields, but in principal it is the same - an ORM is a layer on-top a DAL, that uses a DAL. -- --- You
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
The first thing to notice here, is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects. There is no feature in web2py that generates such an object. The second thing to notice here, is that the attributes of an ORM object usually contain child-objects (plural) that represent fields from a different table than the parent-object. Again, there is no feature in web2py that can generate such an object. A JOIN operation may return row objects, each of which may contain sub-attributes that hold A SINGLE field-value from a foreign-table, but it is a *scalar-value* - NOT another * domain-entity-object* (with it's own attributes, etc.), NOR a SEQUENCE of *domain-entity objects* ... The crucial thing to notice here is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects, which themselves may contain links to other objects/sequences-of-objects, etc. Although ORM's may do that, such a feature is not unique to the ORM pattern. In the web2py DAL, for example, in a Row object with a reference to another table, the reference field is actually a DAL.Reference object, not a scalar value (it includes the scalar value but also allows access to related records in the referenced table). Similarly, a Row object from a table that is referenced by another table includes an attribute that is a DAL.LazySet object (also not a scalar), which allows access to the records in the referencing table that reference the current Row object. The DAL also has list:-type fields, whose values are lists, including lists of DAL.Reference objects in the case of list:reference fields. Row objects can also include methods (i.e., lazy virtual fields). Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Well, it seems like a semantic-issue. DAL and ORM are pretty abstract-terms. Here is how interpret them: DAL - A way to construct schemas and queries without writing SQL or DBAPI calls. ORM - A way to construct domain-models using a DAL in a statefull manner. I don't think you are understanding me, so let me try to be more clear. Let's say an ORM is a particular design pattern for modeling data, and a DAL is a different design pattern for modeling data. Each of those different design patterns can nevertheless be used to implement similar types of features. For example, you might want to query the database and return a results set. This can be done in an ORM, and it can be done in a DAL. The implementation and the syntax will be different in each case, but they are both implementing a common feature. So, when I say the DAL implements features that might otherwise be found in a typical ORM, I am not saying the DAL implements an ORM design pattern, just that it replicates functionality for which you might otherwise use an ORM. No, it does not do that. It implements very different functionality, that may have a similar API and the same terminology used, which honestly I find quite confusing - border-line misleading. I have given a very specific and granular description of the differences of such functionalists between an ORM and web2py. For example, in an ORM, you can define a method in a class that returns a value calculated from the fields of a database record. In the web2py DAL, this same functionality can be achieved using a virtual field or lazy field. There are no lazy-fields in web2py, and I find the terminology misleading - as I said - *lazyness *in the context of *database-access*, is a * deferred-query* - NOT a *deferred-calculation* of the *results *of a query. The difference is to profound to overlook. Deferred calculations of field-results are generally useless - web-applications are generally I/O-Bound much more than CPU-Bound - so the benefits of deferring is mute in post-query calculations compared to benefits in deferred-queries that are used within the context of transaction-operation-optimizations - which is the context most people would thing of whenever they here the term *Lazy* thrown about a database-context, I don't know if the SQLA CORE has virtual fields, but if it doesn't, I would suppose it leaves this kind of functionality to the ORM. That's irellevant to the comparison of SQLA-Core vs. web2py-DAL, since I am not suggesting using the SQLA-Core and dumping it's ORM - quite the opposite - and since virtual-fields are actually much more beneficial when used within an ORM layer, as opposed to a DAL one. The only relevance for this point to this discussion, is the comparison of the sized of the code-bases. I get that this was what you meant. That looks like the definition of the SQLA ORM, not ORM's in general. Not quite - look ar my comment to Derek down below. No, let's not. My point is not that any of those items properly belong to either a DAL or an ORM, or that they can only be implemented with either a DAL or an ORM design pattern. Rather, you had claimed that the SQLA CORE is equivalent to the web2py DAL and that all 20,000+ lines of SQLA ORM code must therefore be providing unique functionality not available in the DAL (thus implying that the ORM must be useful). I was just suggesting that the DAL might be doing more than the SQLA CORE (at least in some areas), and that the DAL might possibly be offering some features for which you would otherwise need the SQLA ORM. You are saying that a lot of web2py's extra-features that extend on-top of the DAL, might not be included in SQLA's Core, but rather may represent a big portion of the 20K lines of code of the ORM, which would then suggest that the features I was excited about may actually represent a much minor portion of the 20K code-base, which would then suggest that they may be small, and therefore legitimate for being considered useless. You could have said so more clearly (like I just did) and prevent the confusion. Now, if you would have seen the lecture I gave Massimo the link to watch, you would have seen how complex these features might be, so I doubt they are implemented within a small code-base. But if they do, this would degrade your argument that this is such a substantial-investment as you called it... But In either case, I would go for trying to use what they wrote, long before I would consider re-inventing it... The point is, many features found in DAL's and ORM's are not unique or specific to the DAL or ORM design pattern. Each design pattern can be used to implement many common types of functionality (the functionality may not be identical, but at least similar, and used to satisfy the same goals). Regarding all the features you claim are inherently DAL features and not ORM features, I
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Wednesday, May 1, 2013 4:39:49 PM UTC-7, Anthony wrote: The first thing to notice here, is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects. There is no feature in web2py that generates such an object. The second thing to notice here, is that the attributes of an ORM object usually contain child-objects (plural) that represent fields from a different table than the parent-object. Again, there is no feature in web2py that can generate such an object. A JOIN operation may return row objects, each of which may contain sub-attributes that hold A SINGLE field-value from a foreign-table, but it is a *scalar-value* - NOT another * domain-entity-object* (with it's own attributes, etc.), NOR a SEQUENCE of *domain-entity objects* ... The crucial thing to notice here is that an ORM object-attribute can contain NON-SCALAR values - meaning, a link to a list of other objects, which themselves may contain links to other objects/sequences-of-objects, etc. Although ORM's may do that, such a feature is not unique to the ORM pattern. In the web2py DAL, for example, in a Row object with a reference to another table, the reference field is actually a DAL.Reference object, not a scalar value (it includes the scalar value but also allows access to related records in the referenced table). In this case it does not reference a set of DAL fields. Similarly, a Row object from a table that is referenced by another table includes an attribute that is a DAL.LazySet object (also not a scalar), which allows access to the records in the referencing table that reference the current Row object. I did not know that - what form of *Layziness* are we talking about here? Will it generate a query to fill-up the target rows? In any case, it is stil a reference to something the WOULD generate a Rows object - it is not a reference to an already-exising domain-object (which may then have references to othe domain-objects, etc. - all already within memory) object as is in ORMS The DAL also has list:-type fields, whose values are lists, including lists of DAL.Reference objects in the case of list:reference fields. That's interesting, but that is not exactl the same - list-fields need to be supported in the database, but in any case, it is not comparable to being linked to relation ally-stored primary-keys - which would be how it would be implemented in an ORM. Row objects can also include custom methods (i.e., lazy virtual fields) as well as virtual fields, which can contain complex objects. Relates to the comment I gave you a couple of minutes ago... These are complementary-auxiliary features (with in the web2py-implementation case, have questionable real-world-utility) which while they do go beyond a simple value, they are still scalar, as they ultimately result in a reference to a scalar-value - not a reference to a sequence of objects. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
So, when I say the DAL implements features that might otherwise be found in a typical ORM, I am not saying the DAL implements an ORM design pattern, just that it replicates functionality for which you might otherwise use an ORM. No, it does not do that. It implements very different functionality, that may have a similar API and the same terminology used, which honestly I find quite confusing - border-line misleading. If by functionality you mean doing the exact same thing in the exact same way, then there aren't even two ORM's that can be said to have the same functionality. So what's your point -- that no two distinct software libraries can be said to have similar functionality? I don't know how many ways I can try to make the same point, so I guess I'll try one more time. By functionality, I'm thinking of things like querying a database to retrieve some records and converting them to a format that can be used in Python, and inserting records in a database, and updating records in a database. These are things you can do with a DAL and things you can do with an ORM. There are many other such things. They are not implemented in the same fashion, nor are they executed using the same abstractions within the application code, but they achieve similar goals. So, you can use the web2py DAL to do things that you might otherwise do with an ORM. For example, in an ORM, you can define a method in a class that returns a value calculated from the fields of a database record. In the web2py DAL, this same functionality can be achieved using a virtual field or lazy field. There are no lazy-fields in web2py, and I find the terminology misleading - as I said - *lazyness *in the context of *database-access*, is a * deferred-query* - NOT a *deferred-calculation* of the *results *of a query. In the context of Row fields, the term lazy means that the value is filled in sometime after creation (typically at access time). Whether that lazy filling in involves database access or not depends on the nature of the field. If the field is simply a virtual field calculated based on the values of other fields in the Row, then there is no sense in which you would be deferring database access, as the value is not stored in the database. In that case, you are simply deferring calculation. On the other hand, reference fields do allow one to access the referenced record, and the database access in that case is in fact deferred. Likewise, Row objects can include LazySet attributes that defer database access of referencing records. Finally, a web2py virtual field or method field can do whatever you want it to do, including deferred database queries of any sort. So yes, there *are* lazy fields in web2py, both of the deferred-calculation type and the deferred-database-access type. In any case, they're not officially called lazy fields in the API -- that's just a term that is commonly used. Deferred calculations of field-results are generally useless - web-applications are generally I/O-Bound much more than CPU-Bound - so the benefits of deferring is mute in post-query calculations compared to benefits in deferred-queries that are used within the context of transaction-operation-optimizations Deferring calculations is certainly not useless, and you may even care more about being CPU-bound than I/O-bound if you're using an async web server, but even if we stipulate the above, that still doesn't change the definition of the word lazy. which is the context most people would thing of whenever they here the term *Lazy* thrown about a database-context I don't know -- sounds like an empirical question. At least within the web2py community, though, I think the term is understood. I don't know if the SQLA CORE has virtual fields, but if it doesn't, I would suppose it leaves this kind of functionality to the ORM. That's irellevant to the comparison of SQLA-Core vs. web2py-DAL, since I am not suggesting using the SQLA-Core and dumping it's ORM I completely agree, which is why that point had nothing to do with a comparison of the DAL to SQLA Core. If you will recall, the point was that the DAL includes some functionality for which you might otherwise use an ORM. If the SQLA Core doesn't have virtual fields, then you need to jump to the ORM for that functionality. Hence, some of the 20,000+ lines of SQLA ORM code are for generating functionality already available via other means in the DAL -- hence, the sheer size of the SQLA ORM does not necessarily imply a high degree of usefulness over and above what you can do with the DAL. - quite the opposite - and since virtual-fields are actually much more beneficial when used within an ORM layer, as opposed to a DAL one. I have no idea how you can justify that claim. The only relevance for this point to this discussion, is the comparison of the sized of the code-bases. I get that this was what
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Although ORM's may do that, such a feature is not unique to the ORM pattern. In the web2py DAL, for example, in a Row object with a reference to another table, the reference field is actually a DAL.Reference object, not a scalar value (it includes the scalar value but also allows access to related records in the referenced table). In this case it does not reference a set of DAL fields. I'm not sure what you mean. A reference field references records, not fields. Similarly, a Row object from a table that is referenced by another table includes an attribute that is a DAL.LazySet object (also not a scalar), which allows access to the records in the referencing table that reference the current Row object. I did not know that - what form of *Layziness* are we talking about here? Will it generate a query to fill-up the target rows? In any case, it is stil a reference to something the WOULD generate a Rows object - it is not a reference to an already-exising domain-object (which may then have references to othe domain-objects, etc. - all already within memory) object as is in ORMS Are you saying that when you select a set of records that include reference fields, the ORM automatically selects all the referenced records (and any records they may reference, and so on) and stores them in memory, even if you have not requested that? That sounds inefficient. The DAL also has list:-type fields, whose values are lists, including lists of DAL.Reference objects in the case of list:reference fields. That's interesting, but that is not exactl the same - list-fields need to be supported in the database, but in any case, it is not comparable to being linked to relation ally-stored primary-keys - which would be how it would be implemented in an ORM. No, list fields do not have to be supported in the database (they are stored as strings) -- they are an abstraction provided by the DAL. list:reference fields do in fact store a list of primary keys (in fact, a list of objects that include the primary keys and know how to retrieve the associated records). web2py also has JSON fields, which I would say does not count as a scalar either. Row objects can also include custom methods (i.e., lazy virtual fields) as well as virtual fields, which can contain complex objects. Relates to the comment I gave you a couple of minutes ago... These are complementary-auxiliary features (with in the web2py-implementation case, have questionable real-world-utility) which while they do go beyond a simple value, they are still scalar, as they ultimately result in a reference to a scalar-value - not a reference to a sequence of objects. No, you can define a virtual field whose value is any custom complex Python object you like, with its own methods, that may do or return whatever you like. They need not reference or return a scalar value. This is not a mere auxiliary feature but something that allows you to replicate functionality you might otherwise find in an ORM class. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
So, let me see if I get this straight - I'll describe how I understand things, so stop me when I start speaking non-sense: Web2py's DAL is stateless, and hens is not designed for object-oriented style of data-modeling. Whenever you create an object to represent a record, you are creating a statefull scenario - you now have to manage the internal-state of that object, and keep it synchronized with the rest of the transaction. By restricting the support of the DAL from using such a pattern, web2py avoids a whole class of problems prevalent in ORMs that are implemented via the Active-Record pattern. Rows are not comparable to ORM-result-objects, as they are not supposed to be treated as statefull - they are mere convenience-wrappers over a dictionary for attribute-access usage for getting field-results. This goes well with web2py's execution model, as it does not intend to be statefull, in the sense that models and controllers are 'executed' on each request-session, and hence Rows are not meant to be treated as objects that could persist in memory across request-sessions. However, additional convenience-functions are present in a Row object, for thread-local changes that can be done on it, without it updating-back to the current transaction automatically. An explicit-update is thereupon expected to be taken by the controller-code, in order to update the changes for the current database-transaction. Using the DAL, transaction-creation is handled implicitly by the database-itself (if there are no active-transactions in the current db-connection-session and then the connection is used), but committing of a transaction is handled explicitly by the DAL, either automatically at request-session-end, and/or manually by a call to do so on the db-connection object. By designing it this way, web2py is leaving the job of transaction-management to the database itself, and avoids many issues that might arise if it would have tried to manage the transaction itself. Given all that, the DAL's design is ill-suited for an OOP style of coding the data-models, as whenever an object is instantiated, it becomes statefull, and hence requires some mechanism to manage it's state - even if it's lifetime is bound to a single transaction and/or connection-session - a mechanism that web2py's DAL is not providing - by design - in order to avoid a slew of issues that could arise from mismanaging these object's state. The Row object may seem as though it has it's own internal-state-management functionality, but this is only for short-lived manipulations, for aggregating changes into a single update operation within the transaction, for performance improvement. It's state-management is not meant to be used like it is used in ORMs. Am I with you so far? Anything I got wrong? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I love the DAL since I find it's functional programming approach very lean without the need to switch back and forth from OO to relational reasoning :-) Moving from a mapping object (the row) to a complex custom object is much easier than it seems at the python level. ORMs on the other side tend to build objects their own style and you have to live with it. In static languages (java, C#, C++) ORM [anti]pattern actually seems a benefit. On dynamic languages like python there is no need of ORM, but there are many since it is easier and fun to develop them. Implementing *your* ORM for *your* application can be done in few lines of code, very readable and easy to maintain. The following blog entry shares much of my idea of ORM in real applications with large http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern My 4 points where I think that DAL can be (and so has to be ;-) ) improved: 1. lazy rows being backed by a sql cursor when requested by the user; 2. since point 1. is not always the solution, row object instance creation must be way faster; 3. a simple way to support db indexes even at a basic level; 4. sql view/readonly tables support (this could be related to cacheable tables). mic 2013/4/30 Massimo Di Pierro massimo.dipie...@gmail.com Hello Arnon, In web2py all DAL operations are wrapped in a transaction which can be committed explicitly via db.commit() or automatically at the end of an http request. It does not suffer from the problems of Active Records mentioned in the slides where each object is committed separately and thus things can get out of sync. In web2py objects are not proxies, as they are in SQLA. The DAL is simply an abstraction to generate SQL for you. the DAL api (insert/delete/update/select) are mapped into the corresponding SQL statements and executed when called, not deferred. Transactions are handled at the database level. This has pros and cons. The pro is that the developer knows when the DB IO is performed. In the SQLA ORM because objects are proxies and DB IO is always lazy, you do not know when DB IO is performed. I do not like that. Yes this is cool from an implementation point of view but still I need to understand what benefit it provides. It does not seem to provide a performance benefit. It does provide a consistency benefit but the consistency problem only exist if the object persists beyond the session in which it is defined. In DAL records are not objects, they are dictionaries, and they only exist for the life of an HTTP request therefore it is not clear one gains anything from the extra complexity. In other words, in my view, the consistency problem is a database problem. ACID and Transactions solve it. Some ORM move the problem at the application level because of the imperfect map between SQL API and Object API. This creates a problem and they have to jump hoops to solve it. Sessions partially solve the problem the ORM created. Perhaps I am missing something. Massimo On Monday, 29 April 2013 20:17:44 UTC-5, Arnon Marcus wrote: So, clean-slake here - who is excited about SQLA's unit-of-work/session implementation? http://youtu.be/uAtaKr5HOdAhttp://pyvideo.org/video/1767/the-sqlalchemy-session-in-depth-0 * The slides for it are here: http://techspot.zzzeek.org/**files/2012/session.key.pdfhttp://techspot.zzzeek.org/files/2012/session.key.pdf * Thw second-half shows an interactive walk-though of it, using HTM5, which you can manually interact with yourself using this: http://techspot.zzzeek.org/**files/2012/session.html.tar.gzhttp://techspot.zzzeek.org/files/2012/session.html.tar.gz How much of that is the DAL doing? How does it map to it? Would it be correct to say that a db-connection is akin to an SQLA-session? I have gone through the DAL documentation again, and I've seen glimpses of parts of this, but the whole auto-querying-on-attribute-**access - with implicit-transaction-caching - is a really cool feature. Can I do db.person[1].name and have it query the database if and only if this value was not already queried in the current transaction? I saw that it auto-commits at the end of each request/response session, right? So, this is the DB-transaction view that is committed, right? So, if I manually-commit - it automatically starts a new transaction? If I get a row object, then run the same query again - will I get the same row-object internally? I mean, does the DAL do this cool identity-map thing internally? I'm thinking about this whole dirty-checking/invalidation thing - it seems crucial for enabling orm-like access to the records (meaning, auto-query-on-access). We could emulate that in an abstraction-layer - I think this is what I am after. Am I being more clear now? With these features in the DAL, I we can pass-around the db object from controller-actions to custom-modules, instantiating it's classes with it - which would be the equivalent of passing the
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
put them on the roadmap ^_^ as for 3. it's just a matter of finding the right syntax for every db engine. I'm not sure about 4., it's available right now with migrate=False ? On Tuesday, April 30, 2013 12:50:04 PM UTC+2, Michele Comitini wrote: I love the DAL since I find it's functional programming approach very lean without the need to switch back and forth from OO to relational reasoning :-) Moving from a mapping object (the row) to a complex custom object is much easier than it seems at the python level. ORMs on the other side tend to build objects their own style and you have to live with it. In static languages (java, C#, C++) ORM [anti]pattern actually seems a benefit. On dynamic languages like python there is no need of ORM, but there are many since it is easier and fun to develop them. Implementing *your* ORM for *your* application can be done in few lines of code, very readable and easy to maintain. The following blog entry shares much of my idea of ORM in real applications with large http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern My 4 points where I think that DAL can be (and so has to be ;-) ) improved: 1. lazy rows being backed by a sql cursor when requested by the user; 2. since point 1. is not always the solution, row object instance creation must be way faster; 3. a simple way to support db indexes even at a basic level; 4. sql view/readonly tables support (this could be related to cacheable tables). mic 2013/4/30 Massimo Di Pierro massimo@gmail.com javascript: Hello Arnon, In web2py all DAL operations are wrapped in a transaction which can be committed explicitly via db.commit() or automatically at the end of an http request. It does not suffer from the problems of Active Records mentioned in the slides where each object is committed separately and thus things can get out of sync. In web2py objects are not proxies, as they are in SQLA. The DAL is simply an abstraction to generate SQL for you. the DAL api (insert/delete/update/select) are mapped into the corresponding SQL statements and executed when called, not deferred. Transactions are handled at the database level. This has pros and cons. The pro is that the developer knows when the DB IO is performed. In the SQLA ORM because objects are proxies and DB IO is always lazy, you do not know when DB IO is performed. I do not like that. Yes this is cool from an implementation point of view but still I need to understand what benefit it provides. It does not seem to provide a performance benefit. It does provide a consistency benefit but the consistency problem only exist if the object persists beyond the session in which it is defined. In DAL records are not objects, they are dictionaries, and they only exist for the life of an HTTP request therefore it is not clear one gains anything from the extra complexity. In other words, in my view, the consistency problem is a database problem. ACID and Transactions solve it. Some ORM move the problem at the application level because of the imperfect map between SQL API and Object API. This creates a problem and they have to jump hoops to solve it. Sessions partially solve the problem the ORM created. Perhaps I am missing something. Massimo On Monday, 29 April 2013 20:17:44 UTC-5, Arnon Marcus wrote: So, clean-slake here - who is excited about SQLA's unit-of-work/session implementation? http://youtu.be/uAtaKr5HOdAhttp://pyvideo.org/video/1767/the-sqlalchemy-session-in-depth-0 * The slides for it are here: http://techspot.zzzeek.org/**files/2012/session.key.pdfhttp://techspot.zzzeek.org/files/2012/session.key.pdf * Thw second-half shows an interactive walk-though of it, using HTM5, which you can manually interact with yourself using this: http://techspot.zzzeek.org/**files/2012/session.html.tar.gzhttp://techspot.zzzeek.org/files/2012/session.html.tar.gz How much of that is the DAL doing? How does it map to it? Would it be correct to say that a db-connection is akin to an SQLA-session? I have gone through the DAL documentation again, and I've seen glimpses of parts of this, but the whole auto-querying-on-attribute-**access - with implicit-transaction-caching - is a really cool feature. Can I do db.person[1].name and have it query the database if and only if this value was not already queried in the current transaction? I saw that it auto-commits at the end of each request/response session, right? So, this is the DB-transaction view that is committed, right? So, if I manually-commit - it automatically starts a new transaction? If I get a row object, then run the same query again - will I get the same row-object internally? I mean, does the DAL do this cool identity-map thing internally? I'm thinking about this whole dirty-checking/invalidation thing - it seems crucial for enabling orm-like access to the records (meaning,
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
The following blog entry shares much of my idea of ORM in real applications with large http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern I read this article about a week ago. It is interesting, but I find 2 fundamental problems with it - it assumes premises that are termed as an either-or-type scenarios: - You either have a relational-data-model, or you don't. - You either use an abstraction-layer, or you don't. These are idealistic-premises that do not map to reality. In reality, you have a hybrid-data-model, with some areas that are more relational than others - all co-existing within a single database. In reality, you almost always end-up dealing with a framework using multiple abstraction layers, at different areas of your application, depending on how well a given use-case maps to higher-layers of abstractions that the framework provides. Another MAJOR issue, with basically every critique I have read about ORMs, is that they almost EXCLUSIVELY target the Active-Record pattern. The entire first half of the SQLA lecture I posted here, is talking explicitly about this, and goes into great detail in describing the problems inherited in the Active-Record pattern. Another SQLA lecture I put here, is discussion exactly about the reality of abstraction-layer-usages that I have described, in terms of a hybrid-approach. It also shows how a layered-approach has benefits that surpass the shortcomings of having to learn multiple layers (a problem that the article touches upon). Contrary to ANY ORM out there, SQLA provides BOTH a relational-layer AS-WELL as an ORM-layer on-top, both of which use the same SQL-Expression syntax, using python-objects: - No need for a 3'rd language like Java-Hibernate's HQL. - No need for manual SQL witing. - SQLA-DAL (SQL-Expression layer) is EXPLICITLY uses throughout the build-up of ORM classes by the developer, and is NOT assumed to be treated as a hidden layer. - SQLA-ORM is NOT using the Active-Record pattern (so basically ALL criticisms you generally read about ORMs, do NOT apply...) - SQLA-ORM layer is OPTIONAL(!), and even if you DON'T use it, you STILL DON'T have to write SQL(!) You use SQL-Expressions, which is ALMOST IDENTICAL to web2py's DAL (!) Basically, almost all of that article is completely irrelevant to SQLA, and hence to the model that I am proposing here. As to OOP, again, it is almost exclusively-criticized in respect of the Active-Record pattern (which again, is irrelevant here...) SQLA's ORM, and hence what I am proposing, goes in line with the article's proposals near the end, which is: For relational-models, go with a more direct-SQL layer for using the data, and build business-logic layers on-top. Most of what SQLA's ORM is providing, is automation-tools, for helping the build-up of business-logic classes on-top of a pythonic-SQL-abstraction-layer. My 4 points where I think that DAL can be (and so has to be ;-) ) improved: 1. lazy rows being backed by a sql cursor when requested by the user; 2. since point 1. is not always the solution, row object instance creation must be way faster; 3. a simple way to support db indexes even at a basic level; 4. sql view/readonly tables support (this could be related to cacheable tables). I would add: 5. A configurable mechanism based on the Unit-Of-Work pattern. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
This is nice, but you're still talking about very general features. It would be more helpful if you could explain why we need those features. Perhaps you could show in SQLA how you would specify and interact with a particular data model, and then explain why you think that is better/easier than how you would do it using the DAL. Anthony On Monday, April 29, 2013 9:17:44 PM UTC-4, Arnon Marcus wrote: So, clean-slake here - who is excited about SQLA's unit-of-work/session implementation? http://youtu.be/uAtaKr5HOdAhttp://pyvideo.org/video/1767/the-sqlalchemy-session-in-depth-0 * The slides for it are here: http://techspot.zzzeek.org/files/2012/session.key.pdf * Thw second-half shows an interactive walk-though of it, using HTM5, which you can manually interact with yourself using this: http://techspot.zzzeek.org/files/2012/session.html.tar.gz How much of that is the DAL doing? How does it map to it? Would it be correct to say that a db-connection is akin to an SQLA-session? I have gone through the DAL documentation again, and I've seen glimpses of parts of this, but the whole auto-querying-on-attribute-access - with implicit-transaction-caching - is a really cool feature. Can I do db.person[1].name and have it query the database if and only if this value was not already queried in the current transaction? I saw that it auto-commits at the end of each request/response session, right? So, this is the DB-transaction view that is committed, right? So, if I manually-commit - it automatically starts a new transaction? If I get a row object, then run the same query again - will I get the same row-object internally? I mean, does the DAL do this cool identity-map thing internally? I'm thinking about this whole dirty-checking/invalidation thing - it seems crucial for enabling orm-like access to the records (meaning, auto-query-on-access). We could emulate that in an abstraction-layer - I think this is what I am after. Am I being more clear now? With these features in the DAL, I we can pass-around the db object from controller-actions to custom-modules, instantiating it's classes with it - which would be the equivalent of passing the session object in SQLA. This way, we can build classes that provide attribute-accessors that proxy DAL-SET-objects, and include implicit-cashing with a memoizer, and even go farther and do lazy-loaders with deferred-query classes. What do you say? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Yes. In other words: we do not try introduce problems for the purpose of solving them. ;-) In my opinion, the job of keeping a consistent state is with the database (and its acid model) not with the API. On Tuesday, 30 April 2013 05:11:56 UTC-5, Arnon Marcus wrote: So, let me see if I get this straight - I'll describe how I understand things, so stop me when I start speaking non-sense: Web2py's DAL is stateless, and hens is not designed for object-oriented style of data-modeling. Whenever you create an object to represent a record, you are creating a statefull scenario - you now have to manage the internal-state of that object, and keep it synchronized with the rest of the transaction. By restricting the support of the DAL from using such a pattern, web2py avoids a whole class of problems prevalent in ORMs that are implemented via the Active-Record pattern. Rows are not comparable to ORM-result-objects, as they are not supposed to be treated as statefull - they are mere convenience-wrappers over a dictionary for attribute-access usage for getting field-results. This goes well with web2py's execution model, as it does not intend to be statefull, in the sense that models and controllers are 'executed' on each request-session, and hence Rows are not meant to be treated as objects that could persist in memory across request-sessions. However, additional convenience-functions are present in a Row object, for thread-local changes that can be done on it, without it updating-back to the current transaction automatically. An explicit-update is thereupon expected to be taken by the controller-code, in order to update the changes for the current database-transaction. Using the DAL, transaction-creation is handled implicitly by the database-itself (if there are no active-transactions in the current db-connection-session and then the connection is used), but committing of a transaction is handled explicitly by the DAL, either automatically at request-session-end, and/or manually by a call to do so on the db-connection object. By designing it this way, web2py is leaving the job of transaction-management to the database itself, and avoids many issues that might arise if it would have tried to manage the transaction itself. Given all that, the DAL's design is ill-suited for an OOP style of coding the data-models, as whenever an object is instantiated, it becomes statefull, and hence requires some mechanism to manage it's state - even if it's lifetime is bound to a single transaction and/or connection-session - a mechanism that web2py's DAL is not providing - by design - in order to avoid a slew of issues that could arise from mismanaging these object's state. The Row object may seem as though it has it's own internal-state-management functionality, but this is only for short-lived manipulations, for aggregating changes into a single update operation within the transaction, for performance improvement. It's state-management is not meant to be used like it is used in ORMs. Am I with you so far? Anything I got wrong? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
2013/4/30 Niphlod niph...@gmail.com put them on the roadmap ^_^ as for 3. it's just a matter of finding the right syntax for every db engine. I'm not sure about 4., it's available right now with migrate=False ? 4. migrate False is on the DDL side and you are right about that, but I'd like it also on DML part i.e. in postgresql doing an INSERT on a VIEW is not allowed. On Tuesday, April 30, 2013 12:50:04 PM UTC+2, Michele Comitini wrote: I love the DAL since I find it's functional programming approach very lean without the need to switch back and forth from OO to relational reasoning :-) Moving from a mapping object (the row) to a complex custom object is much easier than it seems at the python level. ORMs on the other side tend to build objects their own style and you have to live with it. In static languages (java, C#, C++) ORM [anti]pattern actually seems a benefit. On dynamic languages like python there is no need of ORM, but there are many since it is easier and fun to develop them. Implementing *your* ORM for *your* application can be done in few lines of code, very readable and easy to maintain. The following blog entry shares much of my idea of ORM in real applications with large http://seldo.com/weblog/2011/**06/15/orm_is_an_antipatternhttp://seldo.com/weblog/2011/06/15/orm_is_an_antipattern My 4 points where I think that DAL can be (and so has to be ;-) ) improved: 1. lazy rows being backed by a sql cursor when requested by the user; 2. since point 1. is not always the solution, row object instance creation must be way faster; 3. a simple way to support db indexes even at a basic level; 4. sql view/readonly tables support (this could be related to cacheable tables). mic 2013/4/30 Massimo Di Pierro massimo@gmail.com Hello Arnon, In web2py all DAL operations are wrapped in a transaction which can be committed explicitly via db.commit() or automatically at the end of an http request. It does not suffer from the problems of Active Records mentioned in the slides where each object is committed separately and thus things can get out of sync. In web2py objects are not proxies, as they are in SQLA. The DAL is simply an abstraction to generate SQL for you. the DAL api (insert/delete/update/select) are mapped into the corresponding SQL statements and executed when called, not deferred. Transactions are handled at the database level. This has pros and cons. The pro is that the developer knows when the DB IO is performed. In the SQLA ORM because objects are proxies and DB IO is always lazy, you do not know when DB IO is performed. I do not like that. Yes this is cool from an implementation point of view but still I need to understand what benefit it provides. It does not seem to provide a performance benefit. It does provide a consistency benefit but the consistency problem only exist if the object persists beyond the session in which it is defined. In DAL records are not objects, they are dictionaries, and they only exist for the life of an HTTP request therefore it is not clear one gains anything from the extra complexity. In other words, in my view, the consistency problem is a database problem. ACID and Transactions solve it. Some ORM move the problem at the application level because of the imperfect map between SQL API and Object API. This creates a problem and they have to jump hoops to solve it. Sessions partially solve the problem the ORM created. Perhaps I am missing something. Massimo On Monday, 29 April 2013 20:17:44 UTC-5, Arnon Marcus wrote: So, clean-slake here - who is excited about SQLA's unit-of-work/session implementation? http://youtu.be/uAtaKr5HOdAhttp://pyvideo.org/video/1767/the-sqlalchemy-session-in-depth-0 * The slides for it are here: http://techspot.zzzeek.org/**fil**es/2012/session.key.pdfhttp://techspot.zzzeek.org/files/2012/session.key.pdf * Thw second-half shows an interactive walk-though of it, using HTM5, which you can manually interact with yourself using this: http://techspot.zzzeek.org/**fil**es/2012/session.html.tar.gzhttp://techspot.zzzeek.org/files/2012/session.html.tar.gz How much of that is the DAL doing? How does it map to it? Would it be correct to say that a db-connection is akin to an SQLA-session? I have gone through the DAL documentation again, and I've seen glimpses of parts of this, but the whole auto-querying-on-attribute-**acc**ess - with implicit-transaction-caching - is a really cool feature. Can I do db.person[1].name and have it query the database if and only if this value was not already queried in the current transaction? I saw that it auto-commits at the end of each request/response session, right? So, this is the DB-transaction view that is committed, right? So, if I manually-commit - it automatically starts a new transaction? If I get a row object, then run the same query again - will I get the same row-object internally? I
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
so just a read_me_only kind of check where if you even try to load a record in an edit form web2py complains ? or just complain before any insert(), update() is tried on any read_me_only fake table (i.e. do those always in a try:except block or do before an assert is_not_fake before write operations)? not sure cause I never had that requirement (and I don't have python right now), but wouldn't the same thing being accomplished by writable=False on all columns and using validate_and_update(), validate_and_insert() - as you should if you're unsure about the table model, i.e. you don't know if it's a table or a view ? I'm a little unsure about the performance penalty on checking this kind of things every time when they can be prevented just knowing that that particular collection of entities is not writable ^_^ PS: some backends allow specially coded views to be updateable and deletable.. On Tuesday, April 30, 2013 5:35:09 PM UTC+2, Michele Comitini wrote: 2013/4/30 Niphlod nip...@gmail.com javascript: put them on the roadmap ^_^ as for 3. it's just a matter of finding the right syntax for every db engine. I'm not sure about 4., it's available right now with migrate=False ? 4. migrate False is on the DDL side and you are right about that, but I'd like it also on DML part i.e. in postgresql doing an INSERT on a VIEW is not allowed. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
2013/4/30 Niphlod niph...@gmail.com so just a read_me_only kind of check where if you even try to load a record in an edit form web2py complains ? or just complain before any insert(), update() is tried on any read_me_only fake table (i.e. do those always in a try:except block or do before an assert is_not_fake before write operations)? not sure cause I never had that requirement (and I don't have python right now), but wouldn't the same thing being accomplished by writable=False on all columns and using validate_and_update(), validate_and_insert() - as you should if you're unsure about the table model, i.e. you don't know if it's a table or a view ? I'm a little unsure about the performance penalty on checking this kind of things every time when they can be prevented just knowing that that particular collection of entities is not writable ^_^ After thinking a bit maybe more useful thing is to have a new property in the table to allow simple introspection. for t in db.tables: if db[t].table_type == view: print %s is not writable! PS: some backends allow specially coded views to be updateable and deletable. PostgreSQL allows you to do the worst things by defining rules ... alas real world is a nasty place... ;-) On Tuesday, April 30, 2013 5:35:09 PM UTC+2, Michele Comitini wrote: 2013/4/30 Niphlod nip...@gmail.com put them on the roadmap ^_^ as for 3. it's just a matter of finding the right syntax for every db engine. I'm not sure about 4., it's available right now with migrate=False ? 4. migrate False is on the DDL side and you are right about that, but I'd like it also on DML part i.e. in postgresql doing an INSERT on a VIEW is not allowed. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Tuesday, April 30, 2013 3:42:23 PM UTC+3, Anthony wrote: This is nice, but you're still talking about very general features. It would be more helpful if you could explain why we need those features. Perhaps you could show in SQLA how you would specify and interact with a particular data model, and then explain why you think that is better/easier than how you would do it using the DAL. Well, I propose adding a set of tools for building business-model classes on-top of the DAL. I basically mean implementing a 'Unif-Of-Work', and an 'identity-map' as explained in the link(s) I provided. Now, what does this mean... Well, here is what it does NOT mean - it should NOT lead the developer to implement an Active-Record model. Class-attributes representing records, should NOT 'save()' whenever you set them, and should NOT *automatically* 'load()' whenever you access them. That said, the developer should also NOT have to manage the 'order-of-operations' by himself - this should be automated (read: auto-inferred from the schema'). Instead, when an attribute is being set, it should marks itself as 'dirty', and would be 'pushed' to the database later. There should also be a way to tell the class-attributes, which other attributes of which other classes would be affected by it being modified (SQLA calls it the 'relationship-object'). Then, there should be a procedure that uses this information, and while keeping track of what's going on inside the transaction, whenever a record is changed somehow, it 'discovers' which other records participating in the current transaction, might be effected by that change to that attribute, and invalidates their caches. The idea is that there is a mechanism that makes sure that if there are pending-changes of other attributes that needs to be pushed 'before' the attribute-being accessed is valid, then the push-operations are executed before the attribute refreshes itself. This is done by linking events between attributes, and propagating-them on attribute-access (SQLA calls this a 'cascade'). ** The process of marking relationships may be further automated even beyond what exists in SQLA, by analyzing the db-schema itself, and 'inferring' relationships bi-directionally, and wiring all the events automatically at class-instantiation-time.* If there are no such pending-changes, than the attribute being accessed should be assumed to be valid (so it's caches value is returned), unless it was previously invalidated by a transaction-commit. Every transaction-commit should invalidate all existing records that are in memory. This is basically the 'unit-of-work' pattern (if I understood correctly) ** But you should really watch the lectures I posted - they probably explain this much better than I did...* What are the benefits? You should ask? Well, there are performance-improvements with the caching mechanism, and with the aggregation of operations - certain change-operations are only pushed to the transaction-view, when they are actually needed, or at the end of the transaction. There may also be consistency-benefits, by insuring correct order-of-operations. Lastly, the main benefit is automatic-handling of caching and ordering or operations, that the developer no-longer needs to take care of himself. The benefit is not just for simple-data-modes, on the contrary - the more complex the data-mode, the more this becomes beneficial, as the automatic-detection of relationship-dependencies, and auto-cascade of operations, can both save brain-cycles from the developer trying to hold the whole schema in his head and make sure he pushes things in the correct order, and also prevent human-errors that can mess-up the database in cases that constrains are insufficient, and the developer overlooked some relationship-dependencies and was using stale-data without knowing it. Then there is the 'identity-map': This is a complementary-needed feature, to maintain consistency across results of the same records taken in different places in the code within the same transaction. It makes sure that there can be one-and-only-one instance or a record in-memory withing a single transaction of a single db-session. What are the benefits here? You ask? First of all, there's a consistency issue that can arise without such a system, if the developer is manually controlling the order-of-operations, and makes a mistake. This is not an issue that is exclusive for Active-Record patterns - it can also happen in web2py's DAL. For example, lets take the following code: def setItemName(id, name): row = db.item[id].select() row.update_record(name=name) def getItemByName(name): return db.item(db.item.name==name).select() id = 42 ... row = db.item[id].select() row.name='some name' # or row[name]='some name' or row.update(name='some name') ... def setItemName(id, 'some other name') getItemByName(row.name) As you can see, the last function-call would
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
What are the benefits? You should ask? Well, there are performance-improvements with the caching mechanism, and with the aggregation of operations - certain change-operations are only pushed to the transaction-view, when they are actually needed, or at the end of the transaction. I suppose there could be in some cases, but to avoid multiple updates to the same record, you can apply changes to the Row object and then call row.update_record() when finished. On the other hand, there may be cases where you want separate updates in order to trigger separate database triggers or DAL callbacks. Anyway, before investing lots of time building (and subsequently maintaining) a rather sophisticated ORM, it would be nice to know what kind of performance improvements can be expected and in what circumstances. In other words, how often do cases come up where the DAL is much less efficient than SQLA, and how difficult would it be to write equally efficient DAL code in those cases. Lastly, the main benefit is automatic-handling of caching and ordering or operations, that the developer no-longer needs to take care of himself. The benefit is not just for simple-data-modes, on the contrary - the more complex the data-mode, the more this becomes beneficial, as the automatic-detection of relationship-dependencies, and auto-cascade of operations, can both save brain-cycles from the developer trying to hold the whole schema in his head and make sure he pushes things in the correct order, and also prevent human-errors that can mess-up the database in cases that constrains are insufficient, and the developer overlooked some relationship-dependencies and was using stale-data without knowing it. Do you have an example of the above (i.e., where use of the DAL is complicated and prone to error but SQLA handles things automatically)? For example, lets take the following code: def setItemName(id, name): row = db.item[id].select() row.update_record(name=name) def getItemByName(name): return db.item(db.item.name==name).select() id = 42 ... row = db.item[id].select() row.name='some name' # or row[name]='some name' or row.update(name='some name') ... def setItemName(id, 'some other name') getItemByName(row.name) As you can see, the last function-call would either fail, or silently return something other than what was asked for. So, obviously you wouldn't do it that way with the DAL, as that isn't how it works. But how important is it to have this pattern? Once you've finished manipulating the record, you can do row.update_record(), and then any subsequent select will see the changes. It seems like this would be useful only if you need to do a read in between two separate updates to the record and you need the read to see the first update (and for some reason you can't just look at the existing Row object itself). Is this a common enough use case to justify replicating the SQLA ORM on top of the DAL? Do you have a real-world example where this would be a critical feature? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Agreed, you'd never do it this way. First of all, you have the ID already, so why would you use 'getItemByName'. Second, the getItemByName is flawed, it will return a set, not a single row. Third, if you are selecting anything, before you perform an action on it, you should at least verify that you have records to work on. Fourth, if 'setItemName' is an internal function, then you wouldn't want it to call update_record since you may have other updates that need to happen. Developer stupidity is a problem, but you aren't going to solve that with code. On Tuesday, April 30, 2013 10:45:04 AM UTC-7, Anthony wrote: What are the benefits? You should ask? Well, there are performance-improvements with the caching mechanism, and with the aggregation of operations - certain change-operations are only pushed to the transaction-view, when they are actually needed, or at the end of the transaction. I suppose there could be in some cases, but to avoid multiple updates to the same record, you can apply changes to the Row object and then call row.update_record() when finished. On the other hand, there may be cases where you want separate updates in order to trigger separate database triggers or DAL callbacks. Anyway, before investing lots of time building (and subsequently maintaining) a rather sophisticated ORM, it would be nice to know what kind of performance improvements can be expected and in what circumstances. In other words, how often do cases come up where the DAL is much less efficient than SQLA, and how difficult would it be to write equally efficient DAL code in those cases. Lastly, the main benefit is automatic-handling of caching and ordering or operations, that the developer no-longer needs to take care of himself. The benefit is not just for simple-data-modes, on the contrary - the more complex the data-mode, the more this becomes beneficial, as the automatic-detection of relationship-dependencies, and auto-cascade of operations, can both save brain-cycles from the developer trying to hold the whole schema in his head and make sure he pushes things in the correct order, and also prevent human-errors that can mess-up the database in cases that constrains are insufficient, and the developer overlooked some relationship-dependencies and was using stale-data without knowing it. Do you have an example of the above (i.e., where use of the DAL is complicated and prone to error but SQLA handles things automatically)? For example, lets take the following code: def setItemName(id, name): row = db.item[id].select() row.update_record(name=name) def getItemByName(name): return db.item(db.item.name==name).select() id = 42 ... row = db.item[id].select() row.name='some name' # or row[name]='some name' or row.update(name='some name') ... def setItemName(id, 'some other name') getItemByName(row.name) As you can see, the last function-call would either fail, or silently return something other than what was asked for. So, obviously you wouldn't do it that way with the DAL, as that isn't how it works. But how important is it to have this pattern? Once you've finished manipulating the record, you can do row.update_record(), and then any subsequent select will see the changes. It seems like this would be useful only if you need to do a read in between two separate updates to the record and you need the read to see the first update (and for some reason you can't just look at the existing Row object itself). Is this a common enough use case to justify replicating the SQLA ORM on top of the DAL? Do you have a real-world example where this would be a critical feature? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
And keep in mind that the ORM alone in SQLA is over 20,000 lines of code (by contrast, the DAL is now around 10,000 loc). Even if we could get away with a modest fraction of that, this would be a significant undertaking. So, we really need compelling evidence that there are common use cases where this would make a notable difference. Also, some of these features may not require an ORM per se -- it may be possible to add some of this functionality within the context of the DAL. Anthony On Tuesday, April 30, 2013 1:45:04 PM UTC-4, Anthony wrote: What are the benefits? You should ask? Well, there are performance-improvements with the caching mechanism, and with the aggregation of operations - certain change-operations are only pushed to the transaction-view, when they are actually needed, or at the end of the transaction. I suppose there could be in some cases, but to avoid multiple updates to the same record, you can apply changes to the Row object and then call row.update_record() when finished. On the other hand, there may be cases where you want separate updates in order to trigger separate database triggers or DAL callbacks. Anyway, before investing lots of time building (and subsequently maintaining) a rather sophisticated ORM, it would be nice to know what kind of performance improvements can be expected and in what circumstances. In other words, how often do cases come up where the DAL is much less efficient than SQLA, and how difficult would it be to write equally efficient DAL code in those cases. Lastly, the main benefit is automatic-handling of caching and ordering or operations, that the developer no-longer needs to take care of himself. The benefit is not just for simple-data-modes, on the contrary - the more complex the data-mode, the more this becomes beneficial, as the automatic-detection of relationship-dependencies, and auto-cascade of operations, can both save brain-cycles from the developer trying to hold the whole schema in his head and make sure he pushes things in the correct order, and also prevent human-errors that can mess-up the database in cases that constrains are insufficient, and the developer overlooked some relationship-dependencies and was using stale-data without knowing it. Do you have an example of the above (i.e., where use of the DAL is complicated and prone to error but SQLA handles things automatically)? For example, lets take the following code: def setItemName(id, name): row = db.item[id].select() row.update_record(name=name) def getItemByName(name): return db.item(db.item.name==name).select() id = 42 ... row = db.item[id].select() row.name='some name' # or row[name]='some name' or row.update(name='some name') ... def setItemName(id, 'some other name') getItemByName(row.name) As you can see, the last function-call would either fail, or silently return something other than what was asked for. So, obviously you wouldn't do it that way with the DAL, as that isn't how it works. But how important is it to have this pattern? Once you've finished manipulating the record, you can do row.update_record(), and then any subsequent select will see the changes. It seems like this would be useful only if you need to do a read in between two separate updates to the record and you need the read to see the first update (and for some reason you can't just look at the existing Row object itself). Is this a common enough use case to justify replicating the SQLA ORM on top of the DAL? Do you have a real-world example where this would be a critical feature? Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I admit it is a very silly example, I didn't give much thought to it - I just looked for something to exemplify a problem that might occur when no identity-mapping exists. The fact that I could not find a better example, is not a testament to say that such scenarios don't (or can't) occur. There where better example in one of the talks I posted here... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Here is a really awesome talk about SALQ's Core/ORM dichotomy, featuring diagrams that overview the architecture of the ORM, and how it implements the Unit-of-Work pattern: http://lanyrd.com/2011/pygotham/shpkm/#link-fkfb If you take into account what's said in this talk - that basically the ORM and the CORE in SQLA are 2 separate beasts - than if follows what I've been saying here, that the web2py-DAL is equivalent just to the CORE, and has no features that the ORM provides whatsoever. That leves you with 20K lines of code that are not represented at all inside the 10K lines of code of web2py's DAL. I think this would be a solid assertion. That means that it would be a substantial undertaking to replicate the SQLA-ORM over web2py. But it also means something else - that saying that it's benefits are questionable, is akin to saying that a 20k lines-of-code of the coolest ORM in existence, is a useless-piece of software... Useless in the sense that it takes a lot and gives very little (if at all) Somehow I doubt that assertion... Lastly, what this *may* all mean, is that SQLA-ORM may be re-tooled to be layered on-top of web2py's DAL instead of it's CORE. This should be especially easier using what he called in the talk the classical mapping pattern, in which your domain-model classes and your CORE/DAL objects are built-up completely separately, and a 3'rd separate mapper objects takes care of instrumenting your domain-classes with the CORE/DAL objects to essentially build-up and form the ORM out of mearging the two via what he called formal-monkey-patching (which I assume he means some sort of dependency-injection). Given all that, all you would need to do is sub-class the mapper and redactor it to instrument domain-classes using web2py's DAL instead. This may be a substantially-less of an undertaking, and would basically achieve the same thing. Since these are both open-source projects, both written in pure-python, it should be at the very-least possible to do something like that. I had seen past-experiments of doing the opposite - mapping the DAL to layer on-top of SQLA models - but nowhere was it suggested to do it the other way around with SALQ's ORM... I thing it is even a more logical rout to take, as you don't need SQLA's CORE, we already have a full equivalent for that... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Hi Massimo, You should really watch this lecture: http://lanyrd.com/2011/pygotham/shpkm/#link-fkfb I think you will like it, even if you son't like ORMs. It shows some architectural diagrams that you might find interesting, for both the Core layer and ORM layer of SQLAlchemy. I think this would make clear the dichotomy between the two. As the lecture shows, the Core and the ORM are actually separate code-bases (though the ORM is built on-top of Core), so that people can actually not use the ORM at all if they don't like to, and what they will be left eith, is basically equivalent to what web2py provides - a DAL. Since these are separate layers, I think what might be an interesting research-project, is not re-inventing the wheel, but instead try to layer SQLA's ORM on-top of web2py's DAL. The past-experiments I've seen where trying to do the opposite, but I actually think that this makes more sense. The modern way of generating an ORM class, is called the declarative-mapping approach, in which you build the schema right into the python classes (decoratively). This would be way to difficult to re-instrument. However, there still exist the 'older', more dis-jointed way of mapping that they call classical-mapping, which maps a completely regular python classes to a 'Core' schema-object, and generates an ORM-class out of the two using a 3'rd object called a mapper. It is basically taking a regular python class, and a schema that was built using Core (in an almost identical fashion as it is done in the web2py-DAL) and using a separate mapper-object to 'instrument' the schema into the class, and thus generating an ORM class out of the merge of the two. So I was thinking it might be interesting research-project, is to just try to sub-class the mapper-object, and write a web2py-DAL-mapper variation, that basically generates the same ORM, using the same kind of instrumentation, but just using the web2py-DAL schema-object instead of the SQLA-Core one. What do you say? Is that feasible? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I will watch it asap. Anyway, I am not trying to discourage you or oppose to it. If you have resources and want to start building something like this, you have all my support. Some people may love it. What I cannot promise is that it will be included in web2py until I see a real benefit. On Tuesday, 30 April 2013 15:49:53 UTC-5, Arnon Marcus wrote: Hi Massimo, You should really watch this lecture: http://lanyrd.com/2011/pygotham/shpkm/#link-fkfb I think you will like it, even if you don't like ORMs. It shows some architectural diagrams that you might find interesting, for both the Core layer and ORM layer of SQLAlchemy. I think this would make clear the dichotomy between the two. As the lecture shows, the Core and the ORM are actually separate code-bases (though the ORM is built on-top of Core), so that people can actually not use the ORM at all if they don't like to, and what they will be left eith, is basically equivalent to what web2py provides - a DAL. Since these are separate layers, I think what might be an interesting research-project, is not re-inventing the wheel, but instead try to layer SQLA's ORM on-top of web2py's DAL. The past-experiments I've seen where trying to do the opposite, but I actually think that this makes more sense. The modern way of generating an ORM class, is called the declarative-mapping approach, in which you build the schema right into the python classes (decleratively). This would be way to difficult to re-instrument. However, there still exist the 'older', more dis-jointed way of mapping that they call classical-mapping, which maps a completely regular python classes to a 'Core' schema-object, and generates an ORM-class out of the two using a 3'rd object called a mapper. It is basically taking a regular python class, and a schema that was built using Core (in an almost identical fashion as it is done in the web2py-DAL) and using a separate mapper-object to 'instrument' the schema into the class, and thus generating an ORM class out of the merge of the two. So I was thinking it might be interesting research-project, is to just try to sub-class the mapper-object, and write a web2py-DAL-mapper variation, that basically generates the same ORM, using the same kind of instrumentation, but just using the web2py-DAL schema-object instead of the SQLA-Core one. What do you say? Is that feasible? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Have you not tried just importing sqla in your 0.py model, and writing your models and code as you see fit? You can certainly bypass the DAL if you want. On Tuesday, April 30, 2013 1:49:53 PM UTC-7, Arnon Marcus wrote: Hi Massimo, You should really watch this lecture: http://lanyrd.com/2011/pygotham/shpkm/#link-fkfb I think you will like it, even if you don't like ORMs. It shows some architectural diagrams that you might find interesting, for both the Core layer and ORM layer of SQLAlchemy. I think this would make clear the dichotomy between the two. As the lecture shows, the Core and the ORM are actually separate code-bases (though the ORM is built on-top of Core), so that people can actually not use the ORM at all if they don't like to, and what they will be left eith, is basically equivalent to what web2py provides - a DAL. Since these are separate layers, I think what might be an interesting research-project, is not re-inventing the wheel, but instead try to layer SQLA's ORM on-top of web2py's DAL. The past-experiments I've seen where trying to do the opposite, but I actually think that this makes more sense. The modern way of generating an ORM class, is called the declarative-mapping approach, in which you build the schema right into the python classes (decleratively). This would be way to difficult to re-instrument. However, there still exist the 'older', more dis-jointed way of mapping that they call classical-mapping, which maps a completely regular python classes to a 'Core' schema-object, and generates an ORM-class out of the two using a 3'rd object called a mapper. It is basically taking a regular python class, and a schema that was built using Core (in an almost identical fashion as it is done in the web2py-DAL) and using a separate mapper-object to 'instrument' the schema into the class, and thus generating an ORM class out of the merge of the two. So I was thinking it might be interesting research-project, is to just try to sub-class the mapper-object, and write a web2py-DAL-mapper variation, that basically generates the same ORM, using the same kind of instrumentation, but just using the web2py-DAL schema-object instead of the SQLA-Core one. What do you say? Is that feasible? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
If you take into account what's said in this talk - that basically the ORM and the CORE in SQLA are 2 separate beasts - than if follows what I've been saying here, that the web2py-DAL is equivalent just to the CORE, and has no features that the ORM provides whatsoever. I'm not sure that follows. The web2py DAL has a lot of features that might otherwise be found in an ORM. I'm not very familiar with SQLA, but I suspect the DAL has some features not present in CORE but similar to functionality included in the ORM. But it also means something else - that saying that it's benefits are questionable, is akin to saying that a 20k lines-of-code of the coolest ORM in existence, is a useless-piece of software... Anything is possible. Feel free to make the case with some real examples. You might also consider trying SQLA directly in place of the DAL, and maybe using something like WTForms if you need more forms automation. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Tuesday, April 30, 2013 2:57:34 PM UTC-7, Derek wrote: Have you not tried just importing sqla in your 0.py model, and writing your models and code as you see fit? You can certainly bypass the DAL if you want. I know that, but well, you see, there lies the problem - I DON'T want... I love the DAL too much - wouldn't change it for anything! :) (not even SQLA-Core...;) ) I even considered using web2py's DAL outside of web2py as well - in all the plug-ins I plan to write for desktop-applications. It's an amazing and elegant piece of software! Besides, it's not feasible for us anymore anyways, as I said, we already have thousands of lines of code built using the DAL - switching it to something else would be a nightmare, and way too costly. we have DAL-based code more than any other python code - hell, we would be switching a web-framework before we consider replacing the DAL... :) All I want is a decent ORM on-top to structure all that wonderful DAL code we have into... But it has to be statefull to be worth-it, so I couldn't write my own, and SQLA-ORM is the best ORM in python I currently know of... I am looking at Storm also, but it currently seems to be less-modular that SQLA... That's from just a glance, though... -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Tuesday, April 30, 2013 3:18:50 PM UTC-7, Anthony wrote: If you take into account what's said in this talk - that basically the ORM and the CORE in SQLA are 2 separate beasts - than if follows what I've been saying here, that the web2py-DAL is equivalent just to the CORE, and has no features that the ORM provides whatsoever. I'm not sure that follows. The web2py DAL has a lot of features that might otherwise be found in an ORM. I'm not very familiar with SQLA, but I suspect the DAL has some features not present in CORE but similar to functionality included in the ORM. I wouldn't be so sure about that... You should really check out the links I've posted in this thread - SQLA-Core is a fully-fledged DAL. Reddit is using it alone, without any of the ORM level... And again, it may be a matter of semantics, but there is no ORM feature in the DAL - I've been going over the documentation and asking lost of questing talking to Massimo - it's pretty conclussive - iweb2py's DAL is exclusively-stateless... You might also consider trying SQLA directly in place of the DAL shameless copy-paste from my answer to Derek: I know that, but well, you see, there lies the problem - I DON'T want... I love the DAL too much - wouldn't change it for anything! :) (not even SQLA-Core...;) ) I even considered using web2py's DAL outside of web2py as well - in all the plug-ins I plan to write for desktop-applications. It's an amazing and elegant piece of software! Besides, it's not feasible for us anymore anyways, as I said, we already have thousands of lines of code built using the DAL - switching it to something else would be a nightmare, and way too costly. we have DAL-based code more than any other python code - hell, we would be switching a web-framework before we consider replacing the DAL... :) All I want is a decent ORM on-top to structure all that wonderful DAL code we have into... But it has to be statefull to be worth-it, so I couldn't write my own, and SQLA-ORM is the best ORM in python I currently know of... I am looking at Storm also, but it currently seems to be less-modular that SQLA... That's from just a glance, though. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
If you take into account what's said in this talk - that basically the ORM and the CORE in SQLA are 2 separate beasts - than if follows what I've been saying here, that the web2py-DAL is equivalent just to the CORE, and has no features that the ORM provides whatsoever. I'm not sure that follows. The web2py DAL has a lot of features that might otherwise be found in an ORM. I'm not very familiar with SQLA, but I suspect the DAL has some features not present in CORE but similar to functionality included in the ORM. I wouldn't be so sure about that... You should really check out the links I've posted in this thread - SQLA-Core is a fully-fledged DAL. Reddit is using it alone, without any of the ORM level... Well, I did say I suspect, not that I'm sure, but does it have things like migrations, automatic file uploads/retrieval, recursive selects, automatic results serialization into HTML, virtual fields, computed fields, validators, field representations, field labels, field comments, table labels, list:-type fields, JSON fields, export to CSV, callbacks, record versioning, common fields, multi-tenancy, common filters, GAE support, and MongoDB support? And again, it may be a matter of semantics, but there is no ORM feature in the DAL - I've been going over the documentation and asking lost of questing talking to Massimo - it's pretty conclussive - iweb2py's DAL is exclusively-stateless... I didn't say there were ORM features in the DAL, just that it includes features that you might otherwise expect to find in an ORM (e.g., something like virtual fields). In other words, some of what you get with that 20,000+ lines of ORM code might be functionality that is in fact available in the web2py DAL. Also, note that it might be the case that some of the features of the SQLA ORM that you find appealing don't necessarily require an ORM pattern per se but could possibly be implemented within the context of a DAL (i.e., just because SQLA does it via an ORM doesn't mean that's the only way). You might also consider trying SQLA directly in place of the DAL I know that, but well, you see, there lies the problem - I DON'T want... I love the DAL too much - wouldn't change it for anything! :) (not even SQLA-Core...;) ) But if you use an ORM built on top of the DAL, you won't be using the DAL API anyway, so what would you miss? Or are you saying you would still want to use the DAL for a significant portion of code and only move to the ORM for select parts of the application? Anyway, it may nevertheless be a useful exercise to start by using SQLA for some project just to see if it really delivers on the promises you believe it is making. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Monday, April 29, 2013 3:59:40 AM UTC+3, Anthony wrote: I think you're arguing against a bit of a straw man here. Past resistance to the idea of an ORM has usually been in the context of suggestions for an ORM *instead of* a DAL, not as an additional abstraction on top of the DAL for particular use cases. As Massimo noted, there have already been some efforts at the latter, though none generated enough interest to persist. If you can identify a common use case where some kind of abstraction on top of the DAL would make development much easier and/or execution more efficient, and you can clearly articulate the nature of that abstraction, I don't think anyone would object. It might help to see something concrete, though. Let's see an example data model, how it would be easily implemented in SQA or the hypothetical web2py ORM, and how that is a clear improvement over how you would otherwise have to implement it using the DAL I haven't given a DAL-alternative to what I am after, because I am not as familiar with it's advanced-usage-patterns as some of you may be. I am presenting a set of needs, and asking for a way to meet them. If you go through the links I have been pin-pointing here, you would find many concrete examples of data-models and how they are used in SQLA's ORM. You are basically asking me to get much more proficient in the DAL, as well as in SQLA, and to design the solution myself and present it. This is a valid expectation in a developers-group, not a users-group. The whole point of sharing ideas and expertise, is that someone can suggest a conceptual/abstract direction, and this entices someone else who is much more proficient, to come-up with a concrete solution. We share each-other's expertise also, not just ideas. I am inclined to hoped I would not be needing to become an expert myself, in order to propose an idea for improvement. I think it is a legitimate inclination. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
they don't build up magically by default (though the cascade parameter allows the same flexibility as the one present in your backend) but given that all the six important events are available and fired accordingly, you can hook up whatever business logic you like. That answer is too vague for me... Can you elaborate? What cascade-parameter? What do you mean by your backend? What six important event? Does this answer my questions? HTTP is (ATM) a stateless world. Yep, cookies are used to store something from different requests, but that's business logic of your app, not a database concern. Even with SQLA you DON'T want to keep alive a session between multiple requests, as you can very easily end up in lockings. I was referring to stateless-vs-statefull as a metaphor - an analogy. The statefullness I was trying to convey, that exist in all other frameworks except web2py, is in the ORM: SQLA assumes that the model-code is not executed for each request. It is imported once at server-launch-time. It's ORM provides a set of classes that are instantiated on-the-fly, but not in a session-lexical manner - If a request from a different session asks for a table that has an already-existing ORM-class-instance for it in memory, it will be reused. The data it contains, however, will have already been invalidated (dirty-marked) by the last database-connection-session-closing. Connection-pooling means that it re-uses connection to the database, not re-create them. A database-session-close from the point of view of the ORM, is not really a connection-close in the core-layer - it is just a release of a connection-object back to the connection pool. Then on the ORM-level, a transaction-end is not really a session-end in terms of the life-time of the class-instance-objects of the ORM classes. They persist in memory, they just get invalidated by the transaction-end. Again, it is a very different execution model from web2py - and it has nothing to do with HTTP, keep-alive, and long-lived-connections with the client - it has to do with the way the web-framework is implemented: SQLA's ORM-class-instances are reused across requests. Web2py's DAL instances are not. If I wanted to emulate a SQLA-ORM in web2py, I would have to put it all in custom-modules, that would persist class-instances in memory, and do a lot of these invalidation/dirty-chacking stuff myself. I think a way around this might be to keep the DAL-layer separate, in that instances of ORM-classes may persist in memory across requests in a custom module that is not reloaded, but that they would be dynamically re-linked to DAL instances that would still be re-generated for each request by executing the models as usual. This is what I mean by another layer on-top of the DAL. How to implement such a thing? This is way over my league... And what's different than looping over all the referenced tables and joining them ? Do you really need a 10-lines code snippet to accomplish that in DAL ? This can go in the please add features to DAL chapter, as other things mentioned before, but don't really count for an ORM. In respect to my previous section, the auto-generation of query-objects, should be done within the ORM layer, that just uses the DAL instances. The reason for this is that behavioral-characteristics for domain-model classes, go in the ORM layer - that's what it's for. That means exactly what I meant to explain before: you define how it should work beforehand and you need to review the model to understand what is firing under the hood. The DAL is an abstraction-layer. It's purpose is that you should not need to go beneath that layer, unless things go horribly wrong. This means, that for 99% of the usage, you shouldn't care what goes on underneath in the low-level SQL. Otherwise, there really is no need for a DAL at all... The same would go for an ORM layer. It should help you facilitate the mapping of your business-logic classes to database-classes, and define automation of database-layer-behavioral-patterns, as a function of business-logic behaviors, within you your business-logic layer, and not the database-layer. This is in a nutshell what I am after, and where I think an ORM layer on-top of the DAL could be most beneficial. So, you added a lot of statements and pieces of code in your model to define the relationship, and SQLA replays it figuring out the most optimized way possible. My point of I end up with a nice schema to play with may be throwing me ahead of road-blocks that you faced, but is it really that difficult to code a relationship and figuring out how to query it ? I'm not sure I understand what you are saying here, but it seems to pertain to my last comment in the previous section. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
On Monday, April 29, 2013 11:47:41 AM UTC+2, Arnon Marcus wrote: they don't build up magically by default (though the cascade parameter allows the same flexibility as the one present in your backend) but given that all the six important events are available and fired accordingly, you can hook up whatever business logic you like. That answer is too vague for me... Can you elaborate? What cascade-parameter? What do you mean by your backend? What six important event? Does this answer my questions? So, a small recap. Don't take this in the wrong way but: - I need to watch 2 hours of presentations that you found interesting to be able to reply - you expect us (by us I mean web2py developers) to figure out YOUR business requirement because you liked a presentation about SQLA and you want that feature(s) on top of DAL, without providing your use-case and the possible benefits - you expect web2py developers to come up with an API that even you can't figure out - you don't want to raise a finger providing a real usecase scenario cause you're a user, not a developer - you can't take a look at the current DAL implementation to see what are the proposed solutions to your (ATM imaginary) problems If these are the requirements of your discussion-environment, don't be surprised if nobody provides code for you. Docs are there. Given that you're using time to look at other python modules/packages, at least be documented on what you want to use as a starting-point (DAL) for comparisons. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: [web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
This has turned into a meta-discussion, but I get the criticism (I think it is exaggerated, though...). Yes, I should read more about new DAL features, and I will, I just assumed that creative-usage-patters would not necessarily be as well documented as the bare features themselves, and that some people here - you maybe - would be more proficient than me regarding such creative usage-patterns that I might not think of. Just pointing out features of the DAL, is insufficient, without some toy-examples... Perhaps it is my way of asking for help that you don't like, but you should at least admit that I had to go quite far just to convince you that my proposal is even potentially beneficial at all... We never even got to the point in which you gave me concrete alternative usage-patterns of the DAL - you consistently reject the proposition I present in principal. This is not helping. As for use-cases, I wanted you to see someone else explain things better then me - so I gave you some links. It is an almost three-hour lecture, so I pin-pointed specific minutes that I think where relevant, and gave time-coded links. Is that too much to ask? As for the other 50min lecture, I could narrow it down even more, but the point was that I felt that there was a mismatch in perception of philosophies between how you perceive SQLA's philosophy (based on your prior perception of ORMs in general), and the actual philosophy of SQLA, as it's creator is presenting it, and I wanted you to see that. As for API design, I am a user of web2py. Meaning, I AM a developer, but a developer of applications - not frameworks - there is a crucial distinction to make here. You wouldn't expect users of your framework to suggest concrete-implementations of infrastructure-improvements, would you? I am not expecting any of you to figure out my business requirements (I have no idea what gave you that impression...). I am expecting a helping hand of expertise regarding existing usage-patterns, as well as have an architectural/abstract discussion of a proposed additional-abstraction-layer, that might require proficiency in the inner-workings of the DAL, which is beyond my expertise. How would you suggest I should ask for such a thing? On Mon, Apr 29, 2013 at 1:47 PM, Niphlod niph...@gmail.com wrote: On Monday, April 29, 2013 11:47:41 AM UTC+2, Arnon Marcus wrote: they don't build up magically by default (though the cascade parameter allows the same flexibility as the one present in your backend) but given that all the six important events are available and fired accordingly, you can hook up whatever business logic you like. That answer is too vague for me... Can you elaborate? What cascade-parameter? What do you mean by your backend? What six important event? Does this answer my questions? So, a small recap. Don't take this in the wrong way but: - I need to watch 2 hours of presentations that you found interesting to be able to reply - you expect us (by us I mean web2py developers) to figure out YOUR business requirement because you liked a presentation about SQLA and you want that feature(s) on top of DAL, without providing your use-case and the possible benefits - you expect web2py developers to come up with an API that even you can't figure out - you don't want to raise a finger providing a real usecase scenario cause you're a user, not a developer - you can't take a look at the current DAL implementation to see what are the proposed solutions to your (ATM imaginary) problems If these are the requirements of your discussion-environment, don't be surprised if nobody provides code for you. Docs are there. Given that you're using time to look at other python modules/packages, at least be documented on what you want to use as a starting-point (DAL) for comparisons. -- --- You received this message because you are subscribed to a topic in the Google Groups web2py-users group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/vyOEkUeCNj4/unsubscribe?hl=en. To unsubscribe from this group and all its topics, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I am interested in these discussions but it seems to me that far too much is being written and I apologise if I haven't read all the references which have been proposed. The fact is that Hierarchical data does not fit at all well into a RDBMS. The main issue is therefore to decide what 'kludge' to use. The best kludge will vary case by case so the ORM would have to be incredibly sophisticated to always choose an optimised kludge without the developer specifying exactly what he wants. I do not know how SQLA achieves this, but I cannot imagine the DAL being developed to cover all cases in the best possible way. Nevertheless it would indeed be nice to cover at least one common use-case of hierarchical data and I would suggest that someone should propose a specific case. Here is a simple example: geographical areas. I would propose either a nested set (or perhaps simply a materialised path, which I like very much for its simplicity). I think such an example would be very useful. A few simple functions to manipulate this included in the DAL would be really great. I believe it is only with a specific proposal in mind that this discussion will progress. A theoretical discussion will remain just that and nothing practical will result. There are truckloads of stuff already written on this topic and I cannot imagine that this thread would add anything new to that. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
This is actually really close to what I am after - 10x. Alas, it seems DOA... I have contacted the creator. -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
Anthony found something close to what I'm after: https://code.google.com/p/web2pyorm/ But what exactly an I expected to propose that would be concrete? An API? Must I be an API coder in order to propose a feature-set? And about use-cases - I gave mine, as well as links to others. What else should I do? -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[web2py] Re: ORM (?) : A Revisit, NOT a Rebuttal
I was disappointed to discover that this example is proorly-formed. The comparison ther, (as well as in other places, like this: http://web2py.com/examples/static/sqla2.html) feels somewhere between ill-informed to disingenuous. The original link states the following: The web2py database Abstraction layer can read most Django models and SQLAlchemy models...This is experimental and needs more testing. This is not recommended nor suggested but it was developed as a convenience tool to port your existing python apps to web2py. It was not intended as a comparison between the DAL and SQA nor as a comprehensive representation or implementation of SQA's features, just as a convenience for using existing SQA model code if you happen to have some. And the link you included above starts with the following: We translated all the examples from the SQLAlchmey tutorialhttp://www.sqlalchemy.org/docs/05/ormtutorial.html...This is not a feature comparison but more of a legend for new web2py users with SQLAlchemy experience... It is literally just showing how to do what is shown in the SQA tutorial using the web2py DAL. I don't see how either of these posts can be construed as ill-informed or disingenuous -- they are quite clear about their scope and intention. Anthony -- --- You received this message because you are subscribed to the Google Groups web2py-users group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.