[google-appengine] Re: PolyModel needs to be rethought
Yes, very helpful. i appreciate the thoughtful responses and clarifications. i can accept that, for scalability, it must be this way. And yes, i do hail from a traditional SQL (relational) background. i'll need to reread the paper on BigTable to keep me from veering back to that way of thinking. it is a bit of a leap for me. Next time i encounter something that strikes me as odd, i'll just assume the GAE team has already given it much thought and the tenets (ie. scalability) have taken precedence - hopefully that will shutup 'ol doubting Thomas. Thanks again, Rein On Mar 30, 5:57 pm, Rafe wrote: > Rein, > > Thanks for taking the time to think about this stuff. It's > important to think about alternatives to any design offered by the App > Engine team, even if it means it leads to merely clearing up some > misunderstandings about the underlying implementation. I think Andy's > response sums up anything I was going to say and is correct. > Properties that don't exist on leaf classes are not stored whatsoever > in the datastore and are also not loaded when returned by a get or > query fetch. > > One thing to realize about the difference between Datastore and the > Datastore Model API that sits on top of it is that Datastore does not > actually understand or care about the structure of the data that it > stores in each Kind (what you refer to as "Table"). In fact, in order > to make the model API behave more like the actual underlying Datastore > implementation, we had to add a modified subclass of Model similar to > PolyModel. If you have not had a chance, take a look at Expando. > > http://code.google.com/appengine/docs/python/datastore/expandoclass.html > > Notice how each instance of an Expando kind can have a unique > "schema" when compared with all other instances. You can have: > > instance1.prop1 = 'a string' > instance2.prop2 = 'a different property' > instance3.prop1 = 1024 > > instance2 does not have prop1, while instance3 has a different type > for prop1. It seems like under the hood Datastore acts more like a > dict than a traditional object. > > Turns out not to be so far from the truth. If you look further you > will see that the datastore.Entity class, which is where db.Model > reads and writes info from the datastore, actually inherits from dict. > > This is all ok. It turns out that Datastore handles these cases > very well. > > I'm guessing you might be more experienced with how SQL based object > relational mapping algorithms work, such as Java's Hibernate. > Hibernate allows you to map different class structures to either > single tables (much like how PolyModel does it) and multiple tables as > you have described. This makes good sense in an environment like SQL > where joins are permitted. > > That's not to say that there will never be a case in App Engine > where someone would benefit more from a polymorphic model based on > multiple Kinds rather than a single kind. Most of those cases can be > handled relatively well using db.ReferenceProperty. But an easier to > use and more intuitive Python class that encapsulates this concept > might be neat. You should see PolyModel as a reference implementation > of a polymorphic class, but is not necessarily definitive. Everyone > has different needs and the hope is that variations like Expando and > PolyModel give people good starting points to build the best possible > systems. > > Hope this is useful. > > - Rafe Kaplan > > On Mar 28, 9:43 am, Andy Freeman wrote: > > > > The problem with this is that if you have an inheritance > > > hierarchy in which your outermost descendant(s) extend several fields, > > > you've loaded your base entities with a mass of irrelevant fields > > > containing an unusual (i've never seen that before). > > > I'm pretty sure that the base entities (the things in the datastore) > > don't have those fields. > > > The "missing" message comes from a tool that is looking at instances > > pulled from the datastore. It expects to see things and when it > > doesn't, it says . > > > > if you are building an application that, say for example, reads the > > > fields (columns?) in the entity for display to the user (like the > > > Dashboard's Data Viewer), you end up presenting irrelevant fields. > > > That happens only if your application doesn't understand polymodels. > > I'm still working through the details because "not polymodels" don't > > support class_name() and polymodels don't support anything like > > db.class_for_kind(). However, properties() appears to do the right > > thing. > > > One good reason for implementing Polymodels with a single entity type > > is that implementing them with multiple entity types would make > > queries a lot more expensive. > > > If polymodels were implemented with multiple entity types, each > > application-level query would need to make multiple datastore queries, > > one for each applicable datastore entity type. If those queries are >
[google-appengine] Re: PolyModel needs to be rethought
Rein, Thanks for taking the time to think about this stuff. It's important to think about alternatives to any design offered by the App Engine team, even if it means it leads to merely clearing up some misunderstandings about the underlying implementation. I think Andy's response sums up anything I was going to say and is correct. Properties that don't exist on leaf classes are not stored whatsoever in the datastore and are also not loaded when returned by a get or query fetch. One thing to realize about the difference between Datastore and the Datastore Model API that sits on top of it is that Datastore does not actually understand or care about the structure of the data that it stores in each Kind (what you refer to as "Table"). In fact, in order to make the model API behave more like the actual underlying Datastore implementation, we had to add a modified subclass of Model similar to PolyModel. If you have not had a chance, take a look at Expando. http://code.google.com/appengine/docs/python/datastore/expandoclass.html Notice how each instance of an Expando kind can have a unique "schema" when compared with all other instances. You can have: instance1.prop1 = 'a string' instance2.prop2 = 'a different property' instance3.prop1 = 1024 instance2 does not have prop1, while instance3 has a different type for prop1. It seems like under the hood Datastore acts more like a dict than a traditional object. Turns out not to be so far from the truth. If you look further you will see that the datastore.Entity class, which is where db.Model reads and writes info from the datastore, actually inherits from dict. This is all ok. It turns out that Datastore handles these cases very well. I'm guessing you might be more experienced with how SQL based object relational mapping algorithms work, such as Java's Hibernate. Hibernate allows you to map different class structures to either single tables (much like how PolyModel does it) and multiple tables as you have described. This makes good sense in an environment like SQL where joins are permitted. That's not to say that there will never be a case in App Engine where someone would benefit more from a polymorphic model based on multiple Kinds rather than a single kind. Most of those cases can be handled relatively well using db.ReferenceProperty. But an easier to use and more intuitive Python class that encapsulates this concept might be neat. You should see PolyModel as a reference implementation of a polymorphic class, but is not necessarily definitive. Everyone has different needs and the hope is that variations like Expando and PolyModel give people good starting points to build the best possible systems. Hope this is useful. - Rafe Kaplan On Mar 28, 9:43 am, Andy Freeman wrote: > > The problem with this is that if you have an inheritance > > hierarchy in which your outermost descendant(s) extend several fields, > > you've loaded your base entities with a mass of irrelevant fields > > containing an unusual (i've never seen that before). > > I'm pretty sure that the base entities (the things in the datastore) > don't have those fields. > > The "missing" message comes from a tool that is looking at instances > pulled from the datastore. It expects to see things and when it > doesn't, it says . > > > if you are building an application that, say for example, reads the > > fields (columns?) in the entity for display to the user (like the > > Dashboard's Data Viewer), you end up presenting irrelevant fields. > > That happens only if your application doesn't understand polymodels. > I'm still working through the details because "not polymodels" don't > support class_name() and polymodels don't support anything like > db.class_for_kind(). However, properties() appears to do the right > thing. > > One good reason for implementing Polymodels with a single entity type > is that implementing them with multiple entity types would make > queries a lot more expensive. > > If polymodels were implemented with multiple entity types, each > application-level query would need to make multiple datastore queries, > one for each applicable datastore entity type. If those queries are > done in the datastore (as with "in" queries), the datastore needs to > know some things that it doesn't currently know and the "30 datastore > queries per application query" limit comes into play. If those > queries are done in the application run time, limits (and maybe > transactions) will behave differently. > > I think that __key__ can be made to work across multiple entity types, > but wouldn't be surprised if implementing PolyModel with multiple > multiple entity types raised issues there as well. > > BTW - entity groups for transaction purposes have nothing to do with > entity types or polymodels. They're really entity instance groups > because they are defined wrt key name hierarchies. > > On Mar 27, 11:54 pm, Rein Petersen wrote: > > > Hi, > >
[google-appengine] Re: PolyModel needs to be rethought
> The problem with this is that if you have an inheritance > hierarchy in which your outermost descendant(s) extend several fields, > you've loaded your base entities with a mass of irrelevant fields > containing an unusual (i've never seen that before). I'm pretty sure that the base entities (the things in the datastore) don't have those fields. The "missing" message comes from a tool that is looking at instances pulled from the datastore. It expects to see things and when it doesn't, it says . > if you are building an application that, say for example, reads the > fields (columns?) in the entity for display to the user (like the > Dashboard's Data Viewer), you end up presenting irrelevant fields. That happens only if your application doesn't understand polymodels. I'm still working through the details because "not polymodels" don't support class_name() and polymodels don't support anything like db.class_for_kind(). However, properties() appears to do the right thing. One good reason for implementing Polymodels with a single entity type is that implementing them with multiple entity types would make queries a lot more expensive. If polymodels were implemented with multiple entity types, each application-level query would need to make multiple datastore queries, one for each applicable datastore entity type. If those queries are done in the datastore (as with "in" queries), the datastore needs to know some things that it doesn't currently know and the "30 datastore queries per application query" limit comes into play. If those queries are done in the application run time, limits (and maybe transactions) will behave differently. I think that __key__ can be made to work across multiple entity types, but wouldn't be surprised if implementing PolyModel with multiple multiple entity types raised issues there as well. BTW - entity groups for transaction purposes have nothing to do with entity types or polymodels. They're really entity instance groups because they are defined wrt key name hierarchies. On Mar 27, 11:54 pm, Rein Petersen wrote: > Hi, > > At risk of sounding too critical (believe me - i am so very > appreciative of GAE), something about PolyModel has been bothering me > enough that i must get it off my chest in the hope that it might be > reworked. The PolyModel was created to allow developers to access all > child entities through the base. i think it is an important ability > for the GAE and PolyModel addresses the need. > > But, (here it comes) every extension (field) is stored in the same > table? (entity group) along with an extra field (list) that stores the > class names. The problem with this is that if you have an inheritance > hierarchy in which your outermost descendant(s) extend several fields, > you've loaded your base entities with a mass of irrelevant fields > containing an unusual (i've never seen that before). > > if you are building an application that, say for example, reads the > fields (columns?) in the entity for display to the user (like the > Dashboard's Data Viewer), you end up presenting irrelevant fields. How > the unnecessary fetching of these extension fields affects performance > i cannot say but i would guess it is not helpful. > > i tend to think that each subclass entity justifies its own distinct > table? / entity ? and would only contain the fields which extend it's > base class which resides in the same entity group as the parent. > > PolyModel could then, instead, walk the parent chain adding fields as > necessary until it reaches the root entity. The subclassed entity > would be presented as a composite of all the fields within itself and > it's ascendants. since the superclass entities are assigned to the > subclass entities by parent, the reside in the same entity group and > shouldn't be an impediment to scalability or within a transaction > > it seems to make so much sense to me that i can only guess there was > some obstacle preventing it being so and the only solution was what we > have now. if that is the case, please tell me so i can just give up on > this... otherwise comments are always welcome - > > thanks :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---