[google-appengine] Re: PolyModel needs to be rethought

2009-03-30 Thread Rein Petersen

Yes, very helpful. i appreciate the thoughtful responses and
clarifications.

i can accept that, for scalability, it must be this way. And yes, i do
hail from a traditional SQL (relational) background. i'll need to
reread the paper on BigTable to keep me from veering back to that way
of thinking. it is a bit of a leap for me.

Next time i encounter something that strikes me as odd, i'll just
assume the GAE team has already given it much thought and the tenets
(ie. scalability) have taken precedence - hopefully that will shutup
'ol doubting Thomas.

Thanks again, Rein

On Mar 30, 5:57 pm, Rafe  wrote:
>   Rein,
>
>   Thanks for taking the time to think about this stuff.  It's
> important to think about alternatives to any design offered by the App
> Engine team, even if it means it leads to merely clearing up some
> misunderstandings about the underlying implementation.  I think Andy's
> response sums up anything I was going to say and is correct.
> Properties that don't exist on leaf classes are not stored whatsoever
> in the datastore and are also not loaded when returned by a get or
> query fetch.
>
>   One thing to realize about the difference between Datastore and the
> Datastore Model API that sits on top of it is that Datastore does not
> actually understand or care about the structure of the data that it
> stores in each Kind (what you refer to as "Table").  In fact, in order
> to make the model API behave more like the actual underlying Datastore
> implementation, we had to add a modified subclass of Model similar to
> PolyModel.  If you have not had a chance, take a look at Expando.
>
> http://code.google.com/appengine/docs/python/datastore/expandoclass.html
>
>   Notice how each instance of an Expando kind can have a unique
> "schema" when compared with all other instances.  You can have:
>
>   instance1.prop1 = 'a string'
>   instance2.prop2 = 'a different property'
>   instance3.prop1 = 1024
>
>   instance2 does not have prop1, while instance3 has a different type
> for prop1.  It seems like under the hood Datastore acts more like a
> dict than a traditional object.
>
>   Turns out not to be so far from the truth.  If you look further you
> will see that the datastore.Entity class, which is where db.Model
> reads and writes info from the datastore, actually inherits from dict.
>
>   This is all ok.  It turns out that Datastore handles these cases
> very well.
>
>   I'm guessing you might be more experienced with how SQL based object
> relational mapping algorithms work, such as Java's Hibernate.
> Hibernate allows you to map different class structures to either
> single tables (much like how PolyModel does it) and multiple tables as
> you have described.  This makes good sense in an environment like SQL
> where joins are permitted.
>
>   That's not to say that there will never be a case in App Engine
> where someone would benefit more from a polymorphic model based on
> multiple Kinds rather than a single kind.  Most of those cases can be
> handled relatively well using db.ReferenceProperty.  But an easier to
> use and more intuitive Python class that encapsulates this concept
> might be neat.  You should see PolyModel as a reference implementation
> of a polymorphic class, but is not necessarily definitive.  Everyone
> has different needs and the hope is that variations like Expando and
> PolyModel give people good starting points to build the best possible
> systems.
>
>   Hope this is useful.
>
>   - Rafe Kaplan
>
> On Mar 28, 9:43 am, Andy Freeman  wrote:
>
> > > The problem with this is that if you have an inheritance
> > > hierarchy in which your outermost descendant(s) extend several fields,
> > > you've loaded your base entities with a mass of irrelevant fields
> > > containing an unusual  (i've never seen that before).
>
> > I'm pretty sure that the base entities (the things in the datastore)
> > don't have those fields.
>
> > The "missing" message comes from a tool that is looking at instances
> > pulled from the datastore.  It expects to see things and when it
> > doesn't, it says .
>
> > > if you are building an application that, say for example, reads the
> > > fields (columns?) in the entity for display to the user (like the
> > > Dashboard's Data Viewer), you end up presenting irrelevant fields.
>
> > That happens only if your application doesn't understand polymodels.
> > I'm still working through the details because "not polymodels" don't
> > support class_name() and polymodels don't support anything like
> > db.class_for_kind().  However, properties() appears to do the right
> > thing.
>
> > One good reason for implementing Polymodels with a single entity type
> > is that implementing them with multiple entity types would make
> > queries a lot more expensive.
>
> > If polymodels were implemented with multiple entity types, each
> > application-level query would need to make multiple datastore queries,
> > one for each applicable datastore entity type.  If those queries are
>

[google-appengine] Re: PolyModel needs to be rethought

2009-03-30 Thread Rafe

  Rein,

  Thanks for taking the time to think about this stuff.  It's
important to think about alternatives to any design offered by the App
Engine team, even if it means it leads to merely clearing up some
misunderstandings about the underlying implementation.  I think Andy's
response sums up anything I was going to say and is correct.
Properties that don't exist on leaf classes are not stored whatsoever
in the datastore and are also not loaded when returned by a get or
query fetch.

  One thing to realize about the difference between Datastore and the
Datastore Model API that sits on top of it is that Datastore does not
actually understand or care about the structure of the data that it
stores in each Kind (what you refer to as "Table").  In fact, in order
to make the model API behave more like the actual underlying Datastore
implementation, we had to add a modified subclass of Model similar to
PolyModel.  If you have not had a chance, take a look at Expando.

http://code.google.com/appengine/docs/python/datastore/expandoclass.html

  Notice how each instance of an Expando kind can have a unique
"schema" when compared with all other instances.  You can have:

  instance1.prop1 = 'a string'
  instance2.prop2 = 'a different property'
  instance3.prop1 = 1024

  instance2 does not have prop1, while instance3 has a different type
for prop1.  It seems like under the hood Datastore acts more like a
dict than a traditional object.

  Turns out not to be so far from the truth.  If you look further you
will see that the datastore.Entity class, which is where db.Model
reads and writes info from the datastore, actually inherits from dict.

  This is all ok.  It turns out that Datastore handles these cases
very well.

  I'm guessing you might be more experienced with how SQL based object
relational mapping algorithms work, such as Java's Hibernate.
Hibernate allows you to map different class structures to either
single tables (much like how PolyModel does it) and multiple tables as
you have described.  This makes good sense in an environment like SQL
where joins are permitted.

  That's not to say that there will never be a case in App Engine
where someone would benefit more from a polymorphic model based on
multiple Kinds rather than a single kind.  Most of those cases can be
handled relatively well using db.ReferenceProperty.  But an easier to
use and more intuitive Python class that encapsulates this concept
might be neat.  You should see PolyModel as a reference implementation
of a polymorphic class, but is not necessarily definitive.  Everyone
has different needs and the hope is that variations like Expando and
PolyModel give people good starting points to build the best possible
systems.

  Hope this is useful.

  - Rafe Kaplan


On Mar 28, 9:43 am, Andy Freeman  wrote:
> > The problem with this is that if you have an inheritance
> > hierarchy in which your outermost descendant(s) extend several fields,
> > you've loaded your base entities with a mass of irrelevant fields
> > containing an unusual  (i've never seen that before).
>
> I'm pretty sure that the base entities (the things in the datastore)
> don't have those fields.
>
> The "missing" message comes from a tool that is looking at instances
> pulled from the datastore.  It expects to see things and when it
> doesn't, it says .
>
> > if you are building an application that, say for example, reads the
> > fields (columns?) in the entity for display to the user (like the
> > Dashboard's Data Viewer), you end up presenting irrelevant fields.
>
> That happens only if your application doesn't understand polymodels.
> I'm still working through the details because "not polymodels" don't
> support class_name() and polymodels don't support anything like
> db.class_for_kind().  However, properties() appears to do the right
> thing.
>
> One good reason for implementing Polymodels with a single entity type
> is that implementing them with multiple entity types would make
> queries a lot more expensive.
>
> If polymodels were implemented with multiple entity types, each
> application-level query would need to make multiple datastore queries,
> one for each applicable datastore entity type.  If those queries are
> done in the datastore (as with "in" queries), the datastore needs to
> know some things that it doesn't currently know and the "30 datastore
> queries per application query" limit comes into play.  If those
> queries are done in the application run time, limits (and maybe
> transactions) will behave differently.
>
> I think that __key__ can be made to work across multiple entity types,
> but wouldn't be surprised if implementing PolyModel with multiple
> multiple entity types raised issues there as well.
>
> BTW - entity groups for transaction purposes have nothing to do with
> entity types or polymodels.  They're really entity instance groups
> because they are defined wrt key name hierarchies.
>
> On Mar 27, 11:54 pm, Rein Petersen  wrote:
>
> > Hi,
>
>

[google-appengine] Re: PolyModel needs to be rethought

2009-03-28 Thread Andy Freeman

> The problem with this is that if you have an inheritance
> hierarchy in which your outermost descendant(s) extend several fields,
> you've loaded your base entities with a mass of irrelevant fields
> containing an unusual  (i've never seen that before).

I'm pretty sure that the base entities (the things in the datastore)
don't have those fields.

The "missing" message comes from a tool that is looking at instances
pulled from the datastore.  It expects to see things and when it
doesn't, it says .

> if you are building an application that, say for example, reads the
> fields (columns?) in the entity for display to the user (like the
> Dashboard's Data Viewer), you end up presenting irrelevant fields.

That happens only if your application doesn't understand polymodels.
I'm still working through the details because "not polymodels" don't
support class_name() and polymodels don't support anything like
db.class_for_kind().  However, properties() appears to do the right
thing.

One good reason for implementing Polymodels with a single entity type
is that implementing them with multiple entity types would make
queries a lot more expensive.

If polymodels were implemented with multiple entity types, each
application-level query would need to make multiple datastore queries,
one for each applicable datastore entity type.  If those queries are
done in the datastore (as with "in" queries), the datastore needs to
know some things that it doesn't currently know and the "30 datastore
queries per application query" limit comes into play.  If those
queries are done in the application run time, limits (and maybe
transactions) will behave differently.

I think that __key__ can be made to work across multiple entity types,
but wouldn't be surprised if implementing PolyModel with multiple
multiple entity types raised issues there as well.

BTW - entity groups for transaction purposes have nothing to do with
entity types or polymodels.  They're really entity instance groups
because they are defined wrt key name hierarchies.

On Mar 27, 11:54 pm, Rein Petersen  wrote:
> Hi,
>
> At risk of sounding too critical (believe me - i am so very
> appreciative of GAE), something about PolyModel has been bothering me
> enough that i must get it off my chest in the hope that it might be
> reworked. The PolyModel was created to allow developers to access all
> child entities through the base. i think it is an important ability
> for the GAE and PolyModel addresses the need.
>
> But, (here it comes) every extension (field) is stored in the same
> table? (entity group) along with an extra field (list) that stores the
> class names. The problem with this is that if you have an inheritance
> hierarchy in which your outermost descendant(s) extend several fields,
> you've loaded your base entities with a mass of irrelevant fields
> containing an unusual  (i've never seen that before).
>
> if you are building an application that, say for example, reads the
> fields (columns?) in the entity for display to the user (like the
> Dashboard's Data Viewer), you end up presenting irrelevant fields. How
> the unnecessary fetching of these extension fields affects performance
> i cannot say but i would guess it is not helpful.
>
> i tend to think that each subclass entity justifies its own distinct
> table? / entity ? and would only contain the fields which extend it's
> base class which resides in the same entity group as the parent.
>
> PolyModel could then, instead, walk the parent chain adding fields as
> necessary until it reaches the root entity. The subclassed entity
> would be presented as a composite of all the fields within itself and
> it's ascendants. since the superclass entities are assigned to the
> subclass entities by parent, the reside in the same entity group and
> shouldn't be an impediment to scalability or within a transaction
>
> it seems to make so much sense to me that i can only guess there was
> some obstacle preventing it being so and the only solution was what we
> have now. if that is the case, please tell me so i can just give up on
> this... otherwise comments are always welcome -
>
> thanks :)
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---