As a developer we have various limitations that we have to work with
on App Engine. And if you are performance conscious, then you have
even more ;p

Two limits that one has to often work around are the 1MB Datastore
calls and ListProperty size limits.

A common pattern that I've used to overcome these is to use "related
entities". I use these related entities to store properties that I
want to search on -- separate from the entities that I want to
retrieve in order to display the results back to the user.

I was pleased to see App Engine developer Brett Slatkin officially
encouraging the use of this pattern in his I/O talk:

* http://code.google.com/events/io/sessions/BuildingScalableComplexApps.html

He calls them "relation index entities" and explains it better than I
can, so here be a slightly adapted excerpt from pages 23-25 of his
presentation PDF:

----------------

Problem: Scalably "delivering" a Twitter-esque post/message

Solution: Split the message into 2 entities:
* Message model contains the info we care about
* MessageIndex has only relationships for querying

>>> class Message(db.Model):
...   sender = db.StringProperty()
...   body = db.TextProperty()

>>> class MessageIndex(db.Model):
...   receivers = db.StringListProperty()

When writing, put entities in the same entity group for transactions:
* That is, make the Message entity be the parent for the MessageIndex entity
* You can of course have multiple MessageIndex entities per Message to
scale up...

And for queries:
* Do a key-only query to fetch the MessageIndexes
* Transform returned keys to retrieve parent entity
* Fetch Message entities in batch

>>> indexes = db.GqlQuery(
...     "SELECT __key__ FROM MessageIndex "
...     "WHERE receivers = :1", me)

>>> keys = [k.parent() for k in indexes]
>>> messages = db.get(keys)

----------------

Now, if you start using this pattern heavily, you'll realise that it
should be extremely trivial for the App Engine devs to turn the 3
stepped process of:

* Do a key-only query to fetch the MessageIndexes
* Transform returned keys to retrieve parent entity
* Fetch Message entities in batch

Into just a single step:

* Do a query on MessageIndexes which will actually return the related
Messages entities

That is, the query would be done on the MessageIndexes entities, but
instead of returning those entities, the related Message entities
would be returned.

Saving both us and App Engine an additional Datastore request and key
computation!! Given the 30 seconds limit, this also means that you
could then do twice the amount of querying in the same time! So it
helps even more!

So how would this work?

Well, let's say that we had the following Message:

>>> msg = Message(send='tav', body='Hello World')
>>> msg_key = msg.key()

And the following related MessageIndex entities:

>>> rcv1 = MessageIndex(parent=msg_key, receivers=['alice', 'bob'], 
>>> __index__=msg_key)

>>> rcv2 = MessageIndex(parent=msg_key, receivers=['ryan'], __index__=msg_key)

The presence of the newly proposed __index__ property would alter the
behaviour of how the Datastore indexes the rcv1/rcv2 entities. Instead
of doing the current:

  rcv1:receivers:alice    <rcv1_key>
  rcv1:receivers:bob      <rcv1_key>
  rcv2:receivers:ryan     <rcv2_key>

It would look like:

  rcv1:receivers:alice    <msg_key>
  rcv1:receivers:bob      <msg_key>
  rcv2:receivers:ryan     <msg_key>

[Note for those not familiar with the App Engine Datastore -- the App
Engine guys don't store the full entity in all of the index tables,
instead the <entity_key> is stored and used to load up the complete
entity with all of its properties in order to respond to the completed
query once the relevant entities have been identified.]

That is, with __index__, the Datastore would sort of behave as if it
was actually indexing additional properties of the Message entity!! So
when a query comes in, it can do a query as if for a MessageIndex, but
then return the Message entity pointed to by the key!!

Now, sadly, I fear I've explained all this quite poorly, but I do
believe that it's a brilliantly simple/powerful idea.

All it needs is for an optional __index__ property on entities to be
treated specially and for it to be a Key if present (not a
ReferenceProperty) -- in which case, the key referred to in the
__index__ property is used to "index" the entity in question rather
than the entity's own key().


Anyways, hope I've made some sense and provided something of value --
let me know if I can provide any further clarifications. Thanks!

-- 
love, tav

plex:espians/tav | t...@espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to