[google-appengine] Tuples, Exploding Index, IO talk

hawkett Sat, 20 Jun 2009 07:05:56 -0700

Hi,

   I was watching Brett's IO talk re. using 'Relational Index Tables',
and there were a few hints of things in there, and I just wanted to
check I got it all correctly -


1.  Lists are good for tuples - a use case I see is an entity being
tagged, and having a state within that tag - so the tuples might be
('tagA', 'PENDING') , ('tagB', 'ACCEPTED'), ('tagC', 'DENIED') etc. -
so the list structures would be

class Thing(db.Model):
  name = db.StringProperty()
  tags = db.ListProperty(str, default=[])
  states = db.ListProperty(str, default=[])

with their contents tags = ['tagA', 'tagB', 'tagC'], states =
['PENDING', 'ACCEPTED', 'DENIED']

and as data comes and goes you maintain both lists to ensure you
record the correct state for the correct tag by matching their list
position.

2.  Relational Index Tables are good for exploding index problems - so
the query here might be -
"get me all the 'Things' which have 'tagA' and which are 'PENDING' in
that tag" - i.e. all records with the tuple ('tagA, 'PENDING'), which
would be a composite index over two list properties - an exploding
index.

So assuming I've got the above right, I'm trying to work out a few
things

a.  Without relational index tables, what is the best way to construct
the query - e.g.

things = db.GqlQuery(
  "SELECT * FROM Thing "
  "WHERE tags = :1 AND states = :2", 'tagA', 'PENDING')

which would get me anything that had 'tagA' at any point in the tags
list, and anything that had a 'PENDING' at any point in the states
list. This is potentially many more records than those that match the
tuple.  So then I have to do an in-memory cull of those records
returned and work out which ones actually conform to the tuple?  Just
wondering if I am missing something here, because it seems like a
great method for storing a tuple, but complex to query for that same
tuple?

b.  If I am going to use relational index tables, to avoid the
exploding index that the above query could generate -

class Thing(db.model):
  name = db.StringProperty()

class ThingTagIndex(db.Model)
  tags = db.ListProperty(str, default=[])

class ThingStateIndex(db.Model)
  states = db.ListProperty(str, default=[])

then am I right in thinking that my query would be performed as

tagIndexKeys = db.GqlQuery(
  "SELECT __key__ FROM ThingTagIndex "
  "WHERE tags = :1", 'tagA')

# All the things that have 'tagA' in their tags list
thingTagKeys = [k.parent() for k in tagIndexKeys]

stateIndexKeys = db.GqlQuery(
  "SELECT __key__ FROM ThingStateIndex "
  "WHERE states = :1 AND ANCESTOR IN :2", 'PENDING', thingTagKeys)

# All the things that have both 'tagA' and 'PENDING' (but not
necessarily as a tuple)
thingKeys = [k.parent() for k in stateIndexKeys]

things = db.get(thingKeys)

# Oops - I need the lists to do the culling part of my tuple query
from (a)

So I have avoided the exploding index by performing two separate
queries, but I could have achieved much the same result without the
index tables - i.e. by performing separate queries and avoiding the
composite index.  Just wondering if I am seeing the tuple situation
correctly - i.e. there is no way to query them that doesn't require
some in-memory culling?  Thanks,

Colin

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Tuples, Exploding Index, IO talk

Reply via email to