Hi hawkett, I presume by "key hash", you mean the string obtained by stringifying a Key object (eg, str(key)). This is not a hash, but rather a base64 encoding of the Key Protocol Buffer. We don't necessarily guarantee this encoding scheme will not change (though it's rather unlikely), but we do guarantee that db.Key(str(key)) == key - that is, that you'll always be able to reconstruct a key from its string-form.
-Nick Johnson On Mon, Jun 22, 2009 at 1:48 PM, hawkett <hawk...@gmail.com> wrote: > > This actually leads to another question that has been on my mind - how > reliable is a key's string representation? If one part of my tuple is > a key, can I store it's hash and reliably turn that into the actual > key for the object at a (much) later date, or do I need to store the > key_name and path? Put another way, does Google guarantee that the > Key hashing algorithm will never change? Is the hash produced for a > given key_name and path guaranteed to be the same on both the SDK and > the live environment? > > Just wanting to get a clear idea of the best approach to serialising > keys. Thanks, > > Colin > > On Jun 22, 11:10 am, "Nick Johnson (Google)" <nick.john...@google.com> > wrote: > > Hi hawkett, > > > > > > > > On Sat, Jun 20, 2009 at 3:05 PM, hawkett <hawk...@gmail.com> wrote: > > > > > Hi, > > > > > I was watching Brett's IO talk re. using 'Relational Index Tables', > > > and there were a few hints of things in there, and I just wanted to > > > check I got it all correctly - > > > > > 1. Lists are good for tuples - a use case I see is an entity being > > > tagged, and having a state within that tag - so the tuples might be > > > ('tagA', 'PENDING') , ('tagB', 'ACCEPTED'), ('tagC', 'DENIED') etc. - > > > so the list structures would be > > > > > class Thing(db.Model): > > > name = db.StringProperty() > > > tags = db.ListProperty(str, default=[]) > > > states = db.ListProperty(str, default=[]) > > > > > with their contents tags = ['tagA', 'tagB', 'tagC'], states = > > > ['PENDING', 'ACCEPTED', 'DENIED'] > > > > > and as data comes and goes you maintain both lists to ensure you > > > record the correct state for the correct tag by matching their list > > > position. > > > > A much better approach is to use a single ListProperty, and serialize > your > > tuples to it - using Pickle, JSON, CSV, etc - whatever suits. If you > want, > > you can easily write a custom Datastore property class to make this > easier. > > This allows you to do everything you outlined below without extra effort. > > > > -Nick Johnson > > > > > > > > > > > > > 2. Relational Index Tables are good for exploding index problems - so > > > the query here might be - > > > "get me all the 'Things' which have 'tagA' and which are 'PENDING' in > > > that tag" - i.e. all records with the tuple ('tagA, 'PENDING'), which > > > would be a composite index over two list properties - an exploding > > > index. > > > > > So assuming I've got the above right, I'm trying to work out a few > > > things > > > > > a. Without relational index tables, what is the best way to construct > > > the query - e.g. > > > > > things = db.GqlQuery( > > > "SELECT * FROM Thing " > > > "WHERE tags = :1 AND states = :2", 'tagA', 'PENDING') > > > > > which would get me anything that had 'tagA' at any point in the tags > > > list, and anything that had a 'PENDING' at any point in the states > > > list. This is potentially many more records than those that match the > > > tuple. So then I have to do an in-memory cull of those records > > > returned and work out which ones actually conform to the tuple? Just > > > wondering if I am missing something here, because it seems like a > > > great method for storing a tuple, but complex to query for that same > > > tuple? > > > > > b. If I am going to use relational index tables, to avoid the > > > exploding index that the above query could generate - > > > > > class Thing(db.model): > > > name = db.StringProperty() > > > > > class ThingTagIndex(db.Model) > > > tags = db.ListProperty(str, default=[]) > > > > > class ThingStateIndex(db.Model) > > > states = db.ListProperty(str, default=[]) > > > > > then am I right in thinking that my query would be performed as > > > > > tagIndexKeys = db.GqlQuery( > > > "SELECT __key__ FROM ThingTagIndex " > > > "WHERE tags = :1", 'tagA') > > > > > # All the things that have 'tagA' in their tags list > > > thingTagKeys = [k.parent() for k in tagIndexKeys] > > > > > stateIndexKeys = db.GqlQuery( > > > "SELECT __key__ FROM ThingStateIndex " > > > "WHERE states = :1 AND ANCESTOR IN :2", 'PENDING', thingTagKeys) > > > > > # All the things that have both 'tagA' and 'PENDING' (but not > > > necessarily as a tuple) > > > thingKeys = [k.parent() for k in stateIndexKeys] > > > > > things = db.get(thingKeys) > > > > > # Oops - I need the lists to do the culling part of my tuple query > > > from (a) > > > > > So I have avoided the exploding index by performing two separate > > > queries, but I could have achieved much the same result without the > > > index tables - i.e. by performing separate queries and avoiding the > > > composite index. Just wondering if I am seeing the tuple situation > > > correctly - i.e. there is no way to query them that doesn't require > > > some in-memory culling? Thanks, > > > > > Colin > > > > -- > > Nick Johnson, App Engine Developer Programs Engineer > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration > Number: > > 368047 > > > -- Nick Johnson, App Engine Developer Programs Engineer Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---