[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Philippe
thanks for the clarification On Aug 26, 11:44 am, "Nick Johnson (Google)" wrote: > On Wed, Aug 26, 2009 at 10:28 AM, Philippe wrote: > > > Nick, what about list deserialisation ? > > no impact on large data set with 30 elements list ? > > There's a per-element cost to deserializing entities, ye

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Gary
Thanks for your help Nick, in that case I will push on uploading all the values in the original method and see how it goes. Thanks, Gary On Aug 26, 7:22 pm, "Nick Johnson (Google)" wrote: > Hi Gary, > The time a query takes to execute is proportional to the number of results > returned, not th

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Nick Johnson (Google)
On Wed, Aug 26, 2009 at 10:28 AM, Philippe wrote: > > Nick, what about list deserialisation ? > no impact on large data set with 30 elements list ? There's a per-element cost to deserializing entities, yes - so larger lists will take longer. 30 elements is not a particularly large list in the g

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Philippe
Nick, what about list deserialisation ? no impact on large data set with 30 elements list ? On Aug 26, 11:22 am, "Nick Johnson (Google)" wrote: > Hi Gary, > The time a query takes to execute is proportional to the number of results > returned, not the number in the datastore. I suspect your test

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Nick Johnson (Google)
Hi Gary, The time a query takes to execute is proportional to the number of results returned, not the number in the datastore. I suspect your tests have returned more results when you had more entries (because there were more matches, and you were asking for them all)? -Nick Johnson On Wed, Aug 2

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Philippe
if I remember one presentation of the last googleIO, you have an issue of deserialisation of your lists that will add some CPU-time. That's maybe why you experience high CPU-time. If you can query your keyword list without fetching the data (that means, fetching the key only or directly the reccor

[google-appengine] Re: Datastore DB Design

2009-08-26 Thread Gary
Hi Philippe, On average around 30. Thanks, Gary On Aug 26, 3:43 pm, Philippe wrote: > 1 more question: how many keywords do you have usually for one > reccord ? > > On Aug 26, 2:44 am, Gary wrote: > > > Thanks for your help Philippe, > > > So does anyone have an idea of how to store this dat

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Philippe
1 more question: how many keywords do you have usually for one reccord ? On Aug 26, 2:44 am, Gary wrote: > Thanks for your help Philippe, > > So does anyone have an idea of how to store this data in a scalable > manner? I was sure Woobles first answer was going to be the fix for > this but it ap

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Gary
Thanks for your help Philippe, So does anyone have an idea of how to store this data in a scalable manner? I was sure Woobles first answer was going to be the fix for this but it appears it doesn't scale CPU time is increasing to 3 seconds with just 32,000 records - what will it be about 1 millio

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Philippe
sad, I did not think about that :/ On Aug 25, 2:49 pm, Gary wrote: > Hi, > > This solution doesn't seemto work because > > ". in addition, the key_name of a keyword must be the actual > keyword" isn't possible as I will have duplicate keywords and the key > name is actually just used as part of

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Gary
Hi, This solution doesn't seemto work because ". in addition, the key_name of a keyword must be the actual keyword" isn't possible as I will have duplicate keywords and the key name is actually just used as part of the key so data gets converted to a string such as "gZid29zZW1yGQsSBkRvbWFpbiINYn

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Gary
Thanks Phillippe, This sounds like a better way but there will also be a requirement to list the keywords relatd to a value, is that possible? Is it possible to do a search on the Keyword parent? Thanks, Gary On Aug 25, 9:18 pm, Philippe wrote: > another option could be: > Record(db.Model): >

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Philippe
another option could be: Record(db.Model): value = db.StringProperty() Keyword(db.Model): value = db.StringProperty() #this value is not necessary, but I do not know if you can have a Model without properties the idea is that for one Record, you input several keywords has record children

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Gary
Sorry Wooble I forgot to thank you for your help. Here is my code, I've been using the batch updater to populate my records, I have about 32,000 records now. http://pastie.org/593968 A query on that code from the dash logs, there's about 50 domains that have the keyword 'holiday' /?keyword=h

[google-appengine] Re: Datastore DB Design

2009-08-25 Thread Gary
Great, This is how I've done it, I've used the filter to do some testing and with only 32,424 values and appoximately 500,000 keywords a search is taking 3000ms CPU - will this stay the same as my datastore expands to 4.5 million values and potentially 100 million keywords? Gary On Aug 18, 10:4

[google-appengine] Re: Datastore DB Design

2009-08-18 Thread Wooble
You can do it like (in python): Record(db.Model): value = db.StringProperty() keywords = db.StringListProperty() No IN query is necessary to find values with a specific keyword or even multiple keywords, just do: filteredquery = Record.all().filter("keywords =", "mykeyword").filter ("key