Hi Philippe,

On average around 30.

Thanks,

Gary

On Aug 26, 3:43 pm, Philippe <philippe.cr...@gmail.com> wrote:
> 1 more question: how many keywords do you have usually for one
> reccord ?
>
> On Aug 26, 2:44 am, Gary <gbre...@gmail.com> wrote:
>
> > Thanks for your help Philippe,
>
> > So does anyone have an idea of how to store this data in a scalable
> > manner? I was sure Woobles first answer was going to be the fix for
> > this but it appears it doesn't scale CPU time is increasing to 3
> > seconds with just 32,000 records - what will it be about 1 million
> > records?
>
> > Would be interested to hear from anyone who has successfully stored
> > and queried a large dataset on appengine.
>
> > Thanks,
>
> > Gary
>
> > On Aug 26, 12:16 am, Philippe <philippe.cr...@gmail.com> wrote:
>
> > > sad, I did not think about that :/
>
> > > On Aug 25, 2:49 pm, Gary <gbre...@gmail.com> wrote:
>
> > > > Hi,
>
> > > > This solution doesn't seemto work because
>
> > > > ". in addition, the key_name of a keyword must be the actual
> > > > keyword" isn't possible as I will have duplicate keywords and the key
> > > > name is actually just used as part of the key so data gets converted
> > > > to a string such as "gZid29zZW1yGQsSBkRvbWFpbiINYnVpbHR3aXRoLmNvbQw"
> > > > as a key value.
>
> > > > Gary
>
> > > > On Aug 25, 9:18 pm, Philippe <philippe.cr...@gmail.com> wrote:
>
> > > > > another option could be:
> > > > > Record(db.Model):
> > > > >     value = db.StringProperty()
>
> > > > > Keyword(db.Model):
> > > > >     value = db.StringProperty() #this value is not necessary, but I do
> > > > > not know if you can have a Model without properties
>
> > > > > the idea is that for one Record, you input several keywords has record
> > > > > children. in addition, the key_name of a keyword must be the actual
> > > > > keyword
> > > > > then, you can simply get_by_key_name() the keyword. with a
> > > > > keys_only=true.
> > > > > and thus, ask the key.parent() to get the correct Record.key().
> > > > > finally, db.get(that reccord key).
>
> > > > > a db.get() takes less than 100ms. I read somewhere that a get(keys) is
> > > > > always faster and will not get slower if your number of entities
> > > > > increase.
>
> > > > > On Aug 25, 11:52 am, Gary <gbre...@gmail.com> wrote:
>
> > > > > > Great,
>
> > > > > > This is how I've done it, I've used the filter to do some testing 
> > > > > > and
> > > > > > with only 32,424 values and appoximately 500,000 keywords a search 
> > > > > > is
> > > > > > taking 3000ms CPU - will this stay the same as my datastore expands 
> > > > > > to
> > > > > > 4.5 million values and potentially 100 million keywords?
>
> > > > > > Gary
>
> > > > > > On Aug 18, 10:42 pm, Wooble <geoffsp...@gmail.com> wrote:
>
> > > > > > > You can do it like (in python):
>
> > > > > > > Record(db.Model):
> > > > > > >     value = db.StringProperty()
> > > > > > >     keywords = db.StringListProperty()
>
> > > > > > > No IN query is necessary to find values with a specific keyword or
> > > > > > > even multiple keywords, just do:
> > > > > > > filteredquery = Record.all().filter("keywords =", 
> > > > > > > "mykeyword").filter
> > > > > > > ("keywords =", "myotherkeyword")...
>
> > > > > > > You can basically add as many .filter()s as you want thanks to 
> > > > > > > merge
> > > > > > > join.  Normalizing would require an exploding index if you ever 
> > > > > > > want
> > > > > > > to search records having more than 1 given keyword. You might,
> > > > > > > however, want to optimize a bit for space by saving lists of 
> > > > > > > integers
> > > > > > > and mapping those integers to keywords, although you'll end up 
> > > > > > > with a
> > > > > > > performance hit when you search and storage is relatively cheap.
>
> > > > > > > On Aug 18, 2:26 am,garazy<gbre...@gmail.com> wrote:
>
> > > > > > > > Hi,
>
> > > > > > > > I want to store 4.5 million data records that have a string 
> > > > > > > > identifier
> > > > > > > > which each have 50 keywords associated with them.
>
> > > > > > > > For example,
>
> > > > > > > > value (string), keyword1 (string) ... keyword50 (string)
>
> > > > > > > > In SQL Server 2008 in a traditional database the best way, that 
> > > > > > > > I
> > > > > > > > found, was to store this is 3NF such as
>
> > > > > > > > value_id (int)
> > > > > > > > value (string)
> > > > > > > > |
> > > > > > > > |
> > > > > > > > value_id (int)
> > > > > > > > keyword_id (int)
> > > > > > > > |
> > > > > > > > |
> > > > > > > > keyword_id (int)
> > > > > > > > keyword_value (string)
>
> > > > > > > > The data is mostly static, I typically use queries where to find
> > > > > > > > values which use specific keywords, which in SQL uses IN 
> > > > > > > > queries,
> > > > > > > > however I'm not sure if this is the best approach for the app 
> > > > > > > > engine
> > > > > > > > datastore.
>
> > > > > > > > Do people recommend the data stored as-
>
> > > > > > > > value, keyword0, keyword1, keyword2 etc..
>
> > > > > > > > or in the same sort of way as I store it in SQL Server?
>
> > > > > > > > What would the performance be using the app engine be for this 
> > > > > > > > store
> > > > > > > > of datastore if I wanted to find values which have say 2 
> > > > > > > > specific
> > > > > > > > keywords?
>
> > > > > > > > Thanks,
>
> > > > > > > > Gary
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to