thanks for the clarification

On Aug 26, 11:44 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:
> On Wed, Aug 26, 2009 at 10:28 AM, Philippe <philippe.cr...@gmail.com> wrote:
>
> > Nick, what about list deserialisation ?
> > no impact on large data set with 30 elements list ?
>
> There's a per-element cost to deserializing entities, yes - so larger lists
> will take longer. 30 elements is not a particularly large list in the grand
> scheme of things, however.
>
> -Nick Johnson
>
>
>
>
>
> > On Aug 26, 11:22 am, "Nick Johnson (Google)" <nick.john...@google.com>
> > wrote:
> > > Hi Gary,
> > > The time a query takes to execute is proportional to the number of
> > results
> > > returned, not the number in the datastore. I suspect your tests have
> > > returned more results when you had more entries (because there were more
> > > matches, and you were asking for them all)?
>
> > > -Nick Johnson
>
> > > On Wed, Aug 26, 2009 at 1:44 AM, Gary <gbre...@gmail.com> wrote:
>
> > > > Thanks for your help Philippe,
>
> > > > So does anyone have an idea of how to store this data in a scalable
> > > > manner? I was sure Woobles first answer was going to be the fix for
> > > > this but it appears it doesn't scale CPU time is increasing to 3
> > > > seconds with just 32,000 records - what will it be about 1 million
> > > > records?
>
> > > > Would be interested to hear from anyone who has successfully stored
> > > > and queried a large dataset on appengine.
>
> > > > Thanks,
>
> > > > Gary
>
> > > > On Aug 26, 12:16 am, Philippe <philippe.cr...@gmail.com> wrote:
> > > > > sad, I did not think about that :/
>
> > > > > On Aug 25, 2:49 pm, Gary <gbre...@gmail.com> wrote:
>
> > > > > > Hi,
>
> > > > > > This solution doesn't seemto work because
>
> > > > > > ". in addition, the key_name of a keyword must be the actual
> > > > > > keyword" isn't possible as I will have duplicate keywords and the
> > key
> > > > > > name is actually just used as part of the key so data gets
> > converted
> > > > > > to a string such as
> > "gZid29zZW1yGQsSBkRvbWFpbiINYnVpbHR3aXRoLmNvbQw"
> > > > > > as a key value.
>
> > > > > > Gary
>
> > > > > > On Aug 25, 9:18 pm, Philippe <philippe.cr...@gmail.com> wrote:
>
> > > > > > > another option could be:
> > > > > > > Record(db.Model):
> > > > > > >     value = db.StringProperty()
>
> > > > > > > Keyword(db.Model):
> > > > > > >     value = db.StringProperty() #this value is not necessary, but
> > I
> > > > do
> > > > > > > not know if you can have a Model without properties
>
> > > > > > > the idea is that for one Record, you input several keywords has
> > > > record
> > > > > > > children. in addition, the key_name of a keyword must be the
> > actual
> > > > > > > keyword
> > > > > > > then, you can simply get_by_key_name() the keyword. with a
> > > > > > > keys_only=true.
> > > > > > > and thus, ask the key.parent() to get the correct Record.key().
> > > > > > > finally, db.get(that reccord key).
>
> > > > > > > a db.get() takes less than 100ms. I read somewhere that a
> > get(keys)
> > > > is
> > > > > > > always faster and will not get slower if your number of entities
> > > > > > > increase.
>
> > > > > > > On Aug 25, 11:52 am, Gary <gbre...@gmail.com> wrote:
>
> > > > > > > > Great,
>
> > > > > > > > This is how I've done it, I've used the filter to do some
> > testing
> > > > and
> > > > > > > > with only 32,424 values and appoximately 500,000 keywords a
> > search
> > > > is
> > > > > > > > taking 3000ms CPU - will this stay the same as my datastore
> > expands
> > > > to
> > > > > > > > 4.5 million values and potentially 100 million keywords?
>
> > > > > > > > Gary
>
> > > > > > > > On Aug 18, 10:42 pm, Wooble <geoffsp...@gmail.com> wrote:
>
> > > > > > > > > You can do it like (in python):
>
> > > > > > > > > Record(db.Model):
> > > > > > > > >     value = db.StringProperty()
> > > > > > > > >     keywords = db.StringListProperty()
>
> > > > > > > > > No IN query is necessary to find values with a specific
> > keyword
> > > > or
> > > > > > > > > even multiple keywords, just do:
> > > > > > > > > filteredquery = Record.all().filter("keywords =",
> > > > "mykeyword").filter
> > > > > > > > > ("keywords =", "myotherkeyword")...
>
> > > > > > > > > You can basically add as many .filter()s as you want thanks
> > to
> > > > merge
> > > > > > > > > join.  Normalizing would require an exploding index if you
> > ever
> > > > want
> > > > > > > > > to search records having more than 1 given keyword. You
> > might,
> > > > > > > > > however, want to optimize a bit for space by saving lists of
> > > > integers
> > > > > > > > > and mapping those integers to keywords, although you'll end
> > up
> > > > with a
> > > > > > > > > performance hit when you search and storage is relatively
> > cheap.
>
> > > > > > > > > On Aug 18, 2:26 am,garazy<gbre...@gmail.com> wrote:
>
> > > > > > > > > > Hi,
>
> > > > > > > > > > I want to store 4.5 million data records that have a string
> > > > identifier
> > > > > > > > > > which each have 50 keywords associated with them.
>
> > > > > > > > > > For example,
>
> > > > > > > > > > value (string), keyword1 (string) ... keyword50 (string)
>
> > > > > > > > > > In SQL Server 2008 in a traditional database the best way,
> > that
> > > > I
> > > > > > > > > > found, was to store this is 3NF such as
>
> > > > > > > > > > value_id (int)
> > > > > > > > > > value (string)
> > > > > > > > > > |
> > > > > > > > > > |
> > > > > > > > > > value_id (int)
> > > > > > > > > > keyword_id (int)
> > > > > > > > > > |
> > > > > > > > > > |
> > > > > > > > > > keyword_id (int)
> > > > > > > > > > keyword_value (string)
>
> > > > > > > > > > The data is mostly static, I typically use queries where to
> > > > find
> > > > > > > > > > values which use specific keywords, which in SQL uses IN
> > > > queries,
> > > > > > > > > > however I'm not sure if this is the best approach for the
> > app
> > > > engine
> > > > > > > > > > datastore.
>
> > > > > > > > > > Do people recommend the data stored as-
>
> > > > > > > > > > value, keyword0, keyword1, keyword2 etc..
>
> > > > > > > > > > or in the same sort of way as I store it in SQL Server?
>
> > > > > > > > > > What would the performance be using the app engine be for
> > this
> > > > store
> > > > > > > > > > of datastore if I wanted to find values which have say 2
> > > > specific
> > > > > > > > > > keywords?
>
> > > > > > > > > > Thanks,
>
> > > > > > > > > > Gary
>
> > > --
> > > Nick Johnson, Developer Programs Engineer, App Engine
>
> --
> Nick Johnson, Developer Programs Engineer, App Engine
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to