I suggest you watch the IO talk where Brett Slatkin discusses Merge
Joins and pre-computing ranges.

http://www.youtube.com/watch?v=AgaL6NGpkB8

Watch the last half (past 34 min).. and maybe pay attention to the
section that's just after (41 minutes).

This implies you do not need composite indexes (or to create any new
indexes beyond the default ones) for all sorts of queries if you
construct your data in the right way.

I will test this out tonight to provide a proof of concept.



On Nov 3, 10:12 am, Tim Hoffman <zutes...@gmail.com> wrote:
> Hi
>
> On Nov 3, 10:26 pm, Eli Jones <eli.jo...@gmail.com> wrote:
>
>
>
>
>
> > I haven't done any testing on this yet since I'd have to fill up tens
> > of gigs of information to see real live performance numbers.
>
> > I'm hoping the implicit partitioning makes it so that one doesn't need
> > manually created indexes (just thedefault ones.)
>
> > The example I showed would be a schema for storing a daily int statistic.
>
> > The 'June' column entries would show the day of that month and the
> > 'y2009' column would have the 6 value since June is the 6th month of
> > the year.
>
> > If I wanted stats for June, my select would look like this:
>
> > Select * From meStats Where y2009 = 6 AND June > 15
>
> But the minute you do this ">" you will then need an index that looks
> like
>
> - kind: meStats
>   properties:
>   - name: y2009
>   - name: June
>
> and so on for every year month combination where you do a >
> comparison.
>
> I think you should have a read about how indexes are created and
> accessed before you try optimising something that probably doesn't
> need it.
>
> Note the rules from defining index 
> dochttp://code.google.com/appengine/docs/python/datastore/queriesandinde...
>
> Other forms of queries require their indexes to be specified in
> index.yaml, including:
>
>     * queries with multiple sort orders
>     * queries with a sort order on keys in descending order
>     * queries with one or more inequality filters on a property and
> one or more equality filters over other properties
>     * queries with inequality filters and ancestor filters
>
> You fall into the third rule. Which as I said eariler will mean you
> need to manually specify in index.yaml a massive number of indexes
>
> Rgds
>
> T
>
>
>
> > This would/should implicitly hit the june rows for 2009 and get the
> > stats for every day after the 15th.
>
> > You could munge around your column names and the values inserted to
> > get different data reporting behaviour..
>
> > The main, potential value is the implicit partitioning (where you
> > don't need to manually define a bunch of schemas up front).
>
> > On 11/3/09, Tim Hoffman <zutes...@gmail.com> wrote:
>
> > > Hi
>
> > > Have you tried this?
>
> > > For starters you can't assign values to numbers.
>
> > > ie no matter what you do you can't assign 2009 = 'abc'
>
> > > You would need to use some other identifier as you mentioned and then
> > > specify something like
> > > year_2009 = db.IntegerProperty(name=2009) or something similiar.
>
> > > I also see a problem with this strategy with regard to index
> > > definitions.
> > > Whilst running the SDK the indexes will get created as you define data
> > > however once you are running
> > > in real google environment you will need to make sure you have already
> > > defined all possible indexes that you
> > > plan to use before you create any new data (or reindex everything),
> > > which means indexes for all years you plan to hold data for and
> > > search,
> > > and months, and combinations of the two.
>
> > > I am not sure this is a particularly good approach, but then I am not
> > > sure I get what you are actually doing.
>
> > > Have you compared the performance of lookups between the two
> > > strategies, also remembering if you are actually interested in year/
> > > month then you are
> > > actually using composite indexes,  I wonder if you will ever use the
> > > month only index (apart from comparing months with months for all
> > > years in no particular order)
>
> > > Rgds
>
> > > T
>
> > > On Nov 3, 12:22 am, Eli <eli.jo...@gmail.com> wrote:
> > >> Here's something I've been wondering about Expando.
>
> > >> Say you define an Expando model like so:
>
> > >> class meStats(db.Expando):
> > >>     meNumber = db.IntegerProperty(required=True)
>
> > >> And, then you begin populating it like so:
>
> > >> meEntity1 = meStats(meNumber = 200,
> > >>                                 June          = 14,
> > >>                                 2009          = 6)
>
> > >> meEntity.put()
>
> > >> meEntity2 = meStats(meNumber = 381,
> > >>                                 July           = 21,
> > >>                                 2009          = 7)
>
> > >> meEntity2.put()
>
> > >> ..and so on.
>
> > >> The "July" column only has indexes for entities that have "July"
> > >> defined.. correct?  So, in effect, I am creating a partitioned index
> > >> for a table that can grow indefinitely.. and each time I get to a new
> > >> year/month combo, I am inserting into new indexes..? (instead of
> > >> inserting into an ever increasing, monolithic "Month" column index..)
>
> > >> Mainly, I'm packing the pertinent information into the column names
> > >> and column values (instead of making the column name just some dummy
> > >> value like "Month").. this allows me to implicitly create the
> > >> partitioned table/index (I think of it as a partitioned index since it
> > >> is, schematically [as far as I'm concerned], one table.)
>
> > >> You could give the columns better names.. maybe "June_Day" and maybe
> > >> "2009_Month" if you wanted...
>
> > >> Does this make sense?  Have I misunderstood how Expando handles
> > >> indexes?
>
> > >> Another way to word this question would be:
>
> > >> Is there a difference between the indexes created for the June and
> > >> July entries in the above Expando model and the below Model models:
>
> > >> class meJune09Stats(db.Model):
> > >>     meNumber = db.IntegerProperty(required=True)
> > >>     June = db.IntegerProperty(required=True)
> > >>     2009 = db.IntegerProperty(required=True)
>
> > >> class meJuly09Stats(db.Model):
> > >>     meNumber = db.IntegerProperty(required=True)
> > >>     July = db.IntegerProperty(required=True)
> > >>     2009 = db.IntegerProperty(required=True)
>
> > >> Thanks for any information.
>
> > --
> > Sent from my mobile device
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to