[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Nick Johnson (Google)
Hi herbie,

The first 1000 results of a query are the ones returned. If you do not
specify a sort order, entities are returned sorted by their keys.

-Nick Johnson

On Mon, Jun 22, 2009 at 1:42 PM, herbie <4whi...@o2.co.uk> wrote:

>
> I know that if there are more than 1000 entities that match a query,
> then only 1000 will  be return by fetch().  But my question is which
> 1000? The last 1000 added to the datastore?  The first 1000 added to
> the datastore? Or is it undedined?
>
> Thanks
> Ian
>
> >
>


-- 
Nick Johnson, App Engine Developer Programs Engineer
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread herbie


So to be sure to get the latest 1000 entities I should add a datetime
property to my entitie model and filter and sort on that?



On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> I know that if there are more than 1000 entities that match a query,
> then only 1000 will  be return by fetch().  But my question is which
> 1000? The last 1000 added to the datastore?  The first 1000 added to
> the datastore? Or is it undedined?
>
> Thanks
> Ian
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Nick Johnson (Google)
Correct. Are you sure you need 1000 entities, though? Your users probably
won't read through all 1000.

-Nick Johnson

On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:

>
>
> So to be sure to get the latest 1000 entities I should add a datetime
> property to my entitie model and filter and sort on that?
>
>
>
> On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > I know that if there are more than 1000 entities that match a query,
> > then only 1000 will  be return by fetch().  But my question is which
> > 1000? The last 1000 added to the datastore?  The first 1000 added to
> > the datastore? Or is it undedined?
> >
> > Thanks
> > Ian
> >
>


-- 
Nick Johnson, App Engine Developer Programs Engineer
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread herbie

No the users won't need to read 1000 entities, but I want to calculate
the average of a  property from the latest 1000 entities.


On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
wrote:
> Correct. Are you sure you need 1000 entities, though? Your users probably
> won't read through all 1000.
>
> -Nick Johnson
>
>
>
> On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > So to be sure to get the latest 1000 entities I should add a datetime
> > property to my entitie model and filter and sort on that?
>
> > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > I know that if there are more than 1000 entities that match a query,
> > > then only 1000 will  be return by fetch().  But my question is which
> > > 1000? The last 1000 added to the datastore?  The first 1000 added to
> > > the datastore? Or is it undedined?
>
> > > Thanks
> > > Ian
>
> --
> Nick Johnson, App Engine Developer Programs Engineer
> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Nick Johnson (Google)
Consider precalculating this data and storing it against another entity.
This will save a lot of work on requests.

-Nick Johnson

On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:

>
> No the users won't need to read 1000 entities, but I want to calculate
> the average of a  property from the latest 1000 entities.
>
>
> On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> wrote:
> > Correct. Are you sure you need 1000 entities, though? Your users probably
> > won't read through all 1000.
> >
> > -Nick Johnson
> >
> >
> >
> > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
> >
> > > So to be sure to get the latest 1000 entities I should add a datetime
> > > property to my entitie model and filter and sort on that?
> >
> > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > I know that if there are more than 1000 entities that match a query,
> > > > then only 1000 will  be return by fetch().  But my question is which
> > > > 1000? The last 1000 added to the datastore?  The first 1000 added to
> > > > the datastore? Or is it undedined?
> >
> > > > Thanks
> > > > Ian
> >
> > --
> > Nick Johnson, App Engine Developer Programs Engineer
> > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> Number:
> > 368047
> >
>


-- 
Nick Johnson, App Engine Developer Programs Engineer
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
368047

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread herbie

Ok. Say I have many (>1000)  Model entities with two properties 'x'
and 'date'.What is the most efficient query to fetch say the
latest 200 entities  where x > 50.   I don't care what their 'date's
are as long as I get the latest and x > 50

Thanks again for your help.


On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
wrote:
> Consider precalculating this data and storing it against another entity.
> This will save a lot of work on requests.
>
> -Nick Johnson
>
>
>
> On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > No the users won't need to read 1000 entities, but I want to calculate
> > the average of a  property from the latest 1000 entities.
>
> > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > wrote:
> > > Correct. Are you sure you need 1000 entities, though? Your users probably
> > > won't read through all 1000.
>
> > > -Nick Johnson
>
> > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > So to be sure to get the latest 1000 entities I should add a datetime
> > > > property to my entitie model and filter and sort on that?
>
> > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > I know that if there are more than 1000 entities that match a query,
> > > > > then only 1000 will  be return by fetch().  But my question is which
> > > > > 1000? The last 1000 added to the datastore?  The first 1000 added to
> > > > > the datastore? Or is it undedined?
>
> > > > > Thanks
> > > > > Ian
>
> > > --
> > > Nick Johnson, App Engine Developer Programs Engineer
> > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > Number:
> > > 368047
>
> --
> Nick Johnson, App Engine Developer Programs Engineer
> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Tony

You could accomplish this task like so:

xlist = []
query = Foo.all().filter("property_x >" 50).order("-timestamp")
for q in query:
  xlist.append(q.property_x)
avg = sum(xlist) / len(xlist)

What Nick is saying, I think, is that fetching 1000 entities is going
to be very resource-intensive, so a better way to do it is to
calculate this data at write-time instead of read-time.  For example,
every time you add an entity, you could update a separate entity that
has a property like "average = db.FloatProperty()" with the current
average, and then you could simply fetch that entity and get the
current running average.

On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
> Ok. Say I have many (>1000)  Model entities with two properties 'x'
> and 'date'.    What is the most efficient query to fetch say the
> latest 200 entities  where x > 50.   I don't care what their 'date's
> are as long as I get the latest and x > 50
>
> Thanks again for your help.
>
> On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
> wrote:
>
> > Consider precalculating this data and storing it against another entity.
> > This will save a lot of work on requests.
>
> > -Nick Johnson
>
> > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > No the users won't need to read 1000 entities, but I want to calculate
> > > the average of a  property from the latest 1000 entities.
>
> > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > > wrote:
> > > > Correct. Are you sure you need 1000 entities, though? Your users 
> > > > probably
> > > > won't read through all 1000.
>
> > > > -Nick Johnson
>
> > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > So to be sure to get the latest 1000 entities I should add a datetime
> > > > > property to my entitie model and filter and sort on that?
>
> > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > I know that if there are more than 1000 entities that match a query,
> > > > > > then only 1000 will  be return by fetch().  But my question is which
> > > > > > 1000? The last 1000 added to the datastore?  The first 1000 added to
> > > > > > the datastore? Or is it undedined?
>
> > > > > > Thanks
> > > > > > Ian
>
> > > > --
> > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > > Number:
> > > > 368047
>
> > --
> > Nick Johnson, App Engine Developer Programs Engineer
> > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> > 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Tony

I should clarify, I'm not saying move the 1000-entity fetch to the
write process, but instead keep a running total of sum and count that
you can increment and use to calculate the average, rather than having
to fetch entities.  This doesn't solve the use case of "average of
last x entities", though (I made the assumption that you'd prefer to
have the average of >1000 entities if possible) - for that you could
use a list property of length x as a queue and use the sum() and len()
functions to get the average.

On Jun 22, 4:46 pm, Tony  wrote:
> You could accomplish this task like so:
>
> xlist = []
> query = Foo.all().filter("property_x >" 50).order("-timestamp")
> for q in query:
>   xlist.append(q.property_x)
> avg = sum(xlist) / len(xlist)
>
> What Nick is saying, I think, is that fetching 1000 entities is going
> to be very resource-intensive, so a better way to do it is to
> calculate this data at write-time instead of read-time.  For example,
> every time you add an entity, you could update a separate entity that
> has a property like "average = db.FloatProperty()" with the current
> average, and then you could simply fetch that entity and get the
> current running average.
>
> On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > and 'date'.    What is the most efficient query to fetch say the
> > latest 200 entities  where x > 50.   I don't care what their 'date's
> > are as long as I get the latest and x > 50
>
> > Thanks again for your help.
>
> > On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
> > wrote:
>
> > > Consider precalculating this data and storing it against another entity.
> > > This will save a lot of work on requests.
>
> > > -Nick Johnson
>
> > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > No the users won't need to read 1000 entities, but I want to calculate
> > > > the average of a  property from the latest 1000 entities.
>
> > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > > > wrote:
> > > > > Correct. Are you sure you need 1000 entities, though? Your users 
> > > > > probably
> > > > > won't read through all 1000.
>
> > > > > -Nick Johnson
>
> > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > So to be sure to get the latest 1000 entities I should add a 
> > > > > > datetime
> > > > > > property to my entitie model and filter and sort on that?
>
> > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > I know that if there are more than 1000 entities that match a 
> > > > > > > query,
> > > > > > > then only 1000 will  be return by fetch().  But my question is 
> > > > > > > which
> > > > > > > 1000? The last 1000 added to the datastore?  The first 1000 added 
> > > > > > > to
> > > > > > > the datastore? Or is it undedined?
>
> > > > > > > Thanks
> > > > > > > Ian
>
> > > > > --
> > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > > > Number:
> > > > > 368047
>
> > > --
> > > Nick Johnson, App Engine Developer Programs Engineer
> > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> > > 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread herbie

I tried your query below but I get "BadArgumentError: First ordering
property must be the same as inequality filter property, if specified
for this query;"
Does this mean I have to order on 'x' first, then order on 'date'?
Will this still return the latest 200 of all entities with x > 50 if
I  call query.fetch(200)?


I take your's and Nick's about keeping a 'running average'.   But in
my example the user can change the 'x' value so the average has to be
recalculated from the latest entities.


On Jun 22, 9:46 pm, Tony  wrote:
> You could accomplish this task like so:
>
> xlist = []
> query = Foo.all().filter("property_x >" 50).order("-timestamp")
> for q in query:
>   xlist.append(q.property_x)
> avg = sum(xlist) / len(xlist)
>
> What Nick is saying, I think, is that fetching 1000 entities is going
> to be very resource-intensive, so a better way to do it is to
> calculate this data at write-time instead of read-time.  For example,
> every time you add an entity, you could update a separate entity that
> has a property like "average = db.FloatProperty()" with the current
> average, and then you could simply fetch that entity and get the
> current running average.
>
> On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > and 'date'.    What is the most efficient query to fetch say the
> > latest 200 entities  where x > 50.   I don't care what their 'date's
> > are as long as I get the latest and x > 50
>
> > Thanks again for your help.
>
> > On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
> > wrote:
>
> > > Consider precalculating this data and storing it against another entity.
> > > This will save a lot of work on requests.
>
> > > -Nick Johnson
>
> > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > No the users won't need to read 1000 entities, but I want to calculate
> > > > the average of a  property from the latest 1000 entities.
>
> > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > > > wrote:
> > > > > Correct. Are you sure you need 1000 entities, though? Your users 
> > > > > probably
> > > > > won't read through all 1000.
>
> > > > > -Nick Johnson
>
> > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > So to be sure to get the latest 1000 entities I should add a 
> > > > > > datetime
> > > > > > property to my entitie model and filter and sort on that?
>
> > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > I know that if there are more than 1000 entities that match a 
> > > > > > > query,
> > > > > > > then only 1000 will  be return by fetch().  But my question is 
> > > > > > > which
> > > > > > > 1000? The last 1000 added to the datastore?  The first 1000 added 
> > > > > > > to
> > > > > > > the datastore? Or is it undedined?
>
> > > > > > > Thanks
> > > > > > > Ian
>
> > > > > --
> > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > > > Number:
> > > > > 368047
>
> > > --
> > > Nick Johnson, App Engine Developer Programs Engineer
> > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> > > 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-22 Thread Tony

Yes, that is what it means.  I forgot about that restriction.

I see what you mean about changing 'x' values.  Perhaps consider
keeping two counts - a running sum and a running count (of the # of x
properties).  If a user modifies an 'x' value, you can adjust the sum
up or down accordingly.

On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
> I tried your query below but I get "BadArgumentError: First ordering
> property must be the same as inequality filter property, if specified
> for this query;"
> Does this mean I have to order on 'x' first, then order on 'date'?
> Will this still return the latest 200 of all entities with x > 50 if
> I  call query.fetch(200)?
>
> I take your's and Nick's about keeping a 'running average'.   But in
> my example the user can change the 'x' value so the average has to be
> recalculated from the latest entities.
>
> On Jun 22, 9:46 pm, Tony  wrote:
>
>
>
> > You could accomplish this task like so:
>
> > xlist = []
> > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > for q in query:
> >   xlist.append(q.property_x)
> > avg = sum(xlist) / len(xlist)
>
> > What Nick is saying, I think, is that fetching 1000 entities is going
> > to be very resource-intensive, so a better way to do it is to
> > calculate this data at write-time instead of read-time.  For example,
> > every time you add an entity, you could update a separate entity that
> > has a property like "average = db.FloatProperty()" with the current
> > average, and then you could simply fetch that entity and get the
> > current running average.
>
> > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > > and 'date'.    What is the most efficient query to fetch say the
> > > latest 200 entities  where x > 50.   I don't care what their 'date's
> > > are as long as I get the latest and x > 50
>
> > > Thanks again for your help.
>
> > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
> > > wrote:
>
> > > > Consider precalculating this data and storing it against another entity.
> > > > This will save a lot of work on requests.
>
> > > > -Nick Johnson
>
> > > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > No the users won't need to read 1000 entities, but I want to calculate
> > > > > the average of a  property from the latest 1000 entities.
>
> > > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > > > > wrote:
> > > > > > Correct. Are you sure you need 1000 entities, though? Your users 
> > > > > > probably
> > > > > > won't read through all 1000.
>
> > > > > > -Nick Johnson
>
> > > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > > So to be sure to get the latest 1000 entities I should add a 
> > > > > > > datetime
> > > > > > > property to my entitie model and filter and sort on that?
>
> > > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > > I know that if there are more than 1000 entities that match a 
> > > > > > > > query,
> > > > > > > > then only 1000 will  be return by fetch().  But my question is 
> > > > > > > > which
> > > > > > > > 1000? The last 1000 added to the datastore?  The first 1000 
> > > > > > > > added to
> > > > > > > > the datastore? Or is it undedined?
>
> > > > > > > > Thanks
> > > > > > > > Ian
>
> > > > > > --
> > > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > > > > Number:
> > > > > > 368047
>
> > > > --
> > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration 
> > > > Number:
> > > > 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread herbie


So will this :
query = Foo.all().filter("property_x >" 50).order("property_x") .order
("-timestamp")
results = query.fetch(200)

..get the latest entities where property_x > 50 ?  Or will it get the
200 properties with the largest 'property_x'  which are then ordered
by 'timestamp' ?   A subtle but important difference.

As I said I need make sure I get the latest entities.


On Jun 22, 11:33 pm, Tony  wrote:
> Yes, that is what it means.  I forgot about that restriction.
>
> I see what you mean about changing 'x' values.  Perhaps consider
> keeping two counts - a running sum and a running count (of the # of x
> properties).  If a user modifies an 'x' value, you can adjust the sum
> up or down accordingly.
>
> On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > I tried your query below but I get "BadArgumentError: First ordering
> > property must be the same as inequality filter property, if specified
> > for this query;"
> > Does this mean I have to order on 'x' first, then order on 'date'?
> > Will this still return the latest 200 of all entities with x > 50 if
> > I  call query.fetch(200)?
>
> > I take your's and Nick's about keeping a 'running average'.   But in
> > my example the user can change the 'x' value so the average has to be
> > recalculated from the latest entities.
>
> > On Jun 22, 9:46 pm, Tony  wrote:
>
> > > You could accomplish this task like so:
>
> > > xlist = []
> > > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > > for q in query:
> > >   xlist.append(q.property_x)
> > > avg = sum(xlist) / len(xlist)
>
> > > What Nick is saying, I think, is that fetching 1000 entities is going
> > > to be very resource-intensive, so a better way to do it is to
> > > calculate this data at write-time instead of read-time.  For example,
> > > every time you add an entity, you could update a separate entity that
> > > has a property like "average = db.FloatProperty()" with the current
> > > average, and then you could simply fetch that entity and get the
> > > current running average.
>
> > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > > > and 'date'.    What is the most efficient query to fetch say the
> > > > latest 200 entities  where x > 50.   I don't care what their 'date's
> > > > are as long as I get the latest and x > 50
>
> > > > Thanks again for your help.
>
> > > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" 
> > > > wrote:
>
> > > > > Consider precalculating this data and storing it against another 
> > > > > entity.
> > > > > This will save a lot of work on requests.
>
> > > > > -Nick Johnson
>
> > > > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > No the users won't need to read 1000 entities, but I want to 
> > > > > > calculate
> > > > > > the average of a  property from the latest 1000 entities.
>
> > > > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" 
> > > > > > 
> > > > > > wrote:
> > > > > > > Correct. Are you sure you need 1000 entities, though? Your users 
> > > > > > > probably
> > > > > > > won't read through all 1000.
>
> > > > > > > -Nick Johnson
>
> > > > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > > > So to be sure to get the latest 1000 entities I should add a 
> > > > > > > > datetime
> > > > > > > > property to my entitie model and filter and sort on that?
>
> > > > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > > > I know that if there are more than 1000 entities that match a 
> > > > > > > > > query,
> > > > > > > > > then only 1000 will  be return by fetch().  But my question 
> > > > > > > > > is which
> > > > > > > > > 1000? The last 1000 added to the datastore?  The first 1000 
> > > > > > > > > added to
> > > > > > > > > the datastore? Or is it undedined?
>
> > > > > > > > > Thanks
> > > > > > > > > Ian
>
> > > > > > > --
> > > > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> > > > > > Number:
> > > > > > > 368047
>
> > > > > --
> > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration 
> > > > > Number:
> > > > > 368047
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread Nick Johnson (Google)
On Tue, Jun 23, 2009 at 12:42 PM, herbie <4whi...@o2.co.uk> wrote:

>
>
> So will this :
> query = Foo.all().filter("property_x >" 50).order("property_x") .order
> ("-timestamp")
> results = query.fetch(200)
>
> ..get the latest entities where property_x > 50 ?  Or will it get the
> 200 properties with the largest 'property_x'  which are then ordered
> by 'timestamp' ?   A subtle but important difference.


It will get the 200 entities with the smallest property_x greater than 50
(since you're filtering >50 and ordering first by property_x). If two
entities have the same value for property_x, they will be sorted by
timestamp, descending.

If you need the latest, and your threshold of 50 is a constant, you can add
a BooleanProperty to your entity group encoding the condition 'is greater
than 50', and filter on that using an equality filter.

-Nick Johnson


>
> As I said I need make sure I get the latest entities.
>
>
> On Jun 22, 11:33 pm, Tony  wrote:
> > Yes, that is what it means.  I forgot about that restriction.
> >
> > I see what you mean about changing 'x' values.  Perhaps consider
> > keeping two counts - a running sum and a running count (of the # of x
> > properties).  If a user modifies an 'x' value, you can adjust the sum
> > up or down accordingly.
> >
> > On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > I tried your query below but I get "BadArgumentError: First ordering
> > > property must be the same as inequality filter property, if specified
> > > for this query;"
> > > Does this mean I have to order on 'x' first, then order on 'date'?
> > > Will this still return the latest 200 of all entities with x > 50 if
> > > I  call query.fetch(200)?
> >
> > > I take your's and Nick's about keeping a 'running average'.   But in
> > > my example the user can change the 'x' value so the average has to be
> > > recalculated from the latest entities.
> >
> > > On Jun 22, 9:46 pm, Tony  wrote:
> >
> > > > You could accomplish this task like so:
> >
> > > > xlist = []
> > > > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > > > for q in query:
> > > >   xlist.append(q.property_x)
> > > > avg = sum(xlist) / len(xlist)
> >
> > > > What Nick is saying, I think, is that fetching 1000 entities is going
> > > > to be very resource-intensive, so a better way to do it is to
> > > > calculate this data at write-time instead of read-time.  For example,
> > > > every time you add an entity, you could update a separate entity that
> > > > has a property like "average = db.FloatProperty()" with the current
> > > > average, and then you could simply fetch that entity and get the
> > > > current running average.
> >
> > > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > > > > and 'date'.What is the most efficient query to fetch say the
> > > > > latest 200 entities  where x > 50.   I don't care what their
> 'date's
> > > > > are as long as I get the latest and x > 50
> >
> > > > > Thanks again for your help.
> >
> > > > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" <
> nick.john...@google.com>
> > > > > wrote:
> >
> > > > > > Consider precalculating this data and storing it against another
> entity.
> > > > > > This will save a lot of work on requests.
> >
> > > > > > -Nick Johnson
> >
> > > > > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk>
> wrote:
> >
> > > > > > > No the users won't need to read 1000 entities, but I want to
> calculate
> > > > > > > the average of a  property from the latest 1000 entities.
> >
> > > > > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" <
> nick.john...@google.com>
> > > > > > > wrote:
> > > > > > > > Correct. Are you sure you need 1000 entities, though? Your
> users probably
> > > > > > > > won't read through all 1000.
> >
> > > > > > > > -Nick Johnson
> >
> > > > > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk>
> wrote:
> >
> > > > > > > > > So to be sure to get the latest 1000 entities I should add
> a datetime
> > > > > > > > > property to my entitie model and filter and sort on that?
> >
> > > > > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > > > > I know that if there are more than 1000 entities that
> match a query,
> > > > > > > > > > then only 1000 will  be return by fetch().  But my
> question is which
> > > > > > > > > > 1000? The last 1000 added to the datastore?  The first
> 1000 added to
> > > > > > > > > > the datastore? Or is it undedined?
> >
> > > > > > > > > > Thanks
> > > > > > > > > > Ian
> >
> > > > > > > > --
> > > > > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland,
> Registration
> > > > > > > Number:
> > > > > > > > 368047
> >
> > > > > > --
> > > > > > Nick Johnson, App Engine Developer Programs Engineer
> > > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland,
> Registration Number:
>

[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread herbie

Thanks for your help Nick.

No my threshold value 'x' isn't constant.   I still havn't got my head
round this yet!   Can you tell me how to get the latest entities
(assuming I don't want all of them)   out of the datastore  and filter
on another property?

For example:  Get the latest 200 entities  where x > 50.   I don't
care what their 'date's are as long as I get the latest and x > 50.


On Jun 23, 1:16 pm, "Nick Johnson (Google)" 
wrote:
> On Tue, Jun 23, 2009 at 12:42 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > So will this :
> > query = Foo.all().filter("property_x >" 50).order("property_x") .order
> > ("-timestamp")
> > results = query.fetch(200)
>
> > ..get the latest entities where property_x > 50 ?  Or will it get the
> > 200 properties with the largest 'property_x'  which are then ordered
> > by 'timestamp' ?   A subtle but important difference.
>
> It will get the 200 entities with the smallest property_x greater than 50
> (since you're filtering >50 and ordering first by property_x). If two
> entities have the same value for property_x, they will be sorted by
> timestamp, descending.
>
> If you need the latest, and your threshold of 50 is a constant, you can add
> a BooleanProperty to your entity group encoding the condition 'is greater
> than 50', and filter on that using an equality filter.
>
> -Nick Johnson
>
>
>
>
>
> > As I said I need make sure I get the latest entities.
>
> > On Jun 22, 11:33 pm, Tony  wrote:
> > > Yes, that is what it means.  I forgot about that restriction.
>
> > > I see what you mean about changing 'x' values.  Perhaps consider
> > > keeping two counts - a running sum and a running count (of the # of x
> > > properties).  If a user modifies an 'x' value, you can adjust the sum
> > > up or down accordingly.
>
> > > On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > > I tried your query below but I get "BadArgumentError: First ordering
> > > > property must be the same as inequality filter property, if specified
> > > > for this query;"
> > > > Does this mean I have to order on 'x' first, then order on 'date'?
> > > > Will this still return the latest 200 of all entities with x > 50 if
> > > > I  call query.fetch(200)?
>
> > > > I take your's and Nick's about keeping a 'running average'.   But in
> > > > my example the user can change the 'x' value so the average has to be
> > > > recalculated from the latest entities.
>
> > > > On Jun 22, 9:46 pm, Tony  wrote:
>
> > > > > You could accomplish this task like so:
>
> > > > > xlist = []
> > > > > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > > > > for q in query:
> > > > >   xlist.append(q.property_x)
> > > > > avg = sum(xlist) / len(xlist)
>
> > > > > What Nick is saying, I think, is that fetching 1000 entities is going
> > > > > to be very resource-intensive, so a better way to do it is to
> > > > > calculate this data at write-time instead of read-time.  For example,
> > > > > every time you add an entity, you could update a separate entity that
> > > > > has a property like "average = db.FloatProperty()" with the current
> > > > > average, and then you could simply fetch that entity and get the
> > > > > current running average.
>
> > > > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > Ok. Say I have many (>1000)  Model entities with two properties 'x'
> > > > > > and 'date'.    What is the most efficient query to fetch say the
> > > > > > latest 200 entities  where x > 50.   I don't care what their
> > 'date's
> > > > > > are as long as I get the latest and x > 50
>
> > > > > > Thanks again for your help.
>
> > > > > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" <
> > nick.john...@google.com>
> > > > > > wrote:
>
> > > > > > > Consider precalculating this data and storing it against another
> > entity.
> > > > > > > This will save a lot of work on requests.
>
> > > > > > > -Nick Johnson
>
> > > > > > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk>
> > wrote:
>
> > > > > > > > No the users won't need to read 1000 entities, but I want to
> > calculate
> > > > > > > > the average of a  property from the latest 1000 entities.
>
> > > > > > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" <
> > nick.john...@google.com>
> > > > > > > > wrote:
> > > > > > > > > Correct. Are you sure you need 1000 entities, though? Your
> > users probably
> > > > > > > > > won't read through all 1000.
>
> > > > > > > > > -Nick Johnson
>
> > > > > > > > > On Mon, Jun 22, 2009 at 3:23 PM, herbie <4whi...@o2.co.uk>
> > wrote:
>
> > > > > > > > > > So to be sure to get the latest 1000 entities I should add
> > a datetime
> > > > > > > > > > property to my entitie model and filter and sort on that?
>
> > > > > > > > > > On Jun 22, 1:42 pm, herbie <4whi...@o2.co.uk> wrote:
> > > > > > > > > > > I know that if there are more than 1000 entities that
> > match a query,
> > > > > > > > > > > then only 1000 will  be return by fetch().  But my
> > question is which
> > > > > > > > > > > 1

[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread Nick Johnson (Google)
Hi herbie,

If your query includes an inequality (such as x>50), then your first sort
order has to be on the same property as that inequality, which means you
can't (directly) fetch the most recent 200 results with x>50. You either
need to change your query to use only equality filters, or you need to fetch
extra results, then sort them in memory and only take the most recent ones.

-Nick Johnson

On Tue, Jun 23, 2009 at 1:44 PM, herbie <4whi...@o2.co.uk> wrote:

>
> Thanks for your help Nick.
>
> No my threshold value 'x' isn't constant.   I still havn't got my head
> round this yet!   Can you tell me how to get the latest entities
> (assuming I don't want all of them)   out of the datastore  and filter
> on another property?
>
> For example:  Get the latest 200 entities  where x > 50.   I don't
> care what their 'date's are as long as I get the latest and x > 50.
>
>
> On Jun 23, 1:16 pm, "Nick Johnson (Google)" 
> wrote:
> > On Tue, Jun 23, 2009 at 12:42 PM, herbie <4whi...@o2.co.uk> wrote:
> >
> > > So will this :
> > > query = Foo.all().filter("property_x >" 50).order("property_x") .order
> > > ("-timestamp")
> > > results = query.fetch(200)
> >
> > > ..get the latest entities where property_x > 50 ?  Or will it get the
> > > 200 properties with the largest 'property_x'  which are then ordered
> > > by 'timestamp' ?   A subtle but important difference.
> >
> > It will get the 200 entities with the smallest property_x greater than 50
> > (since you're filtering >50 and ordering first by property_x). If two
> > entities have the same value for property_x, they will be sorted by
> > timestamp, descending.
> >
> > If you need the latest, and your threshold of 50 is a constant, you can
> add
> > a BooleanProperty to your entity group encoding the condition 'is greater
> > than 50', and filter on that using an equality filter.
> >
> > -Nick Johnson
> >
> >
> >
> >
> >
> > > As I said I need make sure I get the latest entities.
> >
> > > On Jun 22, 11:33 pm, Tony  wrote:
> > > > Yes, that is what it means.  I forgot about that restriction.
> >
> > > > I see what you mean about changing 'x' values.  Perhaps consider
> > > > keeping two counts - a running sum and a running count (of the # of x
> > > > properties).  If a user modifies an 'x' value, you can adjust the sum
> > > > up or down accordingly.
> >
> > > > On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > I tried your query below but I get "BadArgumentError: First
> ordering
> > > > > property must be the same as inequality filter property, if
> specified
> > > > > for this query;"
> > > > > Does this mean I have to order on 'x' first, then order on 'date'?
> > > > > Will this still return the latest 200 of all entities with x > 50
> if
> > > > > I  call query.fetch(200)?
> >
> > > > > I take your's and Nick's about keeping a 'running average'.   But
> in
> > > > > my example the user can change the 'x' value so the average has to
> be
> > > > > recalculated from the latest entities.
> >
> > > > > On Jun 22, 9:46 pm, Tony  wrote:
> >
> > > > > > You could accomplish this task like so:
> >
> > > > > > xlist = []
> > > > > > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > > > > > for q in query:
> > > > > >   xlist.append(q.property_x)
> > > > > > avg = sum(xlist) / len(xlist)
> >
> > > > > > What Nick is saying, I think, is that fetching 1000 entities is
> going
> > > > > > to be very resource-intensive, so a better way to do it is to
> > > > > > calculate this data at write-time instead of read-time.  For
> example,
> > > > > > every time you add an entity, you could update a separate entity
> that
> > > > > > has a property like "average = db.FloatProperty()" with the
> current
> > > > > > average, and then you could simply fetch that entity and get the
> > > > > > current running average.
> >
> > > > > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > > > Ok. Say I have many (>1000)  Model entities with two properties
> 'x'
> > > > > > > and 'date'.What is the most efficient query to fetch say
> the
> > > > > > > latest 200 entities  where x > 50.   I don't care what their
> > > 'date's
> > > > > > > are as long as I get the latest and x > 50
> >
> > > > > > > Thanks again for your help.
> >
> > > > > > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" <
> > > nick.john...@google.com>
> > > > > > > wrote:
> >
> > > > > > > > Consider precalculating this data and storing it against
> another
> > > entity.
> > > > > > > > This will save a lot of work on requests.
> >
> > > > > > > > -Nick Johnson
> >
> > > > > > > > On Mon, Jun 22, 2009 at 3:55 PM, herbie <4whi...@o2.co.uk>
> > > wrote:
> >
> > > > > > > > > No the users won't need to read 1000 entities, but I want
> to
> > > calculate
> > > > > > > > > the average of a  property from the latest 1000 entities.
> >
> > > > > > > > > On Jun 22, 3:30 pm, "Nick Johnson (Google)" <
> > > nick.john...@google.com>
> > > > > > > > > wrote:
> > > > > > > > 

[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread herbie

Oh, really? That limits my app somewhat.   I asume if I have no
inequality filtres and order by date I will get the latest entities?

I could then filter these in memory for x > threshold


On Jun 23, 1:50 pm, "Nick Johnson (Google)" 
wrote:
> Hi herbie,
>
> If your query includes an inequality (such as x>50), then your first sort
> order has to be on the same property as that inequality, which means you
> can't (directly) fetch the most recent 200 results with x>50. You either
> need to change your query to use only equality filters, or you need to fetch
> extra results, then sort them in memory and only take the most recent ones.
>
> -Nick Johnson
>
>
>
> On Tue, Jun 23, 2009 at 1:44 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > Thanks for your help Nick.
>
> > No my threshold value 'x' isn't constant.   I still havn't got my head
> > round this yet!   Can you tell me how to get the latest entities
> > (assuming I don't want all of them)   out of the datastore  and filter
> > on another property?
>
> > For example:  Get the latest 200 entities  where x > 50.   I don't
> > care what their 'date's are as long as I get the latest and x > 50.
>
> > On Jun 23, 1:16 pm, "Nick Johnson (Google)" 
> > wrote:
> > > On Tue, Jun 23, 2009 at 12:42 PM, herbie <4whi...@o2.co.uk> wrote:
>
> > > > So will this :
> > > > query = Foo.all().filter("property_x >" 50).order("property_x") .order
> > > > ("-timestamp")
> > > > results = query.fetch(200)
>
> > > > ..get the latest entities where property_x > 50 ?  Or will it get the
> > > > 200 properties with the largest 'property_x'  which are then ordered
> > > > by 'timestamp' ?   A subtle but important difference.
>
> > > It will get the 200 entities with the smallest property_x greater than 50
> > > (since you're filtering >50 and ordering first by property_x). If two
> > > entities have the same value for property_x, they will be sorted by
> > > timestamp, descending.
>
> > > If you need the latest, and your threshold of 50 is a constant, you can
> > add
> > > a BooleanProperty to your entity group encoding the condition 'is greater
> > > than 50', and filter on that using an equality filter.
>
> > > -Nick Johnson
>
> > > > As I said I need make sure I get the latest entities.
>
> > > > On Jun 22, 11:33 pm, Tony  wrote:
> > > > > Yes, that is what it means.  I forgot about that restriction.
>
> > > > > I see what you mean about changing 'x' values.  Perhaps consider
> > > > > keeping two counts - a running sum and a running count (of the # of x
> > > > > properties).  If a user modifies an 'x' value, you can adjust the sum
> > > > > up or down accordingly.
>
> > > > > On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > I tried your query below but I get "BadArgumentError: First
> > ordering
> > > > > > property must be the same as inequality filter property, if
> > specified
> > > > > > for this query;"
> > > > > > Does this mean I have to order on 'x' first, then order on 'date'?
> > > > > > Will this still return the latest 200 of all entities with x > 50
> > if
> > > > > > I  call query.fetch(200)?
>
> > > > > > I take your's and Nick's about keeping a 'running average'.   But
> > in
> > > > > > my example the user can change the 'x' value so the average has to
> > be
> > > > > > recalculated from the latest entities.
>
> > > > > > On Jun 22, 9:46 pm, Tony  wrote:
>
> > > > > > > You could accomplish this task like so:
>
> > > > > > > xlist = []
> > > > > > > query = Foo.all().filter("property_x >" 50).order("-timestamp")
> > > > > > > for q in query:
> > > > > > >   xlist.append(q.property_x)
> > > > > > > avg = sum(xlist) / len(xlist)
>
> > > > > > > What Nick is saying, I think, is that fetching 1000 entities is
> > going
> > > > > > > to be very resource-intensive, so a better way to do it is to
> > > > > > > calculate this data at write-time instead of read-time.  For
> > example,
> > > > > > > every time you add an entity, you could update a separate entity
> > that
> > > > > > > has a property like "average = db.FloatProperty()" with the
> > current
> > > > > > > average, and then you could simply fetch that entity and get the
> > > > > > > current running average.
>
> > > > > > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
>
> > > > > > > > Ok. Say I have many (>1000)  Model entities with two properties
> > 'x'
> > > > > > > > and 'date'.    What is the most efficient query to fetch say
> > the
> > > > > > > > latest 200 entities  where x > 50.   I don't care what their
> > > > 'date's
> > > > > > > > are as long as I get the latest and x > 50
>
> > > > > > > > Thanks again for your help.
>
> > > > > > > > On Jun 22, 4:11 pm, "Nick Johnson (Google)" <
> > > > nick.john...@google.com>
> > > > > > > > wrote:
>
> > > > > > > > > Consider precalculating this data and storing it against
> > another
> > > > entity.
> > > > > > > > > This will save a lot of work on requests.
>
> > > > > > > > > -Nick Johnson
>
> > > > > > > > > On Mon, Jun 22, 20

[google-appengine] Re: Query with >1000 matches

2009-06-23 Thread Nick Johnson (Google)
On Tue, Jun 23, 2009 at 3:27 PM, herbie <4whi...@o2.co.uk> wrote:

>
> Oh, really? That limits my app somewhat.   I asume if I have no
> inequality filtres and order by date I will get the latest entities?
>
> I could then filter these in memory for x > threshold


Correct.

-Nick Johnson


>
>
>
> On Jun 23, 1:50 pm, "Nick Johnson (Google)" 
> wrote:
> > Hi herbie,
> >
> > If your query includes an inequality (such as x>50), then your first sort
> > order has to be on the same property as that inequality, which means you
> > can't (directly) fetch the most recent 200 results with x>50. You either
> > need to change your query to use only equality filters, or you need to
> fetch
> > extra results, then sort them in memory and only take the most recent
> ones.
> >
> > -Nick Johnson
> >
> >
> >
> > On Tue, Jun 23, 2009 at 1:44 PM, herbie <4whi...@o2.co.uk> wrote:
> >
> > > Thanks for your help Nick.
> >
> > > No my threshold value 'x' isn't constant.   I still havn't got my head
> > > round this yet!   Can you tell me how to get the latest entities
> > > (assuming I don't want all of them)   out of the datastore  and filter
> > > on another property?
> >
> > > For example:  Get the latest 200 entities  where x > 50.   I don't
> > > care what their 'date's are as long as I get the latest and x > 50.
> >
> > > On Jun 23, 1:16 pm, "Nick Johnson (Google)" 
> > > wrote:
> > > > On Tue, Jun 23, 2009 at 12:42 PM, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > So will this :
> > > > > query = Foo.all().filter("property_x >" 50).order("property_x")
> .order
> > > > > ("-timestamp")
> > > > > results = query.fetch(200)
> >
> > > > > ..get the latest entities where property_x > 50 ?  Or will it get
> the
> > > > > 200 properties with the largest 'property_x'  which are then
> ordered
> > > > > by 'timestamp' ?   A subtle but important difference.
> >
> > > > It will get the 200 entities with the smallest property_x greater
> than 50
> > > > (since you're filtering >50 and ordering first by property_x). If two
> > > > entities have the same value for property_x, they will be sorted by
> > > > timestamp, descending.
> >
> > > > If you need the latest, and your threshold of 50 is a constant, you
> can
> > > add
> > > > a BooleanProperty to your entity group encoding the condition 'is
> greater
> > > > than 50', and filter on that using an equality filter.
> >
> > > > -Nick Johnson
> >
> > > > > As I said I need make sure I get the latest entities.
> >
> > > > > On Jun 22, 11:33 pm, Tony  wrote:
> > > > > > Yes, that is what it means.  I forgot about that restriction.
> >
> > > > > > I see what you mean about changing 'x' values.  Perhaps consider
> > > > > > keeping two counts - a running sum and a running count (of the #
> of x
> > > > > > properties).  If a user modifies an 'x' value, you can adjust the
> sum
> > > > > > up or down accordingly.
> >
> > > > > > On Jun 22, 5:40 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > > > I tried your query below but I get "BadArgumentError: First
> > > ordering
> > > > > > > property must be the same as inequality filter property, if
> > > specified
> > > > > > > for this query;"
> > > > > > > Does this mean I have to order on 'x' first, then order on
> 'date'?
> > > > > > > Will this still return the latest 200 of all entities with x >
> 50
> > > if
> > > > > > > I  call query.fetch(200)?
> >
> > > > > > > I take your's and Nick's about keeping a 'running average'.
> But
> > > in
> > > > > > > my example the user can change the 'x' value so the average has
> to
> > > be
> > > > > > > recalculated from the latest entities.
> >
> > > > > > > On Jun 22, 9:46 pm, Tony  wrote:
> >
> > > > > > > > You could accomplish this task like so:
> >
> > > > > > > > xlist = []
> > > > > > > > query = Foo.all().filter("property_x >"
> 50).order("-timestamp")
> > > > > > > > for q in query:
> > > > > > > >   xlist.append(q.property_x)
> > > > > > > > avg = sum(xlist) / len(xlist)
> >
> > > > > > > > What Nick is saying, I think, is that fetching 1000 entities
> is
> > > going
> > > > > > > > to be very resource-intensive, so a better way to do it is to
> > > > > > > > calculate this data at write-time instead of read-time.  For
> > > example,
> > > > > > > > every time you add an entity, you could update a separate
> entity
> > > that
> > > > > > > > has a property like "average = db.FloatProperty()" with the
> > > current
> > > > > > > > average, and then you could simply fetch that entity and get
> the
> > > > > > > > current running average.
> >
> > > > > > > > On Jun 22, 4:25 pm, herbie <4whi...@o2.co.uk> wrote:
> >
> > > > > > > > > Ok. Say I have many (>1000)  Model entities with two
> properties
> > > 'x'
> > > > > > > > > and 'date'.What is the most efficient query to fetch
> say
> > > the
> > > > > > > > > latest 200 entities  where x > 50.   I don't care what
> their
> > > > > 'date's
> > > > > > > > > are as long as I get the latest and x > 50
> >
> > > > > > > > > Thanks again f