Hi, I was wondering if anyone can help me with this problem. We have an idea we'd like to implement, and we're currently unable to do this efficiently.
I've anonymised the data as best as possible, but the structure is the same. We have two entities, Car and CarJourney. Each Car has 0 to many CarJourney's. Each Car Journey has (amongst other properties) a date associated with it - the date the journey was started. I wish to query by time over car journeys. I'll have two times, a start date and an end date, where start date <= endDate, and I want to receive the most recently started journey in that period. So, if I had a particular car in mind, say car 123, I'd write a query that limits by Car.key and Car.startDate, where Car.key == 123 and Journey.startDate >= startDate and Journey.startDate <= endDate with an ordering on Journey.startDate descending and a limit of 1. e.g. Car A has 3 journeys, taken on 1st, 2nd and the 3rd of the month. The query start date is 1st and the query end date is the 2nd. The result of this query would be one Car journey, the 2nd. Once the result of that query is returned, a very small amount of processing is done to return a result to the user. That's the easy bit. But, instead of over 1 Car, I want a list of cars, where the list contains N keys to cars. So, I want to run the above query N times, once for every car. And I want the latest journey for each car. Because the time range is flexible (and thus can't be known beforehand) we can't implement a "isMostRecent" flag, because while it might be the most recent for now, it might not be the most recent for the specified date parameters. We also need to ensure that this returns promptly (current queries are around the 3-5 second mark for a small set of data) as this goes straight back to the user. This means that we can't use task queues, and because the specified dates are arbitrary we can't implement mass indexing of "isWithinDate" fields. We tried using an async query, but because the amount of processing is negligible the bottleneck is still the queries on the datastore (because the async api still sends the requests synchronously, it just doesn't block). Ideally, we'd implement this as a select on car journeys ordered by startDate where the Car.key is distinct, but we can't seem to pull this off in GAE. There are numerous small optimisations we can make (for example, some MemCaching of repeated queries) but none have made a significant dent in our query time. And MemCaching can only help for a maximum of 1-2 minutes (due to the inevitable forward march of time!) Any ideas are most welcome and highly appreciated. Thanks, Ed -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.