Re: [google-appengine] Re: How to write optimal queries?
Slide 80: - If you write N entities that are all part of 1 entity group, it counts as 1 write. Do this mean that cost of batch write in transaction will pay as 1 datastore operation? Do batch/query read many items from 1 entity group pay as 1 datastore operation? 2011/7/8 Ikai Lan (Google) ika...@google.com Before I try to answer this question, can you take a look at these slides? Hopefully these should clarify why things work the way they work: http://www.slideshare.net/ikailan/introducing-the-app-engine-datastore Ideally, these will raise new questions about how entity groups, index scans, etc work. Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine On Thu, Jul 7, 2011 at 7:37 PM, Pol i...@pol-online.net wrote: I'm a little confused by your answer and the concept of zigzag joins. In practice, assuming the result set will end up being the same, is any of these 2 queries faster / better and why exactly? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) Regarding the 2nd set of queries I asked about, all things being equal, is it better to do the query with or without the ANCESTOR? Intuitively, I would expect the ANCESTOR version to perform faster as it would only run on 1 machine / entity group, but is this true? In the data model, Photo is a child of User and there 1000's more photos than users. Otherwise, I haven't measured performance in the app yet, I'd rather rely on some official best practices for now :) On Jul 7, 6:25 pm, Ikai Lan (Google) ika...@google.com wrote: Ancestor queries don't add significant overhead, so I'm not going to consider that factor. As far as other queries go, in general fewer indexed mean better performance if you are doing zigzag join. However - if an index contains a small number of results, obviously the query will return faster because there are simply less results. Otherwise, you end up zig zagging across more indexes, which probably will result in slower queries. It really depends on the shape of your data. What have your obvservations been? Ikai Lan Developer Programs Engineer, Google App Engine Blog:http://googleappengine.blogspot.com Twitter:http://twitter.com/app_engine Reddit:http://www.reddit.com/r/appengine On Thu, Jul 7, 2011 at 6:21 PM, Pol i...@pol-online.net wrote: Hi, Assuming the app runs on an HR database and that the number of indexes is not a problem: Say that because of the data model, these queries are equivalent (i.e. return the exact same results), which one should be used to get the best performance? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) Does the response change if somewhere else in the code, there is also this query (because of shared indexes or something)? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE ORDER BY timestamp DESC, self.user.key()) Same question, this time with these 2 other queries: query = db.GqlQuery(SELECT * FROM Photo WHERE event = :1 AND prime = TRUE AND hidden = FALSE ORDER BY timestamp DESC, event.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND event = :2 AND prime = TRUE AND hidden = FALSE ORDER BY timestamp DESC, event.key().parent(), event.key()) I understand the 2nd version is an ancestor query which should return consistent results (right?) but in this case, it's ok if the results are a bit stalled. Thanks! -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to
Re: [google-appengine] Re: How to write optimal queries?
Before I try to answer this question, can you take a look at these slides? Hopefully these should clarify why things work the way they work: http://www.slideshare.net/ikailan/introducing-the-app-engine-datastore Ideally, these will raise new questions about how entity groups, index scans, etc work. Ikai Lan Developer Programs Engineer, Google App Engine Blog: http://googleappengine.blogspot.com Twitter: http://twitter.com/app_engine Reddit: http://www.reddit.com/r/appengine On Thu, Jul 7, 2011 at 7:37 PM, Pol i...@pol-online.net wrote: I'm a little confused by your answer and the concept of zigzag joins. In practice, assuming the result set will end up being the same, is any of these 2 queries faster / better and why exactly? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) Regarding the 2nd set of queries I asked about, all things being equal, is it better to do the query with or without the ANCESTOR? Intuitively, I would expect the ANCESTOR version to perform faster as it would only run on 1 machine / entity group, but is this true? In the data model, Photo is a child of User and there 1000's more photos than users. Otherwise, I haven't measured performance in the app yet, I'd rather rely on some official best practices for now :) On Jul 7, 6:25 pm, Ikai Lan (Google) ika...@google.com wrote: Ancestor queries don't add significant overhead, so I'm not going to consider that factor. As far as other queries go, in general fewer indexed mean better performance if you are doing zigzag join. However - if an index contains a small number of results, obviously the query will return faster because there are simply less results. Otherwise, you end up zig zagging across more indexes, which probably will result in slower queries. It really depends on the shape of your data. What have your obvservations been? Ikai Lan Developer Programs Engineer, Google App Engine Blog:http://googleappengine.blogspot.com Twitter:http://twitter.com/app_engine Reddit:http://www.reddit.com/r/appengine On Thu, Jul 7, 2011 at 6:21 PM, Pol i...@pol-online.net wrote: Hi, Assuming the app runs on an HR database and that the number of indexes is not a problem: Say that because of the data model, these queries are equivalent (i.e. return the exact same results), which one should be used to get the best performance? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE AND prime = TRUE ORDER BY timestamp DESC, self.user.key()) Does the response change if somewhere else in the code, there is also this query (because of shared indexes or something)? query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND initialized = TRUE ORDER BY timestamp DESC, self.user.key()) Same question, this time with these 2 other queries: query = db.GqlQuery(SELECT * FROM Photo WHERE event = :1 AND prime = TRUE AND hidden = FALSE ORDER BY timestamp DESC, event.key()) query = db.GqlQuery(SELECT * FROM Photo WHERE ANCESTOR IS :1 AND event = :2 AND prime = TRUE AND hidden = FALSE ORDER BY timestamp DESC, event.key().parent(), event.key()) I understand the 2nd version is an ancestor query which should return consistent results (right?) but in this case, it's ok if the results are a bit stalled. Thanks! -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com . To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.