[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property
Thanks for the follow-up. I guess one could still use the geotoken (my own term) idea, but only save one token (map-graph-square) per saved point. Then, when someone searches on a given point, tokenize the search point, search for anything that shares the same token, and then go back for for data in the surrounding tokens. In my example, that means 9 DB calls instead of one, but that's pretty manageable... I don't know how it scales if you want to have greater variability in your search range, but you could either define your tokens to cover more ground, or increase the number of tokens searched. Looking at the current quota limits, with 2.5 million datastore requests per day, I could handle 200,000 searches per day (250,000 * 9 = 2.45 million), which leaves room for 50,000 posts per day (one datastore operation per post), all under quota. Once this turns into a paid service, if I have to pay for more than that, I think I'll be happy to do so. -B On Sep 9, 2:52 pm, uprise78 [EMAIL PROTECTED] wrote: The exploding index issue can occur with list properties readily. Properties with multiple values, such as using a list value or a ListProperty model, store each value as a separate entry in an index. With a 9 element list you just created 9 indexes. If you expand the box to get some more resolution such as the 13 box example, you will then have 13 indexes on that property alone. You can see where this is going. It gets compounded if you even 1 single other list element in your model. If you had another list element with just 2 records you would have a total of 20 indexes (18 for the 2 lists and then 2 for the basic indexing). This causes a number of issues. The two you pointed out, updating and data storage. Updating has to hit each and every index so updates would be a huge bottleneck if the model is updated often. The other issue is (and correct me if I'm wrong on this anyone with more experience) is that because lists are stored as separate entities Big Table has to do multiple queries to return your result set. A bit of testing would probably be a good idea to see what the speed is. If you have a small dataset, you definitely won't have an issue. Check out this: http://code.google.com/appengine/docs/datastore/queriesandindexes.htm... --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property
Since starting this thread, I've come up with something slightly similar to what you are suggesting here. Rather than store metadata about both lat and lng, I store a ListProperty of lngs to differing degrees of accuracy, and still use inequalities for latitude. So if you have a point with Lat, Lng = 12.12345678, 40.12345678, my stored lng-list property would look like: [40, 40.1, 40.12, 40.123]. This way, based on zoom level (well, actually just based on the size of the viewport, calculated when the query is run), I can query for all of the strips of longitude within my viewport, and then still include a minLat lat minLng. This results in a single query up front, but multiple queries behind the scenes, since the viewport will have many longitude strips and thus I will be comparing the lng-list to an array of values. I decide how to divide up the viewport based on the difference in longitude between either edge, and limit the number of strips to 10 (using the larger strips in order to decrease the number). Here is the code for encoding and generating the array for the query: import decimal from decimal import Decimal as d import logging def LngList(lng): returns array of truncated lng strings lnglist = [] lnglist.append(__truncate(lng, 0)) lnglist.append(__truncate(lng, 1)) lnglist.append(__truncate(lng, 2)) lnglist.append(__truncate(lng, 3)) return lnglist def ViewportLngs(minLng, maxLng): returns array of lngs that exist in the viewport, with the right degree of precision vislngs = [] dif = abs(maxLng - minLng) if minLng maxLng: logging.info(minLng: %s maxLng: %s %(minLng, maxLng)) dif = (180 - minLng) + abs(-180 - maxLng) if dif .01: vislngs = __makelist(3, minLng, 180) + __makelist(3, -180, maxLng) elif dif .1: vislngs = __makelist(2, minLng, 180) + __makelist(2, -180, maxLng) elif dif 1: vislngs = __makelist(1, minLng, 180) + __makelist(1, -180, maxLng) else: logging.error(Viewing both sides of the dateline, zoomed out.) elif dif .01: vislngs = __makelist(3, minLng, maxLng) elif dif .1: vislngs = __makelist(2, minLng, maxLng) elif dif 1: vislngs = __makelist(1, minLng, maxLng) else: # zoomed out case logging.error(The LngList method is being used on a viewport with a range larger than 1 degree.) return vislngs def __makelist(level, minLng, maxLng): actually populates the list of longs, based on conditions passed in, and accounts for extreme near-zero cases. inc = 10**-level lng = __truncate(minLng, level) maxL = __truncate(maxLng, level) vislngs = [lng] while lng != maxL: if lng == 0: return __makelist(level + 1, minLng, maxLng) lng = __truncate(float(lng)+inc, level) vislngs.append(lng) return vislngs def __truncate(num, prec): truncates the float passed in to _prec_ places after the decimal, and returns it as a string result = unicode(d(str(num)).quantize(d(1e%d % (-prec)), decimal.ROUND_DOWN)) if result == u'-0.0': result = u'0.0' if result == u'-0.00': result == u'0.00' return result This method seems to preform pretty well, though I don't yet have gobs of data. I am able to run a query that returns 3-400 markers and load them all up in less than a second. Unfortunately, this doesn't account for Polylines and Polygons. We're still working on an elegant solution for those, probably similar to your method of storing the grid-boxes each one overlaps. Exploding indexes, here I come. Nevin On Sep 10, 10:11 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Thanks for the follow-up. I guess one could still use the geotoken (my own term) idea, but only save one token (map-graph-square) per saved point. Then, when someone searches on a given point, tokenize the search point, search for anything that shares the same token, and then go back for for data in the surrounding tokens. In my example, that means 9 DB calls instead of one, but that's pretty manageable... I don't know how it scales if you want to have greater variability in your search range, but you could either define your tokens to cover more ground, or increase the number of tokens searched. Looking at the current quota limits, with 2.5 million datastore requests per day, I could handle 200,000 searches per day (250,000 * 9 = 2.45 million), which leaves room for 50,000 posts per day (one datastore operation per post), all under quota. Once this turns into a paid service, if I have to pay for more than that, I think I'll be happy to do so. -B On Sep 9, 2:52 pm, uprise78 [EMAIL PROTECTED] wrote: The exploding index issue can occur with list properties readily. Properties with multiple values, such as using a list value or a ListProperty model, store each value as a separate entry in an index. With a 9 element list you just created 9 indexes. If you expand the box to get some more resolution such as the 13 box example, you will then have 13
[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property
I've got a better solution than geohashing. I break down the grid into sub-degree squares by truncating after the first decimal point, and then when I save something that I need to find on the map later, I save metadata with that point indicating the surrounding grid squares. So if I have a point with long -122.123123123 and lat 35.56565, that's in a grid square called -122.1x35.5, and it's surrounded as follows: [-122.2x35.6][-122.1x35.6][-122.0x35.6] [-122.2x35.5][-122.1x35.5][-122.0x35.5] [-122.2x35.4][-122.1x35.4][-122.0x35.4] Those are all represented in my object as a list (db.StringListProperty, so you have to do the right permutations to make them into strings), and because of the way lists work, if a point that you're searching on is in any of the grid squares associated with a saved point, that saved point will come up. To wit, if you have saved that above point, and someone comes in searching on an address that corresponds to LONG -122.0857 LAT 35.6, that corresponds to grid square '-122.0x35.6', which is in your upper right hand corner. Thus, if you search for something like: square = '-122.0x35.6' points = (SELECT * FROM Locations WHERE gridList=:1, square) ...you'll find that the original point we saved above will return. It's all about metadata. Don't think in terms of inequalities and boundary conditions, think in terms of inclusive ranges. Best, Ben On Jul 17, 2:26 pm, Nevin Freeman [EMAIL PROTECTED] wrote: I'm trying to figure out how to incorporate a filter based on the longitude attribute of an Entity's db.GeoPt property. I want to do something like this: GeoDataObj_query = GeoDataObj.gql(WHERE latlng.lat :1 AND latlng.lat :2 AND latlng.lon :3 AND latlng.lon :4 , maxLat, minLat, maxLng, minLng) latlng is my db.GeoPt property, and you can't just do latlng.lat or latlng.lon. How do I access that longitude attribute, and can I do it within the bounds of the rule that limits inequality filters to a single property? Thanks, Nevin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property
On Tue, Sep 9, 2008 at 8:57 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I've got a better solution than geohashing. I break down the grid into sub-degree squares by truncating after the first decimal point, and then when I save something that I need to find on the map later, I save metadata with that point indicating the surrounding grid squares. So if I have a point with long -122.123123123 and lat 35.56565, that's in a grid square called -122.1x35.5, and it's surrounded as follows: [-122.2x35.6][-122.1x35.6][-122.0x35.6] [-122.2x35.5][-122.1x35.5][-122.0x35.5] [-122.2x35.4][-122.1x35.4][-122.0x35.4] Those are all represented in my object as a list (db.StringListProperty, so you have to do the right permutations to make them into strings), and because of the way lists work, if a point that you're searching on is in any of the grid squares associated with a saved point, that saved point will come up. To wit, if you have saved that above point, and someone comes in searching on an address that corresponds to LONG -122.0857 LAT 35.6, that corresponds to grid square '-122.0x35.6', which is in your upper right hand corner. Thus, if you search for something like: square = '-122.0x35.6' points = (SELECT * FROM Locations WHERE gridList=:1, square) ...you'll find that the original point we saved above will return. It's all about metadata. Don't think in terms of inequalities and boundary conditions, think in terms of inclusive ranges. Best, Ben Interesting, basically a different approach to geohash (I believe). However... how does your solution deal with zoom ranges? Best, Jose --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---
[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property
I just came up with this last night, and have only done limited testing. I can guess what you mean by the exploding index problem, but are there any particulars you can give? Does the exploding index problem impact mostly total data storage, time to index, or performance on the GQL queries? Or some combination of the above? I definitely mean to do some testing, but my assumption--tell me if I'm on the wrong track--is that the indexing is primarily a challenge when adding new data, and maybe in data storage. But if I'm right here, then it's not too much of an issue for me, as I imagine that my app will mostly see incremental additions of data to be indexed; I won't be doing bulk loads which I can imagine could bog things down significantly. Any links to other threads on this issue would be appreciated. -Ben (readyassist) On Sep 9, 8:52 am, uprise78 [EMAIL PROTECTED] wrote: Won't having a list property like that get you the exploding index problem? Have you done any testing with a large subset of data? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~--~~~~--~~--~--~---