[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property

2008-09-10 Thread [EMAIL PROTECTED]

Thanks for the follow-up.

I guess one could still use the geotoken (my own term) idea, but
only save one token (map-graph-square) per saved point.  Then, when
someone searches on a given point, tokenize the search point, search
for anything that shares the same token, and then go back for for data
in the surrounding tokens.  In my example, that means 9 DB calls
instead of one, but that's pretty manageable...  I don't know how it
scales if you want to have greater variability in your search range,
but you could either define your tokens to cover more ground, or
increase the number of tokens searched.

Looking at the current quota limits, with 2.5 million datastore
requests per day, I could handle 200,000 searches per day (250,000 * 9
= 2.45 million), which leaves room for 50,000 posts per day (one
datastore operation per post), all under quota.  Once this turns into
a paid service, if I have to pay for more than that, I think I'll be
happy to do so.

-B

On Sep 9, 2:52 pm, uprise78 [EMAIL PROTECTED] wrote:
 The exploding index issue can occur with list properties readily.

 Properties with multiple values, such as using a list value or a
 ListProperty model, store each value as a separate entry in an index.

 With a 9 element list you just created 9 indexes.  If you expand the
 box to get some more resolution such as the 13 box example, you will
 then have 13 indexes on that property alone.  You can see where this
 is going.  It gets compounded if you even 1 single other list element
 in your model.  If you had another list element with just 2 records
 you would have a total of 20 indexes (18 for the 2 lists and then 2
 for the basic indexing).

 This causes a number of issues.  The two you pointed out, updating and
 data storage.  Updating has to hit each and every index so updates
 would be a huge bottleneck if the model is updated often.  The other
 issue is (and correct me if I'm wrong on this anyone with more
 experience) is that because lists are stored as separate entities Big
 Table has to do multiple queries to return your result set.

 A bit of testing would probably be a good idea to see what the speed
 is.  If you have a small dataset, you definitely won't have an issue.

 Check out this:  
 http://code.google.com/appengine/docs/datastore/queriesandindexes.htm...
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property

2008-09-10 Thread Nevin Freeman

Since starting this thread, I've come up with something slightly
similar to what you are suggesting here. Rather than store metadata
about both lat and lng, I store a ListProperty of lngs to differing
degrees of accuracy, and still use inequalities for latitude. So if
you have a point with Lat, Lng = 12.12345678, 40.12345678, my stored
lng-list property would look like: [40, 40.1, 40.12, 40.123]. This
way, based on zoom level (well, actually just based on the size of the
viewport, calculated when the query is run), I can query for all of
the strips of longitude within my viewport, and then still include a
minLat  lat  minLng. This results in a single query up front, but
multiple queries behind the scenes, since the viewport will have many
longitude strips and thus I will be comparing the lng-list to an array
of values. I decide how to divide up the viewport based on the
difference in longitude between either edge, and limit the number of
strips to 10 (using the larger strips in order to decrease the
number). Here is the code for encoding and generating the array for
the query:

import decimal
from decimal import Decimal as d
import logging

def LngList(lng):
  returns array of truncated lng strings
  lnglist = []
  lnglist.append(__truncate(lng, 0))
  lnglist.append(__truncate(lng, 1))
  lnglist.append(__truncate(lng, 2))
  lnglist.append(__truncate(lng, 3))
  return lnglist

def ViewportLngs(minLng, maxLng):
  returns array of lngs that exist in the viewport, with the right
degree of precision
  vislngs = []
  dif = abs(maxLng - minLng)
  if minLng  maxLng:
logging.info(minLng: %s maxLng: %s %(minLng, maxLng))
dif = (180 - minLng) + abs(-180 - maxLng)
if dif  .01:
  vislngs = __makelist(3, minLng, 180) + __makelist(3, -180,
maxLng)
elif dif  .1:
  vislngs = __makelist(2, minLng, 180) + __makelist(2, -180,
maxLng)
elif dif  1:
  vislngs = __makelist(1, minLng, 180) + __makelist(1, -180,
maxLng)
else:
  logging.error(Viewing both sides of the dateline, zoomed out.)
  elif dif  .01:
vislngs = __makelist(3, minLng, maxLng)
  elif dif  .1:
vislngs = __makelist(2, minLng, maxLng)
  elif dif  1:
vislngs = __makelist(1, minLng, maxLng)
  else:
# zoomed out case
logging.error(The LngList method is being used on a viewport with
a range larger than 1 degree.)
  return vislngs

def __makelist(level, minLng, maxLng):
  actually populates the list of longs, based on conditions passed
in, and accounts for extreme near-zero cases.
  inc = 10**-level
  lng = __truncate(minLng, level)
  maxL = __truncate(maxLng, level)
  vislngs = [lng]
  while lng != maxL:
if lng == 0:
  return __makelist(level + 1, minLng, maxLng)
lng = __truncate(float(lng)+inc, level)
vislngs.append(lng)
  return vislngs

def __truncate(num, prec):
  truncates the float passed in to _prec_ places after the decimal,
and returns it as a string
  result = unicode(d(str(num)).quantize(d(1e%d % (-prec)),
decimal.ROUND_DOWN))
  if result == u'-0.0':
result = u'0.0'
  if result == u'-0.00':
result == u'0.00'
  return result

This method seems to preform pretty well, though I don't yet have gobs
of data. I am able to run a query that returns 3-400 markers and load
them all up in less than a second. Unfortunately, this doesn't account
for Polylines and Polygons. We're still working on an elegant solution
for those, probably similar to your method of storing the grid-boxes
each one overlaps. Exploding indexes, here I come.

Nevin

On Sep 10, 10:11 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
 Thanks for the follow-up.

 I guess one could still use the geotoken (my own term) idea, but
 only save one token (map-graph-square) per saved point.  Then, when
 someone searches on a given point, tokenize the search point, search
 for anything that shares the same token, and then go back for for data
 in the surrounding tokens.  In my example, that means 9 DB calls
 instead of one, but that's pretty manageable...  I don't know how it
 scales if you want to have greater variability in your search range,
 but you could either define your tokens to cover more ground, or
 increase the number of tokens searched.

 Looking at the current quota limits, with 2.5 million datastore
 requests per day, I could handle 200,000 searches per day (250,000 * 9
 = 2.45 million), which leaves room for 50,000 posts per day (one
 datastore operation per post), all under quota.  Once this turns into
 a paid service, if I have to pay for more than that, I think I'll be
 happy to do so.

 -B

 On Sep 9, 2:52 pm, uprise78 [EMAIL PROTECTED] wrote:

  The exploding index issue can occur with list properties readily.

  Properties with multiple values, such as using a list value or a
  ListProperty model, store each value as a separate entry in an index.

  With a 9 element list you just created 9 indexes.  If you expand the
  box to get some more resolution such as the 13 box example, you will
  then have 13 

[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property

2008-09-09 Thread [EMAIL PROTECTED]

I've got a better solution than geohashing.

I break down the grid into sub-degree squares by truncating after the
first decimal point, and then when I save something that I need to
find on the map later, I save metadata with that point indicating the
surrounding grid squares.

So if I have a point with long -122.123123123 and lat 35.56565, that's
in a grid square called -122.1x35.5, and it's surrounded as follows:

[-122.2x35.6][-122.1x35.6][-122.0x35.6]
[-122.2x35.5][-122.1x35.5][-122.0x35.5]
[-122.2x35.4][-122.1x35.4][-122.0x35.4]

Those are all represented in my object as a list
(db.StringListProperty, so you have to do the right permutations to
make them into strings), and because of the way lists work, if a point
that you're searching on is in any of the grid squares associated with
a saved point, that saved point will come up.

To wit, if you have saved that above point, and someone comes in
searching on an address that corresponds to LONG -122.0857 LAT
35.6, that corresponds to grid square '-122.0x35.6', which is in
your upper right hand corner.  Thus, if you search for something like:

square = '-122.0x35.6'
points = (SELECT * FROM Locations WHERE gridList=:1, square)

...you'll find that the original point we saved above will return.

It's all about metadata.  Don't think in terms of inequalities and
boundary conditions, think in terms of inclusive ranges.

Best,

Ben

On Jul 17, 2:26 pm, Nevin Freeman [EMAIL PROTECTED] wrote:
 I'm trying to figure out how to incorporate a filter based on the
 longitude attribute of an Entity's db.GeoPt property. I want to do
 something like this:

 GeoDataObj_query = GeoDataObj.gql(WHERE latlng.lat  :1 AND
 latlng.lat  :2 AND latlng.lon  :3 AND latlng.lon  :4
                                                                         ,
 maxLat, minLat, maxLng, minLng)

 latlng is my db.GeoPt property, and you can't just do latlng.lat or
 latlng.lon. How do I access that longitude attribute, and can I do it
 within the bounds of the rule that limits inequality filters to a
 single property?

 Thanks,
 Nevin
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property

2008-09-09 Thread José Oliver Segura

On Tue, Sep 9, 2008 at 8:57 AM, [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:

 I've got a better solution than geohashing.

 I break down the grid into sub-degree squares by truncating after the
 first decimal point, and then when I save something that I need to
 find on the map later, I save metadata with that point indicating the
 surrounding grid squares.

 So if I have a point with long -122.123123123 and lat 35.56565, that's
 in a grid square called -122.1x35.5, and it's surrounded as follows:

 [-122.2x35.6][-122.1x35.6][-122.0x35.6]
 [-122.2x35.5][-122.1x35.5][-122.0x35.5]
 [-122.2x35.4][-122.1x35.4][-122.0x35.4]

 Those are all represented in my object as a list
 (db.StringListProperty, so you have to do the right permutations to
 make them into strings), and because of the way lists work, if a point
 that you're searching on is in any of the grid squares associated with
 a saved point, that saved point will come up.

 To wit, if you have saved that above point, and someone comes in
 searching on an address that corresponds to LONG -122.0857 LAT
 35.6, that corresponds to grid square '-122.0x35.6', which is in
 your upper right hand corner.  Thus, if you search for something like:

 square = '-122.0x35.6'
 points = (SELECT * FROM Locations WHERE gridList=:1, square)

 ...you'll find that the original point we saved above will return.

 It's all about metadata.  Don't think in terms of inequalities and
 boundary conditions, think in terms of inclusive ranges.

 Best,

 Ben

Interesting, basically a different approach to geohash (I believe).

However... how does your solution deal with zoom ranges?

Best,
Jose

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: Query filter based on the Longitude attribute of a db.GeoPt property

2008-09-09 Thread [EMAIL PROTECTED]

I just came up with this last night, and have only done limited
testing.  I can guess what you mean by the exploding index problem,
but are there any particulars you can give?  Does the exploding index
problem impact mostly total data storage, time to index, or
performance on the GQL queries?  Or some combination of the above?

I definitely mean to do some testing, but my assumption--tell me if
I'm on the wrong track--is that the indexing is primarily a challenge
when adding new data, and maybe in data storage.  But if I'm right
here, then it's not too much of an issue for me, as I imagine that my
app will mostly see incremental additions of data to be indexed; I
won't be doing bulk loads which I can imagine could bog things down
significantly.

Any links to other threads on this issue would be appreciated.

-Ben  (readyassist)

On Sep 9, 8:52 am, uprise78 [EMAIL PROTECTED] wrote:
 Won't having a list property like that get you the exploding index
 problem?  Have you done any testing with a large subset of data?
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---