bziobrowski commented on code in PR #14628:
URL: https://github.com/apache/pinot/pull/14628#discussion_r1879603841


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/H3IndexFilterOperator.java:
##########
@@ -78,7 +78,15 @@ public H3IndexFilterOperator(IndexSegment segment, 
QueryContext queryContext, Pr
     assert _h3IndexReader != null;
     int resolution = 
_h3IndexReader.getH3IndexResolution().getLowestResolution();
     _h3Id = H3Utils.H3_CORE.geoToH3(coordinate.y, coordinate.x, resolution);
-    _edgeLength = H3Utils.H3_CORE.edgeLength(resolution, LengthUnit.m);
+
+    // choose current hexagon's longest edge instead of average for the 
resolution;

Review Comment:
   Calculating precise edge length for each ring would most likely defeat any 
potential performance gains coming from using H3. That's because, considering 
that operator doesn't apply when number of rings exceeds 100, we could end up 
calculating distances for up to ~30k hexagons even if we don't have any points 
belonging to them.  
   For finger-grained precisions (7+) using current edge length should be fine 
because edge length shouldn't change much within the boundary of 100 rings. 
   For coarse-grained precisions, if center is placed near icosahedron edge and 
distance is large it could end up producing more results because hexagons far 
from it would be visibly bigger (max ratio is about 1.4) (and we claim them 
even though we shouldn't ). 
   
   In general - I don't think this application of H3 makes too much sense.
   It'd be better to expose library functions such as geoToH3, h3Distance, etc. 
and let users express geo queries in terms of hexagon distances, without having 
to calculate spherical distances, e.g. 
   
   ```sql
   SELECT COUNT(*) 
   FROM testTable 
   WHERE  geoToH3(indexed_point) = geoToH3(ST_Point(-122, 37.5, 1)) 
   ```
   or 
   ```sql
   SELECT COUNT(*) 
   FROM testTable 
   WHERE  h3Distance( geoToH3(indexed_point),geoToH3(ST_Point(-122, 37.5, 1)) ) 
 < 10
   ```
   
   Note: H3 would also need better documentation to let user easily choose 
proper resolution that would speed up query (given some range of values for 
distance) and not revert to regular distance check due to hitting ring limit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to