bziobrowski commented on code in PR #14628:
URL: https://github.com/apache/pinot/pull/14628#discussion_r1879603841
##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/H3IndexFilterOperator.java:
##########
@@ -78,7 +78,15 @@ public H3IndexFilterOperator(IndexSegment segment,
QueryContext queryContext, Pr
assert _h3IndexReader != null;
int resolution =
_h3IndexReader.getH3IndexResolution().getLowestResolution();
_h3Id = H3Utils.H3_CORE.geoToH3(coordinate.y, coordinate.x, resolution);
- _edgeLength = H3Utils.H3_CORE.edgeLength(resolution, LengthUnit.m);
+
+ // choose current hexagon's longest edge instead of average for the
resolution;
Review Comment:
Calculating precise edge length for each ring would most likely defeat any
potential performance gains coming from using H3. That's because, considering
that operator doesn't apply when number of rings exceeds 100, we could end up
calculating distances for up to ~30k hexagons even if we don't have any points
belonging to them.
For finger-grained precisions (7+) using current edge length should be fine
because edge length shouldn't change much within the boundary of 100 rings.
For coarse-grained precisions, if center is placed near icosahedron edge and
distance is large it could end up producing more results because hexagons far
from it would be visibly bigger (max ratio is about 1.4) (and we claim them
even though we shouldn't ).
In general - I don't think this application of H3 makes too much sense.
It'd be better to expose library functions such as geoToH3, h3Distance, etc.
and let users express geo queries in terms of hexagon distances, without having
to calculate spherical distances, e.g.
```sql
SELECT COUNT(*)
FROM testTable
WHERE geoToH3(indexed_point) = geoToH3(ST_Point(-122, 37.5, 1))
```
or
```sql
SELECT COUNT(*)
FROM testTable
WHERE h3Distance( geoToH3(indexed_point),geoToH3(ST_Point(-122, 37.5, 1)) )
< 10
```
Note: H3 would also need better documentation to let user easily choose
proper resolution that would speed up query (given some range of values for
distance) and not revert to regular distance check due to hitting ring limit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]