On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote:
> On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote:
> > > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> > > >> There's also the big-picture question of whether we should just get rid
> > > >> of fuzzy comparisons in the geometric types instead of trying to hack
> > > >> indexes to work around them.
> 
> > In any event, I think we should entertain a patch to make the GiST operator
> > class methods bug-compatible with corresponding operators.  Even if we 
> > decide
> > to change operator behavior in HEAD, the back branches could use it.
> 
> We have broad agreement that the specific implementation of fuzz in geometric
> comparison operators is shoddy, but nobody has voiced interest in designing a
> concrete improvement.  I propose adding a TODO item "Remove or improve
> rounding in geometric comparison operators", endorsing Alexander's design, and
> reviewing his patch.  Objections?

TODO added, and here's a review:

The patch adds no regression tests; it should add tests illustrating the
problems it fixes.

I audited the other indexable geometric operators for similar problems.  This
passage in gist_point_consistent_internal(), which handles (point,point)
operators, caught my suspicion:

                case RTSameStrategyNumber:
                        if (isLeaf)
                        {
                                result = FPeq(key->low.x, query->x)
                                        && FPeq(key->low.y, query->y);
                        }
                        else
                        {
                                result = (query->x <= key->high.x && query->x 
>= key->low.x &&
                                                  query->y <= key->high.y && 
query->y >= key->low.y);
                        }
                        break;

A leaf entry reachable from an internal entry may fall exactly on the
internal-entry bounding box.  Since we would accept a fuzzy match at the leaf
level, I think we must also accept a fuzzy match at the internal level.

> *** a/src/backend/access/gist/gistproc.c
> --- b/src/backend/access/gist/gistproc.c

> ***************
> *** 1326,1331 **** gist_point_consistent(PG_FUNCTION_ARGS)
> --- 1327,1333 ----
>       bool       *recheck = (bool *) PG_GETARG_POINTER(4);
>       bool            result;
>       StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
> +     BOX                *query, *key;

This function now has "query" variables within subsidiary blocks redundant
with and masking this one.  Avoid doing that.

>   
>       switch (strategyGroup)
>       {
> ***************
> *** 1337,1348 **** gist_point_consistent(PG_FUNCTION_ARGS)
>                       *recheck = false;
>                       break;
>               case BoxStrategyNumberGroup:
> !                     result = DatumGetBool(DirectFunctionCall5(
> !                                                                             
>                           gist_box_consistent,
> !                                                                             
>                           PointerGetDatum(entry),
> !                                                                             
>                           PG_GETARG_DATUM(1),
> !                                                                       
> Int16GetDatum(RTOverlapStrategyNumber),
> !                                                                             
>            0, PointerGetDatum(recheck)));
>                       break;
>               case PolygonStrategyNumberGroup:
>                       {
> --- 1339,1356 ----
>                       *recheck = false;
>                       break;
>               case BoxStrategyNumberGroup:
> !                     /* 
> !                      * This code repeats logic of on_ob which uses simple 
> comparison
> !                      * rather than FP* functions.
> !                      */
> !                     query = PG_GETARG_BOX_P(1);
> !                     key = DatumGetBoxP(entry->key);
> !                     
> !                     *recheck = false;
> !                     result = key->high.x >= query->low.x && 
> !                                      key->low.x <= query->high.x &&
> !                                      key->high.y >= query->low.y && 
> !                                      key->low.y <= query->high.y;

For leaf entries, this correctly degenerates to on_pb().  For internal
entries, it must, but does not, implement box_overlap().  (The fuzzy
box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
treatment of boxes resemble its treatment of circles and polygons; that eases
verifying their correctness.  Call gist_box_consistent.  Then, for leaf
entries, call box_contain_pt().


GiST "consistent" functions often validate the strategy number, but the
circle, polygon and box branches of gist_point_consistent silently assume
strategy % GeoStrategyNumberOffset == RTContainedByStrategyNumber.  Should
they verify that assumption?  I raise this as an incidental question; it is
not new with your patch.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to