[HACKERS] Can get GiST RECHECK clause to work

2004-06-13 Thread Mark Cave-Ayland
Hi everyone,

I'm trying to mark a GiST index as lossy using the RECHECK operator as
part of some work on PostGIS, but what happens is that the original
operator function is never reapplied to the results of the index scan.
The operator class and operator definitions looks like this:

CREATE OPERATOR  (
   LEFTARG = GEOMETRY, RIGHTARG = GEOMETRY, PROCEDURE =
geometry_overlap,
   COMMUTATOR = '',
   RESTRICT = postgis_gist_sel, JOIN = positionjoinsel
);

and:

CREATE OPERATOR CLASS gist_geometry_ops
DEFAULT FOR TYPE geometry USING gist AS
OPERATOR1 RECHECK,
OPERATOR2 RECHECK,
OPERATOR3 RECHECK,
OPERATOR4 RECHECK,
OPERATOR5 RECHECK,
OPERATOR6   ~=  RECHECK,
OPERATOR7   ~   RECHECK,
OPERATOR8   @   RECHECK,
FUNCTION1   ggeometry_consistent (internal,
geometry, int4),
FUNCTION2   gbox_union (bytea, internal),
FUNCTION3   ggeometry_compress (internal),
FUNCTION4   rtree_decompress (internal),
FUNCTION5   gbox_penalty (internal, internal,
internal),
FUNCTION6   gbox_picksplit (internal, internal),
FUNCTION7   gbox_same (box, box, internal);


What I'm expecting is that since RECHECK is specified, PostgreSQL will
identify the index entries using the  operator and then call
geometry_overlap() function with the full tuples from the heap. However,
it seems geometry_overlap() is never called from an index scan made
using the  operator :( Can anyone point out where I'm going wrong?


Many thanks,

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Can get GiST RECHECK clause to work

2004-06-13 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes:
 I'm trying to mark a GiST index as lossy using the RECHECK operator as
 part of some work on PostGIS, but what happens is that the original
 operator function is never reapplied to the results of the index scan.

You sure?  I'm pretty sure that a number of the contrib gist index
opclasses would fail their regression tests if this were broken.

As of 7.5 you cannot see the reapplication in the generated plan's
filter condition; perhaps that got you confused?

2004-01-05 23:31  tgl

* src/: backend/executor/nodeIndexscan.c,
backend/nodes/copyfuncs.c, backend/nodes/outfuncs.c,
backend/optimizer/path/costsize.c,
backend/optimizer/plan/createplan.c,
backend/optimizer/plan/setrefs.c, include/nodes/execnodes.h,
include/nodes/plannodes.h: Instead of rechecking lossy index
operators by putting them into the regular qpqual ('filter
condition'), add special-purpose code to nodeIndexscan.c to recheck
them.  This ends being almost no net addition of code, because the
removal of planner code balances out the extra executor code, but
it is significantly more efficient when a lossy operator is
involved in an OR indexscan.  The old implementation had to recheck
the entire indexqual in such cases.


regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Can get GiST RECHECK clause to work

2004-06-13 Thread Mark Cave-Ayland
Hi Tom,

As far as I can tell this is the case. What I've done to test this is to
put an elog(NOTICE, .) in geometry_overlap() so I can tell when
it's being called and this is the result I get:


shapefile=# select * from tgr1 where the_geom 
GeometryFromText('BOX3D(1000 10
0, 2000 2000)'::box3d, -2);
NOTICE:  postgis_gist_sel called
NOTICE:   search_box does not overlaps histogram, returning 0
NOTICE:   returning computed value: 0.00
NOTICE:  IN GEOMETRY OVERLAP
ERROR:  Operation on two GEOMETRIES with different SRIDs

shapefile=# create index tgr1_idx on tgr1 using gist (the_geom
gist_geometry_ops
);
CREATE INDEX
shapefile=# select * from tgr1 where the_geom 
GeometryFromText('BOX3D(1000 10
0, 2000 2000)'::box3d, -2);
NOTICE:  postgis_gist_sel called
NOTICE:   search_box does not overlaps histogram, returning 0
NOTICE:   returning computed value: 0.00
 gid | tlid | fnode | tnode | length | fedirp | fename | fetype | fedirs
| cfcc
| fraddl | toaddl | fraddr | toaddr | zipl | zipr | census1 | census2 |
cfcc1 |
cfcc2 | source | the_geom
-+--+---+---+++++---
-+--
+++++--+--+-+-+-
--+-
--++--
(0 rows)

shapefile=# explain analyze select * from tgr1 where the_geom 
GeometryFromTex
t('BOX3D(1000 100, 2000 2000)'::box3d, -2);
NOTICE:  postgis_gist_sel called
NOTICE:   search_box does not overlaps histogram, returning 0
NOTICE:   returning computed value: 0.00

QUERY PLAN



---
 Index Scan using tgr1_idx on tgr1  (cost=0.00..6.01 rows=1 width=327)
(actual t
ime=30.000..30.000 rows=0 loops=1)
   Index Cond: (the_geom  'SRID=-2;BOX3D(1000 100 0,2000 2000
0)'::geometry)
 Total runtime: 30.000 ms
(3 rows)

shapefile=# select * from pg_amop where amopclaid=(SELECT oid FROM
pg_opclass W
ERE opcname = 'gist_geometry_ops');
 amopclaid | amopsubtype | amopstrategy | amopreqcheck | amopopr
---+-+--+--+-
 17456 |   0 |1 | t|   17410
 17456 |   0 |2 | t|   17412
 17456 |   0 |3 | t|   17413
 17456 |   0 |4 | t|   17411
 17456 |   0 |5 | t|   17409
 17456 |   0 |6 | t|   17414
 17456 |   0 |7 | t|   17415
 17456 |   0 |8 | t|   17416
(8 rows)


So before the index is created, the geometry_overlap() function is
called, but whenever an index scan is used then it's never called?

I've had a look at contrib/rtree_gist but it doesn't make much sense;
from what I can see the RECHECK clause is specified for a couple of
operators in the operator class but there is no operator defined in the
SQL file - so I'm guessing that in this case the RECHECK won't do
anything anyway? I'm wondering if I'm missing some sort of mapping
between  used for CREATE OPERATOR and the  listed in CREATE OPERATOR
CLASS?

This is happening with current CVS as of earlier today, however it looks
as if it doesn't work in 7.4 either (see
http://postgis.refractions.net/pipermail/postgis-users/2004-June/004973.
html where I was trying to get the person in question to alter the
catalogues manually to enforce the RECHECK which didn't solve the
problem for him either).


Many thanks,

Mark.


---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: 13 June 2004 23:09
 To: Mark Cave-Ayland
 Cc: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Can get GiST RECHECK clause to work
 
 
 Mark Cave-Ayland [EMAIL PROTECTED] writes:
  I'm trying to mark a GiST index as lossy using the RECHECK 
 operator as 
  part of some work on PostGIS, but what happens is that the original 
  operator function is never reapplied to the results of the 
 index scan.
 
 You sure?  I'm pretty sure that a number of the contrib gist 
 index opclasses would fail their regression tests if this were broken.
 
 As of 7.5 you cannot see the reapplication in the generated 
 plan's filter condition; perhaps that got you confused?
 
 2004-01-05 23:31  tgl
 
   * src/: backend/executor/nodeIndexscan.c,
   backend/nodes/copyfuncs.c, backend/nodes/outfuncs.c,
   backend/optimizer/path

Re: [HACKERS] Can get GiST RECHECK clause to work

2004-06-13 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes:
 As far as I can tell this is the case. What I've done to test this is to
 put an elog(NOTICE, .) in geometry_overlap()

Well, I can easily prove that CVS tip does call the operator function
and honor its result.

regression=# create table foo (f1 float8 unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index foo_f1_key for table foo
CREATE TABLE
regression=# insert into foo values(1);
INSERT 480998 1
regression=# insert into foo values(2);
INSERT 480999 1
regression=# select * from foo where f1 = 1;
 f1

  1
(1 row)

With gdb, I set a breakpoint at float8eq, and determine that it is
called exactly once (during _bt_checkkeys' scan setup) in this query.
Next, after some fooling about to determine which row in pg_amop
describes float8eq:

regression=# update pg_amop set amopreqcheck = true
regression-#  where amopclaid = 1972 and amopsubtype = 0 and  amopstrategy = 3;
UPDATE 1

Now the select calls float8eq twice, once from _bt_checkkeys and once
from IndexNext.  Moreover I can force a zero result from float8eq in
the second call, and if I do then no rows are returned.

My guess is that your problem occurs because the index is not returning
the row in the first place, and thus there is nothing to recheck.  This
would point to a bug somewhere in your GIST support functions.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly