Re: [HACKERS] happy birthday Tom Lane ...
happy_birthday++; -- Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] binary compat
Dimitri Fontaine wrote: > Hi, > > I've been idly thinking about binary COPY and recent performance efforts > spent on it. The main problem I have with binary output is that I never > know when it'll be any useful (except very same hardware and PostgreSQL > setup)... useful meaning I get to read it back into some database. > If you want a binary cross compatible, how about something well known, standardized, simple and cross platform like XDR? http://www.faqs.org/rfcs/rfc1014.html -- Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] phypot - Pygmy Hippotause ?
Kevin Grittner wrote: > > The first test seems unnecessary if you have the second. > x >= 0, so x can't be zero unless y is, too. > Returning x on y == 0.0 will return 0.0 whenever x == 0.0. > > -Kevin > Wish granted. :-) -- -- Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. Index: src/backend/utils/adt/geo_ops.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/geo_ops.c,v retrieving revision 1.103 diff -c -r1.103 geo_ops.c *** src/backend/utils/adt/geo_ops.c 28 Jul 2009 09:47:59 - 1.103 --- src/backend/utils/adt/geo_ops.c 29 Aug 2009 02:47:14 - *** *** 32,37 --- 32,38 * Internal routines */ + static double phypot(double x, double y); static intpoint_inside(Point *p, int npts, Point *plist); static intlseg_crossing(double x, double y, double px, double py); static BOX *box_construct(double x1, double x2, double y1, double y2); *** *** 825,831 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(HYPOT(a.x - b.x, a.y - b.y)); } --- 826,832 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(phypot(a.x-b.x, a.y-b.y)); } *** *** 1971,1977 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); } double --- 1972,1978 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(phypot(pt1->x - pt2->x, pt1->y - pt2->y)); } double *** *** 1979,1987 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return HYPOT(pt1->x - pt2->x, pt1->y - pt2->y); } Datum --- 1980,1988 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, phypot(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return phypot(pt1->x - pt2->x, pt1->y - pt2->y); } Datum *** *** 2444,2450 dist_pl_internal(Point *pt, LINE *line) { return (line->A * pt->x + line->B * pt->y + line->C) / ! HYPOT(line->A, line->B); } Datum --- 2445,2451 dist_pl_internal(Point *pt, LINE *line) { return (line->A * pt->x + line->B * pt->y + line->C) / ! phypot(line->A, line->B); } Datum *** *** 4916,4922 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius *= HYPOT(point->x, point->y); PG_RETURN_CIRCLE_P(result); } --- 4917,4923 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius *= phypot(point->x, point->y); PG_RETURN_CIRCLE_P(result); } *** *** 4936,4942 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius /= HYPOT(point->x, point->y); PG_RETURN_CIRCLE_P(result); } --- 4937,4943 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius /= phypot(point->x, point->y); PG_RETURN_CIRCLE_P(result); } *** *** 5401,5403 --- 5402,5465 return FALSE; } + + + /*- + * Determine the hypotenuse. + * + * If required, x and y are swapped to make x the larger number. The + * traditional formulae of x^2+y^2 is rearranged to factor x outside the + * sqrt. This allows computation of the hypotenuse for significantly + * larger values, and with a higher precision than otherwise normally + * possible. + * + * Only arguments of > 1.27e308 are at risk of causing overflow. Whereas + * the naive approach limits arguments to < 9.5e153. + * + * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) + * = x * sqrt( 1 + y^2/x^2 ) + * = x * sqrt( 1 + y/x * y/x ) + * + * It is expected that this routine will eventually be replaced with the + * C99 hypot() function. + * + * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the + * case of hypot(inf,nan) results in INF, and not NAN. To obtain Single + * UNIX Specification behaviour, swap the order of the
[HACKERS] phypot - Pygmy Hippotause ?
Another attempt at replacing the current HYPOT macro with a much better behaved function. I've added comments addressing those sections of code that tripped people up before. As well as explaining some of the design decisions. Feedback appreciated. Index: src/backend/utils/adt/geo_ops.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/geo_ops.c,v retrieving revision 1.103 diff -c -r1.103 geo_ops.c *** src/backend/utils/adt/geo_ops.c 28 Jul 2009 09:47:59 - 1.103 --- src/backend/utils/adt/geo_ops.c 28 Aug 2009 08:11:00 - *** *** 32,37 --- 32,38 * Internal routines */ + static double phypot(double x, double y); static intpoint_inside(Point *p, int npts, Point *plist); static intlseg_crossing(double x, double y, double px, double py); static BOX *box_construct(double x1, double x2, double y1, double y2); *** *** 825,831 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(HYPOT(a.x - b.x, a.y - b.y)); } --- 826,832 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(phypot(a.x-b.x, a.y-b.y)); } *** *** 1971,1977 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); } double --- 1972,1978 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(phypot(pt1->x - pt2->x, pt1->y - pt2->y)); } double *** *** 1979,1987 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return HYPOT(pt1->x - pt2->x, pt1->y - pt2->y); } Datum --- 1980,1988 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, phypot(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return phypot(pt1->x - pt2->x, pt1->y - pt2->y); } Datum *** *** 2444,2450 dist_pl_internal(Point *pt, LINE *line) { return (line->A * pt->x + line->B * pt->y + line->C) / ! HYPOT(line->A, line->B); } Datum --- 2445,2451 dist_pl_internal(Point *pt, LINE *line) { return (line->A * pt->x + line->B * pt->y + line->C) / ! phypot(line->A, line->B); } Datum *** *** 4916,4922 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius *= HYPOT(point->x, point->y); PG_RETURN_CIRCLE_P(result); } --- 4917,4923 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius *= phypot(point->x, point->y); PG_RETURN_CIRCLE_P(result); } *** *** 4936,4942 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius /= HYPOT(point->x, point->y); PG_RETURN_CIRCLE_P(result); } --- 4937,4943 PointPGetDatum(point))); result->center.x = p->x; result->center.y = p->y; ! result->radius /= phypot(point->x, point->y); PG_RETURN_CIRCLE_P(result); } *** *** 5401,5403 --- 5402,5468 return FALSE; } + + + /*- + * Determine the hypotenuse. + * + * If required, x and y are swapped to make x the larger number. The + * traditional formulae of x^2+y^2 is rearranged to factor x outside the + * sqrt. This allows computation of the hypotenuse for significantly + * larger values, and with a higher precision than otherwise normally + * possible. + * + * Only arguments of > 1.27e308 are at risk of causing overflow. Whereas + * the naive approach limits arguments to < 9.5e153. + * + * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) + * = x * sqrt( 1 + y^2/x^2 ) + * = x * sqrt( 1 + y/x * y/x ) + * + * It is expected that this routine will eventually be replaced with the + * C99 hypot() function. + * + * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the + * case of hypot(inf,nan) results in INF, and not NAN. To obtain Single + * UNIX Specification behaviour, swap the order of the isnan() and isinf() + * test sections. + * + * The HYPOT macro this funct
Re: [HACKERS] Patches for static check on geo_ops.c
Tom Lane wrote: > I've applied the first three of these changes, but not the last two > (the 'dist' assignments). "clang" seems to have a tin ear for style :-(. > It's failing to notice that we have several similar code blocks in > sequence in these two places, and making the last one different from the > rest would decrease code readability and modifiability. > > "Ah! The old programming via copy-and-paste trick". Maybe clang's ear for style isn't that bad after all. :-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patches for static check on geo_ops.c
Grzegorz Jaskiewicz wonderful static checker coughed up 5 errors in geo_ops.c. None of them of any particular excitement or of earth shattering nature. A patch is attached below that should correct these. (The more little issue we eliminate, the more the large ones will stand out.) At line 3131 value stored into 'dist' variable is never referenced again. At line 3014 value stored into 'dist' variable is never referenced again. At line 2942 value stored into 'd' variable is never referenced again. At line 2953 value stored into 'd' variable is never referenced again. At line 2993 value stored into 'd' variable is never referenced again. ? patchfile_clang Index: src/backend/utils/adt/geo_ops.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/geo_ops.c,v retrieving revision 1.103 diff -c -r1.103 geo_ops.c *** src/backend/utils/adt/geo_ops.c 28 Jul 2009 09:47:59 - 1.103 --- src/backend/utils/adt/geo_ops.c 27 Aug 2009 09:31:30 - *** *** 2939,2945 memcpy(&point, &l1->p[1], sizeof(Point)); } ! if ((d = dist_ps_internal(&l2->p[0], l1)) < dist) { result = DatumGetPointP(DirectFunctionCall2(close_ps, PointPGetDatum(&l2->p[0]), --- 2939,2945 memcpy(&point, &l1->p[1], sizeof(Point)); } ! if (dist_ps_internal(&l2->p[0], l1) < dist) { result = DatumGetPointP(DirectFunctionCall2(close_ps, PointPGetDatum(&l2->p[0]), *** *** 2950,2956 LsegPGetDatum(l2))); } ! if ((d = dist_ps_internal(&l2->p[1], l1)) < dist) { result = DatumGetPointP(DirectFunctionCall2(close_ps, PointPGetDatum(&l2->p[1]), --- 2950,2956 LsegPGetDatum(l2))); } ! if (dist_ps_internal(&l2->p[1], l1) < dist) { result = DatumGetPointP(DirectFunctionCall2(close_ps, PointPGetDatum(&l2->p[1]), *** *** 2990,2996 point.x = box->low.x; point.y = box->high.y; statlseg_construct(&lseg, &box->low, &point); ! dist = d = dist_ps_internal(pt, &lseg); statlseg_construct(&seg, &box->high, &point); if ((d = dist_ps_internal(pt, &seg)) < dist) --- 2990,2996 point.x = box->low.x; point.y = box->high.y; statlseg_construct(&lseg, &box->low, &point); ! dist = dist_ps_internal(pt, &lseg); statlseg_construct(&seg, &box->high, &point); if ((d = dist_ps_internal(pt, &seg)) < dist) *** *** 3011,3017 statlseg_construct(&seg, &box->high, &point); if ((d = dist_ps_internal(pt, &seg)) < dist) { - dist = d; memcpy(&lseg, &seg, sizeof(lseg)); } --- 3011,3016 *** *** 3128,3134 statlseg_construct(&seg, &box->high, &point); if ((d = lseg_dt(lseg, &seg)) < dist) { - dist = d; memcpy(&bseg, &seg, sizeof(bseg)); } --- 3127,3132 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Greg Stark wrote: > We're implementing things like box_distance and point_distance which > as it happens will already be doing earlier arithmetic on the doubles, > so whatever HYPOT() does had better be consistent with that or the > results will be just an inexplicable mishmash. > > Let's look at what the HYPOT macro in PostgreSQL does right now: #define HYPOT(A, B) sqrt((A)*(A)+(B)*(B)) And let's see how it is used in point_distance: Datum point_distance(PG_FUNCTION_ARGS) { Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); PG_RETURN_FLOAT8(HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); } Oh. Surprise. It does not look out for NaN's, InF's, overflows, underflows or anything else. In addition, for the majority of it's input space, it fails to work correctly at all. If x and y are both 1E200 the hypotenuse should be 1.4142135e200, well within the range of the double. However by naively squaring x yields 1E400, outside the range of a double. Currently HYPOT() returns INFINITY, which is not the correct answer. I am trying to introduce the hypot() function (where required) and associated patch, in order to start correcting some of the more obvious defects with the current geometry handling. Maybe my proposed hypot() function is not perfect, but it's so far in front of current the HYPOT macro is not funny. > Incidentally, what on earth does it mean to multiply or divide a > circle by a point? > > Basically it's complex math. This comment, from my new geometry implementation, might help. /*- * Additional Operators * * The +, -, * and / operators treat IPoint's as complex numbers. * (This would indicate a requirement for a Complex type?) * * (a+bi)+(c+di) = ((a+c)+(b+d)i) * (a+bi)-(c+di) = ((a-c)+(b-d)i) * (a+bi)*(c+di) = ac + adi + bci + bdi^2 * = ac + (ad+bc)i - bd # as i^2 = -1 * = ((ac-bd)+(ad+bc)i) * (a+bi)/(c+di) = (a+bi)(c-di) / (c+di)(c-di) * = ((ac+bd) + (bc-ad)i) / (c^2+d^2) * = ((ac + bd)/(c^2+d^2) + ((bc-ad)/(c^2+d^2))i) * * translation + IPoint_IPoint_add * translation - IPoint_IPoint_sub * multiplication * IPoint_IPoint_mul * division / IPoint_IPoint_div *---*/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Greg Stark wrote: > Also, the question arises what should be returned for hypot(Inf,NaN) > which your code returns Inf for. Empirically, it seems the normal > floating point behaviour is to return NaN so I think the NaN test > should be first. > > See http://www.opengroup.org/onlinepubs/95399/functions/hypot.html If /x/ or /y/ is ±Inf, +Inf shall be returned (even if one of /x/ or /y/ is NaN). If /x/ or /y/ is NaN, and the other is not ±Inf, a NaN shall be returned. Just trying to implement correct C99 behaviour here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
This is to go with the hypot() patch I posted the other day. As other code, such as found in adt/float.c and adt/numeric.c, simply assumes that isnan() exists, despite it being a C99 function, I have assumed the same. The below code should be placed into a file called src/port/hypot.c. Unfortunately I do not know how to drive configure and all the other associated build magics. Could some kind soul please implemented that portion. (Or shove me in the right direction) #include #include "c.h" #include "utils/builtins.h" /* * Find the hypotenuse. Firstly x and y are swapped, if required, to make * x the larger number. The traditional formulae of x^2+y^2 is rearranged * to bring x outside the sqrt. This allows computation of the hypotenuse * for much larger magnitudes than otherwise normally possible. * * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) * = x * sqrt( 1 + y^2/x^2 ) * = x * sqrt( 1 + y/x * y/x ) */ double hypot( double x, double y ) { double yx; if( isinf(x) || isinf(y) ) return get_float8_infinity(); if( isnan(x) || isnan(y) ) return get_float8_nan(); x = fabs(x); y = fabs(y); if (x < y) { double temp = x; x = y; y = temp; } if (x == 0.0) return 0.0; else { yx = y/x; return x*sqrt(1.0+yx*yx); } } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slaying the HYPOTamus
Tom Lane wrote: > Greg Stark writes: > >> If there's a performance advantage then we could add a configure test >> and define the macro to call hypot(). You said it existed before C99 >> though, how widespread was it? If it's in all the platforms we support >> it might be reasonable to just go with it. >> > > For one data point, I see hypot() in HPUX 10.20, released circa 1996. > I suspect we would want a configure test and a substitute function > anyway. Personally I wouldn't have a problem with the substitute being > the naive sqrt(x*x+y*y), particularly if it's replacing existing code > that overflows in the same places. > > regards, tom lane > > A hypot() substitute might look something like this psudo-code, this is how Python does it if the real hypot() is missing. double hypot( double dx, double dy ) { double yx; if( isinf(dx) || ifinf(dy) ) { return INFINITY; } dx = fabs(dx); dy = fabs(dy); if (dx < dy) { double temp = dx; dx = dy; dy = temp; } if (x == 0.) return 0.; else { yx = dy/dx; return dx*sqrt(1.0+yx*yx); } } As the following link shows, a lot of care could be put into getting a substitute hypot() correct. http://gforge.inria.fr/plugins/scmsvn/viewcvs.php/trunk/hypot.c?rev=5677&root=mpfr&view=markup -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Slaying the HYPOTamus
Like some ancient precursor to the modern hypot()enuse the dreaded ill-tempered HYPOT()amus lurks in the recesses of geo_ops.c and geo_decls.h. And many a line of code has been swallowed by its mighty maw. This patch replaces the HYPOT() macro with a calls to the hypot() function. The hypot() function has been part of the C standard since C99 (ie 10 years ago), and in most UNIX, IBM and GNU libraries longer than that. The function is designed not to fail where the current naive macro would result in overflow. Where available, we might expect to the hypot() function to take advantage of underlying hardware opcodes. In addition the function evaluates its arguments only once. The current macro evaluates its arguments twice. Currently HYPOT(a.x - b.x, a.y - b.y)) becomes: sqrt((a.x - b.x)*(a.x - b.x) + (a.y - b.y) * (a.y - b.y)) The patch passes all test. Patch attached below. (First attempt at using CVS. Hope it's all good) ? GNUmakefile ? config.log ? config.status ? patchfile ? src/Makefile.global ? src/backend/postgres ? src/backend/catalog/postgres.bki ? src/backend/catalog/postgres.description ? src/backend/catalog/postgres.shdescription ? src/backend/snowball/snowball_create.sql ? src/backend/utils/probes.h ? src/backend/utils/mb/conversion_procs/conversion_create.sql ? src/bin/initdb/initdb ? src/bin/pg_config/pg_config ? src/bin/pg_controldata/pg_controldata ? src/bin/pg_ctl/pg_ctl ? src/bin/pg_dump/pg_dump ? src/bin/pg_dump/pg_dumpall ? src/bin/pg_dump/pg_restore ? src/bin/pg_resetxlog/pg_resetxlog ? src/bin/psql/psql ? src/bin/scripts/clusterdb ? src/bin/scripts/createdb ? src/bin/scripts/createlang ? src/bin/scripts/createuser ? src/bin/scripts/dropdb ? src/bin/scripts/droplang ? src/bin/scripts/dropuser ? src/bin/scripts/reindexdb ? src/bin/scripts/vacuumdb ? src/include/pg_config.h ? src/include/stamp-h ? src/interfaces/ecpg/compatlib/exports.list ? src/interfaces/ecpg/compatlib/libecpg_compat.so.3.1 ? src/interfaces/ecpg/compatlib/libecpg_compat.so.3.2 ? src/interfaces/ecpg/ecpglib/exports.list ? src/interfaces/ecpg/ecpglib/libecpg.so.6.2 ? src/interfaces/ecpg/include/ecpg_config.h ? src/interfaces/ecpg/include/stamp-h ? src/interfaces/ecpg/pgtypeslib/exports.list ? src/interfaces/ecpg/pgtypeslib/libpgtypes.so.3.1 ? src/interfaces/ecpg/preproc/ecpg ? src/interfaces/libpq/exports.list ? src/interfaces/libpq/libpq.so.5.3 ? src/port/pg_config_paths.h ? src/test/regress/log ? src/test/regress/pg_regress ? src/test/regress/results ? src/test/regress/testtablespace ? src/test/regress/tmp_check ? src/test/regress/expected/constraints.out ? src/test/regress/expected/copy.out ? src/test/regress/expected/create_function_1.out ? src/test/regress/expected/create_function_2.out ? src/test/regress/expected/largeobject.out ? src/test/regress/expected/largeobject_1.out ? src/test/regress/expected/misc.out ? src/test/regress/expected/tablespace.out ? src/test/regress/sql/constraints.sql ? src/test/regress/sql/copy.sql ? src/test/regress/sql/create_function_1.sql ? src/test/regress/sql/create_function_2.sql ? src/test/regress/sql/largeobject.sql ? src/test/regress/sql/misc.sql ? src/test/regress/sql/tablespace.sql ? src/timezone/zic Index: src/backend/utils/adt/geo_ops.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/geo_ops.c,v retrieving revision 1.103 diff -c -r1.103 geo_ops.c *** src/backend/utils/adt/geo_ops.c 28 Jul 2009 09:47:59 - 1.103 --- src/backend/utils/adt/geo_ops.c 23 Aug 2009 03:44:39 - *** *** 825,831 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(HYPOT(a.x - b.x, a.y - b.y)); } --- 825,831 box_cn(&a, box1); box_cn(&b, box2); ! PG_RETURN_FLOAT8(hypot(a.x - b.x, a.y - b.y)); } *** *** 1971,1977 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); } double --- 1971,1977 Point *pt1 = PG_GETARG_POINT_P(0); Point *pt2 = PG_GETARG_POINT_P(1); ! PG_RETURN_FLOAT8(hypot(pt1->x - pt2->x, pt1->y - pt2->y)); } double *** *** 1979,1987 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, HYPOT(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return HYPOT(pt1->x - pt2->x, pt1->y - pt2->y); } Datum --- 1979,1987 { #ifdef GEODEBUG printf("point_dt- segment (%f,%f),(%f,%f) length is %f\n", ! pt1->x, pt1->y, pt2->x, pt2->y, hypot(pt1->x - pt2->x, pt1->y - pt2->y)); #endif ! return hypot(pt1->x - pt2->x, pt1->y - pt2->y); } Datum *** *** 2444,2450 dist_pl_internal(Point *pt, LINE *line) { return (line->A * pt->x + line->B * pt->y + line->C) / ! HYPOT
Re: [HACKERS] Geometric Elimination
Martijn van Oosterhout wrote: > I haven't completely understood what you're trying to do > Putting in place the missing 'box op point' and 'point op box' operators. The problematic queries are at the bottom of the email. > - I don't see any definition of an operator class, just the family, > which doesn't seem to make any sense to me. > I am working on the assumption that the Box and Point class already have operator classes. Otherwise how would all the existing Box and Point operators work? From my limited understanding of the source code it's in postgres.bki round about line 2208 and 2211. (It there a better interface to this information?). > - Does it work if you replace the use of the operator with the > equivalent function call (contains)? > Good idea. Yes. It does work. As such, we can assume that the C function and the CREATE FUNCTION are OK. > - Check for differences in the explain output, that should reveal any > implicit casts that may be getting in your way. > The EXPLAIN does not show any explicit casts occurring. -- This works -- 17 seconds SELECT W.geocode, F.state, F.code, F.name FROM work as W, features as F, boundary as TB WHERE TB.feature_id = F.feature_id AND TB.boundout IS TRUE AND TB.boundbox @> box(W.geocode,W.geocode) AND contains(TB.boundary,W.geocode) AND (TB.feature_id) NOT IN ( SELECT feature_id FROM boundary as FB WHERE FB.feature_id = TB.feature_id AND FB.boundout IS FALSE AND FB.boundbox @> box(W.geocode,W.geocode) AND contains(FB.boundary,W.geocode) ) ORDER BY W.geocode[0], W.geocode[1]; -- This works -- 39 seconds SELECT W.geocode, F.state, F.code, F.name FROM work as W, features as F, boundary as TB WHERE TB.feature_id = F.feature_id AND TB.boundout IS TRUE AND contains(TB.boundbox,W.geocode) AND contains(TB.boundary,W.geocode) AND (TB.feature_id) NOT IN ( SELECT feature_id FROM boundary as FB WHERE FB.feature_id = TB.feature_id AND FB.boundout IS FALSE AND contains(FB.boundbox,W.geocode) AND contains(FB.boundary,W.geocode) ) ORDER BY W.geocode[0], W.geocode[1]; -- This fails. -- Returns empty set SELECT W.geocode, F.state, F.code, F.name FROM work as W, features as F, boundary as TB WHERE TB.feature_id = F.feature_id AND TB.boundout IS TRUE AND TB.boundbox @> W.geocode AND contains(TB.boundary,W.geocode) AND (TB.feature_id) NOT IN ( SELECT feature_id FROM boundary as FB WHERE FB.feature_id = TB.feature_id AND FB.boundout IS FALSE AND FB.boundbox @> W.geocode AND contains(FB.boundary,W.geocode) ) ORDER BY W.geocode[0], W.geocode[1]; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Geometric Elimination
Trying to solve this problem by using a process of elimination. All works fine until the comment below is removed. ALTER OPERATOR FAMILY box_ops USING GiST ADD OPERATOR 1 << (box,point), OPERATOR 2 &< (box,point), OPERATOR 3 && (box,point), OPERATOR 4 &> (box,point), OPERATOR 5 >> (box,point), -- OPERATOR 7 @> (box,point), OPERATOR 9 &<| (box,point), OPERATOR 10 <<| (box,point), OPERATOR 11 |>> (box,point); Ah! So operator @> is wrong. DROP OPERATOR IF EXISTS @>(box,point); CREATE OPERATOR @> ( LEFTARG= box, RIGHTARG = point, PROCEDURE = contains, COMMUTATOR = <@, RESTRICT = contsel, JOIN = contjoinsel ); No, all seems fine here. Maybe the definition of the function is incorrect CREATE OR REPLACE FUNCTION contains(box,point) RETURNS boolean LANGUAGE C IMMUTABLE STRICT AS 'contains.so', 'box_point_contains'; The C function? No it seems OK as well. What am I missing? It must be completely obvious. Someone is laughing out there. Put me out of my misery please! /* * Box contains point. box @> point. */ Datum box_point_contains(PG_FUNCTION_ARGS) { BOX *box = PG_GETARG_BOX_P(0); Point *point = PG_GETARG_POINT_P(1); intisin = point->x >= box->low.x && point->x <= box->high.x && point->y >= box->low.y && point->y <= box->high.y; PG_RETURN_BOOL(isin); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Geometric bewilderment
Suspect that attaching large amounts of code is a breach of etiquette. So apologies in advance. Needed to write a contains(polygon,point) function that gave a level of accuracy greater than the current 'poly@>point' operator. The new function works just fine. While at it, thought it would be nice to add in all the 'box op point' and 'point op box' routines that are not currently defined. The code in contains.c implements these functions. In contains.sql you can find how these functions are defined, tided to operators, and then placed in an operator class. (Permission granted to copy any code in here you like, it's all trivial) The perl script test1 tests all the operators. It compares the output of "box op point" with "box op box(point,point)". It is called with one parameter, the name of the database to use. It creates a table called points within that database. It also creates two files in the current directory a.out and b.out. tkdiff is called to show the differences. Running this shows that all the operators work fine. The perl script test2 tests 1000 points against 1000 boxes. It is called with one parameter, the name of the database to use. It creates two tables called points and boxes within that database. It also creates two files in the current directory a.out and b.out. tkdiff is called to show the differences. Running this shows that the indexing works fine. (Explain says it uses the index) Problem: The following code works in production with 20,000,000 points against 10,000 polygons. SELECT W.geocode, F.state_code, F.area_code, F.area_name FROM work as W, boundary as B, features as F WHERE B.boundbox @> box(W.geocode,W.geocode) AND contains(B.boundary,W.geocode) AND B.boundout = 'T' AND (B.feature_id) NOT IN ( SELECT feature_id FROM boundary WHERE boundbox @> box(W.geocode,W.geocode) AND contains(boundary,W.geocode) AND boundout = 'F' ) AND B.feature_id = F.feature_id; However the following does not work. It returns an empty set. Note that all that has changed is that "box @> box(point,point)" has been changed to "box @> point". From test1 we know that the operators work, and from test2 we know the indexing works. Confused and bewildered at this point. SELECT W.geocode, F.state_code, F.area_code, F.area_name FROM work as W, boundary as B, features as F WHERE B.boundbox @> W.geocode AND contains(B.boundary,W.geocode) AND B.boundout = 'T' AND (B.feature_id) NOT IN ( SELECT feature_id FROM boundary WHERE boundbox @> W.geocode AND contains(boundary,W.geocode) AND boundout = 'F' ) AND B.feature_id = F.feature_id; /*= * Contains PostgreSQL extention * * PostgreSQL (poly@>point) operator is currently broken beyond a certain * precision. This module provides a contains(poly,point) to use as a * replacement. * * It addition it adds the currently missing box to point operators, and * conversly the point to box operators as well. * * (c) 2009 Paul Matthews. LGPL ?? .. (to be advised) *===*/ #include "postgres.h" #include "utils/geo_decls.h" #include "fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif /*- * Box op Point routines * * strictly left of 1<<box_point_leftof * does not extend right of 2&<box_point_notright * overlaps 3&&box_point_contains * does not extend left of4&>box_point_notleft * strictly right of 5>>box_point_rightof * contains 7@>box_point_contains * does not extend above 9&<| box_point_notabove * strictly below 10 <<| box_point_below * strictly above 11 |>> box_point_above * does not extend below 12 |&> box_point_notbelow *---*/ PG_FUNCTION_INFO_V1(box_point_leftof); PG_FUNCTION_INFO_V1(box_point_notright); PG_FUNCTION_INFO_V1(box_point_notleft); PG_FUNCTION_INFO_V1(box_point_rightof); PG_FUNCTION_INFO_V1(box_point_contains); PG_FUNCTION_INFO_V1(box_point_notabove); PG_FUNCTION_INFO_V1(box_point_below); PG_FUNCTION_INFO_V1(box_point_above); PG_FUNCTION_INFO_V1(box_point_notbelow); /* * Box strictly left of point. box << point */ Datum box_point_leftof(PG_FUNCTION_ARGS) { BOX *box = PG_GETA
[HACKERS] Custom geometry, why slow?
The story so far ... The provide polygon@>point routine does not work correctly when the points are close to the boundary. So we implemented a custom contains(poly,point) function. In order to stop all points being checked against all polygons, a separate bounding box is maintained. So the query has sections looking like : boundbox @> box( thepoint, thepoint ) AND contains(boundary,thepoint) You will notice that each point to be checked has to be promoted to a degenerate box. Working on the assumption that there is a cost associated with this (ie pmalloc), and we will be passing 100's of millions of points though this in a single transaction, streaming this is important. At any rate it looked kludgy. The goal is provide : boundbox @> thepoint AND contains(boundary,thepoint) So the whole family of "point op box" functions where provided (except for point <@ box) which already exists. The operators have been created. And the operators added to the box_ops operator family. Samples below : CREATE OR REPLACE FUNCTION leftof(box,point) RETURNS boolean LANGUAGE C IMMUTABLE STRICT AS 'contains.so', 'box_point_leftof'; ..etc... DROP OPERATOR IF EXISTS <<(box,point); CREATE OPERATOR << ( LEFTARG= box, RIGHTARG = point, PROCEDURE = leftof, RESTRICT = positionsel, JOIN = positionjoinsel ); ...etc... ALTER OPERATOR FAMILY box_ops USING GiST ADD OPERATOR 1 << (box,point), OPERATOR 2 &< (box,point), OPERATOR 3 && (box,point), OPERATOR 4 &> (box,point), OPERATOR 5 >> (box,point), OPERATOR 7 @> (box,point), --OPERATOR 8 <@ (point,box), OPERATOR 9 &<| (box,point), OPERATOR 10 <<| (box,point), OPERATOR 11 |>> (box,point), OPERATOR 12 |&> (box,point); The problem is, according to EXPLAIN, it still wants to do a sequential scan and not use the index. Any pointers as to why? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Geometry RESTRICT and JOIN
I'm trying to add all the "box op point" operators. The C routines are written and working as advertised. The manuals description of the RESTRICT and JOIN clauses of CREATE OPERATOR don't seem too clear. Are these samples correct, or am I totally off base here? CREATE OPERATOR << ( LEFTARG= box, RIGHTARG = point, PROCEDURE = leftof, RESTRICT = scalarltsel, -- ?? UNSURE JOIN = positionjoinsel -- ?? UNCLEAR ); CREATE OPERATOR &> ( LEFTARG= box, RIGHTARG = point, PROCEDURE = notleft, RESTRICT = scalargtsel, -- ?? UNSURE JOIN = positionjoinsel -- ?? UNCLEAR ); CREATE OPERATOR @> ( LEFTARG= box, RIGHTARG = point, PROCEDURE = contains, RESTRICT = eqsel, -- ?? UNSURE JOIN = contjoinsel -- ?? UNCLEAR ); ...etc... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Quick pointer required re indexing geometry
Dimitri Fontaine wrote: Paul Matthews writes: Witting a box@>point function easy. Having a spot of trouble trying to figure out where and how to graft this into the GiST stuff. Could someone please point me in the general direction? You want index support for it, I suppose? Yes Without index support (but needed anyway), you implement your code in a C module then make it visible from SQL. ...cut... COMMENT ON OPERATOR @>(box, point) IS 'box contains point?'; All done but the comment part :-) Now for adding support for index lookups, you have to see documentation about OPERATOR CLASS and OPERATOR FAMILY. Hope this helps, regards, Thanks overlooked CLASS and FAMILY in my hurry. Hopefully that is where the problem is. What I am hoping to do is provide a "real" patch to make box@>point available for all. But we can get to that later. A lot of code in pgsql, but not much in the way of comments.
[HACKERS] Quick pointer required re indexing geometry
Witting a box@>point function easy. Having a spot of trouble trying to figure out where and how to graft this into the GiST stuff. Could someone please point me in the general direction? Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing geometic calculation
marcin mank wrote: > You are correct, I think, though this does not solve the division problem: > As a first goal I'm just attempting to reduce the EPSILON from 1.0E-6 down to 1.0E-015 (give or take). The current regression test suite works fine down to 1.0E-09. At 1.0E-10 errors appear, not in the geometry sections, but in the select_view test of all things. This is most likely due to postgresql now giving the more correct (hence different) answers. A real test suite is needed for this. Setting up PostGIS + MySQL + OtherCommerical for comparison purposes. The other problem is many of the basic geometric operators in postgres, such a left of, above, etc, are so incorrectly implemented, they are not even wrong. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing geometic calculation
Paul Matthews wrote: > EPSILON in postgres is 1.0E-06 > EPSILON for floats is 1.19209e-07 > EPSILON for doubles is 2.22045E-016 > Bad form to reply to my own post and all. If EPSILON for double represented 1mm, then postgres is rounding to the nearest 10,000,000 km. Since its only about 380,000 km to the moon it's a good thing does not use postgres. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing geometic calculation
Tom Lane wrote: > No, I'm worried about code that supposes that it can divide by (x - y) > after testing that FPeq(x,y) is not true. point_sl() for instance. > > We could perhaps fix those specific issues by testing the difference > explicitly instead of doing it like that. But there's still the overall > problem of error accumulation ... > > regards, tom lane > IEEE754 does not allow two number X and Y, such that X!=Y and (X-Y)==0. And since IEEE754 has been around since the 70's or 80's I think we can start relying on its existence and behavior by now. I don't see why it should be is postgres job to worry about error accumulation. And then why only worry about it in geometric calculations. Where in normal postgres code, or in perl, python, lisp, or any in any other general purpose tool, is it where you ask it to tell you 1.0+1.0 it responds with 2.0 +or- 0.0001? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing geometic calculation
Tom Lane wrote: > It's a hack but it's dealing with an extremely real problem, namely > the built-in inaccuracy of floating-point arithmetic. You can't just > close your eyes to that and hope that everything will be okay. > If the above statement was true, then the FP* macros should be extended to all numerical calculations in postgres. And I don't think anyone here would suggest that doing that, as the results, as per my previous email, would be immediately and clearly ludicrous. Yes, floating point arithmetic has built in inaccuracy. However the FP* macros produce results that are even less accurate than single point arithmetic, and less accurate than my 25 year old calculator. And if I could find my slide rule, it could probably do better. At best, the EPSILON value might have be appropriate for single precision arithmetic, but in no way is it appropriate for double precision arithmetic. Visual C++ defines EPSILON as "The difference between 1 and smallest value greater than 1". The postgres EPSILON is 10 orders of magnitude greater than the precision supported by the underlying hardware. This is clearly preposterous. EPSILON in postgres is 1.0E-06 EPSILON for floats is 1.19209e-07 EPSILON for doubles is 2.22045E-016 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fixing geometic calculation
Let us consider the ordering of real numbers in postgres. As you can see from the results below it has clearly returned the correct results. select( 1. = 1.0002 ); => f select( 1. < 1.0002 ); => t select( 1. > 1.0002 ); => f Imagine the situation however where postgres returned the following values to simple numerical inequalities. In such a case postgresql would be clearly defective and unfit for purpose. select( 1.00 = 1.01 ); => f select( 1.00 < 1.01 ); => f select( 1.00 > 1.01 ); => f If such a situation is unacceptable for the real number line, then in what way can it be acceptable for the real number plain. select( point(1.0,0) <> point(1.1,0) ); => f select( point(1.0,0) << point(1.1,0) ); => f select( point(1.0,0) >> point(1.1,0) ); => f select( point(1.0,0) <-> point(1.1,0) ); => 1.000655e-05 We have two points with a finite separation in the x axis. Postgres thinks they are not the same point, nor one left of the other, nor to the right. This is clearly a both a physical and logical impossibility. The cause of this is the ill conceived FP* macros. They seem represent a solution to a problem that simply does not exist. The first effect of these macros is to reduce the accuracy of all geometric comparisons from double precision, to less than single precision. The following program correctly prints the correct answer. Whereas as we have seen above, postgres falls in a heap. int main() { float f = 1.0; float g = 1.1; if( f==g ) { printf( "f=g\n" ); } if( fg ) { printf( "f>g\n" ); } return 0; } The second effect is to take operations that would of worked correctly even in single precision, and to cause them to produce nonsensical result. For example points that can be both inside and outside a polygon at the same time. Simple analysis of the postgres source code shows that the only places where the FPzero, FPeq, FPne, FPlt, FPle FPgt and FPge macros are defined and used are in the src/backend/utils/adt/geo_ops.c and src/include/utils/geo_decls.h files. What is the justification for these macros? Why do they only affect geometric calculations, and not all numeric calculations? Why should these macro's not be abandoned? Does anyone any any objections to me: 1) removing these macros, or at least disabling EPSILON by default. 2) adding in the obviously missing operators (ie: box @> point) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers