Re: [HACKERS] GiST penalty functions [PoC]

2016-10-02 Thread Andrew Borodin
> This thread has basically died, so marking as returned with feedback. Well, not exactly died, but it's kind of in blind lead. I'll summarize a bit for future references: 1. cube extension for indexing lowered dimensionality data (2d in 3d) is broken. Queries effectively will degrade to full

Re: [HACKERS] GiST penalty functions [PoC]

2016-10-02 Thread Michael Paquier
On Thu, Sep 22, 2016 at 3:45 AM, Andrew Borodin wrote: > [blah] > > Practically, this algorithm cannot be implemented in current GiST API > (only if we provide non-penalty-based choose subtree function, > optional for GiST extension), but it certainly has scientific value.

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-21 Thread Andrew Borodin
Hi Heikki! > Got a link for a description of the RR*-tree? Would be good to reference it > in the patch comments, too. Well, as for now, the best way to reach the paper is DOI 10.1145/1559845.1559929 http://sci-hub.cc/ Authors in conversations clearly stated that they endorse (not sure in correct

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-21 Thread Heikki Linnakangas
On 09/14/2016 07:57 PM, Andrew Borodin wrote: Here is v6 of cube penalty patch. There was a warning about unused edge function under systems without __STDC_IEC_559__ defined, this patch fixes it. Thanks! Reading through this thread again, I think we got a bit side-tracked with this float

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-14 Thread Andrew Borodin
Here is v6 of cube penalty patch. There was a warning about unused edge function under systems without __STDC_IEC_559__ defined, this patch fixes it. Regards, Andrey Borodin. diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c index 3feddef..ad868ac 100644 --- a/contrib/cube/cube.c +++

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-13 Thread Andrew Borodin
Hi hackers! Here is gist cube penaly patch v5. Changes: 1. union version of pack_float() is used, without warings and with minimum asm commands emitted 2. test for __STDC_IEC_559__ is added, this is a standard way of C99 to determine float compliance with IEEE754. ANSI-only compiler+libs, along

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-10 Thread Andrew Borodin
>Personally I wouldn't be very happy about an IEEE754 assumption. Ok, so let's avoid autoconf man. There is no real explanations of the ground for this assumption, just a reference to paper of David Goldberg (and there is no metion about IEEE754 is absoulte everywhere). BTW, may be we can ask

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-09 Thread Greg Stark
On Thu, Sep 8, 2016 at 9:29 AM, Andrew Borodin wrote: >>autoconf check for IEEE 754 floats > Autoconf man says folowing: >>it is safe to assume IEEE-754 in most portable code these days > https://www.gnu.org/software/autoconf/manual/autoconf.html#Floating-Point-Portability

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-09 Thread Михаил Бахтерев
If you are still interested in. Here are 3 versions of pack-float. The union version of pack-float should run faster. The code is simpler, the dependencies are easier. But it may be less accurate or even wrong, as for signed integers (x>>2) and (x/4) are not the same. Consider x = -1. You may

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-09 Thread Михаил Бахтерев
Yes. You are right, ANSI C allows only load-time initializers. Attached ANSI compatible version leads to the same assembly. And let me suggest a bit-twiddling version as well. It gives 12 instructions, instead of 13. 12 is better, as modern x86 CPU will fetch them at most in 3 cycles, one less

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrey Borodin
Thank you for your attention to details, Mikhail. pack_float_good() looks good. But I'm not sure inline strict init is allowed under ansi C. Converting to regular ancient form b.fp = v; won't change compile result, would it? Regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Михаил Бахтерев
Excuse me for intervention. It depends. For instance, i run PostgreSQL on the modern MIPS CPU, which does not have sqrt support. But you are right, it is supported in most cases. And if execution speed of this very fuction is of concern, sqrtf(x) should be used instead of sqrt(x). Despite this,

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrew Borodin
>BTW, would someone explain to me why using a float here will not fail >catastrophically for inputs outside the range of float? Indeed, it will. And that's one of the reason I'm considering advancing GiST API instead of advancing extensions. It won't crash, but will produce terrible index, not

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Tom Lane
Heikki Linnakangas writes: > BTW, I would be OK with the bit-twiddling hack, if we had an autoconf > check for IEEE 754 floats, and a graceful fallback for other systems. I would still object to the version submitted, on the grounds of the compiler warnings it causes.

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrew Borodin
>autoconf check for IEEE 754 floats Autoconf man says folowing: >it is safe to assume IEEE-754 in most portable code these days https://www.gnu.org/software/autoconf/manual/autoconf.html#Floating-Point-Portability > A union might be more readable Here is union version of the patch. It's slower

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Heikki Linnakangas
On 09/08/2016 09:39 AM, Михаил Бахтерев wrote: Excuse me for intervention. It depends. For instance, i run PostgreSQL on the modern MIPS CPU, which does not have sqrt support. But you are right, it is supported in most cases. And if execution speed of this very fuction is of concern, sqrtf(x)

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Tom Lane
Heikki Linnakangas writes: > Unfortunately, sqrt(x) isn't very cheap. You'd be surprised: sqrt is built-in on most modern hardware. On my three-year-old workstation, sqrt(x) seems to take about 2.6ns. For comparison, the pack_float version posted in

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Heikki Linnakangas
On 09/07/2016 09:20 PM, Andrew Borodin wrote: Well, arithmetics is too fragile. This version of float packing with arithmetical packaging static float pack_float(float actualValue, int realm) { double max,min; max = FLT_MAX / ( 8 >> realm ); min = FLT_MAX / ( 16 >> realm ); if( realm == 0 ) min

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Andrew Borodin
Well, arithmetics is too fragile. This version of float packing with arithmetical packaging static float pack_float(float actualValue, int realm) { double max,min; max = FLT_MAX / ( 8 >> realm ); min = FLT_MAX / ( 16 >> realm ); if( realm == 0 ) min = 0; /* squeeze the actual value between min

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Andrew Borodin
Oh, sorry, made one more attemp and now I see your algorithm differently. So you propose to use oE and oV as a markers of borders for what I call Realm. But there may be very little significant bits in one of this ranges. pg_sphere and PostGiS extensions tried to use 1 as a marker, with alike

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Heikki Linnakangas
On 09/07/2016 09:42 AM, Andrew Borodin wrote: 2. Your algorithm, among loosing some bits of precision (which is absolutely acceptable - we have like 31 of them and that’s a lot) rely on false assumption. We compare tuples on page not by MBR of inserted tuple, but by MBR of tuple on page, which

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-07 Thread Andrew Borodin
Hi Heikki! Thank you for reviewing the code, it's always inspiring when a work is noted (: >Looking at the code, by "margin", you mean the sum of all "edges", i.e. of each dimension, of the cube. Indeed. As far as I remember, this is a terminology of old R*-tree paper. Now they use the word

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-06 Thread Heikki Linnakangas
On 08/29/2016 09:04 AM, Andrew Borodin wrote: In this message I address only first problem. I want to construct a penalty function that will choose: 1.Entry with a zero volume and smallest margin, that can accommodate insertion tuple without extension, if one exists; 2.Entry with

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Here is new version of the patch. Now function pack_float is commented better. All inline keywords are removed. I haven't found any performance loss for that. Remove of static's showed 1%-7% performance loss in SELECT computation (3 test runs), so I left statics where they are. Actually, to avoid

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Thank you for your coments, Tom. > Modern compilers are generally able to make their own decisions about it, and > trying to put your thumb on the scales this heavily is not likely to improve > the code. Well, I have tested that combination of "static inline" affets performance of index build

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Tom Lane
Andrew Borodin writes: > Does every supported Postgres platform conforms to IEEE 754 floating > point specification? Possibly, but it is against project policy to allow code to assume so. That pack_float function is absolutely not acceptable IMV, and would still be if it

Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Hi hackers! Here is the new patch version. With the help of Mikhail Bakhterev I've optimized subroutines of the penalty function. Index build time now is roughly equivalent to build time before patch (test attached to thread start). Time of SELECT statement execution is reduced by 40%. Changes in

[HACKERS] GiST penalty functions [PoC]

2016-08-29 Thread Andrew Borodin
Hi, hackers! Some time ago I put a patch to commitfest that optimizes slightly GiST inserts and updates. It's effect in general is quite modest (15% perf improved), but for sorted data inserts are 3 times quicker. This effect I attribute to mitigation of deficiencies of old split algorithm.