Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

Tom Lane Sun, 27 Nov 2011 10:44:13 -0800

Alexander Korotkov <[email protected]> writes:
> On Sat, Nov 26, 2011 at 11:11 AM, Jeff Davis <[email protected]> wrote:
>> There's been some significant change in rangetypes_gist.c, can you
>> please rebase this patch?

> OK, rebased with head.

I looked at this patch a bit.  I agree with the aspect of it that says
"let's add a flag bit so we can tell whether an upper GiST item includes
any empty ranges"; I think we really need that in order to make
contained_by searches usable.  However, I'm not so happy with the
proposed rewrite of the penalty/picksplit functions.  I see two problems
there:

1. penalty is using both hard-wired penalty values (1.0, 2.0, etc) and
values obtained from subtype_diff.  This is not good, because you have
no idea what scale the subtype differences will be expressed on.  The
hard-wired values could be greatly larger than range widths, or greatly
smaller, resulting in randomly different index behavior.

2. It's too large/complicated.  You're proposing to add nearly a
thousand lines to rangetypes_gist.c, and I do not see any reason to
think that this is so much better than what's there now as to justify
that kind of increment in the code size.  I saw your performance
results, but one set of results on an arbitrary (not-real-world) test
case doesn't prove a lot to me; and in particular it doesn't prove that
we couldn't do as well with a much smaller and simpler patch.

There are a lot of garden-variety coding problems, too, for instance here:

+                 *penalty = Max(DatumGetFloat8(FunctionCall2(
+                     subtype_diff, orig_lower.val, new_lower.val)), 0.0);

which is going to uselessly call the subtype_diff function twice most of
the time (Max() is only a macro), plus you left off the collation
argument.  But I don't think it's worth worrying about those until the
big picture is correct, which I feel it isn't yet.

Earlier in the thread you wrote:

> Questions:
> 1) I'm not sure about whether we need support of <> in GiST, because it
> always produces full index scan (except search for non-empty ranges).

I was thinking the same thing; that opclass entry seems pretty darn
useless.

I propose to pull out and apply the changes related to the
RANGE_CONTAIN_EMPTY flag, and also remove the <> opclass entry,
because I think these are uncontroversial and in the nature of
"must fix quickly".  The redesign of the penalty and picksplit
functions should be discussed separately.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

Reply via email to