Here comes part 2 of 2 of the review.

inet-gist
---------

In general the code looks good and I think your approach makes sense. Not an expert on GiST though so I would like a second opinion on this. Maybe there is some clever trick which would make the index more efficient, but the design I see is simple and logical. Like this much more than the incorrect GiST index for inet in btree_gist.

The only thing is that I think your index would do poorly on the case where all values share a prefix before the netmask but have different values after the netmask, since gist_union and gist_penalty does not care about the bits after the netmask. Am I correct? Have you thought anything about this?

To create such data:

CREATE TABLE a (ip inet);
INSERT INTO a SELECT '127.0.0.0/16'::inet + s FROM generate_series(1, pow(2, 16)::int) s;
CREATE INDEX aidx ON a USING gist (ip);

Saw also some minor things too.

Typo in comment, weird sentence:

 * The main question to calculate the union is that they have how many
 * bits in common. [...]

I do not like how you called the result key i inet_union_gist() "tmp". Something like "result" or "result_ip" would be better.

Typo in comment, "reverse" should be "inverse":

 * Penalty is reverse of the common IP bits of the two addresses.

Typo in comment, missing "be":

 * Values bigger than 1 are used when the common IP bits cannot
 * calculated.

Language can be improved in this comment. Both ways to split are by the union of the keys, it is more of a split by address prefix.

 * The second and the important way is to split by the union of the keys.
 * Union of the keys calculated same way with the inet_gist_union function.
 * The first and the last biggest subnets created from the calculated
 * union. In this case addresses contained by the first subnet will be put
 * to the left bucket, address contained by the last subnet will be put to
 * the right bucket.

Could you not just use memcmp in inet_gist_same() instead of bitncmp() since it only cares about equality?

There is an extra empty line at the end of network_gist.c

inet-selfuncs
-------------

Here I have to honestly admit I am not sure if I can tell if your selectivity function is correct. Your explanation sounds convincing and the general idea of looking at the histogram is correct. I am just have no idea if the details are right.

DEFAULT_NETWORK_OVERLAP_SELECTIVITY should be renamed DEFAULT_NETWORK_OVERLAP_SEL and moved to the .c file.

Typo in comment, "constrant" -> "constant".

There is an extra empty line at the end of network_selfuncs.c

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to