On 10/07/2014 01:33 AM, Alvaro Herrera wrote:
Heikki Linnakangas wrote:
On 09/23/2014 10:04 PM, Alvaro Herrera wrote:
+ Open questions
+ --------------
+
+ * Same-size page ranges?
+ Current related literature seems to consider that each "index entry" in a
+ BRIN index must cover the same number of pages. There doesn't seem to be a
What is the related literature? Is there an academic paper or
something that should be cited as a reference for BRIN?
I the original "minmax-proposal" file, I had these four URLs:
: Other database systems already have similar features. Some examples:
:
: * Oracle Exadata calls this "storage indexes"
: http://richardfoote.wordpress.com/category/storage-indexes/
:
: * Netezza has "zone maps"
: http://nztips.com/2010/11/netezza-integer-join-keys/
:
: * Infobright has this automatically within their "data packs" according to a
: May 3rd, 2009 blog post
:
http://www.infobright.org/index.php/organizing_data_and_more_about_rough_data_contest/
:
: * MonetDB also uses this technique, according to a published paper
: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.2662
: "Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS"
I gave them all a quick look and none of them touches the approach in
detail; in fact other than the Oracle Exadata one, they are all talking
about something else and mention the "minmax" stuff only in passing. I
don't think any of them is worth citing.
I think the "current related literature" phrase should be removed, if
there isn't in fact any literature on this. If there's any literature
worth referencing, should add a proper citation.
I added an USE_ASSERTION-only block in brininsert that runs the union
support proc and compares the output with the one from regular addValue.
I haven't tested this too much yet.
Ok, that's better than nothing. I wonder if it's too strict, though. It
uses brin_tuple_equal(), which does a memcmp() on the tuples. That will
trip for any non-meaningful differences, like the scale in a numeric.
* clarify the memory context stuff of support functions that we also
discussed earlier
I re-checked this stuff. Turns out that the support functions don't
palloc/pfree memory too much, except to update the stuff stored in
BrinValues, by using datumCopy(). This memory is only freed when we
need to update a previous Datum. There's no way for the brin.c code to
know when the Datum is going to be released by the support proc, and
thus no way for a temp context to be used.
The memory context experiments I alluded to earlier are related to
pallocs done in brininsert / bringetbitmap themselves, not in the
opclass-provided support procs.
At the very least, it needs to be documented.
All in all, I don't think there's much
room for improvement, other than perhaps doing so in brininsert/
bringetbitmap. Don't really care too much about this either way.
Doing it in brininsert/bringetbitmap seems like the right approach.
GiST, GIN, and SP-GiST all use a temporary memory context like that.
It would be wise to reserve some more support procedure numbers, for
future expansion. Currently, support procs 1-4 are used by BRIN itself,
and higher numbers can be used by the opclass. minmax opclasses uses 5-8
for the <, <=, >= and > operators. If we ever want to add a new,
optional, support function to BRIN, we're out of luck. Let's document
that e.g. support procs < 10 are reserved for BRIN.
The redo routines should be updated to follow the new
XLogReadBufferForRedo idiom (commit
f8f4227976a2cdb8ac7c611e49da03aa9e65e0d2).
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers