On 10/07/2014 01:33 AM, Alvaro Herrera wrote:
Heikki Linnakangas wrote:
On 09/23/2014 10:04 PM, Alvaro Herrera wrote:
+ Open questions
+ --------------
+
+ * Same-size page ranges?
+   Current related literature seems to consider that each "index entry" in a
+   BRIN index must cover the same number of pages.  There doesn't seem to be a

What is the related literature? Is there an academic paper or
something that should be cited as a reference for BRIN?

I the original "minmax-proposal" file, I had these four URLs:

: Other database systems already have similar features. Some examples:
:
: * Oracle Exadata calls this "storage indexes"
:   http://richardfoote.wordpress.com/category/storage-indexes/
:
: * Netezza has "zone maps"
:   http://nztips.com/2010/11/netezza-integer-join-keys/
:
: * Infobright has this automatically within their "data packs" according to a
:   May 3rd, 2009 blog post
:   
http://www.infobright.org/index.php/organizing_data_and_more_about_rough_data_contest/
:
: * MonetDB also uses this technique, according to a published paper
:   http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.2662
:   "Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS"

I gave them all a quick look and none of them touches the approach in
detail; in fact other than the Oracle Exadata one, they are all talking
about something else and mention the "minmax" stuff only in passing.  I
don't think any of them is worth citing.

I think the "current related literature" phrase should be removed, if there isn't in fact any literature on this. If there's any literature worth referencing, should add a proper citation.

I added an USE_ASSERTION-only block in brininsert that runs the union
support proc and compares the output with the one from regular addValue.
I haven't tested this too much yet.

Ok, that's better than nothing. I wonder if it's too strict, though. It uses brin_tuple_equal(), which does a memcmp() on the tuples. That will trip for any non-meaningful differences, like the scale in a numeric.

* clarify the memory context stuff of support functions that we also
discussed earlier

I re-checked this stuff.  Turns out that the support functions don't
palloc/pfree memory too much, except to update the stuff stored in
BrinValues, by using datumCopy().  This memory is only freed when we
need to update a previous Datum.  There's no way for the brin.c code to
know when the Datum is going to be released by the support proc, and
thus no way for a temp context to be used.

The memory context experiments I alluded to earlier are related to
pallocs done in brininsert / bringetbitmap themselves, not in the
opclass-provided support procs.

At the very least, it needs to be documented.

All in all, I don't think there's much
room for improvement, other than perhaps doing so in brininsert/
bringetbitmap.  Don't really care too much about this either way.

Doing it in brininsert/bringetbitmap seems like the right approach. GiST, GIN, and SP-GiST all use a temporary memory context like that.


It would be wise to reserve some more support procedure numbers, for future expansion. Currently, support procs 1-4 are used by BRIN itself, and higher numbers can be used by the opclass. minmax opclasses uses 5-8 for the <, <=, >= and > operators. If we ever want to add a new, optional, support function to BRIN, we're out of luck. Let's document that e.g. support procs < 10 are reserved for BRIN.

The redo routines should be updated to follow the new XLogReadBufferForRedo idiom (commit f8f4227976a2cdb8ac7c611e49da03aa9e65e0d2).

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to