Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Tomas Vondra
hi, On 10/28/2017 02:41 AM, Nico Williams wrote: > On Fri, Oct 27, 2017 at 10:06:58PM +0200, Tomas Vondra wrote: >>> + * We use an optimisation that initially we store the uint32 values >>> directly, >>> + * without the extra hashing step. And only later filling the bitmap space, >>> + * we switc

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Nico Williams
On Fri, Oct 27, 2017 at 10:06:58PM +0200, Tomas Vondra wrote: > > + * We use an optimisation that initially we store the uint32 values > > directly, > > + * without the extra hashing step. And only later filling the bitmap space, > > + * we switch to the regular bloom filter mode. > > > > I don't

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Tomas Vondra
Hi, On 10/27/2017 05:22 PM, Sokolov Yura wrote: > > Hi, Tomas > > BRIN bloom index is a really cool feature, that definitely should be in > core distribution (either in contrib or builtin)!!! > > Small suggestion for algorithm: > > It is well known practice not to calculate whole hash function

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Tomas Vondra
Hi, On 10/27/2017 07:17 PM, Nico Williams wrote: > On Thu, Oct 19, 2017 at 10:15:32PM +0200, Tomas Vondra wrote: > > A bloom filter index would, indeed, be wonderful. > > Comments: > > + * We use an optimisation that initially we store the uint32 values directly, > + * without the extra hashing

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Sokolov Yura
On 2017-10-27 20:17, Nico Williams wrote: On Thu, Oct 19, 2017 at 10:15:32PM +0200, Tomas Vondra wrote: A bloom filter index would, indeed, be wonderful. Comments: + * We use an optimisation that initially we store the uint32 values directly, + * without the extra hashing step. And only later

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Nico Williams
On Thu, Oct 19, 2017 at 10:15:32PM +0200, Tomas Vondra wrote: A bloom filter index would, indeed, be wonderful. Comments: + * We use an optimisation that initially we store the uint32 values directly, + * without the extra hashing step. And only later filling the bitmap space, + * we switch to t

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Sokolov Yura
On 2017-10-19 23:15, Tomas Vondra wrote: Hi, The BRIN minmax opclasses work well only for data where the column is somewhat correlated to physical location in a table. So it works great for timestamps in append-only log tables, for example. When that is not the case (non-correlated columns) the

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Robert Haas
On Fri, Oct 27, 2017 at 2:55 PM, Alvaro Herrera wrote: > I was rather thinking that if we can make this very robust against the > index growing out of proportion, we should consider ditching the > original minmax and replace it with multirange minmax, which seems like > it'd have much better behav

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Alvaro Herrera
Tomas Vondra wrote: > Not sure "a number of in-core opclasses" is a good reason to (not) add > new ones. Also, we already have two built-in BRIN opclasses (minmax and > inclusion). > > In general, "BRIN bloom" can be packed as a contrib module (at least I > believe so). That's not the case for the

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Tomas Vondra
hi, On 10/27/2017 09:34 AM, Simon Riggs wrote: > On 27 October 2017 at 07:20, Robert Haas wrote: >> On Thu, Oct 19, 2017 at 10:15 PM, Tomas Vondra >> wrote: >>> Let's see a query like this: >>> >>> select * from bloom_test >>> where id = '8db1d4a6-31a6-e9a2-4e2c-0e842e1f1772'; >>> >>> T

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-27 Thread Simon Riggs
On 27 October 2017 at 07:20, Robert Haas wrote: > On Thu, Oct 19, 2017 at 10:15 PM, Tomas Vondra > wrote: >> Let's see a query like this: >> >> select * from bloom_test >> where id = '8db1d4a6-31a6-e9a2-4e2c-0e842e1f1772'; >> >> The minmax index produces this plan >> >>Heap Blocks: l

Re: [HACKERS] WIP: BRIN bloom indexes

2017-10-26 Thread Robert Haas
On Thu, Oct 19, 2017 at 10:15 PM, Tomas Vondra wrote: > Let's see a query like this: > > select * from bloom_test > where id = '8db1d4a6-31a6-e9a2-4e2c-0e842e1f1772'; > > The minmax index produces this plan > >Heap Blocks: lossy=2061856 > Execution time: 22707.891 ms > > Now, the blo

[HACKERS] WIP: BRIN bloom indexes

2017-10-19 Thread Tomas Vondra
Hi, The BRIN minmax opclasses work well only for data where the column is somewhat correlated to physical location in a table. So it works great for timestamps in append-only log tables, for example. When that is not the case (non-correlated columns) the minmax ranges get very "wide" and we end up