Re: [GENERAL] new index type with clustering in mind.

2014-12-11 Thread Jack Douglas
 Currently, one issue you're going to face is that brin doesn't rescan a
range to
 find the tighest possible summary tuple.

That's going to be an issue I think, thanks for mentioning it. We'd need
some sort of mechanism for achieving this without a complete REINDEX, even
if it only reset the min/max when all the blocks in the range are entirely
cleared out. Ah well :)

 Another issue is how to find the best possible ordering.  For minmax
 opclasses it's easy, but for other opclass designs it's less clear what to
do.
 Even for minmax you need to find some way to communicate to the system
 what's the order to follow ...

Do you mean the ordering for the clustered table tuples or the ordering of
index tuples in the BRIN index? I'm the former because I'm also assuming you
always scan an entire BRIN index as there isn't a trivial way of optimizing
the index scan for ranges (unless you 'granulate' the ranges along the lines
of this: http://dba.stackexchange.com/a/22295/1396)?

If you mean the clustering order, for the use cases I'm concerned with it
isn't important - as long as tuples with the same cluster key gravitate
towards the same blocks, it often doesn't matter what order those blocks are
in because the main mission is to reduce the number of blocks scanned.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Jack Douglas
 in 9.4, GIN indexes are pretty close to this already

Do I understand correctly that BRIN indexes will be even closer to this?

Kindest regards
Jack

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: 24 May 2014 22:46
To: Martijn van Oosterhout
Cc: Jack Douglas; pgsql-general@postgresql.org
Subject: Re: [GENERAL] new index type with clustering in mind.

Martijn van Oosterhout klep...@svana.org writes:
 On Sat, May 24, 2014 at 05:58:37PM +0100, Jack Douglas wrote:
 Would the following be practical to implement:
 A btree-like index type that points to *pages* rather than individual
rows.

 It's an interesting idea, but, how can you *ever* delete index entries?
 I.e. is there a way to maintain the index without rebuilding it 
 regularly?

The discussions at PGCon pointed out that with the posting-list compression
logic added in 9.4, GIN indexes are pretty close to this already.  Multiple
items on the same heap page will typically only take one byte of index space
per item; but there is an identifiable entry, so you don't get into these
questions of when VACUUM should remove entries, and it's not lossy so you're
not forced to pay the overhead of rechecking every entry on the linked-to
page.

Not to say that 9.4 GIN is necessarily the last word on the subject, but it
would be worth testing it out before deciding that we need something better.
(beta1 is out.  It needs testing.  Hint hint.)

regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Alvaro Herrera
Jack Douglas wrote:
  in 9.4, GIN indexes are pretty close to this already
 
 Do I understand correctly that BRIN indexes will be even closer to this?
 

Yeah, in a way.  You could say they are closer from the opposite end.
There is one index tuple in a BRIN index for each page range (contiguous
set of pages); each index tuple contains a summary of what in that
page range.  There are no exact entries.  If the values are randomly
scattered, the index is useless; all page ranges will have to be scanned
for possibly matching tuples.  If the values are perfectly clustered,
the index is optimal because you scan the minimal set of pages.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Jack Douglas
 If the values are perfectly clustered, the index is optimal because you
scan the minimal set of pages.

That's the bit I'm particularly interested in, as my plan would be to keep
the pages well clustered: http://dba.stackexchange.com/a/66293/1396

Do you see any blocker preventing BRIN being used for a continuous
background re-clustering job (in parallel with or as part of vacuum),
similar to the mechanism I experimented with before? If not is this
something there might be support for adding to the TODO list?

Kindest regards
Jack




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Alvaro Herrera
Jack Douglas wrote:
  If the values are perfectly clustered, the index is optimal because you
 scan the minimal set of pages.
 
 That's the bit I'm particularly interested in, as my plan would be to keep
 the pages well clustered: http://dba.stackexchange.com/a/66293/1396
 
 Do you see any blocker preventing BRIN being used for a continuous
 background re-clustering job (in parallel with or as part of vacuum),
 similar to the mechanism I experimented with before? If not is this
 something there might be support for adding to the TODO list?

In principle, CLUSTER sucks, and having data clustered is clearly good,
so improvements in this area are certainly welcome.

If you were to propose some general mechanism that works for any index,
I don't see that we would reject having it work specifically for BRIN;
having it for BRIN only would be strange.  (I guess it's good enough if
it works for btree and BRIN.  Not sure about GiST and SP-GiST; GIN
clearly is of no use here.)

Currently, one issue you're going to face is that brin doesn't rescan a
range to find the tighest possible summary tuple.  Thinking in min/max
terms, once a tuple with a very high or very low is inserted, the range
doesn't get any smaller once that tuple is deleted from the range.  You
would need to find a way to fix that.  (The easiest way is to REINDEX
the whole thing, of course, but that processes the whole table and not
just some portion of it.)

Another issue is how to find the best possible ordering.  For minmax
opclasses it's easy, but for other opclass designs it's less clear what
to do.  Even for minmax you need to find some way to communicate to the
system what's the order to follow ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-06-03 Thread Jack Douglas
   To reduce complexity (eg MVCC/snapshot related issues), index entries 
   would be added when a row is inserted, but they would not be removed 
   when the row is updated/deleted (or when an insert is rolled back): 

  It's an interesting idea, but, how can you *ever* delete index entries?
  I.e. is there a way to maintain the index without rebuilding it
  regularly?
 
 The discussions at PGCon pointed out that with the posting-list
compression
 logic added in 9.4, GIN indexes are pretty close to this already.
Multiple
 items on the same heap page will typically only take one byte of index
space
 per item; but there is an identifiable entry, so you don't get into these
 questions of when VACUUM should remove entries, and it's not lossy so
 you're not forced to pay the overhead of rechecking every entry on the
 linked-to page.

The README file in the source code for GIN indexes says: Note: There is no
delete operation in the key (entry) tree. The reason for this is that in our
experience, the set of distinct words in a large corpus changes very slowly.
This greatly simplifies the code and concurrency algorithms.. Does that
mean that for the case where GIN is used as a simple replacement for btree,
my initial suggestion above (...would be added when a row is inserted, but
they would not be removed...) is effectively what happens already with GIN
indexes?

---

I've written up a test to demonstrate the principle of this 'clustering
lite': http://dba.stackexchange.com/q/66292/1396

In brief, it shows the benefit of this sort of clustering (much lower io
versus unclustered, and no exclusive lock or heavy WAL generation versus
current clustering implementations) with a workable way of achieving it in
the current release. The basic idea is:

1. turn off autovacuum for the table
2. check each block to determine the degree of clustering
3. delete and re-insert all the rows from blocks below a clustering
threshold
4. manually vacuum to free those (complete) blocks
5. repeat steps 2-4 as regularly as necessary

A weakness is that is requires a full table scan is required. All that is
needed to avoid the full-scan would be a way of getting `blkno` from an
index-only scan, which as far as I can tell is not currently possible, is
it?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-05-26 Thread Jack Douglas
 The discussions at PGCon pointed out that with the posting-list
compression logic added in 9.4, GIN indexes are pretty close to this
already.  Multiple items on the same heap page will typically only take one
byte of index space per item; but there is an identifiable entry, so you
don't get into these questions of when VACUUM should remove entries, and
it's not lossy so you're not forced to pay the overhead of rechecking every
entry on the linked-to page.

 Not to say that 9.4 GIN is necessarily the last word on the subject, but
it would be worth testing it out before deciding that we need something
better.  (beta1 is out.  It needs testing.  Hint hint.)

Hint taken, and first impressions are positive: the compression is very
efficient for the kind of scenario I'm imagining where the key is
deliberately chosen so that the average page has one distinct key. I have a
25Mb gin 'cluster' index on a table where an equivalent regular btree index
is 10 times as large.

So the questions are, a) is this kind of clustering broadly useful (ie not
just to me), b) how much effort will it be to implement a 'vacuum-like'
operation that scans a designated index and performs the relevant
delete/inserts to achieve this kind of clustering? And c) if it is broadly
useful and not a major implementation mountain to climb, is it something
that might be added to the todo list?

If someone can tell me how to decode a `ctid` into a page number (discarding
the row number portion - is there a better way than `
(replace(replace(ctid::text,'(','{'),')','}')::integer[])[1]`), I should be
able to show some analysis demonstrating this working, albeit inefficiently
as I'll have to scan the table itself for the page/key statistics. Would
that sort of analysis be helpful?

Kindest regards
Jack

PS It occurs to me that the btree_gin documentation page for 9.4,
http://www.postgresql.org/docs/9.4/static/btree-gin.html, might benefit from
including some mention of index compression when discussing the relative
performance of regular and gin btree indexes.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-05-24 Thread Martijn van Oosterhout
On Sat, May 24, 2014 at 05:58:37PM +0100, Jack Douglas wrote:
 Would the following be practical to implement:

 A btree-like index type that points to *pages* rather than individual rows.
 Ie if there are many rows in a page with the same data (in the indexed
 columns), only one index entry will exist. In its normal use case, this
 index would be much smaller than a regular index on the same columns which
 would contain one entry for each individual row.

 To reduce complexity (eg MVCC/snapshot related issues), index entries would
 be added when a row is inserted, but they would not be removed when the row
 is updated/deleted (or when an insert is rolled back): this would cause
 index bloat over time in volatile tables but this would be acceptable for
 the use case I have in mind. So in essence, an entry in the index would
 indicate that there *may* be matching rows in the page, not that there
 actually are.

It's an interesting idea, but, how can you *ever* delete index entries?
I.e. is there a way to maintain the index without rebuilding it
regularly?

Maybe there's something you could do with tracking all the entries that
point to one page or something, or a counter.  Because really, the fact
that the item pointer in a btree index includes the item number is only
really needed for deletion.  Postgres always has to read in the whole
page anyway, so if you can find a way around that it might be an
interesting improvement.

Mind you, hash indexes could get this almost free, except they're not
crash safe.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [GENERAL] new index type with clustering in mind.

2014-05-24 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Sat, May 24, 2014 at 05:58:37PM +0100, Jack Douglas wrote:
 Would the following be practical to implement:
 A btree-like index type that points to *pages* rather than individual rows.

 It's an interesting idea, but, how can you *ever* delete index entries?
 I.e. is there a way to maintain the index without rebuilding it
 regularly?

The discussions at PGCon pointed out that with the posting-list
compression logic added in 9.4, GIN indexes are pretty close to this
already.  Multiple items on the same heap page will typically only take
one byte of index space per item; but there is an identifiable entry, so
you don't get into these questions of when VACUUM should remove entries,
and it's not lossy so you're not forced to pay the overhead of rechecking
every entry on the linked-to page.

Not to say that 9.4 GIN is necessarily the last word on the subject, but
it would be worth testing it out before deciding that we need something
better.  (beta1 is out.  It needs testing.  Hint hint.)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general