Re: [GENERAL] new index type with clustering in mind.

2014-12-11 Thread Jack Douglas
 Currently, one issue you're going to face is that brin doesn't rescan a
range to
 find the tighest possible summary tuple.

That's going to be an issue I think, thanks for mentioning it. We'd need
some sort of mechanism for achieving this without a complete REINDEX, even
if it only reset the min/max when all the blocks in the range are entirely
cleared out. Ah well :)

 Another issue is how to find the best possible ordering.  For minmax
 opclasses it's easy, but for other opclass designs it's less clear what to
do.
 Even for minmax you need to find some way to communicate to the system
 what's the order to follow ...

Do you mean the ordering for the clustered table tuples or the ordering of
index tuples in the BRIN index? I'm the former because I'm also assuming you
always scan an entire BRIN index as there isn't a trivial way of optimizing
the index scan for ranges (unless you 'granulate' the ranges along the lines
of this: http://dba.stackexchange.com/a/22295/1396)?

If you mean the clustering order, for the use cases I'm concerned with it
isn't important - as long as tuples with the same cluster key gravitate
towards the same blocks, it often doesn't matter what order those blocks are
in because the main mission is to reduce the number of blocks scanned.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Jack Douglas
 in 9.4, GIN indexes are pretty close to this already

Do I understand correctly that BRIN indexes will be even closer to this?

Kindest regards
Jack

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: 24 May 2014 22:46
To: Martijn van Oosterhout
Cc: Jack Douglas; pgsql-general@postgresql.org
Subject: Re: [GENERAL] new index type with clustering in mind.

Martijn van Oosterhout klep...@svana.org writes:
 On Sat, May 24, 2014 at 05:58:37PM +0100, Jack Douglas wrote:
 Would the following be practical to implement:
 A btree-like index type that points to *pages* rather than individual
rows.

 It's an interesting idea, but, how can you *ever* delete index entries?
 I.e. is there a way to maintain the index without rebuilding it 
 regularly?

The discussions at PGCon pointed out that with the posting-list compression
logic added in 9.4, GIN indexes are pretty close to this already.  Multiple
items on the same heap page will typically only take one byte of index space
per item; but there is an identifiable entry, so you don't get into these
questions of when VACUUM should remove entries, and it's not lossy so you're
not forced to pay the overhead of rechecking every entry on the linked-to
page.

Not to say that 9.4 GIN is necessarily the last word on the subject, but it
would be worth testing it out before deciding that we need something better.
(beta1 is out.  It needs testing.  Hint hint.)

regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-12-10 Thread Jack Douglas
 If the values are perfectly clustered, the index is optimal because you
scan the minimal set of pages.

That's the bit I'm particularly interested in, as my plan would be to keep
the pages well clustered: http://dba.stackexchange.com/a/66293/1396

Do you see any blocker preventing BRIN being used for a continuous
background re-clustering job (in parallel with or as part of vacuum),
similar to the mechanism I experimented with before? If not is this
something there might be support for adding to the TODO list?

Kindest regards
Jack




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-06-03 Thread Jack Douglas
   To reduce complexity (eg MVCC/snapshot related issues), index entries 
   would be added when a row is inserted, but they would not be removed 
   when the row is updated/deleted (or when an insert is rolled back): 

  It's an interesting idea, but, how can you *ever* delete index entries?
  I.e. is there a way to maintain the index without rebuilding it
  regularly?
 
 The discussions at PGCon pointed out that with the posting-list
compression
 logic added in 9.4, GIN indexes are pretty close to this already.
Multiple
 items on the same heap page will typically only take one byte of index
space
 per item; but there is an identifiable entry, so you don't get into these
 questions of when VACUUM should remove entries, and it's not lossy so
 you're not forced to pay the overhead of rechecking every entry on the
 linked-to page.

The README file in the source code for GIN indexes says: Note: There is no
delete operation in the key (entry) tree. The reason for this is that in our
experience, the set of distinct words in a large corpus changes very slowly.
This greatly simplifies the code and concurrency algorithms.. Does that
mean that for the case where GIN is used as a simple replacement for btree,
my initial suggestion above (...would be added when a row is inserted, but
they would not be removed...) is effectively what happens already with GIN
indexes?

---

I've written up a test to demonstrate the principle of this 'clustering
lite': http://dba.stackexchange.com/q/66292/1396

In brief, it shows the benefit of this sort of clustering (much lower io
versus unclustered, and no exclusive lock or heavy WAL generation versus
current clustering implementations) with a workable way of achieving it in
the current release. The basic idea is:

1. turn off autovacuum for the table
2. check each block to determine the degree of clustering
3. delete and re-insert all the rows from blocks below a clustering
threshold
4. manually vacuum to free those (complete) blocks
5. repeat steps 2-4 as regularly as necessary

A weakness is that is requires a full table scan is required. All that is
needed to avoid the full-scan would be a way of getting `blkno` from an
index-only scan, which as far as I can tell is not currently possible, is
it?



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] new index type with clustering in mind.

2014-05-26 Thread Jack Douglas
 The discussions at PGCon pointed out that with the posting-list
compression logic added in 9.4, GIN indexes are pretty close to this
already.  Multiple items on the same heap page will typically only take one
byte of index space per item; but there is an identifiable entry, so you
don't get into these questions of when VACUUM should remove entries, and
it's not lossy so you're not forced to pay the overhead of rechecking every
entry on the linked-to page.

 Not to say that 9.4 GIN is necessarily the last word on the subject, but
it would be worth testing it out before deciding that we need something
better.  (beta1 is out.  It needs testing.  Hint hint.)

Hint taken, and first impressions are positive: the compression is very
efficient for the kind of scenario I'm imagining where the key is
deliberately chosen so that the average page has one distinct key. I have a
25Mb gin 'cluster' index on a table where an equivalent regular btree index
is 10 times as large.

So the questions are, a) is this kind of clustering broadly useful (ie not
just to me), b) how much effort will it be to implement a 'vacuum-like'
operation that scans a designated index and performs the relevant
delete/inserts to achieve this kind of clustering? And c) if it is broadly
useful and not a major implementation mountain to climb, is it something
that might be added to the todo list?

If someone can tell me how to decode a `ctid` into a page number (discarding
the row number portion - is there a better way than `
(replace(replace(ctid::text,'(','{'),')','}')::integer[])[1]`), I should be
able to show some analysis demonstrating this working, albeit inefficiently
as I'll have to scan the table itself for the page/key statistics. Would
that sort of analysis be helpful?

Kindest regards
Jack

PS It occurs to me that the btree_gin documentation page for 9.4,
http://www.postgresql.org/docs/9.4/static/btree-gin.html, might benefit from
including some mention of index compression when discussing the relative
performance of regular and gin btree indexes.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Data Checksum feature and streaming replication

2014-05-24 Thread Jack Douglas
I posted this question on dba.se (http://dba.stackexchange.com/q/65821/1396)
and was advised to try asking here.

 

If I'm using the Data Checksum feature (new to 9.3:
https://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.3#Data_Checksu
ms), and in the event of a checksum failure on a replication master, will
that corrupt data replicate to slaves or will replication halt. Does it
depend on the setting of ignore_checksum_failure?

 

Thanks

Jack



[GENERAL] new index type with clustering in mind.

2014-05-24 Thread Jack Douglas
Hi

 

Please forgive this question if it is naïve.

 

Would the following be practical to implement:

 

A btree-like index type that points to *pages* rather than individual rows.
Ie if there are many rows in a page with the same data (in the indexed
columns), only one index entry will exist. In its normal use case, this
index would be much smaller than a regular index on the same columns which
would contain one entry for each individual row.

 

To reduce complexity (eg MVCC/snapshot related issues), index entries would
be added when a row is inserted, but they would not be removed when the row
is updated/deleted (or when an insert is rolled back): this would cause
index bloat over time in volatile tables but this would be acceptable for
the use case I have in mind. So in essence, an entry in the index would
indicate that there *may* be matching rows in the page, not that there
actually are.

 

For data where there is a high degree of clustering on the columns of the
index, this index would be quite efficient at locating matching rows: the
index itself would be small and most pages would contain largely rows that
match. (In practice it would behave similar to a ‘cluster’ index in Oracle,
without the guarantee of clustering).

 

This brings me on to my use case: a method to efficiently keep a table
largely clustered (with ‘clustered’ more loosely defined than the usual
Postgres definition) without locking more than a page at a time. The above
index could be scanned periodically for entries where multiple keys point to
the same page. Of that set of keys/pages, the subset where the same key
appears in multiple pages can be clustered by simply deleting and
reinserting the rows, key by key. In this case, the pages affected by the
deletes could be locked briefly in order to safely delete the corresponding
index entries without conflict with other transactions. If a lock can’t be
obtained the process could simply skip and move on to the next candidate
key/pages.

 

Although this kind of ‘clustered’ data is less ordered that a normal
Postgres cluster, it retains some of the useful properties: particularly
that data for a particular key is likely to be found in a small number of
pages rather than scattered evenly through the entire table, requiring
potentially far more io to select.

 

Many thanks in advance and my apologies once again if this kind of thing has
been rejected before or is obviously impractical.

Jack

 

 

 

 

 



[GENERAL] returning clause and source columns

2014-05-12 Thread Jack Douglas
Hi

 

Related to this post: http://dba.stackexchange.com/q/50693/1396

 

Would it be a major change to allow the `returning` clause of `insert` to
return values that are *not* inserted? It seems you can already do so for
literals (and perhaps that should be made explicit in the docs here:
http://www.postgresql.org/docs/9.3/static/sql-insert.html#VARIABLELIST).

 

For example:

 

insert into t1(id) select id from t2 returning *, t2.foo;

 

Kindest regards

Jack



[GENERAL] interval hour to minute or interval day to minute

2011-04-17 Thread Jack Douglas

Hi

I discovered the 'fields' option of 'interval', but i can't figure out 
from the docs how it is supposed to work. Are hour to minute and day 
to minute really the same thing? And if not, in what circumstances are 
they treated differently?


psql (8.4.7)
Type help for help.

postgres= select '3 years 2 months - 1 day + 10 hours 5 
minutes'::interval day to minute;

 interval
--
 3 years 2 mons -1 days +10:05:00
(1 row)

postgres= select '3 years 2 months - 1 day + 10 hours 5 
minutes'::interval hour to minute;

 interval
--
 3 years 2 mons -1 days +10:05:00
(1 row)

Warm regards
Jack Douglas

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general