Re: [HACKERS] Pluggable Indexes

2009-01-24 Thread Simon Riggs

On Fri, 2009-01-23 at 16:49 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Fri, 2009-01-23 at 10:33 -0500, Tom Lane wrote:
  Right, the WAL-record-processing API is not really at issue, since it's
  been proven internally to the core code.  My concern is with the other
  part, namely exactly how are we going to identify and install additional
  rmgrs.
 
  The patch is just
  * a hook in StartupXLOG to allow loading arbitrary code into Startup
  * some slight redefinition of RmgrTable to allow arbitrary code to add
  or modify the contents of that table of functions. (Being able to modify
  the table is an not necessary for index extensions, but is for other
  uses).
  * some safeguards people requested
 
 Well, that really seems to just prove my point.  You've defined a hook
 and not thought carefully about how people will use it.  

This was originally proposed on 19 August and a patch submitted to the
September commit fest.
http://archives.postgresql.org/pgsql-hackers/2008-08/msg00794.php

After about 30 emails of technical rebuttal we have a list of possible
uses that can't be done sensibly any other way.
* WAL filtering
* Recovery when we have buggy index AMs, yet without losing data
* Pluggable indexes
* Extracting user data from WAL records (very challenging though)

Those uses require the ability to both add to *and* modify all of the
RmgrTable entries. If this was just for pluggable indexes then the API
probably would look a little different, but it's not. The simplicity of
the hook proposal says nothing about the careful thought behind it, it
just relates to the wide variety of beneficial uses.

At any point there we might have hit serious problems with the patch,
but we didn't. I've done my best to cover the objections raised with
code or suggested control mechanisms, so I'm not expecting anyone to
agree with my first musings.

 The main thing
 that I can see right now that we'd need is some way to determine who
 gets which rmgr index.  (Maybe community assignment of numbers ---
 similar to what we've defined for pg_statistic kind codes --- is fine,

http://archives.postgresql.org/pgsql-hackers/2008-08/msg00916.php


 or maybe it isn't; in any case we need an answer for that before this
 hook can be considered usable.)  Furthermore, maybe that's not the only
 problem.  I'd feel a lot better about this if the hook patch were done
 in parallel with development of actual WAL support in an actual external
...

I agree we need an external module and I learned that lesson from the
earier API proposal you mentioned. The supplied WAL filter plugin was/is
a valid use for this and, as discussed, is the only practical way of
doing WAL filtering. As I said, am happy to make a few mods to make that
more acceptable.

I've deferred on this patch sometimes because of my other work, but also
because I sensed there might be some feeling that people thought this
was a threat to the project from some commercial usurpation (e.g. like
InnoDB). I asked to be contacted off-list if that was the case but
nobody has, so I have assumed this to be a decision based on technical
merit alone. After considering all that has been said I feel this idea
has merit. Yes, we need more and better plugins and this patch is the
seed for those.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-24 Thread Simon Riggs

On Sat, 2009-01-24 at 09:57 +, Simon Riggs wrote:

 I agree we need an external module and I learned that lesson from the
 earier API proposal you mentioned. The supplied WAL filter plugin was/is
 a valid use for this and, as discussed, is the only practical way of
 doing WAL filtering. As I said, am happy to make a few mods to make that
 more acceptable.

I can change the contrib plugin to show how to exclude DROP DATABASE and
DROP TABLESPACE records, which is a common recovery scenario.

I'll produce the table filter plugin and release it to pgfoundry. We
currently have everything we need to make that work, AFAICS.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-24 Thread Simon Riggs

On Sat, 2009-01-24 at 13:51 +, Simon Riggs wrote:
 On Sat, 2009-01-24 at 09:57 +, Simon Riggs wrote:
 
  I agree we need an external module and I learned that lesson from the
  earier API proposal you mentioned. The supplied WAL filter plugin was/is
  a valid use for this and, as discussed, is the only practical way of
  doing WAL filtering. As I said, am happy to make a few mods to make that
  more acceptable.
 
 I can change the contrib plugin to show how to exclude DROP DATABASE and
 DROP TABLESPACE records, which is a common recovery scenario.
 
 I'll produce the table filter plugin and release it to pgfoundry. We
 currently have everything we need to make that work, AFAICS.

On reflection, I'm not going to do those things. 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-23 Thread Simon Riggs

On Thu, 2009-01-22 at 18:45 -0500, Tom Lane wrote:

 There are other recent examples of proposed hooks that in fact
 failed to be useful because of some oversight or other, and it was
 not until we insisted on seeing a live use of the hooks that this
 became apparent.  (IIRC, one or both of the planner-related hooks
 that are new in 8.4 had such issues.)

Thank you for your support of the plugin concept.

You make good points and are completely correct about the earlier
plugin. The additional plugin capability was filling a gap that had been
left when the planner plugin was added in 8.3. A similar thing happened
with executor plugins IIRC. So I agree, new and complex plugin APIs need
a working example otherwise they'll be wrong.

In the current case, index APIs are already well known, so that API is
unlikely to be a problem. The actual rmgr plugin API is very simple,
since its intention is only to add or edit entries onto the internal
RmgrTable (in memory) after which everything is well defined already.
This is probably the simplest API that has been added in recent times.

I'm happy to make the WAL filter plugin work correctly in all cases. It
was intended as a demonstration only, but if that is a problem it is
easily fixed. One of my clients has requested filtering capability
alongside hot standby, so I will deliver it, even if that is rejected
for reasons outside of my hands (such as timing).

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-23 Thread Teodor Sigaev

Hmm, IIRC it is based on a monotonically increasing number.  It could
have been anything.  LSN was just a monotonically increasing number that
would be available if WAL was implemented first (or in parallel).


You are right, but without WAL-logging we would need to implement some kind of 
sequence :)



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-23 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Thu, 2009-01-22 at 18:45 -0500, Tom Lane wrote:
 There are other recent examples of proposed hooks that in fact
 failed to be useful because of some oversight or other, and it was
 not until we insisted on seeing a live use of the hooks that this
 became apparent.

 In the current case, index APIs are already well known, so that API is
 unlikely to be a problem. The actual rmgr plugin API is very simple,
 since its intention is only to add or edit entries onto the internal
 RmgrTable (in memory) after which everything is well defined already.

Right, the WAL-record-processing API is not really at issue, since it's
been proven internally to the core code.  My concern is with the other
part, namely exactly how are we going to identify and install additional
rmgrs.  There was substantial debate about that when it first came up,
so you're not likely to convince me that it's such an open-and-shut case
as to not need supporting evidence.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-23 Thread Simon Riggs

On Fri, 2009-01-23 at 10:33 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Thu, 2009-01-22 at 18:45 -0500, Tom Lane wrote:
  There are other recent examples of proposed hooks that in fact
  failed to be useful because of some oversight or other, and it was
  not until we insisted on seeing a live use of the hooks that this
  became apparent.
 
  In the current case, index APIs are already well known, so that API is
  unlikely to be a problem. The actual rmgr plugin API is very simple,
  since its intention is only to add or edit entries onto the internal
  RmgrTable (in memory) after which everything is well defined already.
 
 Right, the WAL-record-processing API is not really at issue, since it's
 been proven internally to the core code.  My concern is with the other
 part, namely exactly how are we going to identify and install additional
 rmgrs.  There was substantial debate about that when it first came up,
 so you're not likely to convince me that it's such an open-and-shut case
 as to not need supporting evidence.

I hear your objection and will answer it, for the record at least.

We can load arbitrary code into any normal backend. I just want to be
able to do the same with the startup process. It can't be much of a
discussion since the API is essentially just the same as _PG_init(), or
shmem_startup_hook.

We took the risk with planner hook, and missed something. We took the
risk with RequestAddinShmemSpace() and missed something. There wasn't
any backlash or problem as a result though and we haven't even
backpatched the additional hooks. They were inspired additions. Why is
such a simple hook in Startup such a big deal? What would be wrong in
fixing any problem in the next release, just as we've done in the other
examples?

If we didn't already have chapters in the manual on index extensibility
I would have to agree. We could regard this patch as fixing an oversight
in index extensibility, presumably when WAL was created.

The patch is just
* a hook in StartupXLOG to allow loading arbitrary code into Startup
* some slight redefinition of RmgrTable to allow arbitrary code to add
or modify the contents of that table of functions. (Being able to modify
the table is an not necessary for index extensions, but is for other
uses).
* some safeguards people requested

Buggy code in shmem_startup_hook could do just as much damage at startup
or in a crash situation, but we have no safeguards there and nobody has
said a single word against that.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-23 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Fri, 2009-01-23 at 10:33 -0500, Tom Lane wrote:
 Right, the WAL-record-processing API is not really at issue, since it's
 been proven internally to the core code.  My concern is with the other
 part, namely exactly how are we going to identify and install additional
 rmgrs.

 The patch is just
 * a hook in StartupXLOG to allow loading arbitrary code into Startup
 * some slight redefinition of RmgrTable to allow arbitrary code to add
 or modify the contents of that table of functions. (Being able to modify
 the table is an not necessary for index extensions, but is for other
 uses).
 * some safeguards people requested

Well, that really seems to just prove my point.  You've defined a hook
and not thought carefully about how people will use it.  The main thing
that I can see right now that we'd need is some way to determine who
gets which rmgr index.  (Maybe community assignment of numbers ---
similar to what we've defined for pg_statistic kind codes --- is fine,
or maybe it isn't; in any case we need an answer for that before this
hook can be considered usable.)  Furthermore, maybe that's not the only
problem.  I'd feel a lot better about this if the hook patch were done
in parallel with development of actual WAL support in an actual external
indexam.  As was suggested earlier, we could do something like building
hash as an external module for the sake of this development, so it's not
like I'm demanding someone write a whole AM from scratch for this.  But
putting in the hook and leaving people to invent their own ways of using
it is a recipe for conflicts.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread Heikki Linnakangas

Oleg Bartunov wrote:

bitmap indexes could be implemented usin g GiST.


Huh, how would that work? Bitmap indexes have a very different 
structure, AFAICS.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Jan 21, 2009 at 10:48:21PM +, Simon Riggs wrote:
 
 On Thu, 2009-01-22 at 00:29 +0300, Oleg Bartunov wrote:

[...]

  Other question, why don't improve GiST to allow support of  more indexes ?
  bitmap indexes could be implemented usin g GiST.

[...]

 I'll avoid discussing index design with you :-)

Oooh. What a pity -- this would allow us lurkers to learn a lot!

(Oh, wait, Heikki has taken up that :-)

Just wanted to say -- thanks folks

- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFJeDvvBcgs9XrR2kYRAviLAJ4jW1rSygrgeA4M73PerFqWXmO4NACeNvV8
GSSnxUyCroSrvpF2PBevBV4=
=jhqe
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread Teodor Sigaev

What other constraints are there on such non-in-core indexex?  Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.


GiST concurrent algorithm is based on Log Sequence Number of WAL and that was 
the reason to implement WAL (and recoverability) first in GiST.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread Alvaro Herrera
Teodor Sigaev wrote:
 What other constraints are there on such non-in-core indexex?  Early (2005)
 GIST indexes were very painful in production environments because vacuuming
 them held locks for a *long* time (IIRC, an hour or so on my database) on
 the indexes locking out queries.  Was that just a shortcoming of the
 implementation, or was it a side-effect of them not supporting 
 recoverability.

 GiST concurrent algorithm is based on Log Sequence Number of WAL and that 
 was the reason to implement WAL (and recoverability) first in GiST.

Hmm, IIRC it is based on a monotonically increasing number.  It could
have been anything.  LSN was just a monotonically increasing number that
would be available if WAL was implemented first (or in parallel).

Of course, there's no much point in an index that's easily corrupted, so
I understand the desire to implement WAL too -- I'm just pointing out
that concurrency could have been developed independently.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread Robert Haas
 Of course, there's no much point in an index that's easily corrupted, so
 I understand the desire to implement WAL too -- I'm just pointing out
 that concurrency could have been developed independently.

Anything's possible with enough work, but having good support in -core
makes it easier and -core has usually been receptive to requests for
such things - for example, I think Tom put in quite a bit of work to
getting the right hooks in to enable libpqtypes.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Of course, there's no much point in an index that's easily corrupted, so
 I understand the desire to implement WAL too -- I'm just pointing out
 that concurrency could have been developed independently.

 Anything's possible with enough work, but having good support in -core
 makes it easier and -core has usually been receptive to requests for
 such things - for example, I think Tom put in quite a bit of work to
 getting the right hooks in to enable libpqtypes.

Well, in fact, that's an exceedingly apt and instructive comparison.
The hooks that went into libpq resulted from several iterations of
design against a real, live, working application for those hooks.
The proposed rmgr patch is apparently suffering from no such handicap
as having been proven to satisfy the needs of real code :-(

There are other recent examples of proposed hooks that in fact
failed to be useful because of some oversight or other, and it was
not until we insisted on seeing a live use of the hooks that this
became apparent.  (IIRC, one or both of the planner-related hooks
that are new in 8.4 had such issues.)

I generally agree that pluggable rmgr support would be a good idea,
but I would much rather put off making the hooks until we have a live
application for them to prove that they are useful and usable.  If
we make a hook now sans test case, then what happens if we discover
later that it's not quite right?  We'd have to figure out whether there's
a need for backwards-compatible behavior, and we will have a hard time
knowing whether there are any live uses of the hook in the field.

So my take on this is to wait.  If it were actually needed by the hot
standby code then of course the above argument would be wrong, but
what I gather from the discussion is that it's not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 The original design of Postgres allowed pluggable index access methods,
 but that capability has not been brought forward to allow for WAL. This
 patch would bridge that gap.

Well I think what people do is what GIST did early on -- they just don't
support recoverability until they get merged into core.

Nonetheless this *would* be a worthwhile problem to put effort into solving. I
agree that there are lots of exotic index methods out there that it would be
good to be able to develop externally.

But to do that we need an abstract interface that doesn't depend on internal
data structures, not a generic plugin facility that allows the plugin to
hijack the whole system.

We need something more like indexams which provides a set of call points which
do specific functions, only get called when they're needed, and are expected
to only do the one thing they've been asked to do.

This could be a bit tricky since the catalog isn't available to the wal replay
system. We can't just store the info needed in the pg_indexam table. And it
has to span all the databases in the cluster in any case.

Perhaps this should be solved along with the plugins thread. Binary modules
could have some way to register their rmgr id so you could guarantee that
there aren't two plugins with conflicting rmgr ids or version mismatches.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Wed, 2009-01-21 at 14:57 +, Gregory Stark wrote:
 But to do that we need an abstract interface that doesn't depend on
 internal data structures, not a generic plugin facility that allows
 the plugin to hijack the whole system.
 
 We need something more like indexams which provides a set of call
 points which do specific functions, only get called when they're
 needed, and are expected to only do the one thing they've been asked
 to do.

Really this is just ridiculous scare-mongering. Hijack the whole system?

The patch takes special care to allow calls to the rmgr functions only
from the startup process. The APIs are exactly like the indexams and
*are* called only in specific ways, at specific times. At your earlier
request I put in filters to prevent WAL inserts for plugins that didn't
exist, ensuring that all WAL writes were crash recoverable.

You can already do all the weird stuff you like with index AMs, like
send emails to the Pope on every row insert. I can already create an
in-memory index for example. How exactly do the rmgr interface give more
power? The structure of the function pointers is identical to the
indexAM code...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Heikki Linnakangas

Gregory Stark wrote:

But to do that we need an abstract interface that doesn't depend on internal
data structures, not a generic plugin facility that allows the plugin to
hijack the whole system.

We need something more like indexams which provides a set of call points which
do specific functions, only get called when they're needed, and are expected
to only do the one thing they've been asked to do.


That's called GiST. ;-)

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Andrew Sullivan
None of this is Any of My Business any more, but

On Wed, Jan 21, 2009 at 03:44:15PM +, Simon Riggs wrote:

 The patch takes special care to allow calls to the rmgr functions only
 from the startup process. The APIs are exactly like the indexams and
 *are* called only in specific ways, at specific times. At your earlier
 request I put in filters to prevent WAL inserts for plugins that didn't
 exist, ensuring that all WAL writes were crash recoverable.

I haven't even started to think about looking at the code, but I buy
Simon's argument here.  The Pg project is at big pains to point out
how the extensible PL support and custom datatypes are such big
deals.  So why is pluggable index support not also a good thing?

I take no position on the merits of the proposed patch, which I do not
pretend to understand.  But it'd be nice to see opponents distinguish
beteween  bad idea in principle and bad idea in this case.  If
you're arguing the former, clarifying why the analogies aren't
relevant would be helpful.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Ron Mayer
Gregory Stark wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 
 The original design of Postgres allowed pluggable index access methods,
 but that capability has not been brought forward to allow for WAL. This
 patch would bridge that gap.
 
 Well I think what people do is what GIST did early on -- they just don't
 support recoverability until they get merged into core.

What other constraints are there on such non-in-core indexex?  Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.
If the latter, I think that's a good reason to try to avoid developing new
index types the same way the GIST guys did.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Heikki Linnakangas

Ron Mayer wrote:

Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.


The former.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Oleg Bartunov

On Wed, 21 Jan 2009, Simon Riggs wrote:



On Wed, 2009-01-21 at 21:45 +0200, Heikki Linnakangas wrote:

Ron Mayer wrote:

Early (2005)
GIST indexes were very painful in production environments because vacuuming
them held locks for a *long* time (IIRC, an hour or so on my database) on
the indexes locking out queries.  Was that just a shortcoming of the
implementation, or was it a side-effect of them not supporting recoverability.


The former.


In the current way of thinking early-GIST would never have been
committed and as a result we would not have PostGIS. Yes, early index
implementations can be bad and they scare the hell out of me. That's
exactly why I want to keep them out of core, so they don't need to be
perfect, they can come with all sorts of health warnings.


I'm rather keen on Pg extendability, which allowed me and Teodor to 
work on many extensions. Yes, first GiST we inherited from early 
academic research and was more like a toy. We still have several TODO 
items about GiST interface (incorporate SP-GiST). 
I'm not sure about specific patch Simon advocate, but as soon as it
doesnot introduces any threat to the whole  database cluster health 
(for example, WAL spamming) I think we can apply it. 
Other question, why don't improve GiST to allow support of  more indexes ?

bitmap indexes could be implemented usin g GiST.






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Thu, 2009-01-22 at 00:29 +0300, Oleg Bartunov wrote:

 I'm rather keen on Pg extendability, which allowed me and Teodor to 
 work on many extensions. Yes, first GiST we inherited from early 
 academic research and was more like a toy. We still have several TODO 
 items about GiST interface (incorporate SP-GiST).

Sounds good.

 I'm not sure about specific patch Simon advocate, but as soon as it
 doesnot introduces any threat to the whole  database cluster health 
 (for example, WAL spamming) I think we can apply it. 

Currently you can write any crap you want to WAL from any plugin, as
long as it looks a lot like an existing WAL message type. If you crash
then we'll read that crap and (probably) crash again. That is already a
risk.

The rmgr plugin provides a way to handle user-defined WAL messages. The
patch is recovery-side only and is designed to complement the indexAM
APIs, which are normal-running-side only. Best way to think of it is as
another 5 functions on index access method interface that allow you to
implement recoverable index plugins. (Remembering that dynamic index
plugins are already allowed by Postgres).

So the patch does not provide any additional way of *writing* WAL, it
just provides a way of reading it and then taking action.

Rmgr plugins would allow you to simply ignore certain kinds of WAL,
apply data in a user defined manner or filter it etc.. So if you come
across a buggy index, you can turn off the WAL for that index type and
then recover the database without those indexes. Or dynamically patch
the code for that index type and recover. You'll get Postgres back up
faster with this patch than without it, in many cases.

 Other question, why don't improve GiST to allow support of  more indexes ?
 bitmap indexes could be implemented usin g GiST.

I'm not advocating any particular type of index here, just the ability
to make index plugins robust. There is no other way of doing this, i.e.
it can't be done by an external module etc..

I'll avoid discussing index design with you :-)

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Josh Berkus j...@agliodbs.com writes:

 Right.  And I'm saying that pluggability is PostgreSQL's main reason for
 existence, if you look at our place in the future of databases.  So it's worth
 paying *some* cost, provided that the cost/benefit ratio works for the
 particular patch.

I agree that pluggability is a huge deal for Postgres. But note that the
interface is critical. If we provided a plugin architecture for functions and
operators which was simply a hook where you replaced part of the
infrastructure of the parser and executor it would be pointless. 

Instead we provide an interface where your function has to know as little as
possible about the rest of the system. And the parser and executor get enough
information about your function that they can do most of the work. That you
can create a new operator in Postgres *without* knowing how operators actually
are implemented and without worrying about what other operators exist is what
makes the feature so useful.

This is made a lot harder with WAL because a) it spans the entire cluster, not
just a database so any meta-information has to be stored somewhere global and
b) the consequences for getting something wrong are so much more dire. The
entire cluster is dead and can't even be restored from backup.

 To rephrase: I can't judge the rmgr patch one way or the other.  I'm only
 objecting to the idea expressed by Heikki and others that pluggable indexes 
 are
 stupid and unnecessary.

Well we support pluggable indexes -- they just can't be recoverable right now.
Presumably if they're merged into the core database they would have
recoverability added like how GIST progressed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Simon Riggs

On Thu, 2009-01-22 at 00:00 +, Gregory Stark wrote:

 But note that the interface is critical.

Yes, it is.

The existing rmgr code provides for 5 separate calls that a module needs
to implement to make an access method recoverable. btree, hash, gist and
gin already implement that API.

I haven't invented a new interface at all. All the patch does is expose
the existing API for plugins, allowing them to act in exactly the same
ways that the existing index types do.

If you have patch review comments about additional requirements for that
API, that is fine. But saying the API is wrong is not a reason to reject
the patch. Its a reason to change the patch.

 the consequences for getting something wrong are so much more dire.
 The entire cluster is dead and can't even be restored from backup.

Not true. If you decide to use a pluggable index and the plugin breaks,
you can turn off that index type and continue recovering the database.
If GIN breaks for example, you can simply bypass it and continue. So the
rmgr patch provides you a mechanism for recovering an existing system in
a way that is not currently possible - no data loss, just loss of
damaged indexes. And it provides an escape hatch if you use a pluggable
index and it breaks.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers