Re: RFC: Additional Directory for Extensions

2024-08-29 Thread David E. Wheeler
On Aug 27, 2024, at 22:24, Craig Ringer  wrote:

> `pg_config` only cares about compile-time settings, so I would not
> expect its output to change.

Right, of course that’s its original purpose, but extensions depend on it to 
determine where to install extensions. Not just PGXS, but also pgrx and various 
Makefile customizations I’ve seen in the wild.

> I suspect we'd have to add PGXS extension-path awareness if going for
> per-extension subdir support. I'm not sure it makes sense to teach
> `pg_config` about this, since it'd need to have a different mode like
> 
>pg_config --extension myextname --extension-sharedir
> 
> since the extension's "sharedir" is
> $basedir/extensions/myextension/share or whatever.

Right. PGXS would just need to know where to put the directory for an 
extension. There should be a default for the project, and then it can be 
overridden with something like DESTDIR (but without full paths under that 
prefix).

> Supporting this looks to be a bit intrusive in the makefiles,
> requiring them to differentiate between "share dir for extensions" and
> "share dir for !extensions", etc. I'm not immediately sure if it can
> be done in a way that transparently converts unmodified extension PGXS
> makefiles to target the new paths; it might require an additional
> define, or use of new variables and an ifdef block to add
> backwards-compat to the extension makefile for building on older
> postgres.

Yeah, might just have to be an entirely new thing, though it sure would be nice 
for existing PGXS-using Makefiles to do the right thing. Maybe for the new 
version of the server with the proposed new pattern it would dispatch to the 
new thing somehow without modifying all the rest of its logic.

> Right. The proposed structure is rather a bigger change than I was
> thinking when I suggested supporting an extension search path not just
> a single extra path. But it's also a cleaner proposal; the
> per-extension directory would make it easier to ensure that the
> extension control file, sql scripts, and dynamic library all match the
> same extension and version if multiple ones are on the path. Which is
> desirable when doing things like testing a new version of an in-core
> extension.

💯

> Right. And I've been on the receiving end of having a small, focused
> patch derailed by others jumping in and scope-exploding it into
> something completely different to solve a much wider but related
> problem.

I’m not complaining, I would definitely prefer to see consensus on a cleaner 
proposal along the lines we’ve discussed and a commitment to time from parties 
able to get it done in time for v18. I’m willing to help where I can with my 
baby C! Failing that, we can fall back on the destdir patch.

> I'm definitely not trying to stand in the way of progress with this; I
> just want to make sure that it doesn't paint us into a corner that
> prevents a more general solution from being adopted later. That's why
> I'm suggesting making the config a multi-value string (list of paths)
> and raising a runtime "ERROR: searching multiple paths for extensions
> not yet supported" or something if >1 path configured.
> 
> If that doesn't work, no problem.

I think the logic would have to be different, so they’d be different GUCs with 
their own semantics. But if the core team and committers are on board with the 
general idea of search paths and per-extension directory organization, it would 
be best to avoid getting stuck with maintaining the current patch’s GUC.

OTOH, we could get it committed now and revert it later if we get the better 
thing done and committed.

>> I think we should get some clarity on the proposal, and then consensus, as 
>> you say. I say “get some clarity” because my proposal doesn’t require 
>> recursing, and I’m not sure why it’d be needed.
> 
> From what you and Gabriele are discussing (which I agree with), it wouldn’t.

Ah, great.

I’ll try to put some thought into a more formal proposal in a new thread next 
week. Unless your Gabriele beats me to it 😂.

Best,

David





Re: RFC: Additional Directory for Extensions

2024-08-27 Thread Craig Ringer
On Wed, 28 Aug 2024 at 03:26, David E. Wheeler  wrote:
> Right. ISTM it could complicate PGXS quite a bit. If we set, say,
>
> SET extension_search_path = $extsdir, /mnt/extensions/pg16, 
> /mnt/extensions/pg16/gosuperfast/extensions;
>
> What should be the output of `pg_config --sharedir`?

`pg_config` only cares about compile-time settings, so I would not
expect its output to change.

I suspect we'd have to add PGXS extension-path awareness if going for
per-extension subdir support. I'm not sure it makes sense to teach
`pg_config` about this, since it'd need to have a different mode like

pg_config --extension myextname --extension-sharedir

since the extension's "sharedir" is
$basedir/extensions/myextension/share or whatever.

Supporting this looks to be a bit intrusive in the makefiles,
requiring them to differentiate between "share dir for extensions" and
"share dir for !extensions", etc. I'm not immediately sure if it can
be done in a way that transparently converts unmodified extension PGXS
makefiles to target the new paths; it might require an additional
define, or use of new variables and an ifdef block to add
backwards-compat to the extension makefile for building on older
postgres.

> But that leaves the issue of directory organization. The current patch is 
> just a prefix for various PGXS/pg_config directories; the longer-term 
> proposal I’ve made here is not a prefix for sharedir, mandir, etc., but a 
> directory that contains directories named for extensions. So even if we were 
> to take this approach, the directory structure would vary.

Right. The proposed structure is rather a bigger change than I was
thinking when I suggested supporting an extension search path not just
a single extra path. But it's also a cleaner proposal; the
per-extension directory would make it easier to ensure that the
extension control file, sql scripts, and dynamic library all match the
same extension and version if multiple ones are on the path. Which is
desirable when doing things like testing a new version of an in-core
extension.

> OTOH, we have this patch now, and this other stuff is just a proposal. Actual 
> code trumps ideas in my mind.

Right. And I've been on the receiving end of having a small, focused
patch derailed by others jumping in and scope-exploding it into
something completely different to solve a much wider but related
problem.

I'm definitely not trying to stand in the way of progress with this; I
just want to make sure that it doesn't paint us into a corner that
prevents a more general solution from being adopted later. That's why
I'm suggesting making the config a multi-value string (list of paths)
and raising a runtime "ERROR: searching multiple paths for extensions
not yet supported" or something if >1 path configured.

If that doesn't work, no problem.

> I think we should get some clarity on the proposal, and then consensus, as 
> you say. I say “get some clarity” because my proposal doesn’t require 
> recursing, and I’m not sure why it’d be needed.

>From what you and Gabriele are discussing (which I agree with), it wouldn't.




Re: RFC: Additional Directory for Extensions

2024-08-27 Thread David E. Wheeler
On Aug 26, 2024, at 17:35, Craig Ringer  wrote:

> This looks like a good suggestion to me, it would make the packaging,
> distribution and integration of 3rd party extensions significantly
> easier without any obvious large or long term cost.

Yes!

> Also PGXS, the windows extension build support, and 3rd party cmake
> builds etc. But not by the looks a drastic change.

Right. ISTM it could complicate PGXS quite a bit. If we set, say, 

SET extension_search_path = $extsdir, /mnt/extensions/pg16, 
/mnt/extensions/pg16/gosuperfast/extensions;

What should be the output of `pg_config --sharedir`?

> My only real concern with the current patch is that it limits
> searching for extensions to one additional configurable location,
> which is inconsistent with how things like the dynamic_library_path
> works. Once in, it'll be difficult to change or extend for BC, and if
> someone wants to add a search path capability it'll break existing
> configurations.

Agreed.

> Would it be feasible to define its configuration syntax as accepting a
> list of paths, but only implement the semantics for single-entry lists
> and ERROR on multiple paths? That way it could be extended w/o
> breaking existing configurations later.

I imagine it’s a simple matter of programming :-) But that leaves the issue of 
directory organization. The current patch is just a prefix for various 
PGXS/pg_config directories; the longer-term proposal I’ve made here is not a 
prefix for sharedir, mandir, etc., but a directory that contains directories 
named for extensions. So even if we were to take this approach, the directory 
structure would vary.

I suspect we’d have to name it differently and support both long-term. That, 
too me, is the main issue with this patch.

OTOH, we have this patch now, and this other stuff is just a proposal. Actual 
code trumps ideas in my mind.

> With that said, I'm not the one doing the work at the moment, and the
> functionality would definitely be helpful. If there's agreement on
> supporting a search-path or recursing into subdirectories I'd be
> willing to have a go at it, but I'm a bit stale on Pg's codebase now
> so I'd want to be fairly confident the work wouldn't just be thrown
> out.

I think we should get some clarity on the proposal, and then consensus, as you 
say. I say “get some clarity” because my proposal doesn’t require recursing, 
and I’m not sure why it’d be needed.

Best,

David






Re: RFC: Additional Directory for Extensions

2024-08-27 Thread David E. Wheeler
On Aug 27, 2024, at 04:56, Gabriele Bartolini 
 wrote:

> The extension image could follow a naming convention like this (order can be 
> adjusted): `-- version>-(-)`. For example, `pgvector-16-0.7.4-bookworm-1` would 
> represent the first image built in a repository for pgvector 0.7.4 for 
> PostgreSQL 16 on Debian Bookworm. If multi-arch images aren't desired, we 
> could incorporate the architecture somewhere in the naming convention.

Well now you’re just describing the binary distribution format RFC[1] (POC[2]) 
and multi-platform OCI distribution POC[3] :-)

> If we wanted to install multiple versions of an extension, we could mount 
> them in different directories, with the version included in the folder 
> name—for example, `pgvector-0.7.4` instead of just `pgvector`. However, I'm a 
> bit rusty with the extensions framework, so I'll need to check if this 
> approach is feasible and makes sense.

Right, if we decided to adopt this proposal, it might make sense to include the 
“default version” as part of the directory name. But there’s quite a lot of 
work between here and there.

Best,

David

[1]: https://github.com/pgxn/rfcs/pull/2
[2]: https://justatheory.com/2024/06/trunk-poc/
[3]: https://justatheory.com/2024/06/trunk-oci-poc/






Re: RFC: Additional Directory for Extensions

2024-08-27 Thread Gabriele Bartolini
Hi David,

Thanks for your email.

On Mon, 26 Aug 2024 at 16:07, David E. Wheeler 
wrote:

> I would assume BINDIR, DOCDIR, HTMLDIR, PKGLIBDIR, MANDIR, SHAREDIR, and
> perhaps LOCALEDIR.
>
> But even if it’s just one or two, the only proper way an extension
> directory would work, IME, is to define a directory-based structure for
> extensions, where every file for an extension is in a directory named for
> the extension, and subdirectories are defined for each of the above
> requisite file types. Something like:
>
> extension_name
> ├── control.ini
> ├── bin
> ├── doc
> ├── html
> ├── lib
> ├── local
> ├── man
> └── share
>

I'm really glad you proposed this publicly. I reached the same conclusion
the other day when digging deeper into the problem with a few folks from
CloudNativePG. Setting aside multi-arch images for now, if we could
reorganize the content of a single image (identified by OS distro,
PostgreSQL major version, and extension version) with a top-level directory
structure as you described, we could easily mount each image as a separate
volume.

The extension image could follow a naming convention like this (order can
be adjusted): `---(-)`. For example, `pgvector-16-0.7.4-bookworm-1`
would represent the first image built in a repository for pgvector 0.7.4
for PostgreSQL 16 on Debian Bookworm. If multi-arch images aren't desired,
we could incorporate the architecture somewhere in the naming convention.

This would allow multiple paths to work and keep all the files for an
> extension bundled together. It could also potentially allow for multiple
> versions of an extension to be installed at once, if we required the
> version to be part of the directory name.
>

If we wanted to install multiple versions of an extension, we could mount
them in different directories, with the version included in the folder
name—for example, `pgvector-0.7.4` instead of just `pgvector`. However, I'm
a bit rusty with the extensions framework, so I'll need to check if this
approach is feasible and makes sense.

Thanks,
Gabriele
-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


Re: RFC: Additional Directory for Extensions

2024-08-26 Thread Craig Ringer
On Tue, 27 Aug 2024 at 02:07, David E. Wheeler  wrote:
> On Aug 21, 2024, at 19:07, Craig Ringer  wrote:

> But even if it’s just one or two, the only proper way an extension directory 
> would work, IME, is to define a directory-based structure for extensions, 
> where every file for an extension is in a directory named for the extension, 
> and subdirectories are defined for each of the above requisite file types.
> [...]
> I think this would be a much nicer layout for packaging, installing, and 
> managing extensions versus the current method of strewing files around to a 
> slew of different directories.

This looks like a good suggestion to me, it would make the packaging,
distribution and integration of 3rd party extensions significantly
easier without any obvious large or long term cost.

> But it would come at some cost, in terms of backward with the existing layout 
> (or migration to it), significant modification of the server to use the new 
> layout (and extension_search_path), and other annoyances like PATH and 
> MANPATH management.

Also PGXS, the windows extension build support, and 3rd party cmake
builds etc. But not by the looks a drastic change.

> Long term I think it would be worthwhile, but the current patch feels like a 
> decent interim step we could live with, solving most of the integration 
> problems (immutable servers, packaging testing, etc.) at the cost of a 
> slightly unexpected directory layout. What I mean by that is that the current 
> patch is pretty much just using extension_destdir as a prefix to all of those 
> directories from pg_config, so they never have to change, but it does mean 
> that you end up installing extensions in something like:
>
> /mnt/extensions/pg16/usr/share/postgresql/16
> /mnt/extensions/pg16/usr/include/postgresql

My only real concern with the current patch is that it limits
searching for extensions to one additional configurable location,
which is inconsistent with how things like the dynamic_library_path
works. Once in, it'll be difficult to change or extend for BC, and if
someone wants to add a search path capability it'll break existing
configurations.

Would it be feasible to define its configuration syntax as accepting a
list of paths, but only implement the semantics for single-entry lists
and ERROR on multiple paths? That way it could be extended w/o
breaking existing configurations later.

With that said, I'm not the one doing the work at the moment, and the
functionality would definitely be helpful. If there's agreement on
supporting a search-path or recursing into subdirectories I'd be
willing to have a go at it, but I'm a bit stale on Pg's codebase now
so I'd want to be fairly confident the work wouldn't just be thrown
out.

--
Craig Ringer




Re: RFC: Additional Directory for Extensions

2024-08-26 Thread David E. Wheeler
Hi Hackers,

Apologies for the delay in reply; I’ve been at the XOXO Festival and almost 
completely unplugged for the first time in ages. Happy to see this thread 
coming alive, though. Thank you Gabriele, Craig, and Jelte!

On Aug 21, 2024, at 19:07, Craig Ringer  wrote:

> So IMO this should be a _path_ to search for extension control files
> and SQL scripts.
> 
> If the current built-in default extension dir was exposed as a var
> $extdir like we do for $libdir, this might look something like this
> for local development and testing while working with a packaged
> postgres build:
> 
>SET extension_search_path = $extsdir, /opt/myapp/extensions,
> /usr/local/postgres/my-custom-extension/extensions;
>SET dynamic_library_path = $libdir, /opt/myapp/lib,
> /usr/local/postgres/my-custom-extension/lib

I would very much like something like this, but I’m not sure how feasible it is 
for a few reasons. The first, and most important, is that extensions are not 
limited to just a control file and SQL file. They also very often include:

* one or more shared library files
* documentation files
* binary files

And maybe more? How many of these directories might an extension install files 
into:

✦ ❯ pg_config | grep DIR | awk '{print $1}'
BINDIR
DOCDIR
HTMLDIR
INCLUDEDIR
PKGINCLUDEDIR
INCLUDEDIR-SERVER
LIBDIR
PKGLIBDIR
LOCALEDIR
MANDIR
SHAREDIR
SYSCONFDIR

I would assume BINDIR, DOCDIR, HTMLDIR, PKGLIBDIR, MANDIR, SHAREDIR, and 
perhaps LOCALEDIR.

But even if it’s just one or two, the only proper way an extension directory 
would work, IME, is to define a directory-based structure for extensions, where 
every file for an extension is in a directory named for the extension, and 
subdirectories are defined for each of the above requisite file types. 
Something like:

extension_name
├── control.ini
├── bin
├── doc
├── html
├── lib
├── local
├── man
└── share

This would allow multiple paths to work and keep all the files for an extension 
bundled together. It could also potentially allow for multiple versions of an 
extension to be installed at once, if we required the version to be part of the 
directory name.

I think this would be a much nicer layout for packaging, installing, and 
managing extensions versus the current method of strewing files around to a 
slew of different directories. But it would come at some cost, in terms of 
backward with the existing layout (or migration to it), significant 
modification of the server to use the new layout (and extension_search_path), 
and other annoyances like PATH and MANPATH management.

Long term I think it would be worthwhile, but the current patch feels like a 
decent interim step we could live with, solving most of the integration 
problems (immutable servers, packaging testing, etc.) at the cost of a slightly 
unexpected directory layout. What I mean by that is that the current patch is 
pretty much just using extension_destdir as a prefix to all of those 
directories from pg_config, so they never have to change, but it does mean that 
you end up installing extensions in something like

/mnt/extensions/pg16/usr/share/postgresql/16
/mnt/extensions/pg16/usr/include/postgresql

etc.

Best,

David





Re: RFC: Additional Directory for Extensions

2024-08-22 Thread Craig Ringer
On Fri, 23 Aug 2024 at 10:14, Craig Ringer
 wrote:
> On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini
>  wrote:
> > On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio  wrote:
> >> SET extension_search_path = /mnt/extensions/pg16/*
> >
> > That'd be great. +1.
>
> Agreed, that'd be handy, but not worth blocking the underlying capability for.
>
> Except possibly to the degree that the feature should reserve wildcard
> characters and require them to be escaped if they appear on a path, so
> there's no BC break if it's added later.

... though on second thoughts, it might make more sense to just
recursively search directories found under each path entry. Rules like
'search if a redundant trailing / is present' can be an option.

That way there's no icky path escaping needed for normal configuration.




Re: RFC: Additional Directory for Extensions

2024-08-22 Thread Craig Ringer
On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini
 wrote:
> On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio  wrote:
>> SET extension_search_path = /mnt/extensions/pg16/*
>
> That'd be great. +1.

Agreed, that'd be handy, but not worth blocking the underlying capability for.

Except possibly to the degree that the feature should reserve wildcard
characters and require them to be escaped if they appear on a path, so
there's no BC break if it's added later.

On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini
 wrote:
>
> Hi Jelte,
>
> On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio  wrote:
>>
>> It looks like you want one directory per extension, so that list would
>> get pretty long if you have multiple extensions. Maybe (as a follow up
>> change), we should start to support a * as a wildcard in both of these
>> GUCs. So you could say:
>>
>> SET extension_search_path = /mnt/extensions/pg16/*
>>
>> To mean effectively the same as you're describing above.
>
>
> That'd be great. +1.
>
> --
> Gabriele Bartolini
> Vice President, Cloud Native at EDB
> enterprisedb.com




Re: RFC: Additional Directory for Extensions

2024-08-22 Thread Gabriele Bartolini
Hi Jelte,

On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio  wrote:

> It looks like you want one directory per extension, so that list would
> get pretty long if you have multiple extensions. Maybe (as a follow up
> change), we should start to support a * as a wildcard in both of these
> GUCs. So you could say:
>
> SET extension_search_path = /mnt/extensions/pg16/*
>
> To mean effectively the same as you're describing above.
>

That'd be great. +1.

-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


Re: RFC: Additional Directory for Extensions

2024-08-22 Thread Gabriele Bartolini
Hi Craig,

On Thu, 22 Aug 2024 at 01:07, Craig Ringer 
wrote:

> It's also very relevant for local development and testing.
>

Yep, which is the original goal of Christoph IIRC.


> It may be possible to weaken this restriction somewhat thanks to the
> upcoming
> https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/
> feature that permits additional OCI images to be mounted as read-only
> volumes on a workload. This would still only permit mounting at
> Pod-creation time, not runtime mounting and unmonuting, but means the
> base postgres image could be supplemented by mounting additional
> images for extensions.
>

I'm really excited about that feature, but it's still in the alpha stage.
However, I don't anticipate any issues for the future general availability
(GA) release. Regardless, we may need to consider a temporary solution that
is compatible with existing Kubernetes and possibly Postgres versions (but
that's beyond the purpose of this thread).

For example, one might mount image "postgis-vX.Y.Z" image onto base
> image "postgresql-16" if support for PostGIS is desired, without then
> having to bake every possible extension anyone might ever want into
> the base image. This solves all sorts of messy issues with upgrades
> and new version releases.
>

Yep.


> But for it to work, it must be possible to tell postgres to look in
> _multiple places_ for extension .sql scripts and control files. This
> is presently possible for modules (dynamic libraries, .so / .dylib /
> .dll) but without a way to also configure the path for extensions it's
> of very limited utility.
>

Agree.


> So IMO this should be a _path_ to search for extension control files
> and SQL scripts.
>

I like this. I also prefer the name `extension_search_path`.

For safety, it might make sense to impose the restriction that if an
> extension control file is found in a given directory, SQL scripts will
> also only be looked for in that same directory. That way there's no
> chance of accidentally mixing and matching SQL scripts from different
> versions of an extension if it appears twice on the extension search
> path in different places. The rule for loading SQL scripts would be:
>
> * locate first directory on path contianing matching extension control file
> * use this directory as the extension directory for all subsequent SQL
> script loading and running actions
>

It could work, but it requires some prototyping and exploration. I'm
willing to participate and use CloudNativePG as a test bed.

Cheers,
Gabriele
-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


Re: RFC: Additional Directory for Extensions

2024-08-22 Thread Jelte Fennema-Nio
On Thu, 22 Aug 2024 at 01:08, Craig Ringer
 wrote:
> SET extension_search_path = $extsdir,
> /mnt/extensions/pg16/postgis-vX.Y/extensions,
> /mnt/extensions/pg16/gosuperfast/extensions;

It looks like you want one directory per extension, so that list would
get pretty long if you have multiple extensions. Maybe (as a follow up
change), we should start to support a * as a wildcard in both of these
GUCs. So you could say:

SET extension_search_path = /mnt/extensions/pg16/*

To mean effectively the same as you're describing above.




Re: RFC: Additional Directory for Extensions

2024-08-21 Thread Craig Ringer
On Thu, 22 Aug 2024 at 08:00, Gabriele Bartolini
 wrote:
>
> Hi everyone,
>
> Apologies for only starting to look into this now. Thanks, David, for pushing 
> this forward.


 100%. I've wanted this for some time but never had time to cook up a patch.

> I want to emphasize the importance of this patch for the broader adoption of 
> extensions in immutable container environments, such as those used by the 
> CloudNativePG operator in Kubernetes.


It's also very relevant for local development and testing.

Right now postgres makes it impossible to locally compile and install
an extension for a distro-packaged postgres (whether from upstream
repos or PGDG repos) without dirtying the distro-managed filesystem
subtrees with local changes under /usr etc, because it cannot be
configured to look for locally installed extensions on non-default
paths.

> To provide some context, one of the key principles of CloudNativePG is that 
> containers, once started, cannot be modified—this includes the installation 
> of Postgres extensions and their libraries. This restriction prevents us from 
> adding extensions on the fly, requiring them to be included in the main 
> PostgreSQL operand image. As a result, users who need specific extensions 
> must build custom images through automated pipelines (see: 
> https://cloudnative-pg.io/blog/creating-container-images/).


It may be possible to weaken this restriction somewhat thanks to the
upcoming 
https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/
feature that permits additional OCI images to be mounted as read-only
volumes on a workload. This would still only permit mounting at
Pod-creation time, not runtime mounting and unmonuting, but means the
base postgres image could be supplemented by mounting additional
images for extensions.

For example, one might mount image "postgis-vX.Y.Z" image onto base
image "postgresql-16" if support for PostGIS is desired, without then
having to bake every possible extension anyone might ever want into
the base image. This solves all sorts of messy issues with upgrades
and new version releases.

But for it to work, it must be possible to tell postgres to look in
_multiple places_ for extension .sql scripts and control files. This
is presently possible for modules (dynamic libraries, .so / .dylib /
.dll) but without a way to also configure the path for extensions it's
of very limited utility.

> We’ve been considering ways to improve this process for some time. The 
> direction we're exploring involves mounting an ephemeral volume that contains 
> the necessary extensions (namely $sharedir and $pkglibdir from pg_config). 
> These volumes would be created and populated with the required extensions 
> when the container starts and destroyed when it shuts down. To make this 
> work, each extension must be independently packaged as a container image 
> containing the appropriate files for a specific extension version, tailored 
> to the architecture, distribution, OS version, and Postgres version.


Right. And there might be more than one of them.

So IMO this should be a _path_ to search for extension control files
and SQL scripts.

If the current built-in default extension dir was exposed as a var
$extdir like we do for $libdir, this might look something like this
for local development and testing while working with a packaged
postgres build:

SET extension_search_path = $extsdir, /opt/myapp/extensions,
/usr/local/postgres/my-custom-extension/extensions;
SET dynamic_library_path = $libdir, /opt/myapp/lib,
/usr/local/postgres/my-custom-extension/lib

or in the container extensions case, something like:

SET extension_search_path = $extsdir,
/mnt/extensions/pg16/postgis-vX.Y/extensions,
/mnt/extensions/pg16/gosuperfast/extensions;
SET dynamic_library_path = $libdir,
/mnt/extensions/pg16/postgis-vX.Y/lib,
/mnt/extensions/pg16/gosuperfast/lib;

For safety, it might make sense to impose the restriction that if an
extension control file is found in a given directory, SQL scripts will
also only be looked for in that same directory. That way there's no
chance of accidentally mixing and matching SQL scripts from different
versions of an extension if it appears twice on the extension search
path in different places. The rule for loading SQL scripts would be:

* locate first directory on path contianing matching extension control file
* use this directory as the extension directory for all subsequent SQL
script loading and running actions

--
Craig Ringer
EnterpriseDB




Re: RFC: Additional Directory for Extensions

2024-08-21 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 12:12:33PM +0200, Alvaro Herrera wrote:
> archive_command and so on: we could disable these too.  Nathan did some
> work to implement those using dynamic libraries, so it shouldn't be too
> much of a loss; anything that is done with a shell script can also be
> done with a small library.  Those libraries can be made safe.
> If there are other ways to invoke shell commands from GUCs, let's add
> the ability to use libraries for those too.

Sorry, I just noticed this message.  I recently withdrew my patch set [0]
for using a library instead of shell commands for restore_command,
archive_cleanup_command, and recovery_end_command, as it had sat idle for a
very long time.  If/when there's interest, I'd be happy to pick it up
again.

[0] https://postgr.es/m/ZkwLqichtySV5kF3%40nathan-air.lan

-- 
nathan




Re: RFC: Additional Directory for Extensions

2024-08-21 Thread Gabriele Bartolini
Hi everyone,

Apologies for only starting to look into this now. Thanks, David, for
pushing this forward.

I want to emphasize the importance of this patch for the broader adoption
of extensions in immutable container environments, such as those used by
the CloudNativePG operator in Kubernetes.

To provide some context, one of the key principles of CloudNativePG is that
containers, once started, cannot be modified—this includes the installation
of Postgres extensions and their libraries. This restriction prevents us
from adding extensions on the fly, requiring them to be included in the
main PostgreSQL operand image. As a result, users who need specific
extensions must build custom images through automated pipelines (see:
https://cloudnative-pg.io/blog/creating-container-images/).

We’ve been considering ways to improve this process for some time. The
direction we're exploring involves mounting an ephemeral volume that
contains the necessary extensions (namely $sharedir and $pkglibdir from
pg_config). These volumes would be created and populated with the required
extensions when the container starts and destroyed when it shuts down. To
make this work, each extension must be independently packaged as a
container image containing the appropriate files for a specific extension
version, tailored to the architecture, distribution, OS version, and
Postgres version.

I’m committed to thoroughly reviewing this patch, testing it with
CloudNativePG and a few extensions, and providing feedback as soon as
possible.

Best,
Gabriele

On Mon, 8 Jul 2024 at 18:02, David E. Wheeler  wrote:

> On Jun 25, 2024, at 18:31, David E. Wheeler  wrote:
>
> > For those who prefer a GitHub patch review experience, see this PR:
> >
> >  https://github.com/theory/postgres/pull/3/files
>
> Rebased and restored PGC_SUSET in the attached v5 patch, plus noted the
> required privileges in the docs.
>
> Best,
>
> David
>
>
>
>

-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


Re: RFC: Additional Directory for Extensions

2024-07-08 Thread David E. Wheeler
On Jun 25, 2024, at 18:31, David E. Wheeler  wrote:

> For those who prefer a GitHub patch review experience, see this PR:
> 
>  https://github.com/theory/postgres/pull/3/files

Rebased and restored PGC_SUSET in the attached v5 patch, plus noted the 
required privileges in the docs.

Best,

David



v5-0001-Add-extension_destdir-GUC.patch
Description: Binary data




Re: RFC: Additional Directory for Extensions

2024-06-26 Thread Jelte Fennema-Nio
On Tue, 25 Jun 2024 at 19:33, David E. Wheeler  wrote:
>
> On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio  wrote:
>
> > Still, for the sake of completeness it might make sense to support
> > this whole list in extension_destdir. (assuming it's easy to do)
>
> It should be with the current patch, which just uses a prefix to paths in 
> `pg_config`.

Ah alright, I think it confused me because I never saw bindir being
used. But as it turns out the current backend code never uses bindir.
So that makes sense. I guess to actually use the binaries from the
extension_destdir/$BINDIR the operator needs to set PATH accordingly,
or the extension needs to be changed to support extension_destdir.

It might be nice to add a helper function to find binaries in BINDIR,
now that the resolution logic is more complex. Even if postgres itself
doesn't use it. That would make it easier for extensions to be
modified to support extension_distdir. Something like
find_bindir_executable(char *name)




Re: RFC: Additional Directory for Extensions

2024-06-26 Thread Jelte Fennema-Nio
On Wed, 26 Jun 2024 at 00:32, David E. Wheeler  wrote:
> In other news, here’s an updated patch that expands the documentation to 
> record that the destination directory is a prefix, and full paths should be 
> used under it. Also take the opportunity to document the PGXS DESTDIR 
> variable as the thing to use to install files under the destination directory.

Docs are much clearer now thanks.

full = substitute_libpath_macro(name);
+   /*
+* If extension_destdir is set, try to find the file there first
+*/
+   if (*extension_destdir != '\0')
+   {
+   full2 = psprintf("%s%s", extension_destdir, full);
+   if (pg_file_exists(full2))
+   {
+   pfree(full);
+   return full2;
+   }
+   pfree(full2);
+   }

I think this should be done differently. For two reasons:
1. I don't think extension_destdir should be searched when $libdir is
not part of the name.
2. find_in_dynamic_libpath currently doesn't use extension_destdir at
all, so if there is no slash in the filename we do not search
extension_destdir.

I feel like changing the substitute_libpath_macro function a bit (or
adding a new similar function) is probably the best way to achieve
that.

We should also check somewhere (probably GUC check hook) that
extension_destdir is an absolute path.

> It still requires a server restart;

When reading the code I see no reason why this cannot be PGC_SIGHUP.
Even though it's probably not needed to change on a running server, I
think it's better to allow that. Even just so people can disable it if
necessary for some reason without restarting the process.

> I can change it back to superuser-only if that’s the consensus.

It still is GUC_SUPERUSER_ONLY, right?

> For those who prefer a GitHub patch review experience, see this PR:
>
>   https://github.com/theory/postgres/pull/3/files

Sidenote: The "D" link for each patch on cfbot[1] now gives a similar
link for all commitfest entries[2].

[1]: http://cfbot.cputube.org/
[2]: https://github.com/postgresql-cfbot/postgresql/compare/cf/4913~1...cf/4913




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread David E . Wheeler
On Jun 25, 2024, at 10:43 AM, Robert Haas  wrote:

> If we want to work on making the sorts of changes that
> you're proposing, let's do it on a separate thread. It's not going to
> be meaningfully harder to move in that direction after some patch like
> this than it is today.

I appreciate this separation of concerns, Robert.

In other news, here’s an updated patch that expands the documentation to record 
that the destination directory is a prefix, and full paths should be used under 
it. Also take the opportunity to document the PGXS DESTDIR variable as the 
thing to use to install files under the destination directory.

It still requires a server restart; I can change it back to superuser-only if 
that’s the consensus.

For those who prefer a GitHub patch review experience, see this PR:

  https://github.com/theory/postgres/pull/3/files

Best,

David



v4-0001-Add-extension_destdir-GUC.patch
Description: Binary data




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread David E. Wheeler
On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio  wrote:

> Still, for the sake of completeness it might make sense to support
> this whole list in extension_destdir. (assuming it's easy to do)

It should be with the current patch, which just uses a prefix to paths in 
`pg_config`. So if SHAREDIR is set to /usr/share/postgresql/16 and 
extension_destdir is set to /mount/ext, then Postgres will look for files in 
/mount/ext/usr/share/postgresql/16. The same rule applies (or should apply) for 
all other pg_config directory configs and where the postmaster looks for 
specific files. And PGXS already supports installing files in these locations, 
thanks to its DESTDIR param.

(I don’t know how it works on Windows, though.)

That said, this is very much a pattern designed for RPM and Debian package 
management patterns, and not for actually installing and managing extensions. 
And maybe that’s fine for now, as it can still be used to address the 
immutability problems descried in the original post in this thread.

Ultimately, I’d like to figure out a way to more tidily organize installed 
extension files, but I think that, too, might be a separate thread.

Best,

David





Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 6:12 AM Alvaro Herrera  wrote:
> Now, I'm not saying that this is an easy journey.  But if we don't
> start, we're not going to get there.

I actually kind of agree with you. I think I've said something similar
in a previous email to the list somewhere. But I don't agree that this
patch should be burdened with taking the first step. We seem to often
find reasons why patches that packagers for prominent distributions
are carrying shouldn't be put into core, and I think that's a bad
habit. They're not going to stop applying those packages because we
refuse to put suitable functionality in core; we're just creating a
situation where lots of people are running slightly patched versions
of PostgreSQL instead of straight-up PostgreSQL. That's not improving
anything. If we want to work on making the sorts of changes that
you're proposing, let's do it on a separate thread. It's not going to
be meaningfully harder to move in that direction after some patch like
this than it is today.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Jelte Fennema-Nio
On Tue, 25 Jun 2024 at 12:12, Alvaro Herrera  wrote:
> They can mutilate the system catalogs: yes, they can TRUNCATE pg_type.
> So what?  They've just destroyed their own ability to do anything else.
> The real issue here is that they can edit pg_proc to cause SQL function
> calls to call arbitrary code.  But what if we limit functions so that
> the C code that they can call is located in specific places that are
> known to only contain secure code?  This is easy: make sure the
> OS-installation only contains safe code in $libdir.

I wouldn't call it "easy" but I totally agree that changing pg_proc is
the main security issue that we have no easy way to tackle.

> I hear you say: ah, but they can modify dynamic_library_path, which is a
> GUC, to load code from anywhere -- especially /tmp, where the newest
> bitcoin-mining library was just written.  This is true.  I suggest, to
> solve this problem, that we should make dynamic_library_path no longer a
> GUC.  It should be a setting that comes from a different origin, one
> that even superuser cannot write to.  Only the OS-installation can
> modify that file; that way, superuser cannot load arbitrary code that
> way.

I don't think that needs a whole new file. Making this GUC be
PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE should be
enough. Just like was done for the new allow_alter_system GUC in PG17.

> This is where the new GUC setting being proposed in this thread rubs me
> the wrong way: it's adding yet another avenue for this to be exploited.
> I would like this new directory not to be a GUC either, just like
> dynamic_library_path.

We can make it PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE
too, either now or in the future.

> Now, I'm not saying that this is an easy journey.  But if we don't
> start, we're not going to get there.

Sure, but it sounds like you're suggesting that you want to "start" by
not adding new features that have equivalent security holes as the
ones that we already have. I don't think that is a very helpful way to
get to a better place. It seems much more useful to tackle the current
problems that we have first, and almost certainly the same solutions
to those problems can be applied to any new features with security
issues.

It at least definitely seems the case for the proposal in this thread:
i.e. we already have a GUC that allows loading libraries from an
arbitrary location. This proposal adds another such GUC. If we solve
the security problem in that first GUC, either by
GUC_DISALLOW_IN_AUTO_FILE, or by creating a whole new mechanism for
the setting, then I see no reason why we cannot use that exact same
solution for the newly proposed GUC. So the required work to secure
postgres will not be meaningfully harder by adding this GUC.




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Alvaro Herrera
On 2024-Jun-24, Robert Haas wrote:

> Is "tighten up what the superuser can do" on our list of objectives?
> Personally, I think we should be focusing mostly, and maybe entirely,
> on letting non-superusers do more things, with appropriate security
> controls. The superuser can ultimately do anything, since they can
> cause shell commands to be run and load arbitrary code into the
> backend and write code in untrusted procedural languages and mutilate
> the system catalogs and lots of other terrible things.

I don't agree that we should focus _solely_ on allowing non-superusers
to do more things.  Sure, it's a good thing to do -- but we shouldn't
completely close the option of securing superuser itself.  I think it's
not completely impossible to have a future where superuser is just so
within the database, i.e. that it can't escape to the operating system.
I'm sure that would be useful in many environments.  On this list, many
people frequently make the argument that it is impossible to secure, but
I'm not convinced.

They can mutilate the system catalogs: yes, they can TRUNCATE pg_type.
So what?  They've just destroyed their own ability to do anything else.
The real issue here is that they can edit pg_proc to cause SQL function
calls to call arbitrary code.  But what if we limit functions so that
the C code that they can call is located in specific places that are
known to only contain secure code?  This is easy: make sure the
OS-installation only contains safe code in $libdir.

I hear you say: ah, but they can modify dynamic_library_path, which is a
GUC, to load code from anywhere -- especially /tmp, where the newest
bitcoin-mining library was just written.  This is true.  I suggest, to
solve this problem, that we should make dynamic_library_path no longer a
GUC.  It should be a setting that comes from a different origin, one
that even superuser cannot write to.  Only the OS-installation can
modify that file; that way, superuser cannot load arbitrary code that
way.

This is where the new GUC setting being proposed in this thread rubs me
the wrong way: it's adding yet another avenue for this to be exploited.
I would like this new directory not to be a GUC either, just like
dynamic_library_path.

I hear you argue: ah, but they can use COPY to write a new file to
$libdir.  Yes, they can, and I think that's foolish.  We could have
another non-GUC setting which takes a list of directories where COPY can
write files into.  Problem solved.  Do people really need the ability to
write files on arbitrary locations?

Untrusted extensions: well, just don't have those in the OS-installation
and you'll be fine.  I'm okay with saying that a superuser-restricted
system is incompatible with plpython.

archive_command and so on: we could disable these too.  Nathan did some
work to implement those using dynamic libraries, so it shouldn't be too
much of a loss; anything that is done with a shell script can also be
done with a small library.  Those libraries can be made safe.
If there are other ways to invoke shell commands from GUCs, let's add
the ability to use libraries for those too.

What other exploits do we know about?  How can we close them?


Now, I'm not saying that this is an easy journey.  But if we don't
start, we're not going to get there.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Christoph Berg
Re: Nathan Bossart
> At first glance, the general idea seems reasonable to me.  I'm wondering
> whether there is a requirement for this directory to be prepended or if it
> could be appended to the end.  That way, the existing ones would take
> priority, which might be desirable from a security standpoint.

My use case for this is to test things at compile time (where I can't
write to /usr/share/postgresql/). If installed things would take
priority over the things that I'm trying to test, I'd be disappointed.

Christoph




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Jelte Fennema-Nio
On Mon, 24 Jun 2024 at 22:42, David E. Wheeler  wrote:
> >> BINDIR
> >> DOCDIR
> >> HTMLDIR
> >> PKGINCLUDEDIR
> >> LOCALEDIR
> >> MANDIR
> >>
> >> I can imagine an extension wanting or needing to use any and all of these.
> >
> > Are these really all relevant to backend code?
>
> Oh I think so. Especially BINDIR; lots of extensions ship with binary 
> applications. And most ship with docs, too (PGXS puts items listed in DOCS 
> into DOCDIR). Some might also produce man pages (for their binaries), HTML 
> docs, and other stuff. Maybe an FTE extension would include locale files?
>
> I find it pretty easy to imagine use cases for all of them. So much so that I 
> wrote an extension binary distribution RFC[1] and its POC[2] around them.

Definitely agreed on BINDIR needing to be supported.

And while lots of extensions ship with docs, I expect this feature to
mostly be used in production environments to make deploying extensions
easier. And I'm not sure that many people care about deploying docs to
production (honestly lots of people would probably want to strip
them).

Still, for the sake of completeness it might make sense to support
this whole list in extension_destdir. (assuming it's easy to do)




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread David E. Wheeler
On Jun 24, 2024, at 17:17, Jelte Fennema-Nio  wrote:

> If you want to only change $libdir during CREATE EXTENSION (or ALTER
> EXTENSION UPDATE), then why not just change it there. And really you'd
> only want to change it when creating an extension from which the
> control file is coming from extension_destdir.

IIUC, the postmaster needs to load an extension on first use in every session 
unless it’s in shared_preload_libraries.

> However, I can also see a case for really always changing $libdir.
> Because some extensions in shared_preload_libraries, might want to
> trigger loading other libraries that they ship with dynamically. And
> these libraries are probably also in extension_destdir.

Right, it can be more than just the DSOs for the extension itself.

Best,

David





Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Jelte Fennema-Nio
On Mon, 24 Jun 2024 at 18:11, Nathan Bossart  wrote:
> At first glance, the general idea seems reasonable to me.  I'm wondering
> whether there is a requirement for this directory to be prepended or if it
> could be appended to the end.  That way, the existing ones would take
> priority, which might be desirable from a security standpoint.

Citus does ship with some override library for pgoutput to make
logical replication/CDC work correctly with sharded tables. Right now
using this override library requires changing dynamic_library_path. It
would be nice if that wasn't necessary. But this is obviously a small
thing. And I definitely agree that there's a security angle to this as
well, but honestly that seems rather small too. If an attacker can put
shared libraries into the extension_destdir, I'm pretty sure you've
lost already, no matter if extension_destdir is prepended or appended
to the existing $libdir.




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Jelte Fennema-Nio
On Thu, 11 Apr 2024 at 19:52, David E. Wheeler  wrote:
> I realize this probably isn’t going to happen for 17, given the freeze, but I 
> would very much welcome feedback and pointers to address concerns about 
> providing a second directory for extensions and DSOs. Quite a few people have 
> talked about the need for this in the Extension Mini Summits[1], so I’m sure 
> I could get some collaborators to make improvements or look at a different 
> approach.

Overall +1 for the idea. We're running into this same limitation (only
a single place to put extension files) at Microsoft at the moment.

+and to the '$libdir' directive when loading modules
+that back functions.

I feel like this is a bit strange. Either its impact is too wide, or
it's not wide enough depending on your intent.

If you want to only change $libdir during CREATE EXTENSION (or ALTER
EXTENSION UPDATE), then why not just change it there. And really you'd
only want to change it when creating an extension from which the
control file is coming from extension_destdir.

However, I can also see a case for really always changing $libdir.
Because some extensions in shared_preload_libraries, might want to
trigger loading other libraries that they ship with dynamically. And
these libraries are probably also in extension_destdir.




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread David E. Wheeler
On Jun 24, 2024, at 4:28 PM, Robert Haas  wrote:

> As long as the GUC is superuser-only, I'm not sure what else there is
> to do here. The only question is whether there's some reason to
> disallow this even from the superuser, but I'm not quite seeing such a
> reason.

I can switch it back from requiring a restart to allowing a superuser to set it.

>> I sketched them quickly, so agree they can be better. Reading the code, I 
>> now see that it appears to be the former case. I’d like to advocate for the 
>> latter.
> 
> Sounds good.

Yeah, though then I have a harder time deciding how it should work. pg_config’s 
paths are absolute. With your first example, we just use them exactly as they 
are, but prefix them with the destination directory. So if it’s set to 
`/my/temp/root/`, then files go into

/my/temp/root/$(pg_conifg --sharedir)
/my/temp/root/$(pg_conifg --pkglibdir)
/my/temp/root/$(pg_conifg --bindir)
# etc.

Which is exactly how RPM and Apt packages are built, but seems like an odd 
configuration for general use.

>> BINDIR
>> DOCDIR
>> HTMLDIR
>> PKGINCLUDEDIR
>> LOCALEDIR
>> MANDIR
>> 
>> I can imagine an extension wanting or needing to use any and all of these.
> 
> Are these really all relevant to backend code?

Oh I think so. Especially BINDIR; lots of extensions ship with binary 
applications. And most ship with docs, too (PGXS puts items listed in DOCS into 
DOCDIR). Some might also produce man pages (for their binaries), HTML docs, and 
other stuff. Maybe an FTE extension would include locale files?

I find it pretty easy to imagine use cases for all of them. So much so that I 
wrote an extension binary distribution RFC[1] and its POC[2] around them.

Best,

David

[1]: https://github.com/orgs/pgxn/discussions/2
[1]: https://justatheory.com/2024/06/trunk-poc/





Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Robert Haas
On Mon, Jun 24, 2024 at 3:37 PM David E. Wheeler  wrote:
> I guess the question then is what security controls are appropriate for this 
> feature, which after all tells the postmaster what directories to read files 
> from. It feels a little outside the scope of a regular user to even be aware 
> of the file system undergirding the service. But perhaps there’s a 
> non-superuser role for whom it is appropriate?

As long as the GUC is superuser-only, I'm not sure what else there is
to do here. The only question is whether there's some reason to
disallow this even from the superuser, but I'm not quite seeing such a
reason.

> > On the patch itself, I find the documentation for this to be fairly
> > hard to understand. I think it could benefit from an example. I'm
> > confused about whether this is intended to let me search for
> > extensions in /my/temp/root/usr/lib/postgresql/... by setting
> > extension_directory=/my/temp/dir, or whether it's intended me to
> > search both /usr/lib/postgresql as I normally would and also
> > /some/other/place.
>
> I sketched them quickly, so agree they can be better. Reading the code, I now 
> see that it appears to be the former case. I’d like to advocate for the 
> latter.

Sounds good.

> > If the latter, I wonder why we don't handle shared
> > libraries by setting dynamic_library_path and then just have an
> > analogue of that for control files.
>
> The challenge is that it applies not just to shared object libraries and 
> control files, but also extension SQL files and any other SHAREDIR files an 
> extension might include. But also, I think it should support all the 
> pg_config installation targets that extensions might use, including:
>
> BINDIR
> DOCDIR
> HTMLDIR
> PKGINCLUDEDIR
> LOCALEDIR
> MANDIR
>
> I can imagine an extension wanting or needing to use any and all of these.

Are these really all relevant to backend code?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread David E. Wheeler
On Jun 24, 2024, at 1:53 PM, Robert Haas  wrote:

> Is "tighten up what the superuser can do" on our list of objectives?
> Personally, I think we should be focusing mostly, and maybe entirely,
> on letting non-superusers do more things, with appropriate security
> controls. The superuser can ultimately do anything, since they can
> cause shell commands to be run and load arbitrary code into the
> backend and write code in untrusted procedural languages and mutilate
> the system catalogs and lots of other terrible things.

I guess the question then is what security controls are appropriate for this 
feature, which after all tells the postmaster what directories to read files 
from. It feels a little outside the scope of a regular user to even be aware of 
the file system undergirding the service. But perhaps there’s a non-superuser 
role for whom it is appropriate?

> Now, I think there are environments where people have used things like
> containers to try to lock down the superuser, and while I'm not sure
> that can ever be particularly water-tight, if it were the case that
> this patch would make it a whole lot easier for a superuser to bypass
> the kinds of controls that people are imposing today, that might be an
> argument against this patch. But ... off-hand, I'm not seeing such an
> exposure.

Yeah I’m not even sure I follow. Containers are immutable, other than mutable 
mounted volumes --- which is one use case this patch is attempting to enable.

> On the patch itself, I find the documentation for this to be fairly
> hard to understand. I think it could benefit from an example. I'm
> confused about whether this is intended to let me search for
> extensions in /my/temp/root/usr/lib/postgresql/... by setting
> extension_directory=/my/temp/dir, or whether it's intended me to
> search both /usr/lib/postgresql as I normally would and also
> /some/other/place.

I sketched them quickly, so agree they can be better. Reading the code, I now 
see that it appears to be the former case. I’d like to advocate for the latter. 

> If the latter, I wonder why we don't handle shared
> libraries by setting dynamic_library_path and then just have an
> analogue of that for control files.

The challenge is that it applies not just to shared object libraries and 
control files, but also extension SQL files and any other SHAREDIR files an 
extension might include. But also, I think it should support all the pg_config 
installation targets that extensions might use, including:

BINDIR
DOCDIR
HTMLDIR
PKGINCLUDEDIR
LOCALEDIR
MANDIR

I can imagine an extension wanting or needing to use any and all of these.

Best,

David







Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Robert Haas
On Wed, Apr 3, 2024 at 3:13 AM Alvaro Herrera  wrote:
> I support the idea of there being a second location from where to load
> shared libraries ... but I don't like the idea of making it
> runtime-configurable.  If we want to continue to tighten up what
> superuser can do, then one of the things that has to go away is the
> ability to load shared libraries from arbitrary locations
> (dynamic_library_path).  I think we should instead look at making those
> locations hardcoded at compile time.  The packager can then decide where
> those things go, and the superuser no longer has the ability to load
> arbitrary code from arbitrary locations.

Is "tighten up what the superuser can do" on our list of objectives?
Personally, I think we should be focusing mostly, and maybe entirely,
on letting non-superusers do more things, with appropriate security
controls. The superuser can ultimately do anything, since they can
cause shell commands to be run and load arbitrary code into the
backend and write code in untrusted procedural languages and mutilate
the system catalogs and lots of other terrible things.

Now, I think there are environments where people have used things like
containers to try to lock down the superuser, and while I'm not sure
that can ever be particularly water-tight, if it were the case that
this patch would make it a whole lot easier for a superuser to bypass
the kinds of controls that people are imposing today, that might be an
argument against this patch. But ... off-hand, I'm not seeing such an
exposure.

On the patch itself, I find the documentation for this to be fairly
hard to understand. I think it could benefit from an example. I'm
confused about whether this is intended to let me search for
extensions in /my/temp/root/usr/lib/postgresql/... by setting
extension_directory=/my/temp/dir, or whether it's intended me to
search both /usr/lib/postgresql as I normally would and also
/some/other/place. If the latter, I wonder why we don't handle shared
libraries by setting dynamic_library_path and then just have an
analogue of that for control files.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: RFC: Additional Directory for Extensions

2024-06-24 Thread Nathan Bossart
On Thu, Apr 11, 2024 at 01:52:26PM -0400, David E. Wheeler wrote:
> I realize this probably isn´t going to happen for 17, given the freeze,
> but I would very much welcome feedback and pointers to address concerns
> about providing a second directory for extensions and DSOs. Quite a few
> people have talked about the need for this in the Extension Mini
> Summits[1], so I´m sure I could get some collaborators to make
> improvements or look at a different approach.

At first glance, the general idea seems reasonable to me.  I'm wondering
whether there is a requirement for this directory to be prepended or if it
could be appended to the end.  That way, the existing ones would take
priority, which might be desirable from a security standpoint.

-- 
nathan




Re: RFC: Additional Directory for Extensions

2024-04-11 Thread David E. Wheeler
On Apr 4, 2024, at 1:20 PM, David E. Wheeler  wrote:

> I’ve added some docs based on your GUC description; updated patch attached.

Here’s a rebase.

I realize this probably isn’t going to happen for 17, given the freeze, but I 
would very much welcome feedback and pointers to address concerns about 
providing a second directory for extensions and DSOs. Quite a few people have 
talked about the need for this in the Extension Mini Summits[1], so I’m sure I 
could get some collaborators to make improvements or look at a different 
approach.

Best,

David

[1] 
https://justatheory.com/2024/02/extension-ecosystem-summit/#extension-ecosystem-mini-summit




v3-0001-Add-extension_directory-GUC.patch
Description: Binary data




Re: RFC: Additional Directory for Extensions

2024-04-04 Thread David E. Wheeler
On Apr 3, 2024, at 12:46 PM, Christoph Berg  wrote:

> Thanks for bringing this up, I should have submitted this years ago.
> (The patch is originally from September 2020.)

That’s okay, it’s still 2020 in some ways. 😂

> I designed the patch to require a superuser to set it, so it doesn't
> matter very much by which mechanism it gets updated. There should be
> little reason to vary it at run-time, so I'd be fine with requiring a
> restart, but otoh, why restrict the superuser from reloading it if
> they know what they are doing?

I think that’s fair. I’ll keep it requiring a restart now, on the theory it 
would be easier to loosen it later than have to tighten it later.

> I'm not sure the concept of "core libraries" exists. PG happens to
> dlopen things at run time, and it doesn't know/care if they were
> installed by users or by the original PG server. Also, an exploited
> libpgtypes library is not worse than any other untrusted "user"
> library, so you really don't want to allow users to provide their own
> .so files, no matter by what mechanism.

Yes, I guess my concern is whether it could be used to “shadow” core libraries. 
Maybe it’s no different, really.

>> This would also allow the lookup of extension libraries prefixed by the 
>> directory field from the control file, which would enable much tidier 
>> extension installation: The control file, SQL scripts, and DSOs could all be 
>> in a single directory for an extension.
> 
> Nice idea, but that would mean installing .so files into PGSHAREDIR.
> Perhaps the whole extension stuff would have to move to PKGLIBDIR
> instead.

Yes, I was just poking around the code, and realized that, when extension 
functions are created they may or may not not use `MODULE_PATHNAME`, but in any 
event, there is nothing different about loading an extension DSO than any other 
DSO. I was hoping to find a path where it knows it’s opening a DSO for the 
purpose of an extension, so we could limit the lookup there. But that does not 
(currently) exist.

Maybe we could add an `$extensiondir` variable to complement `$libdir`?

Or is PGKLIBDIR is the way to go? I’m not familiar with it. It looks like 
extension JIT files are put there already.

> Fwiw, I wrote this patch to solve the problem of testing extensions at
> build-time where the build process does not have write access to
> PGSHAREDIR. It solves that problem quite well, almost all PG
> extensions have build-time test coverage now (where there was
> basically 0 before).

Yeah, good additional use case.

On Apr 3, 2024, at 1:03 PM, Christoph Berg  wrote:

> Also, it's called extension "destdir" because it behaves like DESTDIR
> in Makefiles: It prepends the given path to the path that PG is trying
> to open when set. So it doesn't allow arbitrary new locations as of
> now, just /home/build/foo-1/debian/foo/usr/share/postgresql/17/extension
> in addition to /usr/share/postgresql/17/extension. (That is what the
> Debian package build process needs, so that restriction/design choice
> made sense.

Right, this makes perfect sense, in that you don’t have to copy all the 
extension files from the destdir to the SHAREDIR to test them, which I imagine 
could be a PITA.

> That's also included in the current GUC description:
> 
>   This directory is prepended to paths when loading extensions
>   (control and SQL files), and to the '$libdir' directive when
>   loading modules that back functions. The location is made
>   configurable to allow build-time testing of extensions that do not
>   have been installed to their proper location yet.
> 
> Perhaps I should have included a more verbose "NOT FOR PRODUCTION"
> there.

The use cases I described upthread are very much production use cases. Do you 
think it’s not for production just because we need to really think it through?

I’ve added some docs based on your GUC description; updated patch attached.

Best,

David



v2-0001-Add-extension_directory-GUC.patch
Description: Binary data


Re: RFC: Additional Directory for Extensions

2024-04-03 Thread Christoph Berg
> Fwiw, I wrote this patch to solve the problem of testing extensions at
> build-time where the build process does not have write access to
> PGSHAREDIR. It solves that problem quite well, almost all PG
> extensions have build-time test coverage now (where there was
> basically 0 before).

Also, it's called extension "destdir" because it behaves like DESTDIR
in Makefiles: It prepends the given path to the path that PG is trying
to open when set. So it doesn't allow arbitrary new locations as of
now, just /home/build/foo-1/debian/foo/usr/share/postgresql/17/extension
in addition to /usr/share/postgresql/17/extension. (That is what the
Debian package build process needs, so that restriction/design choice
made sense.)

> Security is not a concern at this point as everything is running as
> the same user, and the test cluster will be wiped right after the
> test. I figured marking the setting as "super user" only was enough
> security at that point, but I would recommend another audit before
> using it together with "trusted" extensions and other things in
> production.

That's also included in the current GUC description:

   This directory is prepended to paths when loading extensions
   (control and SQL files), and to the '$libdir' directive when
   loading modules that back functions. The location is made
   configurable to allow build-time testing of extensions that do not
   have been installed to their proper location yet.

Perhaps I should have included a more verbose "NOT FOR PRODUCTION"
there.

As for compatibility, the patch has been part of the PG 9.5..17 now
for several years, and I'm very happy with extra test coverage it
provides, especially on the Debian architectures that don't have
"autopkgtest" runners yet.

Christoph




Re: RFC: Additional Directory for Extensions

2024-04-03 Thread Christoph Berg
Re: David E. Wheeler
> > Yes, I like the suggestion to make it require a restart, which lets the 
> > sysadmin control it and not limited to whatever the person who compiled it 
> > thought would make sense.
> 
> Here’s a revision of the Debian patch that requires a server start.

Thanks for bringing this up, I should have submitted this years ago.
(The patch is originally from September 2020.)

I designed the patch to require a superuser to set it, so it doesn't
matter very much by which mechanism it gets updated. There should be
little reason to vary it at run-time, so I'd be fine with requiring a
restart, but otoh, why restrict the superuser from reloading it if
they know what they are doing?

> However, in studying the patch, it appears that the `extension_directory` is 
> searched for *all* shared libraries, not just those being loaded for an 
> extension. Am I reading the `expand_dynamic_library_name()` function right?
> 
> If so, this seems like a good way for a bad actor to muck with things, by 
> putting an exploited libpgtypes library into the extension directory, where 
> it would be loaded in preference to the core libpgtypes library, if they 
> couldn’t exploit the original.
> 
> I’m thinking it would be better to have the dynamic library lookup for 
> extension libraries (and LOAD libraries?) separate, so that the 
> `extension_directory` would not be used for core libraries.

I'm not sure the concept of "core libraries" exists. PG happens to
dlopen things at run time, and it doesn't know/care if they were
installed by users or by the original PG server. Also, an exploited
libpgtypes library is not worse than any other untrusted "user"
library, so you really don't want to allow users to provide their own
.so files, no matter by what mechanism.

> This would also allow the lookup of extension libraries prefixed by the 
> directory field from the control file, which would enable much tidier 
> extension installation: The control file, SQL scripts, and DSOs could all be 
> in a single directory for an extension.

Nice idea, but that would mean installing .so files into PGSHAREDIR.
Perhaps the whole extension stuff would have to move to PKGLIBDIR
instead.


Fwiw, I wrote this patch to solve the problem of testing extensions at
build-time where the build process does not have write access to
PGSHAREDIR. It solves that problem quite well, almost all PG
extensions have build-time test coverage now (where there was
basically 0 before).

Security is not a concern at this point as everything is running as
the same user, and the test cluster will be wiped right after the
test. I figured marking the setting as "super user" only was enough
security at that point, but I would recommend another audit before
using it together with "trusted" extensions and other things in
production.

Christoph




Re: RFC: Additional Directory for Extensions

2024-04-03 Thread David E. Wheeler
On Apr 3, 2024, at 8:54 AM, David E. Wheeler  wrote:

> Yes, I like the suggestion to make it require a restart, which lets the 
> sysadmin control it and not limited to whatever the person who compiled it 
> thought would make sense.

Here’s a revision of the Debian patch that requires a server start.

However, in studying the patch, it appears that the `extension_directory` is 
searched for *all* shared libraries, not just those being loaded for an 
extension. Am I reading the `expand_dynamic_library_name()` function right?

If so, this seems like a good way for a bad actor to muck with things, by 
putting an exploited libpgtypes library into the extension directory, where it 
would be loaded in preference to the core libpgtypes library, if they couldn’t 
exploit the original.

I’m thinking it would be better to have the dynamic library lookup for 
extension libraries (and LOAD libraries?) separate, so that the 
`extension_directory` would not be used for core libraries.

This would also allow the lookup of extension libraries prefixed by the 
directory field from the control file, which would enable much tidier extension 
installation: The control file, SQL scripts, and DSOs could all be in a single 
directory for an extension.

Thoughts?

Best,

David




v1-0001-Add-extension_directory-GUC.patch
Description: Binary data


Re: RFC: Additional Directory for Extensions

2024-04-03 Thread David E. Wheeler
On Apr 3, 2024, at 8:54 AM, David E. Wheeler  wrote:

> Yes, I like the suggestion to make it require a restart, which lets the 
> sysadmin control it and not limited to whatever the person who compiled it 
> thought would make sense.

Or SIGHUP?

D



Re: RFC: Additional Directory for Extensions

2024-04-03 Thread David E. Wheeler
On Apr 3, 2024, at 3:57 AM, walt...@technowledgy.de wrote:

> I can also imagine that it would be very helpful in a container setup to be 
> able to set an environment variable with this path instead of having to 
> recompile all of postgres to change it.

Yes, I like the suggestion to make it require a restart, which lets the 
sysadmin control it and not limited to whatever the person who compiled it 
thought would make sense.

Best,

David





Re: RFC: Additional Directory for Extensions

2024-04-03 Thread Daniel Gustafsson
> On 3 Apr 2024, at 09:13, Alvaro Herrera  wrote:
> 
> On 2024-Apr-02, David E. Wheeler wrote:
> 
>> That quotation comes from this Debian patch[2] maintained by Christoph
>> Berg. I’d like to formally propose integrating this patch into the
>> core. And not only because it’s overhead for package maintainers like
>> Christoph, but because a number of use cases have emerged since we
>> originally discussed something like this back in 2013[3]:
> 
> I support the idea of there being a second location from where to load
> shared libraries

Agreed, the case made upthread that installing an extension breaks the app
signing seems like a compelling reason to do this.

The implementation of this need to make sure the directory is properly set up
however to avoid similar problems that CVE 2019-10211 showed.

--
Daniel Gustafsson





Re: RFC: Additional Directory for Extensions

2024-04-03 Thread walther

Alvaro Herrera:

I support the idea of there being a second location from where to load
shared libraries ... but I don't like the idea of making it
runtime-configurable.  If we want to continue to tighten up what
superuser can do, then one of the things that has to go away is the
ability to load shared libraries from arbitrary locations
(dynamic_library_path).  I think we should instead look at making those
locations hardcoded at compile time.  The packager can then decide where
those things go, and the superuser no longer has the ability to load
arbitrary code from arbitrary locations.


The use-case for runtime configuration of this seems to be build-time 
testing of extensions against an already installed server. For this 
purpose it should be enough to be able to set this directory at startup 
- it doesn't need to be changed while the server is actually running. 
Then you could spin up a temporary postgres instance with the extension 
directory pointing a the build directory and test.


Would startup-configurable be better than runtime-configurable regarding 
your concerns?


I can also imagine that it would be very helpful in a container setup to 
be able to set an environment variable with this path instead of having 
to recompile all of postgres to change it.


Best,

Wolfgang




Re: RFC: Additional Directory for Extensions

2024-04-03 Thread Alvaro Herrera
On 2024-Apr-02, David E. Wheeler wrote:

> That quotation comes from this Debian patch[2] maintained by Christoph
> Berg. I’d like to formally propose integrating this patch into the
> core. And not only because it’s overhead for package maintainers like
> Christoph, but because a number of use cases have emerged since we
> originally discussed something like this back in 2013[3]:

I support the idea of there being a second location from where to load
shared libraries ... but I don't like the idea of making it
runtime-configurable.  If we want to continue to tighten up what
superuser can do, then one of the things that has to go away is the
ability to load shared libraries from arbitrary locations
(dynamic_library_path).  I think we should instead look at making those
locations hardcoded at compile time.  The packager can then decide where
those things go, and the superuser no longer has the ability to load
arbitrary code from arbitrary locations.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Al principio era UNIX, y UNIX habló y dijo: "Hello world\n".
No dijo "Hello New Jersey\n", ni "Hello USA\n".