Re: RFC: Additional Directory for Extensions
On Aug 27, 2024, at 22:24, Craig Ringer wrote: > `pg_config` only cares about compile-time settings, so I would not > expect its output to change. Right, of course that’s its original purpose, but extensions depend on it to determine where to install extensions. Not just PGXS, but also pgrx and various Makefile customizations I’ve seen in the wild. > I suspect we'd have to add PGXS extension-path awareness if going for > per-extension subdir support. I'm not sure it makes sense to teach > `pg_config` about this, since it'd need to have a different mode like > >pg_config --extension myextname --extension-sharedir > > since the extension's "sharedir" is > $basedir/extensions/myextension/share or whatever. Right. PGXS would just need to know where to put the directory for an extension. There should be a default for the project, and then it can be overridden with something like DESTDIR (but without full paths under that prefix). > Supporting this looks to be a bit intrusive in the makefiles, > requiring them to differentiate between "share dir for extensions" and > "share dir for !extensions", etc. I'm not immediately sure if it can > be done in a way that transparently converts unmodified extension PGXS > makefiles to target the new paths; it might require an additional > define, or use of new variables and an ifdef block to add > backwards-compat to the extension makefile for building on older > postgres. Yeah, might just have to be an entirely new thing, though it sure would be nice for existing PGXS-using Makefiles to do the right thing. Maybe for the new version of the server with the proposed new pattern it would dispatch to the new thing somehow without modifying all the rest of its logic. > Right. The proposed structure is rather a bigger change than I was > thinking when I suggested supporting an extension search path not just > a single extra path. But it's also a cleaner proposal; the > per-extension directory would make it easier to ensure that the > extension control file, sql scripts, and dynamic library all match the > same extension and version if multiple ones are on the path. Which is > desirable when doing things like testing a new version of an in-core > extension. 💯 > Right. And I've been on the receiving end of having a small, focused > patch derailed by others jumping in and scope-exploding it into > something completely different to solve a much wider but related > problem. I’m not complaining, I would definitely prefer to see consensus on a cleaner proposal along the lines we’ve discussed and a commitment to time from parties able to get it done in time for v18. I’m willing to help where I can with my baby C! Failing that, we can fall back on the destdir patch. > I'm definitely not trying to stand in the way of progress with this; I > just want to make sure that it doesn't paint us into a corner that > prevents a more general solution from being adopted later. That's why > I'm suggesting making the config a multi-value string (list of paths) > and raising a runtime "ERROR: searching multiple paths for extensions > not yet supported" or something if >1 path configured. > > If that doesn't work, no problem. I think the logic would have to be different, so they’d be different GUCs with their own semantics. But if the core team and committers are on board with the general idea of search paths and per-extension directory organization, it would be best to avoid getting stuck with maintaining the current patch’s GUC. OTOH, we could get it committed now and revert it later if we get the better thing done and committed. >> I think we should get some clarity on the proposal, and then consensus, as >> you say. I say “get some clarity” because my proposal doesn’t require >> recursing, and I’m not sure why it’d be needed. > > From what you and Gabriele are discussing (which I agree with), it wouldn’t. Ah, great. I’ll try to put some thought into a more formal proposal in a new thread next week. Unless your Gabriele beats me to it 😂. Best, David
Re: RFC: Additional Directory for Extensions
On Wed, 28 Aug 2024 at 03:26, David E. Wheeler wrote: > Right. ISTM it could complicate PGXS quite a bit. If we set, say, > > SET extension_search_path = $extsdir, /mnt/extensions/pg16, > /mnt/extensions/pg16/gosuperfast/extensions; > > What should be the output of `pg_config --sharedir`? `pg_config` only cares about compile-time settings, so I would not expect its output to change. I suspect we'd have to add PGXS extension-path awareness if going for per-extension subdir support. I'm not sure it makes sense to teach `pg_config` about this, since it'd need to have a different mode like pg_config --extension myextname --extension-sharedir since the extension's "sharedir" is $basedir/extensions/myextension/share or whatever. Supporting this looks to be a bit intrusive in the makefiles, requiring them to differentiate between "share dir for extensions" and "share dir for !extensions", etc. I'm not immediately sure if it can be done in a way that transparently converts unmodified extension PGXS makefiles to target the new paths; it might require an additional define, or use of new variables and an ifdef block to add backwards-compat to the extension makefile for building on older postgres. > But that leaves the issue of directory organization. The current patch is > just a prefix for various PGXS/pg_config directories; the longer-term > proposal I’ve made here is not a prefix for sharedir, mandir, etc., but a > directory that contains directories named for extensions. So even if we were > to take this approach, the directory structure would vary. Right. The proposed structure is rather a bigger change than I was thinking when I suggested supporting an extension search path not just a single extra path. But it's also a cleaner proposal; the per-extension directory would make it easier to ensure that the extension control file, sql scripts, and dynamic library all match the same extension and version if multiple ones are on the path. Which is desirable when doing things like testing a new version of an in-core extension. > OTOH, we have this patch now, and this other stuff is just a proposal. Actual > code trumps ideas in my mind. Right. And I've been on the receiving end of having a small, focused patch derailed by others jumping in and scope-exploding it into something completely different to solve a much wider but related problem. I'm definitely not trying to stand in the way of progress with this; I just want to make sure that it doesn't paint us into a corner that prevents a more general solution from being adopted later. That's why I'm suggesting making the config a multi-value string (list of paths) and raising a runtime "ERROR: searching multiple paths for extensions not yet supported" or something if >1 path configured. If that doesn't work, no problem. > I think we should get some clarity on the proposal, and then consensus, as > you say. I say “get some clarity” because my proposal doesn’t require > recursing, and I’m not sure why it’d be needed. >From what you and Gabriele are discussing (which I agree with), it wouldn't.
Re: RFC: Additional Directory for Extensions
On Aug 26, 2024, at 17:35, Craig Ringer wrote: > This looks like a good suggestion to me, it would make the packaging, > distribution and integration of 3rd party extensions significantly > easier without any obvious large or long term cost. Yes! > Also PGXS, the windows extension build support, and 3rd party cmake > builds etc. But not by the looks a drastic change. Right. ISTM it could complicate PGXS quite a bit. If we set, say, SET extension_search_path = $extsdir, /mnt/extensions/pg16, /mnt/extensions/pg16/gosuperfast/extensions; What should be the output of `pg_config --sharedir`? > My only real concern with the current patch is that it limits > searching for extensions to one additional configurable location, > which is inconsistent with how things like the dynamic_library_path > works. Once in, it'll be difficult to change or extend for BC, and if > someone wants to add a search path capability it'll break existing > configurations. Agreed. > Would it be feasible to define its configuration syntax as accepting a > list of paths, but only implement the semantics for single-entry lists > and ERROR on multiple paths? That way it could be extended w/o > breaking existing configurations later. I imagine it’s a simple matter of programming :-) But that leaves the issue of directory organization. The current patch is just a prefix for various PGXS/pg_config directories; the longer-term proposal I’ve made here is not a prefix for sharedir, mandir, etc., but a directory that contains directories named for extensions. So even if we were to take this approach, the directory structure would vary. I suspect we’d have to name it differently and support both long-term. That, too me, is the main issue with this patch. OTOH, we have this patch now, and this other stuff is just a proposal. Actual code trumps ideas in my mind. > With that said, I'm not the one doing the work at the moment, and the > functionality would definitely be helpful. If there's agreement on > supporting a search-path or recursing into subdirectories I'd be > willing to have a go at it, but I'm a bit stale on Pg's codebase now > so I'd want to be fairly confident the work wouldn't just be thrown > out. I think we should get some clarity on the proposal, and then consensus, as you say. I say “get some clarity” because my proposal doesn’t require recursing, and I’m not sure why it’d be needed. Best, David
Re: RFC: Additional Directory for Extensions
On Aug 27, 2024, at 04:56, Gabriele Bartolini wrote: > The extension image could follow a naming convention like this (order can be > adjusted): `-- version>-(-)`. For example, `pgvector-16-0.7.4-bookworm-1` would > represent the first image built in a repository for pgvector 0.7.4 for > PostgreSQL 16 on Debian Bookworm. If multi-arch images aren't desired, we > could incorporate the architecture somewhere in the naming convention. Well now you’re just describing the binary distribution format RFC[1] (POC[2]) and multi-platform OCI distribution POC[3] :-) > If we wanted to install multiple versions of an extension, we could mount > them in different directories, with the version included in the folder > name—for example, `pgvector-0.7.4` instead of just `pgvector`. However, I'm a > bit rusty with the extensions framework, so I'll need to check if this > approach is feasible and makes sense. Right, if we decided to adopt this proposal, it might make sense to include the “default version” as part of the directory name. But there’s quite a lot of work between here and there. Best, David [1]: https://github.com/pgxn/rfcs/pull/2 [2]: https://justatheory.com/2024/06/trunk-poc/ [3]: https://justatheory.com/2024/06/trunk-oci-poc/
Re: RFC: Additional Directory for Extensions
Hi David, Thanks for your email. On Mon, 26 Aug 2024 at 16:07, David E. Wheeler wrote: > I would assume BINDIR, DOCDIR, HTMLDIR, PKGLIBDIR, MANDIR, SHAREDIR, and > perhaps LOCALEDIR. > > But even if it’s just one or two, the only proper way an extension > directory would work, IME, is to define a directory-based structure for > extensions, where every file for an extension is in a directory named for > the extension, and subdirectories are defined for each of the above > requisite file types. Something like: > > extension_name > ├── control.ini > ├── bin > ├── doc > ├── html > ├── lib > ├── local > ├── man > └── share > I'm really glad you proposed this publicly. I reached the same conclusion the other day when digging deeper into the problem with a few folks from CloudNativePG. Setting aside multi-arch images for now, if we could reorganize the content of a single image (identified by OS distro, PostgreSQL major version, and extension version) with a top-level directory structure as you described, we could easily mount each image as a separate volume. The extension image could follow a naming convention like this (order can be adjusted): `---(-)`. For example, `pgvector-16-0.7.4-bookworm-1` would represent the first image built in a repository for pgvector 0.7.4 for PostgreSQL 16 on Debian Bookworm. If multi-arch images aren't desired, we could incorporate the architecture somewhere in the naming convention. This would allow multiple paths to work and keep all the files for an > extension bundled together. It could also potentially allow for multiple > versions of an extension to be installed at once, if we required the > version to be part of the directory name. > If we wanted to install multiple versions of an extension, we could mount them in different directories, with the version included in the folder name—for example, `pgvector-0.7.4` instead of just `pgvector`. However, I'm a bit rusty with the extensions framework, so I'll need to check if this approach is feasible and makes sense. Thanks, Gabriele -- Gabriele Bartolini Vice President, Cloud Native at EDB enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Tue, 27 Aug 2024 at 02:07, David E. Wheeler wrote: > On Aug 21, 2024, at 19:07, Craig Ringer wrote: > But even if it’s just one or two, the only proper way an extension directory > would work, IME, is to define a directory-based structure for extensions, > where every file for an extension is in a directory named for the extension, > and subdirectories are defined for each of the above requisite file types. > [...] > I think this would be a much nicer layout for packaging, installing, and > managing extensions versus the current method of strewing files around to a > slew of different directories. This looks like a good suggestion to me, it would make the packaging, distribution and integration of 3rd party extensions significantly easier without any obvious large or long term cost. > But it would come at some cost, in terms of backward with the existing layout > (or migration to it), significant modification of the server to use the new > layout (and extension_search_path), and other annoyances like PATH and > MANPATH management. Also PGXS, the windows extension build support, and 3rd party cmake builds etc. But not by the looks a drastic change. > Long term I think it would be worthwhile, but the current patch feels like a > decent interim step we could live with, solving most of the integration > problems (immutable servers, packaging testing, etc.) at the cost of a > slightly unexpected directory layout. What I mean by that is that the current > patch is pretty much just using extension_destdir as a prefix to all of those > directories from pg_config, so they never have to change, but it does mean > that you end up installing extensions in something like: > > /mnt/extensions/pg16/usr/share/postgresql/16 > /mnt/extensions/pg16/usr/include/postgresql My only real concern with the current patch is that it limits searching for extensions to one additional configurable location, which is inconsistent with how things like the dynamic_library_path works. Once in, it'll be difficult to change or extend for BC, and if someone wants to add a search path capability it'll break existing configurations. Would it be feasible to define its configuration syntax as accepting a list of paths, but only implement the semantics for single-entry lists and ERROR on multiple paths? That way it could be extended w/o breaking existing configurations later. With that said, I'm not the one doing the work at the moment, and the functionality would definitely be helpful. If there's agreement on supporting a search-path or recursing into subdirectories I'd be willing to have a go at it, but I'm a bit stale on Pg's codebase now so I'd want to be fairly confident the work wouldn't just be thrown out. -- Craig Ringer
Re: RFC: Additional Directory for Extensions
Hi Hackers, Apologies for the delay in reply; I’ve been at the XOXO Festival and almost completely unplugged for the first time in ages. Happy to see this thread coming alive, though. Thank you Gabriele, Craig, and Jelte! On Aug 21, 2024, at 19:07, Craig Ringer wrote: > So IMO this should be a _path_ to search for extension control files > and SQL scripts. > > If the current built-in default extension dir was exposed as a var > $extdir like we do for $libdir, this might look something like this > for local development and testing while working with a packaged > postgres build: > >SET extension_search_path = $extsdir, /opt/myapp/extensions, > /usr/local/postgres/my-custom-extension/extensions; >SET dynamic_library_path = $libdir, /opt/myapp/lib, > /usr/local/postgres/my-custom-extension/lib I would very much like something like this, but I’m not sure how feasible it is for a few reasons. The first, and most important, is that extensions are not limited to just a control file and SQL file. They also very often include: * one or more shared library files * documentation files * binary files And maybe more? How many of these directories might an extension install files into: ✦ ❯ pg_config | grep DIR | awk '{print $1}' BINDIR DOCDIR HTMLDIR INCLUDEDIR PKGINCLUDEDIR INCLUDEDIR-SERVER LIBDIR PKGLIBDIR LOCALEDIR MANDIR SHAREDIR SYSCONFDIR I would assume BINDIR, DOCDIR, HTMLDIR, PKGLIBDIR, MANDIR, SHAREDIR, and perhaps LOCALEDIR. But even if it’s just one or two, the only proper way an extension directory would work, IME, is to define a directory-based structure for extensions, where every file for an extension is in a directory named for the extension, and subdirectories are defined for each of the above requisite file types. Something like: extension_name ├── control.ini ├── bin ├── doc ├── html ├── lib ├── local ├── man └── share This would allow multiple paths to work and keep all the files for an extension bundled together. It could also potentially allow for multiple versions of an extension to be installed at once, if we required the version to be part of the directory name. I think this would be a much nicer layout for packaging, installing, and managing extensions versus the current method of strewing files around to a slew of different directories. But it would come at some cost, in terms of backward with the existing layout (or migration to it), significant modification of the server to use the new layout (and extension_search_path), and other annoyances like PATH and MANPATH management. Long term I think it would be worthwhile, but the current patch feels like a decent interim step we could live with, solving most of the integration problems (immutable servers, packaging testing, etc.) at the cost of a slightly unexpected directory layout. What I mean by that is that the current patch is pretty much just using extension_destdir as a prefix to all of those directories from pg_config, so they never have to change, but it does mean that you end up installing extensions in something like /mnt/extensions/pg16/usr/share/postgresql/16 /mnt/extensions/pg16/usr/include/postgresql etc. Best, David
Re: RFC: Additional Directory for Extensions
On Fri, 23 Aug 2024 at 10:14, Craig Ringer wrote: > On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini > wrote: > > On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio wrote: > >> SET extension_search_path = /mnt/extensions/pg16/* > > > > That'd be great. +1. > > Agreed, that'd be handy, but not worth blocking the underlying capability for. > > Except possibly to the degree that the feature should reserve wildcard > characters and require them to be escaped if they appear on a path, so > there's no BC break if it's added later. ... though on second thoughts, it might make more sense to just recursively search directories found under each path entry. Rules like 'search if a redundant trailing / is present' can be an option. That way there's no icky path escaping needed for normal configuration.
Re: RFC: Additional Directory for Extensions
On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini wrote: > On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio wrote: >> SET extension_search_path = /mnt/extensions/pg16/* > > That'd be great. +1. Agreed, that'd be handy, but not worth blocking the underlying capability for. Except possibly to the degree that the feature should reserve wildcard characters and require them to be escaped if they appear on a path, so there's no BC break if it's added later. On Thu, 22 Aug 2024 at 21:00, Gabriele Bartolini wrote: > > Hi Jelte, > > On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio wrote: >> >> It looks like you want one directory per extension, so that list would >> get pretty long if you have multiple extensions. Maybe (as a follow up >> change), we should start to support a * as a wildcard in both of these >> GUCs. So you could say: >> >> SET extension_search_path = /mnt/extensions/pg16/* >> >> To mean effectively the same as you're describing above. > > > That'd be great. +1. > > -- > Gabriele Bartolini > Vice President, Cloud Native at EDB > enterprisedb.com
Re: RFC: Additional Directory for Extensions
Hi Jelte, On Thu, 22 Aug 2024 at 09:32, Jelte Fennema-Nio wrote: > It looks like you want one directory per extension, so that list would > get pretty long if you have multiple extensions. Maybe (as a follow up > change), we should start to support a * as a wildcard in both of these > GUCs. So you could say: > > SET extension_search_path = /mnt/extensions/pg16/* > > To mean effectively the same as you're describing above. > That'd be great. +1. -- Gabriele Bartolini Vice President, Cloud Native at EDB enterprisedb.com
Re: RFC: Additional Directory for Extensions
Hi Craig, On Thu, 22 Aug 2024 at 01:07, Craig Ringer wrote: > It's also very relevant for local development and testing. > Yep, which is the original goal of Christoph IIRC. > It may be possible to weaken this restriction somewhat thanks to the > upcoming > https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/ > feature that permits additional OCI images to be mounted as read-only > volumes on a workload. This would still only permit mounting at > Pod-creation time, not runtime mounting and unmonuting, but means the > base postgres image could be supplemented by mounting additional > images for extensions. > I'm really excited about that feature, but it's still in the alpha stage. However, I don't anticipate any issues for the future general availability (GA) release. Regardless, we may need to consider a temporary solution that is compatible with existing Kubernetes and possibly Postgres versions (but that's beyond the purpose of this thread). For example, one might mount image "postgis-vX.Y.Z" image onto base > image "postgresql-16" if support for PostGIS is desired, without then > having to bake every possible extension anyone might ever want into > the base image. This solves all sorts of messy issues with upgrades > and new version releases. > Yep. > But for it to work, it must be possible to tell postgres to look in > _multiple places_ for extension .sql scripts and control files. This > is presently possible for modules (dynamic libraries, .so / .dylib / > .dll) but without a way to also configure the path for extensions it's > of very limited utility. > Agree. > So IMO this should be a _path_ to search for extension control files > and SQL scripts. > I like this. I also prefer the name `extension_search_path`. For safety, it might make sense to impose the restriction that if an > extension control file is found in a given directory, SQL scripts will > also only be looked for in that same directory. That way there's no > chance of accidentally mixing and matching SQL scripts from different > versions of an extension if it appears twice on the extension search > path in different places. The rule for loading SQL scripts would be: > > * locate first directory on path contianing matching extension control file > * use this directory as the extension directory for all subsequent SQL > script loading and running actions > It could work, but it requires some prototyping and exploration. I'm willing to participate and use CloudNativePG as a test bed. Cheers, Gabriele -- Gabriele Bartolini Vice President, Cloud Native at EDB enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Thu, 22 Aug 2024 at 01:08, Craig Ringer wrote: > SET extension_search_path = $extsdir, > /mnt/extensions/pg16/postgis-vX.Y/extensions, > /mnt/extensions/pg16/gosuperfast/extensions; It looks like you want one directory per extension, so that list would get pretty long if you have multiple extensions. Maybe (as a follow up change), we should start to support a * as a wildcard in both of these GUCs. So you could say: SET extension_search_path = /mnt/extensions/pg16/* To mean effectively the same as you're describing above.
Re: RFC: Additional Directory for Extensions
On Thu, 22 Aug 2024 at 08:00, Gabriele Bartolini wrote: > > Hi everyone, > > Apologies for only starting to look into this now. Thanks, David, for pushing > this forward. 100%. I've wanted this for some time but never had time to cook up a patch. > I want to emphasize the importance of this patch for the broader adoption of > extensions in immutable container environments, such as those used by the > CloudNativePG operator in Kubernetes. It's also very relevant for local development and testing. Right now postgres makes it impossible to locally compile and install an extension for a distro-packaged postgres (whether from upstream repos or PGDG repos) without dirtying the distro-managed filesystem subtrees with local changes under /usr etc, because it cannot be configured to look for locally installed extensions on non-default paths. > To provide some context, one of the key principles of CloudNativePG is that > containers, once started, cannot be modified—this includes the installation > of Postgres extensions and their libraries. This restriction prevents us from > adding extensions on the fly, requiring them to be included in the main > PostgreSQL operand image. As a result, users who need specific extensions > must build custom images through automated pipelines (see: > https://cloudnative-pg.io/blog/creating-container-images/). It may be possible to weaken this restriction somewhat thanks to the upcoming https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/ feature that permits additional OCI images to be mounted as read-only volumes on a workload. This would still only permit mounting at Pod-creation time, not runtime mounting and unmonuting, but means the base postgres image could be supplemented by mounting additional images for extensions. For example, one might mount image "postgis-vX.Y.Z" image onto base image "postgresql-16" if support for PostGIS is desired, without then having to bake every possible extension anyone might ever want into the base image. This solves all sorts of messy issues with upgrades and new version releases. But for it to work, it must be possible to tell postgres to look in _multiple places_ for extension .sql scripts and control files. This is presently possible for modules (dynamic libraries, .so / .dylib / .dll) but without a way to also configure the path for extensions it's of very limited utility. > We’ve been considering ways to improve this process for some time. The > direction we're exploring involves mounting an ephemeral volume that contains > the necessary extensions (namely $sharedir and $pkglibdir from pg_config). > These volumes would be created and populated with the required extensions > when the container starts and destroyed when it shuts down. To make this > work, each extension must be independently packaged as a container image > containing the appropriate files for a specific extension version, tailored > to the architecture, distribution, OS version, and Postgres version. Right. And there might be more than one of them. So IMO this should be a _path_ to search for extension control files and SQL scripts. If the current built-in default extension dir was exposed as a var $extdir like we do for $libdir, this might look something like this for local development and testing while working with a packaged postgres build: SET extension_search_path = $extsdir, /opt/myapp/extensions, /usr/local/postgres/my-custom-extension/extensions; SET dynamic_library_path = $libdir, /opt/myapp/lib, /usr/local/postgres/my-custom-extension/lib or in the container extensions case, something like: SET extension_search_path = $extsdir, /mnt/extensions/pg16/postgis-vX.Y/extensions, /mnt/extensions/pg16/gosuperfast/extensions; SET dynamic_library_path = $libdir, /mnt/extensions/pg16/postgis-vX.Y/lib, /mnt/extensions/pg16/gosuperfast/lib; For safety, it might make sense to impose the restriction that if an extension control file is found in a given directory, SQL scripts will also only be looked for in that same directory. That way there's no chance of accidentally mixing and matching SQL scripts from different versions of an extension if it appears twice on the extension search path in different places. The rule for loading SQL scripts would be: * locate first directory on path contianing matching extension control file * use this directory as the extension directory for all subsequent SQL script loading and running actions -- Craig Ringer EnterpriseDB
Re: RFC: Additional Directory for Extensions
On Tue, Jun 25, 2024 at 12:12:33PM +0200, Alvaro Herrera wrote: > archive_command and so on: we could disable these too. Nathan did some > work to implement those using dynamic libraries, so it shouldn't be too > much of a loss; anything that is done with a shell script can also be > done with a small library. Those libraries can be made safe. > If there are other ways to invoke shell commands from GUCs, let's add > the ability to use libraries for those too. Sorry, I just noticed this message. I recently withdrew my patch set [0] for using a library instead of shell commands for restore_command, archive_cleanup_command, and recovery_end_command, as it had sat idle for a very long time. If/when there's interest, I'd be happy to pick it up again. [0] https://postgr.es/m/ZkwLqichtySV5kF3%40nathan-air.lan -- nathan
Re: RFC: Additional Directory for Extensions
Hi everyone, Apologies for only starting to look into this now. Thanks, David, for pushing this forward. I want to emphasize the importance of this patch for the broader adoption of extensions in immutable container environments, such as those used by the CloudNativePG operator in Kubernetes. To provide some context, one of the key principles of CloudNativePG is that containers, once started, cannot be modified—this includes the installation of Postgres extensions and their libraries. This restriction prevents us from adding extensions on the fly, requiring them to be included in the main PostgreSQL operand image. As a result, users who need specific extensions must build custom images through automated pipelines (see: https://cloudnative-pg.io/blog/creating-container-images/). We’ve been considering ways to improve this process for some time. The direction we're exploring involves mounting an ephemeral volume that contains the necessary extensions (namely $sharedir and $pkglibdir from pg_config). These volumes would be created and populated with the required extensions when the container starts and destroyed when it shuts down. To make this work, each extension must be independently packaged as a container image containing the appropriate files for a specific extension version, tailored to the architecture, distribution, OS version, and Postgres version. I’m committed to thoroughly reviewing this patch, testing it with CloudNativePG and a few extensions, and providing feedback as soon as possible. Best, Gabriele On Mon, 8 Jul 2024 at 18:02, David E. Wheeler wrote: > On Jun 25, 2024, at 18:31, David E. Wheeler wrote: > > > For those who prefer a GitHub patch review experience, see this PR: > > > > https://github.com/theory/postgres/pull/3/files > > Rebased and restored PGC_SUSET in the attached v5 patch, plus noted the > required privileges in the docs. > > Best, > > David > > > > -- Gabriele Bartolini Vice President, Cloud Native at EDB enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Jun 25, 2024, at 18:31, David E. Wheeler wrote: > For those who prefer a GitHub patch review experience, see this PR: > > https://github.com/theory/postgres/pull/3/files Rebased and restored PGC_SUSET in the attached v5 patch, plus noted the required privileges in the docs. Best, David v5-0001-Add-extension_destdir-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Tue, 25 Jun 2024 at 19:33, David E. Wheeler wrote: > > On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio wrote: > > > Still, for the sake of completeness it might make sense to support > > this whole list in extension_destdir. (assuming it's easy to do) > > It should be with the current patch, which just uses a prefix to paths in > `pg_config`. Ah alright, I think it confused me because I never saw bindir being used. But as it turns out the current backend code never uses bindir. So that makes sense. I guess to actually use the binaries from the extension_destdir/$BINDIR the operator needs to set PATH accordingly, or the extension needs to be changed to support extension_destdir. It might be nice to add a helper function to find binaries in BINDIR, now that the resolution logic is more complex. Even if postgres itself doesn't use it. That would make it easier for extensions to be modified to support extension_distdir. Something like find_bindir_executable(char *name)
Re: RFC: Additional Directory for Extensions
On Wed, 26 Jun 2024 at 00:32, David E. Wheeler wrote: > In other news, here’s an updated patch that expands the documentation to > record that the destination directory is a prefix, and full paths should be > used under it. Also take the opportunity to document the PGXS DESTDIR > variable as the thing to use to install files under the destination directory. Docs are much clearer now thanks. full = substitute_libpath_macro(name); + /* +* If extension_destdir is set, try to find the file there first +*/ + if (*extension_destdir != '\0') + { + full2 = psprintf("%s%s", extension_destdir, full); + if (pg_file_exists(full2)) + { + pfree(full); + return full2; + } + pfree(full2); + } I think this should be done differently. For two reasons: 1. I don't think extension_destdir should be searched when $libdir is not part of the name. 2. find_in_dynamic_libpath currently doesn't use extension_destdir at all, so if there is no slash in the filename we do not search extension_destdir. I feel like changing the substitute_libpath_macro function a bit (or adding a new similar function) is probably the best way to achieve that. We should also check somewhere (probably GUC check hook) that extension_destdir is an absolute path. > It still requires a server restart; When reading the code I see no reason why this cannot be PGC_SIGHUP. Even though it's probably not needed to change on a running server, I think it's better to allow that. Even just so people can disable it if necessary for some reason without restarting the process. > I can change it back to superuser-only if that’s the consensus. It still is GUC_SUPERUSER_ONLY, right? > For those who prefer a GitHub patch review experience, see this PR: > > https://github.com/theory/postgres/pull/3/files Sidenote: The "D" link for each patch on cfbot[1] now gives a similar link for all commitfest entries[2]. [1]: http://cfbot.cputube.org/ [2]: https://github.com/postgresql-cfbot/postgresql/compare/cf/4913~1...cf/4913
Re: RFC: Additional Directory for Extensions
On Jun 25, 2024, at 10:43 AM, Robert Haas wrote: > If we want to work on making the sorts of changes that > you're proposing, let's do it on a separate thread. It's not going to > be meaningfully harder to move in that direction after some patch like > this than it is today. I appreciate this separation of concerns, Robert. In other news, here’s an updated patch that expands the documentation to record that the destination directory is a prefix, and full paths should be used under it. Also take the opportunity to document the PGXS DESTDIR variable as the thing to use to install files under the destination directory. It still requires a server restart; I can change it back to superuser-only if that’s the consensus. For those who prefer a GitHub patch review experience, see this PR: https://github.com/theory/postgres/pull/3/files Best, David v4-0001-Add-extension_destdir-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio wrote: > Still, for the sake of completeness it might make sense to support > this whole list in extension_destdir. (assuming it's easy to do) It should be with the current patch, which just uses a prefix to paths in `pg_config`. So if SHAREDIR is set to /usr/share/postgresql/16 and extension_destdir is set to /mount/ext, then Postgres will look for files in /mount/ext/usr/share/postgresql/16. The same rule applies (or should apply) for all other pg_config directory configs and where the postmaster looks for specific files. And PGXS already supports installing files in these locations, thanks to its DESTDIR param. (I don’t know how it works on Windows, though.) That said, this is very much a pattern designed for RPM and Debian package management patterns, and not for actually installing and managing extensions. And maybe that’s fine for now, as it can still be used to address the immutability problems descried in the original post in this thread. Ultimately, I’d like to figure out a way to more tidily organize installed extension files, but I think that, too, might be a separate thread. Best, David
Re: RFC: Additional Directory for Extensions
On Tue, Jun 25, 2024 at 6:12 AM Alvaro Herrera wrote: > Now, I'm not saying that this is an easy journey. But if we don't > start, we're not going to get there. I actually kind of agree with you. I think I've said something similar in a previous email to the list somewhere. But I don't agree that this patch should be burdened with taking the first step. We seem to often find reasons why patches that packagers for prominent distributions are carrying shouldn't be put into core, and I think that's a bad habit. They're not going to stop applying those packages because we refuse to put suitable functionality in core; we're just creating a situation where lots of people are running slightly patched versions of PostgreSQL instead of straight-up PostgreSQL. That's not improving anything. If we want to work on making the sorts of changes that you're proposing, let's do it on a separate thread. It's not going to be meaningfully harder to move in that direction after some patch like this than it is today. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Tue, 25 Jun 2024 at 12:12, Alvaro Herrera wrote: > They can mutilate the system catalogs: yes, they can TRUNCATE pg_type. > So what? They've just destroyed their own ability to do anything else. > The real issue here is that they can edit pg_proc to cause SQL function > calls to call arbitrary code. But what if we limit functions so that > the C code that they can call is located in specific places that are > known to only contain secure code? This is easy: make sure the > OS-installation only contains safe code in $libdir. I wouldn't call it "easy" but I totally agree that changing pg_proc is the main security issue that we have no easy way to tackle. > I hear you say: ah, but they can modify dynamic_library_path, which is a > GUC, to load code from anywhere -- especially /tmp, where the newest > bitcoin-mining library was just written. This is true. I suggest, to > solve this problem, that we should make dynamic_library_path no longer a > GUC. It should be a setting that comes from a different origin, one > that even superuser cannot write to. Only the OS-installation can > modify that file; that way, superuser cannot load arbitrary code that > way. I don't think that needs a whole new file. Making this GUC be PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE should be enough. Just like was done for the new allow_alter_system GUC in PG17. > This is where the new GUC setting being proposed in this thread rubs me > the wrong way: it's adding yet another avenue for this to be exploited. > I would like this new directory not to be a GUC either, just like > dynamic_library_path. We can make it PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE too, either now or in the future. > Now, I'm not saying that this is an easy journey. But if we don't > start, we're not going to get there. Sure, but it sounds like you're suggesting that you want to "start" by not adding new features that have equivalent security holes as the ones that we already have. I don't think that is a very helpful way to get to a better place. It seems much more useful to tackle the current problems that we have first, and almost certainly the same solutions to those problems can be applied to any new features with security issues. It at least definitely seems the case for the proposal in this thread: i.e. we already have a GUC that allows loading libraries from an arbitrary location. This proposal adds another such GUC. If we solve the security problem in that first GUC, either by GUC_DISALLOW_IN_AUTO_FILE, or by creating a whole new mechanism for the setting, then I see no reason why we cannot use that exact same solution for the newly proposed GUC. So the required work to secure postgres will not be meaningfully harder by adding this GUC.
Re: RFC: Additional Directory for Extensions
On 2024-Jun-24, Robert Haas wrote: > Is "tighten up what the superuser can do" on our list of objectives? > Personally, I think we should be focusing mostly, and maybe entirely, > on letting non-superusers do more things, with appropriate security > controls. The superuser can ultimately do anything, since they can > cause shell commands to be run and load arbitrary code into the > backend and write code in untrusted procedural languages and mutilate > the system catalogs and lots of other terrible things. I don't agree that we should focus _solely_ on allowing non-superusers to do more things. Sure, it's a good thing to do -- but we shouldn't completely close the option of securing superuser itself. I think it's not completely impossible to have a future where superuser is just so within the database, i.e. that it can't escape to the operating system. I'm sure that would be useful in many environments. On this list, many people frequently make the argument that it is impossible to secure, but I'm not convinced. They can mutilate the system catalogs: yes, they can TRUNCATE pg_type. So what? They've just destroyed their own ability to do anything else. The real issue here is that they can edit pg_proc to cause SQL function calls to call arbitrary code. But what if we limit functions so that the C code that they can call is located in specific places that are known to only contain secure code? This is easy: make sure the OS-installation only contains safe code in $libdir. I hear you say: ah, but they can modify dynamic_library_path, which is a GUC, to load code from anywhere -- especially /tmp, where the newest bitcoin-mining library was just written. This is true. I suggest, to solve this problem, that we should make dynamic_library_path no longer a GUC. It should be a setting that comes from a different origin, one that even superuser cannot write to. Only the OS-installation can modify that file; that way, superuser cannot load arbitrary code that way. This is where the new GUC setting being proposed in this thread rubs me the wrong way: it's adding yet another avenue for this to be exploited. I would like this new directory not to be a GUC either, just like dynamic_library_path. I hear you argue: ah, but they can use COPY to write a new file to $libdir. Yes, they can, and I think that's foolish. We could have another non-GUC setting which takes a list of directories where COPY can write files into. Problem solved. Do people really need the ability to write files on arbitrary locations? Untrusted extensions: well, just don't have those in the OS-installation and you'll be fine. I'm okay with saying that a superuser-restricted system is incompatible with plpython. archive_command and so on: we could disable these too. Nathan did some work to implement those using dynamic libraries, so it shouldn't be too much of a loss; anything that is done with a shell script can also be done with a small library. Those libraries can be made safe. If there are other ways to invoke shell commands from GUCs, let's add the ability to use libraries for those too. What other exploits do we know about? How can we close them? Now, I'm not saying that this is an easy journey. But if we don't start, we're not going to get there. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Doing what he did amounts to sticking his fingers under the hood of the implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)
Re: RFC: Additional Directory for Extensions
Re: Nathan Bossart > At first glance, the general idea seems reasonable to me. I'm wondering > whether there is a requirement for this directory to be prepended or if it > could be appended to the end. That way, the existing ones would take > priority, which might be desirable from a security standpoint. My use case for this is to test things at compile time (where I can't write to /usr/share/postgresql/). If installed things would take priority over the things that I'm trying to test, I'd be disappointed. Christoph
Re: RFC: Additional Directory for Extensions
On Mon, 24 Jun 2024 at 22:42, David E. Wheeler wrote: > >> BINDIR > >> DOCDIR > >> HTMLDIR > >> PKGINCLUDEDIR > >> LOCALEDIR > >> MANDIR > >> > >> I can imagine an extension wanting or needing to use any and all of these. > > > > Are these really all relevant to backend code? > > Oh I think so. Especially BINDIR; lots of extensions ship with binary > applications. And most ship with docs, too (PGXS puts items listed in DOCS > into DOCDIR). Some might also produce man pages (for their binaries), HTML > docs, and other stuff. Maybe an FTE extension would include locale files? > > I find it pretty easy to imagine use cases for all of them. So much so that I > wrote an extension binary distribution RFC[1] and its POC[2] around them. Definitely agreed on BINDIR needing to be supported. And while lots of extensions ship with docs, I expect this feature to mostly be used in production environments to make deploying extensions easier. And I'm not sure that many people care about deploying docs to production (honestly lots of people would probably want to strip them). Still, for the sake of completeness it might make sense to support this whole list in extension_destdir. (assuming it's easy to do)
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 17:17, Jelte Fennema-Nio wrote: > If you want to only change $libdir during CREATE EXTENSION (or ALTER > EXTENSION UPDATE), then why not just change it there. And really you'd > only want to change it when creating an extension from which the > control file is coming from extension_destdir. IIUC, the postmaster needs to load an extension on first use in every session unless it’s in shared_preload_libraries. > However, I can also see a case for really always changing $libdir. > Because some extensions in shared_preload_libraries, might want to > trigger loading other libraries that they ship with dynamically. And > these libraries are probably also in extension_destdir. Right, it can be more than just the DSOs for the extension itself. Best, David
Re: RFC: Additional Directory for Extensions
On Mon, 24 Jun 2024 at 18:11, Nathan Bossart wrote: > At first glance, the general idea seems reasonable to me. I'm wondering > whether there is a requirement for this directory to be prepended or if it > could be appended to the end. That way, the existing ones would take > priority, which might be desirable from a security standpoint. Citus does ship with some override library for pgoutput to make logical replication/CDC work correctly with sharded tables. Right now using this override library requires changing dynamic_library_path. It would be nice if that wasn't necessary. But this is obviously a small thing. And I definitely agree that there's a security angle to this as well, but honestly that seems rather small too. If an attacker can put shared libraries into the extension_destdir, I'm pretty sure you've lost already, no matter if extension_destdir is prepended or appended to the existing $libdir.
Re: RFC: Additional Directory for Extensions
On Thu, 11 Apr 2024 at 19:52, David E. Wheeler wrote: > I realize this probably isn’t going to happen for 17, given the freeze, but I > would very much welcome feedback and pointers to address concerns about > providing a second directory for extensions and DSOs. Quite a few people have > talked about the need for this in the Extension Mini Summits[1], so I’m sure > I could get some collaborators to make improvements or look at a different > approach. Overall +1 for the idea. We're running into this same limitation (only a single place to put extension files) at Microsoft at the moment. +and to the '$libdir' directive when loading modules +that back functions. I feel like this is a bit strange. Either its impact is too wide, or it's not wide enough depending on your intent. If you want to only change $libdir during CREATE EXTENSION (or ALTER EXTENSION UPDATE), then why not just change it there. And really you'd only want to change it when creating an extension from which the control file is coming from extension_destdir. However, I can also see a case for really always changing $libdir. Because some extensions in shared_preload_libraries, might want to trigger loading other libraries that they ship with dynamically. And these libraries are probably also in extension_destdir.
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 4:28 PM, Robert Haas wrote: > As long as the GUC is superuser-only, I'm not sure what else there is > to do here. The only question is whether there's some reason to > disallow this even from the superuser, but I'm not quite seeing such a > reason. I can switch it back from requiring a restart to allowing a superuser to set it. >> I sketched them quickly, so agree they can be better. Reading the code, I >> now see that it appears to be the former case. I’d like to advocate for the >> latter. > > Sounds good. Yeah, though then I have a harder time deciding how it should work. pg_config’s paths are absolute. With your first example, we just use them exactly as they are, but prefix them with the destination directory. So if it’s set to `/my/temp/root/`, then files go into /my/temp/root/$(pg_conifg --sharedir) /my/temp/root/$(pg_conifg --pkglibdir) /my/temp/root/$(pg_conifg --bindir) # etc. Which is exactly how RPM and Apt packages are built, but seems like an odd configuration for general use. >> BINDIR >> DOCDIR >> HTMLDIR >> PKGINCLUDEDIR >> LOCALEDIR >> MANDIR >> >> I can imagine an extension wanting or needing to use any and all of these. > > Are these really all relevant to backend code? Oh I think so. Especially BINDIR; lots of extensions ship with binary applications. And most ship with docs, too (PGXS puts items listed in DOCS into DOCDIR). Some might also produce man pages (for their binaries), HTML docs, and other stuff. Maybe an FTE extension would include locale files? I find it pretty easy to imagine use cases for all of them. So much so that I wrote an extension binary distribution RFC[1] and its POC[2] around them. Best, David [1]: https://github.com/orgs/pgxn/discussions/2 [1]: https://justatheory.com/2024/06/trunk-poc/
Re: RFC: Additional Directory for Extensions
On Mon, Jun 24, 2024 at 3:37 PM David E. Wheeler wrote: > I guess the question then is what security controls are appropriate for this > feature, which after all tells the postmaster what directories to read files > from. It feels a little outside the scope of a regular user to even be aware > of the file system undergirding the service. But perhaps there’s a > non-superuser role for whom it is appropriate? As long as the GUC is superuser-only, I'm not sure what else there is to do here. The only question is whether there's some reason to disallow this even from the superuser, but I'm not quite seeing such a reason. > > On the patch itself, I find the documentation for this to be fairly > > hard to understand. I think it could benefit from an example. I'm > > confused about whether this is intended to let me search for > > extensions in /my/temp/root/usr/lib/postgresql/... by setting > > extension_directory=/my/temp/dir, or whether it's intended me to > > search both /usr/lib/postgresql as I normally would and also > > /some/other/place. > > I sketched them quickly, so agree they can be better. Reading the code, I now > see that it appears to be the former case. I’d like to advocate for the > latter. Sounds good. > > If the latter, I wonder why we don't handle shared > > libraries by setting dynamic_library_path and then just have an > > analogue of that for control files. > > The challenge is that it applies not just to shared object libraries and > control files, but also extension SQL files and any other SHAREDIR files an > extension might include. But also, I think it should support all the > pg_config installation targets that extensions might use, including: > > BINDIR > DOCDIR > HTMLDIR > PKGINCLUDEDIR > LOCALEDIR > MANDIR > > I can imagine an extension wanting or needing to use any and all of these. Are these really all relevant to backend code? -- Robert Haas EDB: http://www.enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Jun 24, 2024, at 1:53 PM, Robert Haas wrote: > Is "tighten up what the superuser can do" on our list of objectives? > Personally, I think we should be focusing mostly, and maybe entirely, > on letting non-superusers do more things, with appropriate security > controls. The superuser can ultimately do anything, since they can > cause shell commands to be run and load arbitrary code into the > backend and write code in untrusted procedural languages and mutilate > the system catalogs and lots of other terrible things. I guess the question then is what security controls are appropriate for this feature, which after all tells the postmaster what directories to read files from. It feels a little outside the scope of a regular user to even be aware of the file system undergirding the service. But perhaps there’s a non-superuser role for whom it is appropriate? > Now, I think there are environments where people have used things like > containers to try to lock down the superuser, and while I'm not sure > that can ever be particularly water-tight, if it were the case that > this patch would make it a whole lot easier for a superuser to bypass > the kinds of controls that people are imposing today, that might be an > argument against this patch. But ... off-hand, I'm not seeing such an > exposure. Yeah I’m not even sure I follow. Containers are immutable, other than mutable mounted volumes --- which is one use case this patch is attempting to enable. > On the patch itself, I find the documentation for this to be fairly > hard to understand. I think it could benefit from an example. I'm > confused about whether this is intended to let me search for > extensions in /my/temp/root/usr/lib/postgresql/... by setting > extension_directory=/my/temp/dir, or whether it's intended me to > search both /usr/lib/postgresql as I normally would and also > /some/other/place. I sketched them quickly, so agree they can be better. Reading the code, I now see that it appears to be the former case. I’d like to advocate for the latter. > If the latter, I wonder why we don't handle shared > libraries by setting dynamic_library_path and then just have an > analogue of that for control files. The challenge is that it applies not just to shared object libraries and control files, but also extension SQL files and any other SHAREDIR files an extension might include. But also, I think it should support all the pg_config installation targets that extensions might use, including: BINDIR DOCDIR HTMLDIR PKGINCLUDEDIR LOCALEDIR MANDIR I can imagine an extension wanting or needing to use any and all of these. Best, David
Re: RFC: Additional Directory for Extensions
On Wed, Apr 3, 2024 at 3:13 AM Alvaro Herrera wrote: > I support the idea of there being a second location from where to load > shared libraries ... but I don't like the idea of making it > runtime-configurable. If we want to continue to tighten up what > superuser can do, then one of the things that has to go away is the > ability to load shared libraries from arbitrary locations > (dynamic_library_path). I think we should instead look at making those > locations hardcoded at compile time. The packager can then decide where > those things go, and the superuser no longer has the ability to load > arbitrary code from arbitrary locations. Is "tighten up what the superuser can do" on our list of objectives? Personally, I think we should be focusing mostly, and maybe entirely, on letting non-superusers do more things, with appropriate security controls. The superuser can ultimately do anything, since they can cause shell commands to be run and load arbitrary code into the backend and write code in untrusted procedural languages and mutilate the system catalogs and lots of other terrible things. Now, I think there are environments where people have used things like containers to try to lock down the superuser, and while I'm not sure that can ever be particularly water-tight, if it were the case that this patch would make it a whole lot easier for a superuser to bypass the kinds of controls that people are imposing today, that might be an argument against this patch. But ... off-hand, I'm not seeing such an exposure. On the patch itself, I find the documentation for this to be fairly hard to understand. I think it could benefit from an example. I'm confused about whether this is intended to let me search for extensions in /my/temp/root/usr/lib/postgresql/... by setting extension_directory=/my/temp/dir, or whether it's intended me to search both /usr/lib/postgresql as I normally would and also /some/other/place. If the latter, I wonder why we don't handle shared libraries by setting dynamic_library_path and then just have an analogue of that for control files. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RFC: Additional Directory for Extensions
On Thu, Apr 11, 2024 at 01:52:26PM -0400, David E. Wheeler wrote: > I realize this probably isn´t going to happen for 17, given the freeze, > but I would very much welcome feedback and pointers to address concerns > about providing a second directory for extensions and DSOs. Quite a few > people have talked about the need for this in the Extension Mini > Summits[1], so I´m sure I could get some collaborators to make > improvements or look at a different approach. At first glance, the general idea seems reasonable to me. I'm wondering whether there is a requirement for this directory to be prepended or if it could be appended to the end. That way, the existing ones would take priority, which might be desirable from a security standpoint. -- nathan
Re: RFC: Additional Directory for Extensions
On Apr 4, 2024, at 1:20 PM, David E. Wheeler wrote: > I’ve added some docs based on your GUC description; updated patch attached. Here’s a rebase. I realize this probably isn’t going to happen for 17, given the freeze, but I would very much welcome feedback and pointers to address concerns about providing a second directory for extensions and DSOs. Quite a few people have talked about the need for this in the Extension Mini Summits[1], so I’m sure I could get some collaborators to make improvements or look at a different approach. Best, David [1] https://justatheory.com/2024/02/extension-ecosystem-summit/#extension-ecosystem-mini-summit v3-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 12:46 PM, Christoph Berg wrote: > Thanks for bringing this up, I should have submitted this years ago. > (The patch is originally from September 2020.) That’s okay, it’s still 2020 in some ways. 😂 > I designed the patch to require a superuser to set it, so it doesn't > matter very much by which mechanism it gets updated. There should be > little reason to vary it at run-time, so I'd be fine with requiring a > restart, but otoh, why restrict the superuser from reloading it if > they know what they are doing? I think that’s fair. I’ll keep it requiring a restart now, on the theory it would be easier to loosen it later than have to tighten it later. > I'm not sure the concept of "core libraries" exists. PG happens to > dlopen things at run time, and it doesn't know/care if they were > installed by users or by the original PG server. Also, an exploited > libpgtypes library is not worse than any other untrusted "user" > library, so you really don't want to allow users to provide their own > .so files, no matter by what mechanism. Yes, I guess my concern is whether it could be used to “shadow” core libraries. Maybe it’s no different, really. >> This would also allow the lookup of extension libraries prefixed by the >> directory field from the control file, which would enable much tidier >> extension installation: The control file, SQL scripts, and DSOs could all be >> in a single directory for an extension. > > Nice idea, but that would mean installing .so files into PGSHAREDIR. > Perhaps the whole extension stuff would have to move to PKGLIBDIR > instead. Yes, I was just poking around the code, and realized that, when extension functions are created they may or may not not use `MODULE_PATHNAME`, but in any event, there is nothing different about loading an extension DSO than any other DSO. I was hoping to find a path where it knows it’s opening a DSO for the purpose of an extension, so we could limit the lookup there. But that does not (currently) exist. Maybe we could add an `$extensiondir` variable to complement `$libdir`? Or is PGKLIBDIR is the way to go? I’m not familiar with it. It looks like extension JIT files are put there already. > Fwiw, I wrote this patch to solve the problem of testing extensions at > build-time where the build process does not have write access to > PGSHAREDIR. It solves that problem quite well, almost all PG > extensions have build-time test coverage now (where there was > basically 0 before). Yeah, good additional use case. On Apr 3, 2024, at 1:03 PM, Christoph Berg wrote: > Also, it's called extension "destdir" because it behaves like DESTDIR > in Makefiles: It prepends the given path to the path that PG is trying > to open when set. So it doesn't allow arbitrary new locations as of > now, just /home/build/foo-1/debian/foo/usr/share/postgresql/17/extension > in addition to /usr/share/postgresql/17/extension. (That is what the > Debian package build process needs, so that restriction/design choice > made sense. Right, this makes perfect sense, in that you don’t have to copy all the extension files from the destdir to the SHAREDIR to test them, which I imagine could be a PITA. > That's also included in the current GUC description: > > This directory is prepended to paths when loading extensions > (control and SQL files), and to the '$libdir' directive when > loading modules that back functions. The location is made > configurable to allow build-time testing of extensions that do not > have been installed to their proper location yet. > > Perhaps I should have included a more verbose "NOT FOR PRODUCTION" > there. The use cases I described upthread are very much production use cases. Do you think it’s not for production just because we need to really think it through? I’ve added some docs based on your GUC description; updated patch attached. Best, David v2-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
> Fwiw, I wrote this patch to solve the problem of testing extensions at > build-time where the build process does not have write access to > PGSHAREDIR. It solves that problem quite well, almost all PG > extensions have build-time test coverage now (where there was > basically 0 before). Also, it's called extension "destdir" because it behaves like DESTDIR in Makefiles: It prepends the given path to the path that PG is trying to open when set. So it doesn't allow arbitrary new locations as of now, just /home/build/foo-1/debian/foo/usr/share/postgresql/17/extension in addition to /usr/share/postgresql/17/extension. (That is what the Debian package build process needs, so that restriction/design choice made sense.) > Security is not a concern at this point as everything is running as > the same user, and the test cluster will be wiped right after the > test. I figured marking the setting as "super user" only was enough > security at that point, but I would recommend another audit before > using it together with "trusted" extensions and other things in > production. That's also included in the current GUC description: This directory is prepended to paths when loading extensions (control and SQL files), and to the '$libdir' directive when loading modules that back functions. The location is made configurable to allow build-time testing of extensions that do not have been installed to their proper location yet. Perhaps I should have included a more verbose "NOT FOR PRODUCTION" there. As for compatibility, the patch has been part of the PG 9.5..17 now for several years, and I'm very happy with extra test coverage it provides, especially on the Debian architectures that don't have "autopkgtest" runners yet. Christoph
Re: RFC: Additional Directory for Extensions
Re: David E. Wheeler > > Yes, I like the suggestion to make it require a restart, which lets the > > sysadmin control it and not limited to whatever the person who compiled it > > thought would make sense. > > Here’s a revision of the Debian patch that requires a server start. Thanks for bringing this up, I should have submitted this years ago. (The patch is originally from September 2020.) I designed the patch to require a superuser to set it, so it doesn't matter very much by which mechanism it gets updated. There should be little reason to vary it at run-time, so I'd be fine with requiring a restart, but otoh, why restrict the superuser from reloading it if they know what they are doing? > However, in studying the patch, it appears that the `extension_directory` is > searched for *all* shared libraries, not just those being loaded for an > extension. Am I reading the `expand_dynamic_library_name()` function right? > > If so, this seems like a good way for a bad actor to muck with things, by > putting an exploited libpgtypes library into the extension directory, where > it would be loaded in preference to the core libpgtypes library, if they > couldn’t exploit the original. > > I’m thinking it would be better to have the dynamic library lookup for > extension libraries (and LOAD libraries?) separate, so that the > `extension_directory` would not be used for core libraries. I'm not sure the concept of "core libraries" exists. PG happens to dlopen things at run time, and it doesn't know/care if they were installed by users or by the original PG server. Also, an exploited libpgtypes library is not worse than any other untrusted "user" library, so you really don't want to allow users to provide their own .so files, no matter by what mechanism. > This would also allow the lookup of extension libraries prefixed by the > directory field from the control file, which would enable much tidier > extension installation: The control file, SQL scripts, and DSOs could all be > in a single directory for an extension. Nice idea, but that would mean installing .so files into PGSHAREDIR. Perhaps the whole extension stuff would have to move to PKGLIBDIR instead. Fwiw, I wrote this patch to solve the problem of testing extensions at build-time where the build process does not have write access to PGSHAREDIR. It solves that problem quite well, almost all PG extensions have build-time test coverage now (where there was basically 0 before). Security is not a concern at this point as everything is running as the same user, and the test cluster will be wiped right after the test. I figured marking the setting as "super user" only was enough security at that point, but I would recommend another audit before using it together with "trusted" extensions and other things in production. Christoph
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 8:54 AM, David E. Wheeler wrote: > Yes, I like the suggestion to make it require a restart, which lets the > sysadmin control it and not limited to whatever the person who compiled it > thought would make sense. Here’s a revision of the Debian patch that requires a server start. However, in studying the patch, it appears that the `extension_directory` is searched for *all* shared libraries, not just those being loaded for an extension. Am I reading the `expand_dynamic_library_name()` function right? If so, this seems like a good way for a bad actor to muck with things, by putting an exploited libpgtypes library into the extension directory, where it would be loaded in preference to the core libpgtypes library, if they couldn’t exploit the original. I’m thinking it would be better to have the dynamic library lookup for extension libraries (and LOAD libraries?) separate, so that the `extension_directory` would not be used for core libraries. This would also allow the lookup of extension libraries prefixed by the directory field from the control file, which would enable much tidier extension installation: The control file, SQL scripts, and DSOs could all be in a single directory for an extension. Thoughts? Best, David v1-0001-Add-extension_directory-GUC.patch Description: Binary data
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 8:54 AM, David E. Wheeler wrote: > Yes, I like the suggestion to make it require a restart, which lets the > sysadmin control it and not limited to whatever the person who compiled it > thought would make sense. Or SIGHUP? D
Re: RFC: Additional Directory for Extensions
On Apr 3, 2024, at 3:57 AM, walt...@technowledgy.de wrote: > I can also imagine that it would be very helpful in a container setup to be > able to set an environment variable with this path instead of having to > recompile all of postgres to change it. Yes, I like the suggestion to make it require a restart, which lets the sysadmin control it and not limited to whatever the person who compiled it thought would make sense. Best, David
Re: RFC: Additional Directory for Extensions
> On 3 Apr 2024, at 09:13, Alvaro Herrera wrote: > > On 2024-Apr-02, David E. Wheeler wrote: > >> That quotation comes from this Debian patch[2] maintained by Christoph >> Berg. I’d like to formally propose integrating this patch into the >> core. And not only because it’s overhead for package maintainers like >> Christoph, but because a number of use cases have emerged since we >> originally discussed something like this back in 2013[3]: > > I support the idea of there being a second location from where to load > shared libraries Agreed, the case made upthread that installing an extension breaks the app signing seems like a compelling reason to do this. The implementation of this need to make sure the directory is properly set up however to avoid similar problems that CVE 2019-10211 showed. -- Daniel Gustafsson
Re: RFC: Additional Directory for Extensions
Alvaro Herrera: I support the idea of there being a second location from where to load shared libraries ... but I don't like the idea of making it runtime-configurable. If we want to continue to tighten up what superuser can do, then one of the things that has to go away is the ability to load shared libraries from arbitrary locations (dynamic_library_path). I think we should instead look at making those locations hardcoded at compile time. The packager can then decide where those things go, and the superuser no longer has the ability to load arbitrary code from arbitrary locations. The use-case for runtime configuration of this seems to be build-time testing of extensions against an already installed server. For this purpose it should be enough to be able to set this directory at startup - it doesn't need to be changed while the server is actually running. Then you could spin up a temporary postgres instance with the extension directory pointing a the build directory and test. Would startup-configurable be better than runtime-configurable regarding your concerns? I can also imagine that it would be very helpful in a container setup to be able to set an environment variable with this path instead of having to recompile all of postgres to change it. Best, Wolfgang
Re: RFC: Additional Directory for Extensions
On 2024-Apr-02, David E. Wheeler wrote: > That quotation comes from this Debian patch[2] maintained by Christoph > Berg. I’d like to formally propose integrating this patch into the > core. And not only because it’s overhead for package maintainers like > Christoph, but because a number of use cases have emerged since we > originally discussed something like this back in 2013[3]: I support the idea of there being a second location from where to load shared libraries ... but I don't like the idea of making it runtime-configurable. If we want to continue to tighten up what superuser can do, then one of the things that has to go away is the ability to load shared libraries from arbitrary locations (dynamic_library_path). I think we should instead look at making those locations hardcoded at compile time. The packager can then decide where those things go, and the superuser no longer has the ability to load arbitrary code from arbitrary locations. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Al principio era UNIX, y UNIX habló y dijo: "Hello world\n". No dijo "Hello New Jersey\n", ni "Hello USA\n".