Hi, Thanks for the review, that's quite one! :)
I'm not sure to follow you all along, it seems like the reading is "try it first then understand and comment again", so sometimes I'm not sure if you're saying that docs are missing the point or that the behaviour ain't right. "David E. Wheeler" <da...@kineticode.com> writes: > `make installcheck` fails (regression.diffs attached). You forgot the attachment. I'd add mine if I didn't get "All 124 tests passed." result for the command :) (Note that I've been having problems too, but can't reproduce them easily, and they might have been related to forgotten maintainer-clean steps). > I see tests for pg_execute_sql_string() and replace(), but absolutely > nothing else. Such an important core feature really should have a > pretty comprehensive suite of tests exercising all the > functionality. Relying on the contrib extension tests won't do it, as > they exercise only a subset of the functionality (e.g., no > custom_variable_classes), and then only as a side-effect of their > actual purpose. Well, the exposed functionality aren't much (2 SQL commands and some support functions) and you need to have a contrib at hand to exercise them. So I'm not sure about how to add in regression tests for contrib infrastructure work and have them depend only on -core. I'll take ideas. The other thing is that the very important feature here is dump and restore of a database containing extensions. You're not reporting about it, though. And I'm not sure about how to check for pg_dump content from pg_regress oriented tests. > Yes. Much of the execution is superuser-only, so we need to make sure > that such is actually the case (unit tests would help with that). How do you mean? Trying to subvert the features by being able to run them as a non-superuser? How do you install tests around that? > Notes on the documentation > -------------------------- > > The pg_extension catalog docs are a little thin, but probably > sufficient. They appear to be duplicated, though, with the entries for > catalog-pg-extension and view-pg-extensions looking identical. One is > apparently a list of installed extensions, and the other an SRF that > lists installed and available extensions. But I find the duplication a > bit confusing. (Might be easer if I wasn't reading SGML though.) Well yes the pg_extension catalog is filled at CREATE EXTENSION time so that we have an OID to register dependencies on. Then I added a SRF to also list available extensions so that it's easy to integrate the feature into pgadmin and the like. Now I'm open to suggestions as far as the names are involved, or maybe we should keep the SRF but remove the view. psql could easily enough use the function directly I think. > I don't understand this paragraph at all: > This paragraph isn't very clear, either: > Examples might help. Well, I think native English skills are required here. And I'm not too sure about the example. > Control file keys: > > * comment -- Should mention that it's used for CREATE COMMENT ON EXTENSION, > no? > * version -- If it's free-form, how will you do dependency-checking? Or will > you? > * script > * encoding -- Seems unnecessary; one can explicitly set it in the .sql file, > no? > * custom_variable_classes Doc says: http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html This control file allows for registering the extension and should provide values for the keys described below. Then there's the detail about fields. I think the context is enough to understand, but I'd be happy to integrate some revisions of the current content. For the version, it has been agreed before (Developer Meeting) that v1 would not care about inter-extension dependencies nor be smart about it, or I would have proposed a patch to have the following into core: http://packages.debian.org/sid/postgresql-9.0-debversion > I don't understand this paragraph: > > <para> > The admin > function <link > linkend="functions-extension">pg_extension_flag_dump</link> > can be used to revert the default <literal>pg_dump</literal> policy > about objects that belong to an extension and force the flagged objects > to be part of the backups. > </para> > > What's a pg_dump policy? Here again not being a native speaker shows up it seems. What I mean is that the only goal of all this patch is for pg_dump to stop including extensions install scripts, but just mark the dependency. So obviously, if your extension provides for some user editable content, you want to reverse the default behavior and have those content be parts of the dump, right? Now, to phrase that in a manual fashion... > * There appears to be an irrelevant change to the docs for replace(). As detailed in another thread, the patch you're reviewing depends on another one that has been reviewed separately, marked as ready, but is not yet in. So the extension patch is including the pg_execute_sql_file patch, to ease your reviewer work (you don't have to deal with patch dependencies). So you're seeing unrelated-at-first-glance changes. > Also, why are the placholders different than everywhere else? I see > > + SELECT pg_execute_sql_string('CREATE SCHEMA @schema@;', '@schema@', > 'utils'); > > and > > + SELECT pg_execute_sql_string('CREATE SCHEMA s;', 's', 'bar'); > > Why not $1 like is used in PREPARE and user-defined functions? I think > the second example is particularly weird, as there's nothing to > indicate that "s" is a placeholder. I don't mind "@schema@", but would > like to see uses of such things be consistent throughout PostgreSQL > (and I expect that one cannot add such as placeholders to PREPARE). The @extschema@ syntax has been proposed by Heikki, and I just adopted it. If another is to emerge, I'll adapt to it. While preparing this, I didn't want to mix mechanism and policy, so there's nothing imposed on the extension authors or the function users wrt to what a placeholder name should look like. > In the CREATE EXTENSION docs, I don't understand this section: > <varlistentry> > <term>NO USER DATA</term> > Does it mean that no user data will be dumped when the extension is > dumped? Consider you have an extension with user data. At create extension time, you're creating a table to hold the data, and maybe you insert default values into it. At pg_restore time, CREATE EXTENSION will be called again. Do you want the extension's provided default user data to be restored, or the one from the backup? So pg_dump will issue CREATE EXTENSION foo WITH NO USER DATA variant, and will also issue dump commands for objects marked as being user data, so that at pg_restore time you get the values from the backup when it makes sense. Updates on the manual style paragraph to explain that are welcome. It could be that what's needed is a better overview of the whole thing somewhere, but I though what's written would be enough. It seems to be only enough when you already know what it's all about --- and that's typical of Unix style man pages. Do we want to include an extension author tutorial? > In the DROP EXTENSION docs, I think there's a pasto: > s/index/extension/. Same in the RESTRICT section. Thanks, fixed here. > Why would multiple control files be necessary? And how would it map to > EXTVERSION? Would all the extensions get the one version in > EXTVERSION? That part has been extensively re-hashed by Alvaro (and Heikki I think) already. The basis for multiple control files per extension is to be able to cope with contrib/spi, that contains 5 extensions: MODULES = autoinc insert_username moddatetime refint timetravel EXTENSION = autoinc auto_username moddatetime refint timetravel EXTVERSION = $(VERSION) Now, would you make the same layout choice and maintain different version numbering schemes, you still can do that, but you won't benefit from the EXTVERSION facility. > <literal><function>pg_extension_flag_dump(<parameter>objid</> > <type>oid</>)</function></literal> > Why is this necessary? Aren't all database objects "worthy" of being > dumped? And this: Remember that the only goal of all that patch is *NOT* to dump the extension's install script? > What constitutes "user data" when it comes to extensions? In contrast, > the explanation below of "pg_extension_flag_dump" is very clear, > making it obvious what it's for and why it might be useful. I don't > see how it relates to the summary above, however. OTOH, it also says: User data is anything that the extension's install script flagged so. > <para> > This function can only be called when loading a <acronym>SQL</> script > using the command <command>CREATE > EXTENSION</command>. See <xref linkend="sql-createextension"> > and <xref linkend="extend-extension">. > </para> > > And I have no idea how I'd call that function within another function > call. Does this mean that only the extension author can call it? Seems > like a weird scoping without precedent. Indeed. Those two functions only make sense when used while running the CREATE EXTENSION sql command, so are only meaningful in the extension's install script. And yes that looks like a precedent. > + (errmsg("This function can only be used from > CREATE EXTENSION."))); > > I think that's a confusing error message, frankly. Perhaps better > would be something like "This function can only be called from SQL > files executed by CREATE EXTENSION". The docs should make that > clearer, too. It was only when I saw this error message that I > realized how it's scoped. Any proposal here? > Now, some of the information in the [wiki > page](http://wiki.postgresql.org/wiki/Extensions) actually seems more > informative. For example, the description and examples of \dx, \dx+, > and pg_extension_objects() are *much* better. Any way to work those > into the docs? Maybe they don't go into the function reference pages, > but perhaps into the "Putting it all together" doc? In fact, I think > the wiki doc should be copied into the "putting it all together" doc > almost verbatim (modulo the patch excerpts). That will be a big help > to prospective extension authors. Well maybe it's just me but I feel like wiki content is "free style" while the manual tone has to be sharp and concise, and the examples in there must be carefully chosen so as not to deprecate early. But if the consensus is to add the wiki content into the documentation chapter, that's what I'll do for next patch version. http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html > Reviewing the wiki page some more, you have this example (also in the docs, > IIRC): > > + DO $$ > + BEGIN > + IF pg_extension_with_user_data() THEN > + create schema foo; > + create table foo.bar(id serial primary key); > + perform pg_extension_flag_dump('foo.bar_id_seq'::regclass); > + perform pg_extension_flag_dump('foo.bar::regclass); > + END IF; > + END; > + $$; > > That's good, but what if someone has restored a database that has > those objects already? Will the extension be re-created with this code > before the tables are loaded? Or vice versa? Either way, you're going > to have a conflict when you restore from a dump. The example is in the docs, yes. The whole goal of the two functions pg_extension_flag_dump() and pg_extension_with_user_data() is for pg_restore to instruct the extension's install script if it should create the user data objects or bypass the step because they will be in the dump already. I think we have to find a way to state that clearly in the manual. > Code Review > ----------- > > I find the use of variables in the control files confusing. It looks like > this: > > version = 'EXTVERSION' > > Why not use make syntax? > > version = $(EXTVERSION) > > That makes it much clearer that it's a variable and provides the same > syntax a the extension developer will already be using in the > Makefile. For starters, you should be comparing this facility with the .sql.in one where you can use the following notation: CREATE OR REPLACE FUNCTION citext_cmp(citext, citext) RETURNS int4 AS 'MODULE_PATHNAME' LANGUAGE C STRICT IMMUTABLE; Then, if you insist on some strange cleverer input syntax, I'll welcome the C-coded patch to make that happen. You might want to know that the current extension patch is re-using the same parsing code that is used for postgresql.conf and recovery.conf, though. > I've already mentioned that the @extschema@ placeholder is different from any > other SQL placeholder: > > SET search_path = @extschema@; > > I'm not sure of a better option, but I do think it should be made > consistent, if at all possible, since this is executed in the > database, not at `make` time. Again, there's no technical restriction in there. =# select replace('Are placeholders $2 and $1 better than @n...@?', '$1', 'foo', '$2', 'bar', '@name@', '@'); replace --------------------------------------------- Are placeholders bar and foo better than @? (1 row) It looks like bikeshedding time. We managed to reduce it up until now, but well, it has to kick-in after a while :) > Speaking of which, what happens if an extension has no search_path? Or > an old explicit one? No. Please do not try to open this can of worms. It took one entire year for us to agree that extension should not concern themselves with schema and search_path issues, and 2 developer meetings. So what's in the patch is a very simple way for users to instruct the extension about a schema preference choice, which as been asked by Heikki and Robert this time. I'll try to come up with the archives link later, now ain't a good time for me to crawl the lists. The only other acceptable option would be to do nothing with schema at install time and to ask users to use the next patch in the series. CREATE EXTENSION foo; ALTER EXTENSION foo SET SCHEMA bar; Now, this would have to fail if more than exactly one schema is depending on the extension's foo, and that's it. Considering that most (if not all) extensions are using only one schema, it was though nice to include the WITH SCHEMA bar; facility at CREATE EXTENSION time, and the simple implementation is what you're commenting on. > All except the last one could be written into the Makefile, and then > `make` would generate the control file for installation. Only if the > extension author wants to support custom_variable_classes would you > need to create an explicit control file. And even that could probably > be finessed with some sort of Make variable. We've been trying to get there, and Tom and Alvaro both said that it's not worth the Makefile effort and dance, then Itagaki proposed the current mechanism, which is based on extension.control.in files and a simple implicit Make rule in pgxs.mk. > BTW, you might want to put the changes to the existing contrib modules > in a separate patch, just to simplify things a bit. Not a big deal at > this point, but it might help the committer to have further > separation. But reviewing that bit, I was wondering, how are the > uninstall scripts handled? Are they necessary anymore? Separating the only use case of the patch away does not sound like simplifying anything for me, but maybe that's only because I'm already hard at work maintaining 7 git branches (+ master) to be able to issue reviewable patches. As far as the uninstall scripts goes, yes, I think they are not necessary anymore and I now realize I didn't remove them. Will wait for some more input before to add their removal in the v16 patch. > ### Questions ### > > * I assume that if an extension is installed that includes a DSO that > the server will need to be HUPed before you can CREATE EXTENSION, yes? If by DSO you mean dynamic shared object (.so, .dylib or .dll) then no. > * I trust that CREATE EXTENSION and DROP EXTENSION and ALTER EXTENSION > are transactional, yes? Sure. But the patch you're reviewing is not featuring ALTER EXTENSION, that's for the next one :) > * Is there a regtype for extensions? No. Do we need one? > * Why can only super users get a list of extensions? It might be > useful for me to be able to check from within a udf to see if a given > extension is installed, for example. I can see why you wouldn't want > to allow listing of extensions that are installed on the system but > not in the current database. Anyone can SELECT FROM pg_extension, the catalog, and so list the installed extensions. Same for listing an extension's object. The superuser only function is the one that has to parse the control files. I'm not good about security concerns, but it well seems to me that parsing files in $SHAREDIR should be superuser only. So I think what you want to happen is what's in the code. Right? > * Is there no way to keep a list of available extensions somewhere in > the cluster? Seems kind of silly to parse all the control files from > within pg_extensions(). That can make a call to pg_extensions() pretty > expensive. Where would you put that cache, and when do you invalidate its entries or add new ones? I don't think that listing available extensions is such a frequent operation that it warrant for its own cache, even if you're able to come up with a good enough management rule set. Oh, and I just did select * from pg_extensions(); 3 times here, to be sure we're on the same page. Time: 166,868 ms then 7,823 then 3,584. > * What's with `replace_text_variadic(PG_FUNCTION_ARGS)`? Related? Again, it's all to do with giving you a single all-included patch. This function is a byproduct of the pg_execute_from_file() patch review. > * \dx+ is not documented in \? in psql. I think you just need to add > "[+]". Well spotted! Fixed here. > * `doc/src/sgml/ref/psql-ref.sgml` needs updating too. Done, thanks. > * I like that I can see a list of installed and available extensions, > but I'm not sure that \dx+ is the right command for the > latter. Usually + means "more columns to provide more information for > the same objects as you saw without the +". Is there any precedent for > showing different objects with + than without? Well, I'm not sure about what the best user interface for it is in psql, but the feature is there already: =# select name, comment, installed from pg_extensions limit 2; name | comment | installed ---------------+--------------------------------------------+----------- adminpack | Administrative functions for PostgreSQL | t auto_username | functions for tracking who changed a table | f (2 rows) > Exercising the Feature > ---------------------- > > * I built my cluster with all the contrib extensions. \dx+ does a nice > job of listing what's available. So I ran `CREATE EXTENSION > hstore`. It seemed to work great, but was *very* verbose. Every SQL > statement was emitted to the terminal. I thought \echo was turned off > by CREATE EXTENSION. I was able to use hstore after that; worked > great. Yay! The chatter has been discussed extensively with Alvaro and Tom in the pg_execute_from_file() branch already --- sorry about that, but really, I though that handing you a single patch would make it easier. So this chatter is a result of 2 things: - we're using SPI in CREATE EXTENSION - hstore issues a WARNING on the => operator creation, and SPI reaction to WARNING is to dump the query string, which is the install script here. > * Next I tried dropping it. `DROP EXTENSION ex.citext;` did not > work. What if I have citext installed in two schemas and just want to > drop one of them? Ah, I see, one cannot have an extension with the > same name in different schemas. Currently the schema-qualification is > a syntax error, but maybe it should be a little more informative? Well this point is not decided upon yet. I'm arguing that an extension is a database global object, and there's a point to be made for them to live in a schema. Currently, they depend on the schema you've been using at CREATE EXTENSION time, and the schema created by the script are depending on the extension, and as a result all is working fine and following the usual conventions (try drop schema cascade when its owner ain't a superuser but an extension has been installed there, e.g.). It could be that the easiest way to fix the current mess is to remove WITH SCHEMA support at CREATE EXTENSION time and have the facility back in core with the ALTER EXTENSION SET SCHEMA patch. > * The `pg_extension` catalog and `pg_extensions` views are both > useful, but too close in name. I found it confusing how similar they > are. It seems to me like the `pg_extension` class is well-named and > like the other catalog table names, but `pg_extenions` should perhaps > be `pg_available_extensions` or something. Give me the name and I'll update the patch. Well, or this could be considered a commiter decision. > Creating a Third Party Extension > -------------------------------- > > ! CREATE EXTENSION pair; > ! NOTICE: Installing extension 'pair' from > '/usr/local/pgsql-devel/share/contrib/pair.sql > ! ERROR: could not execute sql file: > '/usr/local/pgsql-devel/share/contrib/pair.sql' > > It would be nice if it gave me more information. What failed, exactly? > It also fails when I use `WITH SCHEMA public`. Yeah, that's a byproduct of using SPI_execute() in pg_execute_sql_file. Your script contains BEGIN; and COMMIT; commands. It's not supposed to. You won't find a single contrib module containing those either, with or without my patch. > * One could specify stuff *only* in the `Makefile` and let `make` > create a control file. Been trying, failed. Sorry. > * I could omit the @extschema@ business altogether and just let CREATE > EXTENSION set the path for the duration of its execution. That won't fly. The only workable solution is a 2-steps recipe: CREATE EXTENSION pair; ALTER EXTENSION pair SET SCHEMA utils; And the second command will raise an error if 'pair' has registered dependencies for more than exactly one schema. Some work needs to be done to see that happen, though. > Conclusion > ---------- > > Overall I'm very happy with this patch. There are some nits I've > brought up here that need to be addressed, and I do think there are > some design changes that would be worth considering. So I'm marking it > as returned. But I also think that someone more familiar with the core > code should do a proper code review during this commitfest (I'm mostly > about functionality and interface). I'm pleased to see that you mean 'Waiting on Author', because I intend to answer timely enough to still hope for the patch to make it in this commitfest, baring showstoppers. If you follow the git repository you will notice that the doc changes asked for in this review are pushed. I'll wait some more for us to decide on remaining points before to produce yet another patch version, though. http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers