On Dec 4, 2010, at 6:14 AM, Dimitri Fontaine wrote: > Hi, > > Thanks for the review, that's quite one! :) > > I'm not sure to follow you all along, it seems like the reading is "try > it first then understand and comment again", so sometimes I'm not sure > if you're saying that docs are missing the point or that the behaviour > ain't right.
Overall I think the docs could use a lot more fleshing out. Some of the stuff in the wiki would help a lot. At some point, though, I'll work over the docs myself and either send a patch to you or to the list (if it has been committed to core). > "David E. Wheeler" <da...@kineticode.com> writes: >> `make installcheck` fails (regression.diffs attached). > > You forgot the attachment. I'd add mine if I didn't get "All 124 tests > passed." result for the command :) Sorry, attached now. > (Note that I've been having problems too, but can't reproduce them > easily, and they might have been related to forgotten maintainer-clean > steps). Yeah, I did that a bunch of times. This was the best run. > Well, the exposed functionality aren't much (2 SQL commands and some > support functions) and you need to have a contrib at hand to exercise > them. So I'm not sure about how to add in regression tests for contrib > infrastructure work and have them depend only on -core. I'll take ideas. You write a very simple contrib module exclusively for testing. It doesn't have to actually do anything other than create a couple of objects. A domain perhaps. > The other thing is that the very important feature here is dump and > restore of a database containing extensions. You're not reporting about > it, though. And I'm not sure about how to check for pg_dump content from > pg_regress oriented tests. Yeah, I haven't tried the dump and restore stuff. Are there no dump/restore tests in core? If there are, I expect all you'd have to do is add an extension or two to be dumped and restored. >> Yes. Much of the execution is superuser-only, so we need to make sure >> that such is actually the case (unit tests would help with that). > > How do you mean? Trying to subvert the features by being able to run > them as a non-superuser? How do you install tests around that? If the tests run as a super user (I think they usually do), then all you have to do is create a new unprivileged user, switch to that user via SET ROLE, and call the function in question. >> The pg_extension catalog docs are a little thin, but probably >> sufficient. They appear to be duplicated, though, with the entries for >> catalog-pg-extension and view-pg-extensions looking identical. One is >> apparently a list of installed extensions, and the other an SRF that >> lists installed and available extensions. But I find the duplication a >> bit confusing. (Might be easer if I wasn't reading SGML though.) > > Well yes the pg_extension catalog is filled at CREATE EXTENSION time so > that we have an OID to register dependencies on. Then I added a SRF to > also list available extensions so that it's easy to integrate the > feature into pgadmin and the like. Right. > Now I'm open to suggestions as far as the names are involved, or maybe > we should keep the SRF but remove the view. psql could easily enough use > the function directly I think. I think the catalog should be pg_created_extensions (or pg_installed_extensions) and the view should be pg_available_extensions. Is there no way to have a catalog table visible to *all* databases that contains the list of available extensions? >> I don't understand this paragraph at all: >> This paragraph isn't very clear, either: >> Examples might help. > > Well, I think native English skills are required here. And I'm not too > sure about the example. I recognize that some documentation pages have no (or very few examples). But oftentimes nothing makes the utility of a feature more apparent than an example. I tend to use them liberally in my documentation. As I said, I'll probably contribute a refactoring of the docs at some point. >> Control file keys: >> >> * comment -- Should mention that it's used for CREATE COMMENT ON EXTENSION, >> no? >> * version -- If it's free-form, how will you do dependency-checking? Or will >> you? >> * script >> * encoding -- Seems unnecessary; one can explicitly set it in the .sql file, >> no? >> * custom_variable_classes > > Doc says: > http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html > > This control file allows for registering the extension and should > provide values for the keys described below. > > Then there's the detail about fields. I think the context is enough to > understand, but I'd be happy to integrate some revisions of the current > content. Yes, I think if you just mentioned that the comment is used for an implicit CREATE COMMENT that it would be useful. That's true, yes? The others are okay as-is. > For the version, it has been agreed before (Developer Meeting) that v1 > would not care about inter-extension dependencies nor be smart about it, > or I would have proposed a patch to have the following into core: > > http://packages.debian.org/sid/postgresql-9.0-debversion OKay. > >> I don't understand this paragraph: >> >> <para> >> The admin >> function <link >> linkend="functions-extension">pg_extension_flag_dump</link> >> can be used to revert the default <literal>pg_dump</literal> policy >> about objects that belong to an extension and force the flagged objects >> to be part of the backups. >> </para> >> >> What's a pg_dump policy? > > Here again not being a native speaker shows up it seems. What I mean is > that the only goal of all this patch is for pg_dump to stop including > extensions install scripts, but just mark the dependency. Yeah. Say that then. Makes much more sense. > So obviously, if your extension provides for some user editable content, > you want to reverse the default behavior and have those content be parts > of the dump, right? Right. > Now, to phrase that in a manual fashion... > >> * There appears to be an irrelevant change to the docs for replace(). > > As detailed in another thread, the patch you're reviewing depends on > another one that has been reviewed separately, marked as ready, but is > not yet in. So the extension patch is including the pg_execute_sql_file > patch, to ease your reviewer work (you don't have to deal with patch > dependencies). So you're seeing unrelated-at-first-glance changes. Got it. >> Also, why are the placholders different than everywhere else? I see >> >> + SELECT pg_execute_sql_string('CREATE SCHEMA @schema@;', '@schema@', >> 'utils'); >> >> and >> >> + SELECT pg_execute_sql_string('CREATE SCHEMA s;', 's', 'bar'); >> >> Why not $1 like is used in PREPARE and user-defined functions? I think >> the second example is particularly weird, as there's nothing to >> indicate that "s" is a placeholder. I don't mind "@schema@", but would >> like to see uses of such things be consistent throughout PostgreSQL >> (and I expect that one cannot add such as placeholders to PREPARE). > > The @extschema@ syntax has been proposed by Heikki, and I just adopted > it. If another is to emerge, I'll adapt to it. While preparing this, I > didn't want to mix mechanism and policy, so there's nothing imposed on > the extension authors or the function users wrt to what a placeholder > name should look like. Okay. I think I see the need to differentiate between a variable in .control.in replaced by `make` and one replaced by `CREATE EXTENSION`. Something explaining that might be good. >> In the CREATE EXTENSION docs, I don't understand this section: >> <varlistentry> >> <term>NO USER DATA</term> >> Does it mean that no user data will be dumped when the extension is >> dumped? > > Consider you have an extension with user data. At create extension time, > you're creating a table to hold the data, and maybe you insert default > values into it. At pg_restore time, CREATE EXTENSION will be called > again. Do you want the extension's provided default user data to be > restored, or the one from the backup? Backup. > So pg_dump will issue CREATE EXTENSION foo WITH NO USER DATA variant, > and will also issue dump commands for objects marked as being user data, > so that at pg_restore time you get the values from the backup when it > makes sense. Ah, I see. > Updates on the manual style paragraph to explain that are welcome. It > could be that what's needed is a better overview of the whole thing > somewhere, but I though what's written would be enough. It seems to be > only enough when you already know what it's all about --- and that's > typical of Unix style man pages. Do we want to include an extension > author tutorial? Yes! >> Why would multiple control files be necessary? And how would it map to >> EXTVERSION? Would all the extensions get the one version in >> EXTVERSION? > > That part has been extensively re-hashed by Alvaro (and Heikki I think) > already. The basis for multiple control files per extension is to be > able to cope with contrib/spi, that contains 5 extensions: > > MODULES = autoinc insert_username moddatetime refint timetravel > EXTENSION = autoinc auto_username moddatetime refint timetravel > EXTVERSION = $(VERSION) > > Now, would you make the same layout choice and maintain different > version numbering schemes, you still can do that, but you won't benefit > from the EXTVERSION facility. Okay. So a control file corresponds to a single extension with a single version, but that extension might have multiple files, right? No, wait, you have five *extensions* above (but only one version number). Why not just make it *on* extension in five scripts? That would be less confusing to me, given that you can have only one version number, it appears. >> <literal><function>pg_extension_flag_dump(<parameter>objid</> >> <type>oid</>)</function></literal> >> Why is this necessary? Aren't all database objects "worthy" of being >> dumped? And this: > > Remember that the only goal of all that patch is *NOT* to dump the > extension's install script? I do now, yes. >> What constitutes "user data" when it comes to extensions? In contrast, >> the explanation below of "pg_extension_flag_dump" is very clear, >> making it obvious what it's for and why it might be useful. I don't >> see how it relates to the summary above, however. OTOH, it also says: > > User data is anything that the extension's install script flagged so. Okay. The scoping feels a little odd, given the example: DO $$ BEGIN IF pg_extension_with_user_data() THEN create schema foo; create table foo.bar(id serial primary key); perform pg_extension_flag_dump('foo.bar_id_seq'::regclass); perform pg_extension_flag_dump('foo.bar::regclass); END IF; END; $$; But not too bad I guess. >> <para> >> This function can only be called when loading a <acronym>SQL</> script >> using the command <command>CREATE >> EXTENSION</command>. See <xref linkend="sql-createextension"> >> and <xref linkend="extend-extension">. >> </para> >> >> And I have no idea how I'd call that function within another function >> call. Does this mean that only the extension author can call it? Seems >> like a weird scoping without precedent. > > Indeed. Those two functions only make sense when used while running the > CREATE EXTENSION sql command, so are only meaningful in the extension's > install script. And yes that looks like a precedent. Yeah. >> + (errmsg("This function can only be used from >> CREATE EXTENSION."))); >> >> I think that's a confusing error message, frankly. Perhaps better >> would be something like "This function can only be called from SQL >> files executed by CREATE EXTENSION". The docs should make that >> clearer, too. It was only when I saw this error message that I >> realized how it's scoped. > > Any proposal here? Yes. Change the error message to "This function can only be called from SQL files executed by CREATE EXTENSION". > Well maybe it's just me but I feel like wiki content is "free style" > while the manual tone has to be sharp and concise, and the examples in > there must be carefully chosen so as not to deprecate early. But if the > consensus is to add the wiki content into the documentation chapter, > that's what I'll do for next patch version. > > http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html Yes, I definitely think that there should be a lot more information in that page, including an example or six. Some of the stuff from the wiki page would go a long way towards that. And that seems like the right place for it, too, as long as the CREATE EXTENSION page points to it. > The example is in the docs, yes. The whole goal of the two functions > pg_extension_flag_dump() and pg_extension_with_user_data() is for > pg_restore to instruct the extension's install script if it should > create the user data objects or bypass the step because they will be in > the dump already. > > I think we have to find a way to state that clearly in the manual. Yes. I see now how it works, but it was a bit confusing for me to get there. I think I had some trouble with the term "user data," though I don't have a better term to offer, alas. > >> Code Review >> ----------- >> >> I find the use of variables in the control files confusing. It looks like >> this: >> >> version = 'EXTVERSION' >> >> Why not use make syntax? >> >> version = $(EXTVERSION) >> >> That makes it much clearer that it's a variable and provides the same >> syntax a the extension developer will already be using in the >> Makefile. > > For starters, you should be comparing this facility with the .sql.in one > where you can use the following notation: > > CREATE OR REPLACE FUNCTION citext_cmp(citext, citext) > RETURNS int4 > AS 'MODULE_PATHNAME' > LANGUAGE C STRICT IMMUTABLE; Right, I forgot about MODULE_PATHNAME in the .sql.in file. > Then, if you insist on some strange cleverer input syntax, I'll welcome > the C-coded patch to make that happen. You might want to know that the > current extension patch is re-using the same parsing code that is used > for postgresql.conf and recovery.conf, though. It's probably fine, given the precedent of MODULE_PATHNAME. I'd rather see the .control file killed from the distribution altogether (that is, generated by `make`). >> I've already mentioned that the @extschema@ placeholder is different from >> any other SQL placeholder: >> >> SET search_path = @extschema@; >> >> I'm not sure of a better option, but I do think it should be made >> consistent, if at all possible, since this is executed in the >> database, not at `make` time. > > Again, there's no technical restriction in there. > > =# select replace('Are placeholders $2 and $1 better than @n...@?', '$1', > 'foo', '$2', 'bar', '@name@', '@'); > replace > --------------------------------------------- > Are placeholders bar and foo better than @? > (1 row) > > It looks like bikeshedding time. We managed to reduce it up until now, > but well, it has to kick-in after a while :) Oh, so they're not defined placeholders. I can do anything I want supported by variadic replace(). Seems kind of weird, though. Erm, except that @extschema@ is explicitly supported by CREATE EXTENSION, right? >> Speaking of which, what happens if an extension has no search_path? Or >> an old explicit one? > > No. Please do not try to open this can of worms. It took one entire year > for us to agree that extension should not concern themselves with schema > and search_path issues, and 2 developer meetings. Yeah, except that you *are* making us be concerned with them. I am required to put this line into my extension's .sql file: SET search_path = @extschema@; I'd rather not have to. > So what's in the patch is a very simple way for users to instruct the > extension about a schema preference choice, which as been asked by > Heikki and Robert this time. I'll try to come up with the archives link > later, now ain't a good time for me to crawl the lists. No, I agree that I should be able to do `CREATE EXTENSION foo WITH SCHEMA bar` and have it work. But it appears to *only* work if the extension author included the above magical line. It feels superfluous to me. IOW, if I install extension "foo" and it does *not* have the above magic line, then this command will *not* do what I expect: CREATE EXTENSION foo WITH SCHEMA bar; Extension "foo" will be in the public schema (usually) rather than "bar". > The only other acceptable option would be to do nothing with schema at > install time and to ask users to use the next patch in the series. > > CREATE EXTENSION foo; > ALTER EXTENSION foo SET SCHEMA bar; Why is that the only other option? > Now, this would have to fail if more than exactly one schema is > depending on the extension's foo, and that's it. Considering that most > (if not all) extensions are using only one schema, it was though nice to > include the WITH SCHEMA bar; facility at CREATE EXTENSION time, and the > simple implementation is what you're commenting on. What I'm suggesting is that extension authors should not be concerned with schemas. >> All except the last one could be written into the Makefile, and then >> `make` would generate the control file for installation. Only if the >> extension author wants to support custom_variable_classes would you >> need to create an explicit control file. And even that could probably >> be finessed with some sort of Make variable. > > We've been trying to get there, and Tom and Alvaro both said that it's > not worth the Makefile effort and dance, then Itagaki proposed the > current mechanism, which is based on extension.control.in files and a > simple implicit Make rule in pgxs.mk. You already have key/value replacement of EXTVERSION. What's so different about EXTNAME, EXTSCRIPT, EXTENCODING, and EXTCOMMENT? If I don't need a custom variable class, why bother me with the added complexity of another file with another syntax? >> BTW, you might want to put the changes to the existing contrib modules >> in a separate patch, just to simplify things a bit. Not a big deal at >> this point, but it might help the committer to have further >> separation. But reviewing that bit, I was wondering, how are the >> uninstall scripts handled? Are they necessary anymore? > > Separating the only use case of the patch away does not sound like > simplifying anything for me, but maybe that's only because I'm already > hard at work maintaining 7 git branches (+ master) to be able to issue > reviewable patches. Yeah, was just a thought. I expect that everything in contrib can work even if it's not an extension, and could be converted in a later commit. But it's not a big deal to do it up-front. > As far as the uninstall scripts goes, yes, I think they are not > necessary anymore and I now realize I didn't remove them. Will wait for > some more input before to add their removal in the v16 patch. Cool. >> ### Questions ### >> >> * I assume that if an extension is installed that includes a DSO that >> the server will need to be HUPed before you can CREATE EXTENSION, yes? > > If by DSO you mean dynamic shared object (.so, .dylib or .dll) then no. Yes, shared object. Don't you need to restart the server to have it pick up a new dso now? >> * I trust that CREATE EXTENSION and DROP EXTENSION and ALTER EXTENSION >> are transactional, yes? > > Sure. But the patch you're reviewing is not featuring ALTER EXTENSION, > that's for the next one :) Right. I kept thinking it should be there, and then remembering it was separate. :-) >> * Is there a regtype for extensions? > > No. Do we need one? Probably not. >> * Why can only super users get a list of extensions? It might be >> useful for me to be able to check from within a udf to see if a given >> extension is installed, for example. I can see why you wouldn't want >> to allow listing of extensions that are installed on the system but >> not in the current database. > > Anyone can SELECT FROM pg_extension, the catalog, and so list the > installed extensions. Same for listing an extension's object. Well then the limit on the function is arbitrary. > The superuser only function is the one that has to parse the control > files. I'm not good about security concerns, but it well seems to me > that parsing files in $SHAREDIR should be superuser only. Oh, agreed. Better still would be some way to have a catalog (heh) of available extensions readable from all databases in a cluster. Then, amusingly enough, one wouldn't need control files, either. But maybe that's not possible in PostgreSQL? I'm not sure. > So I think what you want to happen is what's in the code. Right? I think so. That function is used by the view, right? >> * Is there no way to keep a list of available extensions somewhere in >> the cluster? Seems kind of silly to parse all the control files from >> within pg_extensions(). That can make a call to pg_extensions() pretty >> expensive. > > Where would you put that cache, and when do you invalidate its entries > or add new ones? I don't think that listing available extensions is such > a frequent operation that it warrant for its own cache, even if you're > able to come up with a good enough management rule set. No idea. I know zilch about that stuff. > Oh, and I just did select * from pg_extensions(); 3 times here, to be > sure we're on the same page. Time: 166,868 ms then 7,823 then 3,584. That's pretty damned slow. >> * I like that I can see a list of installed and available extensions, >> but I'm not sure that \dx+ is the right command for the >> latter. Usually + means "more columns to provide more information for >> the same objects as you saw without the +". Is there any precedent for >> showing different objects with + than without? > > Well, I'm not sure about what the best user interface for it is in psql, > but the feature is there already: > > =# select name, comment, installed from pg_extensions limit 2; > name | comment | installed > ---------------+--------------------------------------------+----------- > adminpack | Administrative functions for PostgreSQL | t > auto_username | functions for tracking who changed a table | f Yeah, I'm not sure, either. I like the *functionality of \dx+, I'm just not sure that's the right name for it. >> * I built my cluster with all the contrib extensions. \dx+ does a nice >> job of listing what's available. So I ran `CREATE EXTENSION >> hstore`. It seemed to work great, but was *very* verbose. Every SQL >> statement was emitted to the terminal. I thought \echo was turned off >> by CREATE EXTENSION. I was able to use hstore after that; worked >> great. Yay! > > The chatter has been discussed extensively with Alvaro and Tom in the > pg_execute_from_file() branch already --- sorry about that, but really, > I though that handing you a single patch would make it easier. > > So this chatter is a result of 2 things: > > - we're using SPI in CREATE EXTENSION > > - hstore issues a WARNING on the => operator creation, and SPI reaction > to WARNING is to dump the query string, which is the install script > here. Ah. SPI isn't able to be more precise about it, eh? Pity. Maybe for hstore we could silence the warning in the install script, eh? >> * Next I tried dropping it. `DROP EXTENSION ex.citext;` did not >> work. What if I have citext installed in two schemas and just want to >> drop one of them? Ah, I see, one cannot have an extension with the >> same name in different schemas. Currently the schema-qualification is >> a syntax error, but maybe it should be a little more informative? > > Well this point is not decided upon yet. I'm arguing that an extension > is a database global object, and there's a point to be made for them to > live in a schema. Those sound like mutually-contradicting statements to me. But maybe it's because I tend to think of schemas as namespaces, and perhaps they're not? > Currently, they depend on the schema you've been using at CREATE > EXTENSION time, and the schema created by the script are depending on > the extension, and as a result all is working fine and following the > usual conventions (try drop schema cascade when its owner ain't a > superuser but an extension has been installed there, e.g.). > > It could be that the easiest way to fix the current mess is to remove > WITH SCHEMA support at CREATE EXTENSION time and have the facility back > in core with the ALTER EXTENSION SET SCHEMA patch. It seems to me that if you can put an extension in a schema, then you can put another extension with the same name in another schema. That's how it works for all other schema-residing objects, AFAIK. >> * The `pg_extension` catalog and `pg_extensions` views are both >> useful, but too close in name. I found it confusing how similar they >> are. It seems to me like the `pg_extension` class is well-named and >> like the other catalog table names, but `pg_extenions` should perhaps >> be `pg_available_extensions` or something. > > Give me the name and I'll update the patch. Well, or this could be > considered a commiter decision. Covered above: pg_created_extensions and pg_available_extensions. >> Creating a Third Party Extension >> -------------------------------- >> >> ! CREATE EXTENSION pair; >> ! NOTICE: Installing extension 'pair' from >> '/usr/local/pgsql-devel/share/contrib/pair.sql >> ! ERROR: could not execute sql file: >> '/usr/local/pgsql-devel/share/contrib/pair.sql' >> >> It would be nice if it gave me more information. What failed, exactly? >> It also fails when I use `WITH SCHEMA public`. > > Yeah, that's a byproduct of using SPI_execute() in pg_execute_sql_file. > Your script contains BEGIN; and COMMIT; commands. It's not supposed to. > You won't find a single contrib module containing those either, with or > without my patch. Ah! Yep, that was the problem. Still, it would be great if I could get better feedback as to what the error actually is. I'm not sure I ever would have figured it out myself, given that it works just fine if I install it the old-fashioned way (psql -f). >> * One could specify stuff *only* in the `Makefile` and let `make` >> create a control file. > > Been trying, failed. Sorry. Keep trying man! I got your back! >> * I could omit the @extschema@ business altogether and just let CREATE >> EXTENSION set the path for the duration of its execution. > > That won't fly. Why not? > The only workable solution is a 2-steps recipe: > CREATE EXTENSION pair; > ALTER EXTENSION pair SET SCHEMA utils; > > And the second command will raise an error if 'pair' has registered > dependencies for more than exactly one schema. Some work needs to be > done to see that happen, though. This seems orthogonal to me. > I'm pleased to see that you mean 'Waiting on Author', because I intend > to answer timely enough to still hope for the patch to make it in this > commitfest, baring showstoppers. D'oh! Sorry, yes, that's what I meant. I think I made the same mistake the last time I did a commitfest review. Duh. Thanks for changing it in the cf app. > If you follow the git repository you will notice that the doc changes > asked for in this review are pushed. I'll wait some more for us to > decide on remaining points before to produce yet another patch version, > though. Sounds like a plan. Best, David
regression.diffs
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers