Re: cutting down the TODO list thread
On Tue, Oct 27, 2020 at 8:25 PM John Naylor wrote: > As I mentioned in [1], I've volunteered to clear out the TODO list of > items that appear to be too difficult, controversial, or otherwise not > worth doing to warrant being listed there. I'll be working a few sections > at a time, and every so often I'll have a list of proposed items for > removal. If I don't hear objections, I'll remove the items after a few days > while going through the next set. > I'm totally on board with cleaning the list up, but how about marking as "won't fix" (or similar) instead of actually removing the items? That should help to prevent the same exact items from appearing on the list again, which they eventually would, I believe. -- Alex
Re: Concurrency issue in pg_rewind
On Fri, Sep 18, 2020 at 8:10 AM Michael Paquier wrote: > On Thu, Sep 17, 2020 at 10:20:16AM +0200, Oleksandr Shulgin wrote: > > Ouch. I think pg_rewind shouldn't try to remove any random files in > pg_wal > > that it doesn't know about. > > What if the administrator made a backup of some WAL segments there? > > IMO, this would be a rather bad strategy anyway, so just don't do > that, because that could also mean that this is on the same partition > as pg_wal/ which would crash the server if the partition has the idea > to get full even if max_wal_size is set correctly. To clarify my point, I don't mean to backup WAL segments in the background when the server is running, but precisely when the server is down and you need to intervene, such as running pg_rewind. You might want to "stash" some of the latest segments in case you need to start over (name it pg_wal/00840A76001E.backup, or pg_wal/backup/00840A76001E). It is surprising that pg_rewind might want to decide to remove those. -- Alex
Re: Concurrency issue in pg_rewind
On Wed, Sep 16, 2020 at 2:55 PM Alexander Kukushkin wrote: > > The second time pg_rewind also failed, but the error looked differently: > servers diverged at WAL location A76/39E55338 on timeline 132 > rewinding from last common checkpoint at A76/1EF254B8 on timeline 132 > > could not remove file > > "/home/postgres/pgdata/pgroot/data/pg_wal/.wal-g/prefetch/running/00840A760024": > Ouch. I think pg_rewind shouldn't try to remove any random files in pg_wal that it doesn't know about. What if the administrator made a backup of some WAL segments there? -- Alex
Re: builtin functions, parameter names and psql's \df
On Wed, Sep 2, 2020 at 7:35 AM Andres Freund wrote: > Hi, > > on a regular basis I remember a builtin function's name, or can figure it > out > using \df etc, but can't remember the argument order. A typical example is > regexp_*, where I never remember whether the pattern or the input string > comes > first. > > Unfortunatly \df does not really help with that: > > =# \df regexp_split_to_table > > ┌┬───┬──┬─┬──┐ > │ Schema │ Name │ Result data type │ Argument data > types │ Type │ > > ├┼───┼──┼─┼──┤ > │ pg_catalog │ regexp_split_to_table │ SETOF text │ text, text > │ func │ > │ pg_catalog │ regexp_split_to_table │ SETOF text │ text, text, > text│ func │ > > └┴───┴──┴─┴──┘ > > If the parameters were named however, it'd be clear: > > =# CREATE OR REPLACE FUNCTION pg_catalog.regexp_split_to_table(string > text, pattern text) > RETURNS SETOF text > LANGUAGE internal > IMMUTABLE PARALLEL SAFE STRICT > AS $function$regexp_split_to_table_no_flags$function$ > > =# \df regexp_split_to_table > > ┌┬───┬──┬──┬──┐ > │ Schema │ Name │ Result data type │ Argument data > types│ Type │ > > ├┼───┼──┼──┼──┤ > │ pg_catalog │ regexp_split_to_table │ SETOF text │ string text, > pattern text │ func │ > │ pg_catalog │ regexp_split_to_table │ SETOF text │ text, text, > text │ func │ > > └┴───┴──┴──┴──┘ > > (I intentionally left the three parameter version unchanged, to show the > difference) > > > In the docs we already name the parameters using SQL like syntax, see [1]. > How > about we actually do so for at least the more common / complicated > functions? > +many I find myself in the same situation a lot. I've never realized that's an implementation detail and not something fundamental preventing the parameters from being named in the built-in functions. -- Alex
Re: libpq copy error handling busted
On Thu, Jun 4, 2020 at 5:37 AM Thomas Munro wrote: > On Thu, Jun 4, 2020 at 1:53 PM Thomas Munro > wrote: > > On Thu, Jun 4, 2020 at 1:35 PM Tom Lane wrote: > > > Ah, it's better if I put the pqReadData call into *both* the paths > > > where 1f39a1c06 made pqSendSome give up. The attached patch seems > > > to fix the issue for the "pgbench -i" scenario, with either fast- > > > or immediate-mode server stop. I tried it with and without SSL too, > > > just to see. Still, it's not clear to me whether this might worsen > > > any of the situations we discussed in the lead-up to 1f39a1c06 [1]. > > > Thomas, are you in a position to redo any of that testing? > > It seems to be behave correctly in that scenario. > > Here's what I tested. First, I put this into pgdata/postgresql.conf: > > ssl=on > ssl_ca_file='root+client_ca.crt' > ssl_cert_file='server-cn-only.crt' > ssl_key_file='server-cn-only.key' > ssl_crl_file='root+client.crl' > ssl_min_protocol_version='TLSv1.2' > ssl_max_protocol_version='TLSv1.1' > ssl_min_protocol_version='TLSv1.2' > ssl_max_protocol_version='' > > I copied the named files from src/test/ssl/ssl/ into pgdata, and I ran > chmod 600 on the .key file. > > I put this into pgdata/pg_hba.conf at the top: > > hostssl all all 127.0.0.1/32 cert clientcert=verify-full > > I made a copy of src/test/ssl/ssl/client-revoked.key and ran chmod 600 on > it. > Would it be feasible to capture this in a sort of a regression (TAP?) test? -- Alex
Re: feature idea: use index when checking for NULLs before SET NOT NULL
On Fri, May 29, 2020 at 8:56 AM Sergei Kornilov wrote: > Hello > > Correct index lookup is a difficult task. I tried to implement this > previously... > > But the answer in SO is a bit incomplete for recent postgresql releases. > Seqscan is not the only possible way to set not null in pg12+. My patch was > commited ( https://commitfest.postgresql.org/22/1389/ ) and now it's > possible to do this way: > > alter table foos > add constraint foos_not_null > check (bar1 is not null) not valid; -- short-time exclusive lock > > alter table foos validate constraint foos_not_null; -- still seqscan > entire table but without exclusive lock > > An then another short lock: > alter table foos alter column bar1 set not null; > alter table foos drop constraint foos_not_null; > That's really good to know, Sergei! John, I think it's worth pointing out that Postgres most likely does a full table scan to validate a constraint by design and not in optimization oversight. Think of what's gonna happen if the index used for checking is corrupted? Cheers, -- Alex
Re: [PATCH] Add support to psql for edit-and-execute-command
On Mon, May 18, 2020 at 1:30 AM Joe Wildish wrote: > > Attached is a small patch for adding "edit-and-execute-command" readline > support to psql. Bash has this concept and I miss it when using psql. It > allows you to amend the current line in an editor by pressing "v" (when > in vi mode) or "C-x C-e" (when in emacs mode). Those are the default > bindings from bash although of course they can be amended in inputrc. > The only difference from \e is that you don't need to jump to the end of input first, I guess? -- Alex
Re: Poll: are people okay with function/operator table redesign?
On Mon, May 4, 2020 at 11:22 PM Tom Lane wrote: > I've now completed updating chapter 9 for the new layout, > and the results are visible at > https://www.postgresql.org/docs/devel/functions.html > There is more to do --- for instance, various contrib modules > have function/operator tables that should be synced with this > design. But this seemed like a good place to pause and reflect. > Would it be premature to complain about the not-that-great look of Table 9.1 now? Compare the two attached images: the screenshot from https://www.postgresql.org/docs/devel/functions-comparison.html vs the GIMP-assisted pipe dream of mine to align it to the right edge of the table cell. I don't have the faintest idea how to achieve that using SGML at the moment, but it just looks so much nicer to me. ;-) Regards, -- Alex
Re: do {} while (0) nitpick
On Fri, May 1, 2020 at 3:52 AM Bruce Momjian wrote: > > On Thu, Apr 30, 2020 at 09:51:10PM -0400, Tom Lane wrote: > > John Naylor writes: > > > As I understand it, the point of having "do {} while (0)" in a > > > multi-statement macro is to turn it into a simple statement. > > > > Right. > > > > > As such, > > > ending with a semicolon in both the macro definition and the > > > invocation will turn it back into multiple statements, creating > > > confusion if someone were to invoke the macro in an "if" statement. > > > > Yeah. I'd call these actual bugs, and perhaps even back-patch worthy. > > Agreed. Those semicolons could easily create bugs. It was a while ago that I last checked our Developer guide over at PostgreSQL wiki website, but I wonder if this is a sort of issue that modern linters would be able to recognize? The only hit for "linting" search on the wiki is this page referring to the developer meeting in Ottawa about a year ago: https://wiki.postgresql.org/wiki/PgCon_2019_Developer_Meeting > Other major projects include: > ... > Code linting Anybody aware what's the current status of that effort? Cheers, -- Alex
Re: Proposing WITH ITERATIVE
On Tue, Apr 28, 2020 at 5:49 AM Jonah H. Harris wrote: > On Mon, Apr 27, 2020 at 11:33 PM David Fetter wrote: > >> >> Have the authors agreed to make it available to the project under a >> compatible license? > > > If there’s interest, obviously. Otherwise I wouldn’t be asking. > > I said from the start why I wasn’t attaching a patch and that I was seeing > feedback. Honestly, David, stop wasting my, and list time, asking > pointless off-topic questions. > Jonah, I see it the other way round—it could end up as a waste of everyone's time discussing the details, if the authors don't agree to publish their code in the first place. Of course, it could also be written from scratch, in which case I guess someone else from the community (who haven't seen that private code) would have to take a stab at it, but I believe it helps to know this in advance. I also don't see how it "obviously" follows from your two claims: "I've found this functionality" and "I'll clean-up their patch and submit", that you have even asked (or, for that matter—found) the authors of that code. Finally, I'd like to suggest you adopt a more constructive tone and become familiar with the project's Code of Conduct[1], if you haven't yet. I am certain that constructive, respectful communication from your side will help the community to focus on the actual details of your proposal. Kind regards, -- Alex [1] https://www.postgresql.org/about/policies/coc/
Re: Make java client lib accept same connection strings as psql
On Sat, Feb 22, 2020 at 4:05 AM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Fri, Feb 21, 2020 at 6:21 PM Michael Leonhard > wrote: > >> How about making the Java client library accept the same connection >> strings as psql and other command-line tools? [...] >> > > That falls outside the scope of this list/project. The separate pgJDBC > project team would need to decide to take that up. > Michael, While your proposed end goal sounds as a desirable thing to me, there are certain obstacles to that, unfortunately. First, consider that the URL support appeared in libpq after, not before the support in JDBC driver. Second, the common subset of allowed connection parameters between the two is only as big as "host", "port", "database", "user" and "password". Additionally, libpq understands the "ssl=true" parameter for JDBC compatibility, though I don't think that was a good idea in the end. For one, in the JDBC world "ssl=false" is treated exactly the same as "ssl=true" or any other value, which is questionable design in the first place. And even if you could use exactly the same URL in both libpq-based and JDBC clients, without running into syntax errors, the semantics of "ssl=true" is subtly different between the two: in the former case, the client does not attempt to validate certificate, nor the hostname, as opposed to the latter. As to your actual example, the part of syntax that is treated differently in libpq is the the "userinfo": https://tools.ietf.org/html/rfc3986#section-3.2.1 The JDBC driver could be extended to support this as well, as such a change is backwards-compatible. As David has pointed out, this question should be asked to the PgJDBC project. Lastly, the RFC provides some good arguments as to why providing username and, especially, password in the connection URL is undesirable. While the "user:password@host" or "?user=fred=secret" syntax can be handy for local testing, this is definitely not something that should be used in production. Both libpq and the JDBC driver provide ways to pass login credentials in a more secure manner. Kind regards, -- Alex
Re: POC: converting Lists into arrays
On Tue, Jul 2, 2019 at 5:12 PM Tom Lane wrote: > Oleksandr Shulgin writes: > > Not related to the diff v6..v7, but shouldn't we throw additionally a > > memset() with '\0' before calling pfree(): > > I don't see the point of that. In debug builds CLOBBER_FREED_MEMORY will > take care of it, and in non-debug builds I don't see why we'd expend > the cycles. > This is what I was wondering about, thanks for providing a reference. -- Alex
Re: POC: converting Lists into arrays
On Tue, Jul 2, 2019 at 1:27 AM Tom Lane wrote: > > So I think this is a win, and attached is v7. > Not related to the diff v6..v7, but shouldn't we throw additionally a memset() with '\0' before calling pfree(): +ListCell *newelements; + +newelements = (ListCell *) +MemoryContextAlloc(GetMemoryChunkContext(list), + new_max_len * sizeof(ListCell)); +memcpy(newelements, list->elements, + list->length * sizeof(ListCell)); +pfree(list->elements); +list->elements = newelements; Or is this somehow ensured by debug pfree() implementation or does it work differently together with Valgrind? Otherwise it seems that the calling code can still be hanging onto a list element from a freed chunk and (rather) happily accessing it, as opposed to almost ensured crash if that is zeroed before returning from enlarge_list(). Cheers, -- Alex
Re: Analyze all plans
On Wed, Jan 23, 2019 at 9:44 AM Donald Dong wrote: > > 1. Enumerate all the plans > Not sure this is going to work. Because of the total number of possible plans is somewhere around O(n!), if I'm not mistaken, in terms of number of joined relations times the available access methods times the possible join strategies. So enumerating all possible plans stops being practical for even slightly complicated queries. Regards, -- Alex
Re: How can we submit code patches that implement our (pending) patents?
On Wed, Jul 25, 2018 at 8:35 PM Nico Williams wrote: > > What are you proposing anyways? That every commit come with a patent > search? > I would propose that everyone wasting their time and effort to discuss this issue here, would rather spend that time working towards putting an end to software patents instead. That would be a much better use of the time, IMHO. Just my 2c, -- Alex
Re: psql's \d versus included-index-column feature
On Thu, Jul 19, 2018 at 1:11 AM Alexander Korotkov < a.korot...@postgrespro.ru> wrote: > On Wed, Jul 18, 2018 at 11:14 PM David G. Johnston < > david.g.johns...@gmail.com> wrote: > >> On Wed, Jul 18, 2018 at 12:55 PM, Tom Lane wrote: >> >>> >>> regression=# \d tbl_include_reg_idx >>> Index "public.tbl_include_reg_idx" >>> Column | Type | Key | Definition >>> +-+-- >>> c1 | integer | t | c1 >>> c2 | integer | t | c2 >>> c3 | integer | f | c3 >>> c4 | box | f | c4 >>> btree, for table "public.tbl_include_reg" >>> >> >> +1 for the additional column indicating whether the column is being >> treated as key data or supplemental included data. >> > > +1 > And especially I don't think we should place word "INCLUDE" to the > definition column. > > -1 for printing a boolean t/f; would rather spell it out: >> > > IMHO, t/f have advantage of brevity. From my point of view, covering > indexes are not so evident feature. So, users need to spend some time > reading documentation before realizing what they are and how to use them. > So, I don't expect that short designation of INCLUDE columns as "non-key" > (Key == false) columns could be discouraging here. > I don't think there is an established practice on how to display this sort of info, but I see that both styles already exist, namely: =# \dL List of languages Name| Owner | Trusted | Description +--+-+-- plpgsql| postgres | t | PL/pgSQL procedural language plproxy| postgres | f | ... and =# \dC List of casts Source type | Target type | Function | Implicit? -+-++--- abstime | date| date | in assignment abstime | integer | (binary coercible) | no abstime | timestamp without time zone | timestamp | yes ... I like the second option more, for readability reasons and because it is easier to extend if ever needed. Given that the documentation refers to included columns as "non-key columns", it seems natural to me to name the \d output column "Key?" and use "yes/no" as the values. Regards, -- Alex
Re: Setting libpq TCP keepalive parameters from environment
On Wed, May 9, 2018 at 1:58 AM, Craig Ringerwrote: > > > > It would be much more convenient to just set the environment variable > when > > running the script and let it affect the whole process and its children. > > > > Would a patch be welcome? > > I can't really comment on that part, but: > > PGOPTIONS="-c tcp_keepalives_count=5 -c tcp_keepalives_interval=60" > psql 'host=localhost' > > should enable server-side keepalives. Unsure how much that helps you > if you need client-side keepalives too. > Good point, in our specific case it appears to work as well if it's the server who sends the keepalives. Thank you, -- Alex
Re: [HACKERS] [PATCH] Generic type subscripting
On Tue, Mar 6, 2018 at 6:21 PM, Dmitry Dolgov <9erthali...@gmail.com> wrote: > > One more small update after fd1a421fe6 in attachments. > Before looking at the code I have a few comments about documentation: in json.sgml: +-- Extract value by key +SELECT ('{"a": 1}'::jsonb)['a']; What is the result of running this query? What is the resulting data type? +-- Extract nested value by key path +SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c']; + +-- Extract element by index +SELECT ('[1, "2", null]'::jsonb)['1']; What is the result here? Why subscript is a string and not a number? Are subscription indexes 0- or 1-based? +-- Update value by key +UPDATE table_name set jsonb_field['key'] = 1; + +-- Select records using where clause with subscripting +SELECT * from table_name where jsonb_field['key'] = '"value"'; Please capitalize: SET, FROM, WHERE. Use of double quotes around "value" requires some explanation, I think. Should the user expect that a suitable index is used by the query planner for this query? In other words, I would like to see this part of documentation to be extended beyond just showcasing the syntax. Regards, -- Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707
Re: Estimate maintenance_work_mem for CREATE INDEX
On Tue, Dec 19, 2017 at 10:47 AM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > (cross-posting admin and hackers) > > Hello, > > I wonder if I'm alone in my wish to have a way for estimating how much > maintenance work memory would suffice to allocate for a session when > creating an index and avoid spilling to disk? > > Recently I had to re-create some indexes on a 9.6 server and I had some > input on the on-disk index size: one was around 30 GB, the other -- a bit > over 60 GB according to \di+ output. The total number of live tuples in > the table itself was close to 1.3e+9, the table had an estimated 25% bloat. > > I had some spare memory on the machine so I've given it 60 GB for > maintenance_work_mem and expected that at least the smaller of the two will > fit in memory completely. To my surprise that didn't suffice and both > indexes were building with some disk spill. > > Is anyone aware of a query to estimate the memory requirements for CREATE > INDEX [CONCURRENTLY]? > > I've looked in the postgres wiki, but didn't find anything to that end. > Nor searching the archives of pgsql-admin did help. > > I understand that there were some changes in recent releases related to > memory allocation (e.g. allowing huge allocation in 9.4), but at least > targeting 9.6 or 10 would make sense. There are also a lot of ways how one > CREATE INDEX can be different from the other, but in the most simple case > where you have fixed-width columns and building the full index (i.e. no > WHERE clause), it should be possible. > Now I see I fail to mention this is the default btree index with all default options. Obviously other indexes can be very different in memory requirements. Not hasting to look in the source to calculate all the sizeof()s yet: > waiting on your reply and suggestions. ;-) > If there would be an option in the database itself to provide those estimation, we wouldn't even need to figure out estimation queries. "EXPLAIN CREATE INDEX" anyone? Regards, -- Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707
Estimate maintenance_work_mem for CREATE INDEX
(cross-posting admin and hackers) Hello, I wonder if I'm alone in my wish to have a way for estimating how much maintenance work memory would suffice to allocate for a session when creating an index and avoid spilling to disk? Recently I had to re-create some indexes on a 9.6 server and I had some input on the on-disk index size: one was around 30 GB, the other -- a bit over 60 GB according to \di+ output. The total number of live tuples in the table itself was close to 1.3e+9, the table had an estimated 25% bloat. I had some spare memory on the machine so I've given it 60 GB for maintenance_work_mem and expected that at least the smaller of the two will fit in memory completely. To my surprise that didn't suffice and both indexes were building with some disk spill. Is anyone aware of a query to estimate the memory requirements for CREATE INDEX [CONCURRENTLY]? I've looked in the postgres wiki, but didn't find anything to that end. Nor searching the archives of pgsql-admin did help. I understand that there were some changes in recent releases related to memory allocation (e.g. allowing huge allocation in 9.4), but at least targeting 9.6 or 10 would make sense. There are also a lot of ways how one CREATE INDEX can be different from the other, but in the most simple case where you have fixed-width columns and building the full index (i.e. no WHERE clause), it should be possible. Not hasting to look in the source to calculate all the sizeof()s yet: waiting on your reply and suggestions. ;-) Cheers! -- Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707