Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags
Jim Nasby wrote: > On 2/14/17 3:13 AM, Seki, Eiji wrote: > > +extern TransactionId GetOldestXmin(Relation rel, uint8 > > ignoreFlags); > > My impression is that most other places that do this sort of thing just call > the argument 'flags', so as not to "lock in" a single idea of what the flags > are for. I can't readily think of another use for flags in GetOldestXmin, but > ISTM it's better to just go with "flags" instead of "ignoreFlags". Thanks. I also think "flags" is better. I will rename it. But I wonder if I should rename the defined flag names, IGNORE_A_FLAG_XXX and IGNORE_FLAGS_XXX because they include "IGNORE" in their name. I'm concerned GetOldestXmin users are difficult to know the meaning if they have general names, and general names will conflict to other definitions. Would you let me know if you have any idea? -- Regards, Eiji Seki Fujitsu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sum aggregate calculation for single precsion real
On 14.02.2017 16:59, Jim Nasby wrote: On 2/13/17 10:45 AM, Konstantin Knizhnik wrote: It is not true - please notice query execution time of this two queries: I bet you'd get even less difference if you simply cast to float8 instead of adding 0.0. Same result, no floating point addition. The expectation for SUM(float4) is that you want speed and are prepared to cope with the consequences. It's easy enough to cast your input to float8 if you want a wider accumulator, or to numeric if you'd like more stable (not necessarily more accurate :-() results. I do not think it's the database's job to make those choices for you. From my point of your it is strange and wrong expectation. I am choosing "float4" type for a column just because it is enough to represent range of data I have and I need to minimize size of record. In other words, you've decided to trade accuracy for performance... Could not agree with it... 1. If I choose float4 type to store bid price (which usually has 5-6 significant digits) - I do not loose precision and accuracy is not suffered. The accuracy is important when I am calculating sum of prices. But here the assumption that accuracy of sum calculation should depend on type of summed field is non obvious. May be it is more or less clear for C programmers but not for SQL users. In all database I have tested SUM of single precision floats is calculated at least using double precision numbers (or using numeric type). 2. There is no huge gap in performance between accumulating in float4 and float8. There are no "orders of magnitude": postgres=# select sum(l_quantity) from lineitem_projection; sum - 1.07374e+09 (1 row) Time: 4659.509 ms (00:04.660) postgres=# select sum(l_quantity::float8) from lineitem_projection; sum 1529738036 (1 row) Time: 5465.320 ms (00:05.465) So do not think that there is actually compromise here between performance and accuracy. But current implementation cause leads to many confusions and contradictions with users expectations: 1. The fact that sum(l_quantity) and sum(l_quantity::float8) are absolutely different (1.5 times!!! - we loose 0.5 milliard dollars:) 2. avg(l_quantity)*count(l_quantity) is not equal to sum(l_quantity) But in case of casting to float8 result is the same. 3. sum of aggregates for groups is not equal to total sum (once again no problem for float8 type_/ But when I am calculating sum, I expect to receive more or less precise result. Certainly I realize that even in case of using double it is ... but now you want to trade performance for accuracy? Why would you expect the database to magically come to that conclusion? Se above. No trading here. Please notice that current Postgres implementation of AVG aggregates calculates at sum and sum of squares even if last one is not needed for AVG. The comment in the code says: * It might seem attractive to optimize this by having multiple accumulator * functions that only calculate the sums actually needed. But on most * modern machines, a couple of extra floating-point multiplies will be * insignificant compared to the other per-tuple overhead, so I've chosen * to minimize code space instead. And it is true! In the addition to the results above I can add AVG timing for AVG calculation: postgres=# select avg(l_quantity) from lineitem_projection; avg -- 25.5015621964919 (1 row) postgres=# select avg(l_quantity::float8) from lineitem_projection; avg -- 25.5015621964919 (1 row) Please notice that avg for float is calculated using float4_accum which use float8 accumulator and also calculates sumX2! Time: 6103.807 ms (00:06.104) So I do not see reasonable arguments here for using float4pl for sum(float4)! And I do not know any database which has such strange behavior. I know that "be as others" or especially "be as Oracle" are never good argument for Postgres community but doing something differently (and IMHO wrong) without any significant reasons seems to be very strange. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] GUC for cleanup indexes threshold.
Hi, I tried regression test and found some errors concerning brin and gin, though I didn't look into this. Here's a log: *** /home/ideriha/postgres-master/src/test/regress/expected/brin.out 2017-02-13 11:33:43.270942937 +0900 --- /home/ideriha/postgres-master/src/test/regress/results/brin.out 2017-02-15 14:58:24.725984474 +0900 *** *** 403,408 SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected brin_summarize_new_values --- ! 0 (1 row) --- 403,408 SELECT brin_summarize_new_values('brinidx'); -- ok, no change expected brin_summarize_new_values --- ! 5 (1 row) == *** /home/ideriha/postgres-master/src/test/regress/expected/gin.out 2016-12-20 16:49:09.513050050 +0900 --- /home/ideriha/postgres-master/src/test/regress/results/gin.out 2017-02-15 14:58:25.536984461 +0900 *** *** 20,26 select gin_clean_pending_list('gin_test_idx'); -- nothing to flush gin_clean_pending_list ! 0 (1 row) -- Test vacuuming --- 20,26 select gin_clean_pending_list('gin_test_idx'); -- nothing to flush gin_clean_pending_list ! 8 (1 row) -- Test vacuuming == Regards, Ideriha Takeshi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPDATE of partition key
On 14 February 2017 at 22:24, David Fetterwrote: > On Mon, Feb 13, 2017 at 05:31:56PM +0530, Amit Khandekar wrote: >> Currently, an update of a partition key of a partition is not >> allowed, since it requires to move the row(s) into the applicable >> partition. >> >> Attached is a WIP patch (update-partition-key.patch) that removes >> this restriction. When an UPDATE causes the row of a partition to >> violate its partition constraint, then a partition is searched in >> that subtree that can accommodate this row, and if found, the row is >> deleted from the old partition and inserted in the new partition. If >> not found, an error is reported. > > This is great! > > Would it be really invasive to HINT something when the subtree is a > proper subtree? I am not quite sure I understood this question. Can you please explain it a bit more ... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On 14 February 2017 at 22:35, Robert Haaswrote: > On Mon, Feb 6, 2017 at 12:36 AM, Amit Khandekar > wrote: >>> Now that I think of that, I think for implementing above, we need to >>> keep track of per-subplan max_workers in the Append path; and with >>> that, the bitmap will be redundant. Instead, it can be replaced with >>> max_workers. Let me check if it is easy to do that. We don't want to >>> have the bitmap if we are sure it would be replaced by some other data >>> structure. >> >> Attached is v2 patch, which implements above. Now Append plan node >> stores a list of per-subplan max worker count, rather than the >> Bitmapset. But still Bitmapset turned out to be necessary for >> AppendPath. More details are in the subsequent comments. > > Keep in mind that, for a non-partial path, the cap of 1 worker for > that subplan is a hard limit. Anything more will break the world. > But for a partial plan, the limit -- whether 1 or otherwise -- is a > soft limit. It may not help much to route more workers to that node, > and conceivably it could even hurt, but it shouldn't yield any > incorrect result. I'm not sure it's a good idea to conflate those two > things. Yes, the logic that I used in the patch assumes that "Path->parallel_workers field not only suggests how many workers to allocate, but also prevents allocation of too many workers for that path". For seqscan path, this field is calculated based on the relation pages count. I believe the theory is that, too many workers might even slow down the parallel scan. And the same theory would be applied for calculating other types of low-level paths like index scan. The only reason I combined the soft limit and the hard limit is because it is not necessary to have two different fields. But of course this is again under the assumption that allocating more than parallel_workers would never improve the speed, in fact it can even slow it down. Do we have such a case currently where the actual number of workers launched turns out to be *more* than Path->parallel_workers ? > For example, suppose that I have a scan of two children, one > of which has parallel_workers of 4, and the other of which has > parallel_workers of 3. If I pick parallel_workers of 7 for the > Parallel Append, that's probably too high. Had those two tables been > a single unpartitioned table, I would have picked 4 or 5 workers, not > 7. On the other hand, if I pick parallel_workers of 4 or 5 for the > Parallel Append, and I finish with the larger table first, I think I > might as well throw all 4 of those workers at the smaller table even > though it would normally have only used 3 workers. > Having the extra 1-2 workers exit does not seem better. It is here, where I didn't understand exactly why would we want to assign these extra workers to a subplan which tells use that it is already being run by 'parallel_workers' number of workers. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioned tables and relfilenode
On Fri, Feb 10, 2017 at 3:19 PM, Amit Langotewrote: > The new partitioned tables do not contain any data by themselves. Any > data inserted into a partitioned table is routed to and stored in one of > its partitions. In fact, it is impossible to insert *any* data before a > partition (to be precise, a leaf partition) is created. It seems wasteful > then to allocate physical storage (files) for partitioned tables. If we > do not allocate the storage, then we must make sure that the right thing > happens when a command that is intended to manipulate a table's storage > encounters a partitioned table, the "right thing" here being that the > command's code either throws an error or warning (in some cases) if the > specified table is a partitioned table or ignores any partitioned tables > when it reads the list of relations to process from pg_class. Commands > that need to be taught about this are vacuum, analyze, truncate, and alter > table. Specifically: Thanks. I have been looking a bit at this set of patches. > - In case of vacuum, specifying a partitioned table causes a warning > - In case of analyze, we do not throw an error or warning but simply > avoid calling do_analyze_rel() *non-recursively*. Further in > acquire_inherited_sample_rows(), any partitioned tables in the list > returned by find_all_inheritors() are skipped. > - In case of truncate, only the part which manipulates table's physical > storage is skipped for partitioned tables. I am wondering if instead those should not just be errors saying that such operations are just not support. This could be relaxed in the future to mean that a vacuum, truncate or analyze on the partitioned table triggers an operator in cascade. > - ATRewriteTables() skips on the AlteredTableInfo entries for partitioned > tables, because there is nothing to be done. Perhaps that makes sense, foreign tables do that. > - Since we cannot create indexes on partitioned tables anyway, there is > no need to handle cluster and reindex (they throw a meaningful error > already due to the lack of indexes.) Yep. > Patches 0001 and 0002 of the attached implement the above part. 0001 > teaches the above mentioned commands to do the "right thing" as described > above and 0002 teaches heap_create() and heap_create_with_catalog() to not > create any physical storage (none of the forks) for partitioned tables. - else + /* +* Although we cannot analyze partitioned tables themselves, we are +* still be able to do the recursive ANALYZE. +*/ + else if (onerel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) { /* No need for a WARNING if we already complained during VACUUM */ if (!(options & VACOPT_VACUUM)) It seems to me that it is better style to use an empty else if with only the comment. Comments related to a condition that are outside this condition should be conditional in their formulations. Comments within a condition can be affirmations when they refer to a decided state of things. >From patch 2: @@ -1351,7 +1352,12 @@ heap_create_with_catalog(const char *relname, relkind == RELKIND_TOASTVALUE || relkind == RELKIND_PARTITIONED_TABLE); - heap_create_init_fork(new_rel_desc); + /* +* We do not want to create any storage objects for a partitioned +* table. +*/ + if (relkind != RELKIND_PARTITIONED_TABLE) + heap_create_init_fork(new_rel_desc); } This does not make much sense with an assertion telling exactly the contrary a couple of lines before. I think that it would make more sense to move the assertion on relkind directly in heap_create_init_fork(). > Then comes 0003, which concerns inheritance planning. In a regular > inheritance set (ie, the inheritance set corresponding to an inheritance > hierarchy whose root is a regular table), the inheritance parents are > included in their role as child members, because they might contribute > rows to the final result. So AppendRelInfo's are created for each such > table by the planner prep phase, which the later planning steps use to > create a scan plan for those tables as the Append's child plans. > Currently, the partitioned tables are also processed by the optimizer as > inheritance sets. Partitioned table inheritance parents do not own any > storage, so we *must* not create scan plans for them. So we do not need > to process them as child members of the inheritance set. 0003 teaches > expand_inherited_rtentry() to not add partitioned tables as child members. > Also, since the root partitioned table RTE is no longer added to the > Append list as the 1st child member, inheritance_planner() cannot assume > that it can install the 1st child RTE as the nominalRelation of a given > ModifyTable node, instead the original root parent table RTE is installed > as the nominalRelation. This is a collection of checks on relkind == RELKIND_PARTITIONED_TABLE
Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags
Amit Kapila wrote: > How will you decide just based on oldest xmin whether the tuple is visible or > not? How will you take decisions about tuples which have xmax set? In our use case, GetOldestXmin is used by an original maintainer process[es] to an original control table[s]. The table can be concurrently read or inserted in any transactions. However, rows in the table can be deleted (set xmax) only by the maintainer process. Then, one control table can be processed by only one maintainer process at once. So I do MVCC as following. - The maintainer's transaction: - If xmax is set, simply ignore the tuple. - For other tuples, read tuples if GetOldestXmin() > xmin. - Other transactions: Do ordinal MVCC using his XID. Note: A control table relates to a normal table relation, so get oldest xmin as to the normal table. -- Regards, Eiji Seki Fujitsu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
On Wed, Feb 15, 2017 at 11:08 AM, Robert Haaswrote: > I think the patch as presented probably isn't quite what we want, > because it waits synchronously for the second result to be ready. > Note that the wait for the first result is asynchronous: we check > PQisBusy and return without progressing the state machine until the > latter returns false; only then do we call PQgetResult(). OK, I have been poking at this problem. I think that we need to introduce a new state in ConnStatusType telling "please consume all results until PQgetResult returns NULL" which is CONNECTION_CONSUME in the patch attached. And as long as all the results are not consumed, the loop just keeps going. If more messages are being waited for with PGisBusy returning true, then the loop requests for more data to read by switching back to PGRES_POLLING_READING. By the way, I am noticing as well that libpq.sgml is missing a reference to CONNECTION_CHECK_WRITABLE. This should be present as applications calling PQconnectPoll() could face such a status. I have fixed this issue as well in the patch attached. -- Michael pqsendquery-fix-v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelize queries containing subplans
On Wed, Feb 15, 2017 at 4:38 AM, Robert Haaswrote: > On Tue, Feb 14, 2017 at 4:24 AM, Amit Kapila wrote: >> On further evaluation, it seems this patch has one big problem which >> is that it will allow forming parallel plans which can't be supported >> with current infrastructure. For ex. marking immediate level params >> as parallel safe can generate below type of plan: >> >> Seq Scan on t1 >>Filter: (SubPlan 1) >>SubPlan 1 >> -> Gather >>Workers Planned: 1 >>-> Result >> One-Time Filter: (t1.k = 0) >> -> Parallel Seq Scan on t2 >> >> >> In this plan, we can't evaluate one-time filter (that contains >> correlated param) unless we have the capability to pass all kind of >> PARAM_EXEC param to workers. I don't want to invest too much time in >> this patch unless somebody can see some way using current parallel >> infrastructure to implement correlated subplans. > > I don't think this approach has much chance of working; it just seems > too simplistic. I'm not entirely sure what the right approach is. > Unfortunately, the current query planner code seems to compute the > sets of parameters that are set and used quite late, and really only > on a per-subquery level. > Now just for the sake of discussion consider we have list of allParams at the path level, then also I think it might not be easy to make it work. > Here we need to know whether there is > anything that's set below the Gather node and used above it, or the > other way around, and we need to know it much earlier, while we're > still doing path generation. > Yes, that is exactly the challenge. I am not sure if currently there is a way by which we can identify if a Param on a particular node refers to node below it or above it. > (There's also possible a couple of other cases, like an initPlan that > needs to get executed only once, and also maybe a case where a > parameter is set below the Gather and later used above the Gather. > Not sure if that latter one happen, or how to deal with it.) > I think the case for initPlan is slightly different because we can always evaluate it at Gather (considering it is an uncorrelated initplan) and then pass it to Workers. We generally have a list of all the params at each plan node, so we can identify which of these are initPlan params and then evaluate them. Now, it can be used irrespective of whether it is used above or below the Gather node. For the cases, where it can be used above Gather node, it will work as we always store the computed value of such params in estate/econtext and for the cases when it has to be used below Gather, we need to pass the computed value to workers. Now, there is some exceptions like for few cases not all the params are available at a particular node, but I feel those can be handled easily by either traversing the planstate tree or by actually storing them at Gather node. Actually, in short, this is what is done in the patch proposed for parallizing initplans [1]. [1] - https://commitfest.postgresql.org/13/997/ -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE TABLE with parallel workers, 10.0?
On Tue, Feb 14, 2017 at 09:24:47PM -0800, Joshua Chamberlain wrote: > Hello, > > (I'm posting to hackers since I got no response on the general > list.) > > I use Postgres + PostGIS quite heavily, and recently have been > taking full advantage of the new parallelism in 9.6. I'm now running > queries in a few hours that would otherwise take more than a day. > > However, parallelism is disabled for all queries that perform writes > (as documented). I would normally run "CREATE TABLE AS [some > super-expensive query]", but since that can't use parallelism I'm > using the \o option in psql, creating the table separately, and then > \copy-ing in the results. That works, but "CREATE TABLE AS" would > be more convenient. How about creating a temp view? CREATE TEMPORARY VIEW foo_tv AS [your gigantic query goes here]; CREATE TABLE foo (LIKE foo_tv); INSERT INTO foo SELECT * FROM foo_tv; > Are there plans in 10.0 to allow parallelism in queries that write, > or at least in "CREATE TABLE AS" queries? (Support in materialized > views would be great, too!) Patches are always welcome, and there's one more commitfest to go before 10. Best, David. -- David Fetterhttp://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK
On Tue, Feb 14, 2017 at 1:13 AM, Fujii Masaowrote: > On Mon, Feb 13, 2017 at 4:05 PM, Masahiko Sawada > wrote: >> On Sat, Feb 11, 2017 at 8:18 PM, Petr Jelinek >> wrote: >>> On 10/02/17 19:55, Masahiko Sawada wrote: On Thu, Feb 9, 2017 at 12:44 AM, Petr Jelinek wrote: > On 08/02/17 07:40, Masahiko Sawada wrote: >> On Wed, Feb 8, 2017 at 9:01 AM, Michael Paquier >> wrote: >>> On Wed, Feb 8, 2017 at 1:30 AM, Fujii Masao >>> wrote: On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek wrote: > For example what happens if apply crashes during the DROP > SUBSCRIPTION/COMMIT and is not started because the delete from catalog > is now visible so the subscription is no longer there? Another idea is to treat DROP SUBSCRIPTION in the same way as VACUUM, i.e., make it emit an error if it's executed within user's transaction block. >>> >>> It seems to me that this is exactly Petr's point: using >>> PreventTransactionChain() to prevent things to happen. >> >> Agreed. It's better to prevent to be executed inside user transaction >> block. And I understood there is too many failure scenarios we need to >> handle. >> Also DROP SUBSCRIPTION should call CommitTransactionCommand() just after removing the entry from pg_subscription, then connect to the publisher and remove the replication slot. >>> >>> For consistency that may be important. >> >> Agreed. >> >> Attached patch, please give me feedback. >> > > This looks good (and similar to what initial patch had btw). Works fine > for me as well. > > Remaining issue is, what to do about CREATE SUBSCRIPTION then, there are > similar failure scenarios there, should we prevent it from running > inside transaction as well? > Hmm, after thought I suspect current discussing approach. For example, please image the case where CRAETE SUBSCRIPTION creates subscription successfully but fails to create replication slot for whatever reason, and then DROP SUBSCRIPTION drops the subscription but dropping replication slot is failed. In such case, CREAET SUBSCRIPTION and DROP SUBSCRIPTION return ERROR but the subscription is created and dropped successfully. I think that this behaviour confuse the user. I think we should just prevent calling DROP SUBSCRIPTION in user's transaction block. Or I guess that it could be better to separate the starting/stopping logical replication from subscription management. >>> >>> We need to stop the replication worker(s) in order to be able to drop >>> the slot. There is no such issue with startup of the worker as that one >>> is launched by launcher after the transaction has committed. >>> >>> IMO best option is to just don't allow DROP/CREATE SUBSCRIPTION inside a >>> transaction block and don't do any commits inside of those (so that >>> there are no rollbacks, which solves your initial issue I believe). That >>> way failure to create/drop slot will result in subscription not being >>> created/dropped which is what we want. > > On second thought, +1. > >> I basically agree this option, but why do we need to change CREATE >> SUBSCRIPTION as well? > > Because the window between the creation of replication slot and the > transaction > commit of CREATE SUBSCRIPTION should be short. Otherwise, if any error happens > during that window, the replication slot unexpectedly remains while there is > no > corresponding subscription. Of course, even If we prevent CREATE SUBSCRIPTION > from being executed within user's transaction block, there is still such > window. But we can reduce the possibility of that problem. Thank you for the explanation. I understood and agree. I think we should disallow to call ALTER SUBSCRIPTION inside a user's transaction block as well. Attached patch changes these three DDLs so that they cannot be called inside a user's transaction block. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center disallow_sub_ddls_in_transaction_block.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output vs make installcheck
Andres Freundwrites: > I don't quite see the point of this - there's a lot of settings that cause > spurious test failures. I don't see any point fixing random cases of that. > And I don't think the continual cost of doing so overall is worth the minimal > gain. > What's your reason to get this fixed? FWIW, I'm inclined to do something about Jeff's nearby complaint about operator_precedence_warning, because the cause of that failure is pretty obscure. I'm less excited about this one, because it should be obvious what happened to anyone who looks at the regression diffs. In general I think there's value in "make installcheck" passing when it reasonably can, but you're quite right that there's a lot of setting changes that would break it, and not all are going to be practical to fix. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE TABLE with parallel workers, 10.0?
On Wed, Feb 15, 2017 at 2:24 PM, Joshua Chamberlainwrote: > Are there plans in 10.0 to allow parallelism in queries that write, or at > least in "CREATE TABLE AS" queries? (Support in materialized views would be > great, too!) Nope. There are no patches for now. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CREATE TABLE with parallel workers, 10.0?
Hello, (I'm posting to hackers since I got no response on the general list.) I use Postgres + PostGIS quite heavily, and recently have been taking full advantage of the new parallelism in 9.6. I'm now running queries in a few hours that would otherwise take more than a day. However, parallelism is disabled for all queries that perform writes (as documented). I would normally run "CREATE TABLE AS [some super-expensive query]", but since that can't use parallelism I'm using the \o option in psql, creating the table separately, and then \copy-ing in the results. That works, but "CREATE TABLE AS" would be more convenient. Are there plans in 10.0 to allow parallelism in queries that write, or at least in "CREATE TABLE AS" queries? (Support in materialized views would be great, too!) Thanks, Joshua Chamberlain
Re: [HACKERS] bytea_output vs make installcheck
On February 14, 2017 9:02:14 PM PST, neha khatriwrote: >On Wed, Feb 15, 2017 at 10:04 AM, neha khatri > wrote:. >> >> >>> Attached are two options for doing that. One overrides bytea_output >>> locally where-ever needed, and the other overrides it for the entire >>> 'regression' database. >>> >> >> The solution that overrides bytea_output locally looks appropriate. >It may >> not be required to change the format for entire database. >> Had there been a way to convert bytea_output from 'hex' to 'escape' >> internally, that could have simplified this customization even more. >> > >Well, the conversion from 'hex' to 'escape' is available using the >function >encode(). >So the queries that are failing due to the setting bytea_output = >escape, >can be wrapped under encode(), to obtain the result in 'escape' format. >Here is another way to resolve the same problem. The patch is attached. I don't quite see the point of this - there's a lot of settings that cause spurious test failures. I don't see any point fixing random cases of that. And I don't think the continual cost of doing so overall is worth the minimal gain. What's your reason to get this fixed? Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output vs make installcheck
On Wed, Feb 15, 2017 at 10:04 AM, neha khatriwrote:. > > >> Attached are two options for doing that. One overrides bytea_output >> locally where-ever needed, and the other overrides it for the entire >> 'regression' database. >> > > The solution that overrides bytea_output locally looks appropriate. It may > not be required to change the format for entire database. > Had there been a way to convert bytea_output from 'hex' to 'escape' > internally, that could have simplified this customization even more. > Well, the conversion from 'hex' to 'escape' is available using the function encode(). So the queries that are failing due to the setting bytea_output = escape, can be wrapped under encode(), to obtain the result in 'escape' format. Here is another way to resolve the same problem. The patch is attached. Regards, Neha intallcheck_bytea_output_escape.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Provide list of subscriptions and publications in psql's completion
On Wed, Feb 15, 2017 at 3:18 AM, Jim Nasbywrote: > Why not do what we do for pg_stat_activity.current_query and leave it NULL > for non-SU? If subcriptions are designed to be superuser-only, it seems fair to me to do so. Some other system SRFs do that already. > Even better would be if we could simply strip out password info. Presumably > we already know how to parse the contents, so I'd think that shouldn't be > that difficult. I thought that this was correctly clobbered... But... No that's not the case by looking at the code. And honestly I think that it is unacceptable to show potentially security-sensitive information in system catalogs via a connection string. We are really careful about not showing anything bad in pg_stat_wal_receiver, which also sets to NULL fields for non-superusers and even clobbered values in the printed connection string for superusers, but pg_subscription fails on those points. I am adding an open item on the wiki regarding that. FWIW, a patch needs to refactor libpqrcv_check_conninfo and libpqrcv_get_conninfo so as the connection string build of PQconninfoOption data goes through the same process. If everybody agrees on those lines, I have no problems in producing a patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator_precedence_warning vs make installcheck
Jeff Janeswrites: > make installcheck fails against a server running with > operator_precedence_warning = on. > The difference is in update.out, and consists of an error-locating carat > getting moved over by one position. I've attached the regression diff. > I don't know why the setting of this GUC causes the carat to move around, > that seems odd. The reason is that with operator_precedence_warning = on, there's an explicit raw-parse-tree node for the parenthesis pair, which otherwise there is not, so that exprLocation reports a different result for the location of the subexpression. We could possibly prevent the difference by having exprLocation look through such nodes. I'm not sure offhand if there are cases where that would be worse than before. We've definitely made some other hacks to hide the difference between operator_precedence_warning on and off. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sum aggregate calculation for single precsion real
Robert Haaswrites: > Well put. Although it's worth noting that we aren't 100% consistent > about this stuff: sum(smallint), sum(integer), and sum(bigint) all use > an output data type different from the input data type, but other > versions of sum() don't. In those cases I believe the main reason for the different output type is that there's a significant risk of overflow if we don't. See commits bec98a31c and 5f7c2bdb5 for some history. You could perhaps make an argument that sum(float4) would have less risk of overflow if it accumulated in and returned float8, but frankly that seems a bit thin. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelize queries containing subplans
On Wed, Feb 15, 2017 at 4:46 AM, Robert Haaswrote: > On Mon, Feb 13, 2017 at 11:07 PM, Amit Kapila wrote: >> I have removed the check of AlternativeSubPlan so that it can be >> handled by expression_tree_walker. > ... >> Attached patch fixes all the comments raised till now. > > Committed after removing the reference to AlternativeSubPlan from the > comment. I didn't see any need to mention that. > Okay, I was also of two minds while adding that line in the comment. I had kept it because there are few places in the code where both SubPlan and AlternativeSubPlan are handled together, so I thought the person looking at this code should not wonder why we have not handled AlternativeSubPlan. However, I think it is okay to remove it from comment as there exist other places where only Subplan is handled and we rely on expression tree walker for AlternativeSubPlans. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -R
On Wed, Feb 15, 2017 at 2:38 AM, Robert Haaswrote: > On Thu, Feb 9, 2017 at 9:40 PM, Michael Paquier > wrote: >> In short, I'd like to think that we should just filter out those two >> parameters by name and call it a day. Or introduce an idea of value >> set for the environment by adding some kind of tracking flag in >> PQconninfoOption? Though I am not sure that it is worth complicating >> libpq to just generate recovery.conf in pg_basebackup. > > Yeah, I'm not sure what the best solution is. I just thought it was strange. Thinking more about this, perhaps the correct answer is to do nothing? target_session_attrs being set is rather similar to sslmode or sslcompression for example. They are here but don't hurt. The same thing applies to passfile: if the file is not here the client would still ask for input. If it is here, things are helped a bit. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
On Mon, Feb 13, 2017 at 5:57 AM, Amit Langotewrote: > On 2017/02/13 14:21, Amit Langote wrote: >> On 2017/02/10 19:19, Simon Riggs wrote: >>> * The OID inheritance needs work - you shouldn't need to specify a >>> partition needs OIDS if the parent has it already. >> >> That sounds right. It's better to keep the behavior same as for regular >> inheritance. Will post a patch to get rid of the unnecessary check. >> >> FWIW, the check was added to prevent the command from succeeding in the >> case where WITHOUT OIDS has been specified for a partition and the parent >> partitioned table has the OID column. Regular inheritance simply >> *overrides* the WITHOUT OIDS specification, which might be seen as >> surprising. > > 0001 of the attached patches takes care of this. I think 0001 needs to remove this hunk of documentation: If the partitioned table specified WITH OIDS then each partition must also specify WITH OIDS. Oids are not automatically inherited by partitions. I think 0001 is better than the status quo, but I'm wondering whether we should try to do something slightly different. Maybe it should always work for the child table to specify neither WITH OIDS nor WITHOUT OIDS, but if you do specify one of them then it has to be the one that matches the parent partitioned table? With this patch, IIUC, WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS is allowed (but ignored) regardless of the parent setting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_wal_write statistics view
On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapilawrote: > On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi > wrote: > > Hi Hackers, > > > > I just want to discuss adding of a new statistics view that provides > > the information of wal writing details as follows > > > > +1. I think it will be useful to observe WAL activity. > Thanks for your opinion. > postgres=# \d pg_stat_wal_writer > > View "pg_catalog.pg_stat_wal_writer" > > Column | Type | Collation | Nullable > | > > Default > > ---+--+- > --+--+- > > num_backend_writes | bigint | | > > | > > num_total_writes | bigint | | | > > num_blocks | bigint | | | > > total_write_time | bigint| | | > > stats_reset | timestamp with time zone | | > | > > > > The columns of the view are > > 1. Total number of xlog writes that are called from the backend. > > 2. Total number of xlog writes that are called from both backend > > and background workers. (This column can be changed to just > > display on the background writes). > > 3. The number of the blocks that are written. > > 4. Total write_time of the IO operation it took, this variable data is > > filled only when the track_io_timing GUC is enabled. > > So, here is *write_time* the total time system has spent in WAL > writing before the last reset? > total write_time spent in WAL writing "after" the last reset in milliseconds. I think there should be a separate column for write and sync time. > > Added. > > Or it is possible to integrate the new columns into the existing > > pg_stat_bgwriter view also. > > > > I feel separate view is better. > Ok. Following the sample out of the view after regress run. postgres=# select * from pg_stat_walwrites; -[ RECORD 1 ]--+-- backend_writes | 19092 writes | 663 write_blocks | 56116 write_time | 0 sync_time | 3064 stats_reset| 2017-02-15 13:37:09.454314+11 Currently, writer, walwriter and checkpointer processes are considered as background processes that can do the wal write mainly. Here I attached patch that implements the view. I will add this patch to next commitfest. Regards, Hari Babu Fujitsu Australia pg_stat_walwrites_view_1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
On Fri, Feb 10, 2017 at 3:00 AM, Simon Riggswrote: > Given that we already have partitioning feature committed, we really > need to have the docs committed as well. Just for the record, it's not like there were no documentation changes in the originally committed patch. In fact there were over 400 lines of documentation: doc/src/sgml/catalogs.sgml | 129 +++- doc/src/sgml/ref/alter_table.sgml | 117 +- doc/src/sgml/ref/create_foreign_table.sgml | 26 + doc/src/sgml/ref/create_table.sgml | 154 + The patch you committed approximately doubles the amount of documentation for this feature, which is fine as far as it goes, but the key points were all explained in the original commit. I have been known to leave out documentation from commits from time to time and fill it in after-the-fact, but that's not really what happened here. > Without claiming I'm happy about this, I think the best way to improve > the number of eyeballs on this is to commit these docs as is. > > For me, the most important thing is understanding the feature, not > (yet) discussing what the docs should look like. This is especially > true if other patches reference the way partitioning works and nobody > can comment on those patches because they don't understand > > Any issues with that? There are a number of things that I think are awkward about the patch as committed: + + +See last section for some general information: + + + I think we generally try to write the documentation in such a way as to minimize backward references, and a backward reference to the previous section seems particularly odd. We've now got section "5.10 Partitioned Tables" followed immediately by section "5.11 Partitioning", where the latter seems to think that you haven't read the former. I think that section 5.11 needs a much heavier rewrite than what it got as part of this patch. It's a hodgepodge of the old content (which explained how to fake partitioning when we didn't have an explicit concept of partitioning) and new content that talks about how the new way is different from the old way. But now that we have the new way, I'm guessing that most people are going to use that and not care about the old way any more. I'm not that it's even appropriate to keep the lengthy explanation of how to fake table partitioning using table inheritance and non-overlapping CHECK constraints, but if we still want that stuff it should be de-emphasized more than it is here. Probably the section should be retitled: in previous releases we called this "Partitioning" because we had no explicit notion of partitioning, but now that we do, it's confusing to have a section called "Partitioning" that explains how to avoid using the partitioning feature, which is more or less what this does. Or maybe the section title should stay the same (or this should be merged into the previous section?) but rewritten to change the emphasis. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sum aggregate calculation for single precsion real
On Tue, Feb 14, 2017 at 8:59 AM, Jim Nasbywrote: >> From my point of your it is strange and wrong expectation. >> I am choosing "float4" type for a column just because it is enough to >> represent range of data I have and I need to minimize size of record. > > In other words, you've decided to trade accuracy for performance... > >> But when I am calculating sum, I expect to receive more or less precise >> result. Certainly I realize that even in case of using double it is > > ... but now you want to trade performance for accuracy? Why would you expect > the database to magically come to that conclusion? Well put. Although it's worth noting that we aren't 100% consistent about this stuff: sum(smallint), sum(integer), and sum(bigint) all use an output data type different from the input data type, but other versions of sum() don't. To some extent all of these decisions are just guesses about what users will find useful, and as this thread shows, not everybody's going to agree. But I don't think our guesses are flagrantly unreasonable or anything. There's also nothing to prevent Konstantin or anybody else who doesn't like the default behavior to create their own version of sum(float4) and put it in a schema that's listed before pg_catalog in search_path. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] operator_precedence_warning vs make installcheck
make installcheck fails against a server running with operator_precedence_warning = on. The difference is in update.out, and consists of an error-locating carat getting moved over by one position. I've attached the regression diff. I don't know why the setting of this GUC causes the carat to move around, that seems odd. If it can't be fixed at the source, it should be easy enough to just override the setting in update.sql. Cheers, Jeff regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing CHECK_FOR_INTERRUPTS in hash joins
On Wed, Feb 15, 2017 at 2:22 PM, Tom Lanewrote: > Adding a C.F.I. inside this loop is the most straightforward fix, but > I am leaning towards adding one in ExecHashJoinGetSavedTuple instead, > because that would also ensure that all successful paths through > ExecHashJoinOuterGetTuple will do a C.F.I. somewhere, and it seems good > for that to be consistent. The other possibility is to put one inside > ExecHashTableInsert, but the only other caller of that doesn't really need > it, since it's relying on ExecProcNode to do one. Would it also make sense to put one in the loop in ExecHashIncreaseNumBatches (or perhaps ExecHashJoinSaveTuple for symmetry with the above)? Otherwise you might have to wait for a few hundred MB of tuples to be written out which could be slow if IO is somehow overloaded. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Wed, Feb 15, 2017 at 11:08 AM, Robert Haaswrote: > On Tue, Feb 14, 2017 at 7:12 PM, Tom Lane wrote: >> FWIW, my own habit when creating new PG files is generally to write >> >> * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group >> * Portions Copyright (c) 1994, Regents of the University of California >> >> even if it's "all new" code. The main reason being that it's hardly ever >> the case that you didn't copy-and-paste some amount of stuff out of a >> pre-existing file, and trying to sort out how much of what originated >> exactly when is an unrewarding exercise. Even if it is basically all >> new code, this feels like giving an appropriate amount of credit to >> Those Who Went Before Us. > > Right. I tend to do the same, and wonder if we shouldn't make that a > general practice. This looks sensible to me. No-brainer rules that make sense are less things to worry about. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Tue, Feb 14, 2017 at 7:12 PM, Tom Lanewrote: > Robert Haas writes: >> On Tue, Feb 14, 2017 at 5:16 PM, Michael Paquier >> wrote: >>> Just for curiosity: does the moment when the code has been written or >>> committed counts? It's no big deal seeing how liberal the Postgres >>> license is, but this makes me wonder... > >> IANAL, but I think if you ask one, he or she will tell you that what >> matters is the date the work was created. In the case of code, that >> means when the code was written. > > FWIW, my own habit when creating new PG files is generally to write > > * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group > * Portions Copyright (c) 1994, Regents of the University of California > > even if it's "all new" code. The main reason being that it's hardly ever > the case that you didn't copy-and-paste some amount of stuff out of a > pre-existing file, and trying to sort out how much of what originated > exactly when is an unrewarding exercise. Even if it is basically all > new code, this feels like giving an appropriate amount of credit to > Those Who Went Before Us. Right. I tend to do the same, and wonder if we shouldn't make that a general practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
On Tue, Feb 14, 2017 at 12:11 AM, Michael Paquierwrote: > On Thu, Feb 2, 2017 at 3:01 PM, Higuchi, Daisuke > wrote: >> From: Michael Paquier [mailto:michael.paqu...@gmail.com] >>>This has not been added yet to the next CF. As Ashutosh mentioned >>>things tend to be easily forgotten. So I have added it here: >>>https://commitfest.postgresql.org/13/982/ >> Thank you for adding this problem to CF. > > I have added this thread to the list of open items for PG10: > https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items Good catch, Michael. I think the patch as presented probably isn't quite what we want, because it waits synchronously for the second result to be ready. Note that the wait for the first result is asynchronous: we check PQisBusy and return without progressing the state machine until the latter returns false; only then do we call PQgetResult(). But the typo fix is of course correct, and independent. Committed that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Missing CHECK_FOR_INTERRUPTS in hash joins
I ran into a case where a hash join took a really long time to respond to a cancel request --- long enough that I gave up and kill -9'd it, because its memory usage was also growing to the point where the kernel would likely soon choose to do that for me. The culprit seems to be that there's no CHECK_FOR_INTERRUPTS anywhere in this loop in ExecHashJoinNewBatch(): while ((slot = ExecHashJoinGetSavedTuple(hjstate, innerFile, , hjstate->hj_HashTupleSlot))) { /* * NOTE: some tuples may be sent to future batches. Also, it is * possible for hashtable->nbatch to be increased here! */ ExecHashTableInsert(hashtable, slot, hashvalue); } so that if you try to cancel while it's sucking a really large batch file into memory, you lose. (In the pathological case I was checking, the batch file was many gigabytes in size, and had certainly never all been resident earlier.) Adding a C.F.I. inside this loop is the most straightforward fix, but I am leaning towards adding one in ExecHashJoinGetSavedTuple instead, because that would also ensure that all successful paths through ExecHashJoinOuterGetTuple will do a C.F.I. somewhere, and it seems good for that to be consistent. The other possibility is to put one inside ExecHashTableInsert, but the only other caller of that doesn't really need it, since it's relying on ExecProcNode to do one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
Robert Haaswrites: > On Tue, Feb 14, 2017 at 5:16 PM, Michael Paquier > wrote: >> Just for curiosity: does the moment when the code has been written or >> committed counts? It's no big deal seeing how liberal the Postgres >> license is, but this makes me wonder... > IANAL, but I think if you ask one, he or she will tell you that what > matters is the date the work was created. In the case of code, that > means when the code was written. FWIW, my own habit when creating new PG files is generally to write * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California even if it's "all new" code. The main reason being that it's hardly ever the case that you didn't copy-and-paste some amount of stuff out of a pre-existing file, and trying to sort out how much of what originated exactly when is an unrewarding exercise. Even if it is basically all new code, this feels like giving an appropriate amount of credit to Those Who Went Before Us. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove all references to "xlog" from SQL-callable functions in p
On Wed, Feb 15, 2017 at 2:08 AM, Robert Haaswrote: > On Mon, Feb 13, 2017 at 5:00 PM, Michael Paquier > wrote: >> It seems like the previous review I provided for the set of renaming >> patches has been ignored: >> https://www.postgresql.org/message-id/CAB7nPqSm=PNSe3EfvnEResdFzpQMcXXgapFBcF=EFdxVW4=f...@mail.gmail.com >> Good that somebody else checked... > > Sorry about that, Michael. That was careless of me. No problem. Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible TODO: allow arbitrary expressions in event trigger WHEN
On Mon, Feb 13, 2017 at 10:34 PM, Tom Lanewrote: > Jim Nasby writes: >> Is there a reason not to allow $SUBJECT? Specifically, it'd be nice to >> be able to do something like WHEN tag LIKE 'ALTER%'. > > Seems like it would be a seriously bad idea for such an expression to be > able to invoke arbitrary SQL code. What if it calls a user-defined > function that tries to do DDL? Yeah. I remember thinking about this and deciding that allowing real expressions there was totally intractable. I don't remember what all the reasons were, but what Tom's talking about may have been at least part of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] new high availability feature for the system with both asynchronous and synchronous replication
Hi all, I propose a new feature for high availability. This configuration is effective for following configuration: 1. Primary and synchronous standby are in the same center; called main center. 2. Asynchronous standby is in the another center; called backup center. (The backup center is located far away from the main center. If replication mode is synchronous, performance will be deteriorated. So, this replication must be Asynchronous. ) 3. Asynchronous replication is performed in the backup center too. 4. When primary in main center abnormally stops, standby in main center is promoted, and the standby in backup center connects to the new primary. This configuration is also shown in the figure below. [Main center] || | |--| synchronous |--| | | | |replication | | | | | primary | <--> | standby1 | | | |--| |--| | |||--| || || asynchronous || replication || ||[Backup center] |||--| | |--| asynchronous|--| | | | |replication | | | | | standby2 | <--> | standby3 | | | |--| |--| | || When the load in the main center becomes high, although WAL reaches standby in backup center, WAL may not reach synchronous standby in main center for various reasons. In other words, standby in the backup center may advance beyond synchronous standby in main center. When the primary abnormally stops and standby in main center promotes, two standbys in backup center must be recovered by pg_rewind. However, it is necessary to stop new primary for pg_rewind. If pg_basebackup is used, recovery of backup center takes some times. This is not high availability. [Proposal Concept] In this feature, just switch the connection destination and restart it. So, it is not necessary to stop new primary.There is no need for recovering by pg_rewind or pg_basebackup because standby in the backup center will not advance beyond the standby in the main center. In my idea, this feature is enabled when the new GDU parameter is set. In the case that synchronous standby and asynchronous standby are connected to primary, walsender check if WAL is sent to synchronous standby before sending WAL to the asynchronous standby. After walsender confirm WAL has been sent to synchronous standby, it also sends the WAL to the asynchronous standby. I would appreciate it if you give any comments for this feature. Regards, Daisuke Higuchi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Tue, Feb 14, 2017 at 5:16 PM, Michael Paquierwrote: > On Wed, Feb 15, 2017 at 2:43 AM, Robert Haas wrote: >> On Thu, Feb 9, 2017 at 8:17 PM, Michael Paquier >> wrote: >>> Please find attached a patch with those fixes. >> >> Committed, but I changed the copyright dates to 2016-2017 rather than >> just 2017 since surely some of the code was originally written before >> 2017. Even that might not really be going back far enough, but it >> doesn't matter too much. > > Just for curiosity: does the moment when the code has been written or > committed counts? It's no big deal seeing how liberal the Postgres > license is, but this makes me wonder... IANAL, but I think if you ask one, he or she will tell you that what matters is the date the work was created. In the case of code, that means when the code was written. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrerawrote: > I'd rather have a --quiet mode instead. If you're running it by hand, > you're likely to omit the switch, whereas when writing the cron job > you're going to notice lack of switch even before you let the job run > once. Well, that might've been a better way to design it, but changing it now would break backward compatibility and I'm not really sure that's a good idea. Even if it is, it's a separate concern from whether or not in the less-quiet mode we should point out that we're waiting for a checkpoint on the server side. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelize queries containing subplans
On Mon, Feb 13, 2017 at 11:07 PM, Amit Kapilawrote: > I have removed the check of AlternativeSubPlan so that it can be > handled by expression_tree_walker. ... > Attached patch fixes all the comments raised till now. Committed after removing the reference to AlternativeSubPlan from the comment. I didn't see any need to mention that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions
On 2017-02-14 17:58:23 -0500, Tom Lane wrote: > Andres Freundwrites: > > On 2016-11-26 08:41:28 -0800, Andres Freund wrote: > >> On November 26, 2016 8:06:26 AM PST, Tom Lane wrote: > >>> Those don't call functions, they call operators. Yes, I know that an > >>> operator has a function underlying it, but the user-level expectation > >>> for track_functions is that what it counts are things that look > >>> syntactically like function calls. I'm not eager to add tracking > >>> overhead for cases that there's been exactly zero field demand for. > > >> But we do track for OpExprs? Otherwise I'd agree. > > > Bump? > > If you're going to insist on foolish consistency, I'd rather take out > tracking in OpExpr than add it in dozens of other places. I'm ok with being inconsistent, but I'd like to make that a conscious choice rather it being the consequence of an oversight - and that's what it looks like to me. We're doing it for OpExpr, but not for a bunch of other function / operator invocations within execQual.c (namely ExecEvalDistinct, ExecEvalScalarArrayOp, ExecEvalRowCompare, ExecEvalMinMax, ExecEvalNullIf), neither do we do it for *function* invocations directly in the executor (prominently node[Window]Agg.c), but we do it for trigger invocations. That's, uh, a bit weird and hard to explain. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallelize queries containing subplans
On Tue, Feb 14, 2017 at 4:24 AM, Amit Kapilawrote: > On further evaluation, it seems this patch has one big problem which > is that it will allow forming parallel plans which can't be supported > with current infrastructure. For ex. marking immediate level params > as parallel safe can generate below type of plan: > > Seq Scan on t1 >Filter: (SubPlan 1) >SubPlan 1 > -> Gather >Workers Planned: 1 >-> Result > One-Time Filter: (t1.k = 0) > -> Parallel Seq Scan on t2 > > > In this plan, we can't evaluate one-time filter (that contains > correlated param) unless we have the capability to pass all kind of > PARAM_EXEC param to workers. I don't want to invest too much time in > this patch unless somebody can see some way using current parallel > infrastructure to implement correlated subplans. I don't think this approach has much chance of working; it just seems too simplistic. I'm not entirely sure what the right approach is. Unfortunately, the current query planner code seems to compute the sets of parameters that are set and used quite late, and really only on a per-subquery level. Here we need to know whether there is anything that's set below the Gather node and used above it, or the other way around, and we need to know it much earlier, while we're still doing path generation. There doesn't seem to be any simple way of getting that information, but I think you need it. What's more, I think you would still need it even if you had the ability to pass parameter values between processes. For example, consider: Gather -> Parallel Seq Scan Filter: (Correlated Subplan Reference Goes Here) Of course, the Param in the filter condition *can't* be a shared Param across all processes. It needs to be private to each process participating in the parallel sequential scan -- and the params passing data down from the Parallel Seq Scan to the correlated subplan also need to be private. On the other hand, in your example quoted above, you do need to share across processes: the value for t1.k needs to get passed down. So it seems to me that we somehow need to identify, for each parameter that gets used, whether it's provided by something beneath the Gather node (in which case it should be private to the worker) or whether it's provided from higher up (in which case it should be passed down to the worker, or if we can't do that, then don't use parallelism there). (There's also possible a couple of other cases, like an initPlan that needs to get executed only once, and also maybe a case where a parameter is set below the Gather and later used above the Gather. Not sure if that latter one happen, or how to deal with it.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output vs make installcheck
On Wed, Feb 15, 2017 at 8:50 AM, Jeff Janeswrote: > make installcheck currently fails against a server running > with bytea_output = escape. > > Making it succeed is fairly easy, and I think it is worth doing. > > Attached are two options for doing that. One overrides bytea_output > locally where-ever needed, and the other overrides it for the entire > 'regression' database. > The solution that overrides bytea_output locally looks appropriate. It may not be required to change the format for entire database. Had there been a way to convert bytea_output from 'hex' to 'escape' internally, that could have simplified this customisation even more. Regards, Neha Cheers, Neha On Wed, Feb 15, 2017 at 8:50 AM, Jeff Janes wrote: > make installcheck currently fails against a server running > with bytea_output = escape. > > Making it succeed is fairly easy, and I think it is worth doing. > > Attached are two options for doing that. One overrides bytea_output > locally where-ever needed, and the other overrides it for the entire > 'regression' database. > > Cheers, > > Jeff > > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Hello Tom, So moving the conditional stack back into PsqlScanState has some side effects: conditional.[ch] have to move to the fe_utils/ dirs, and now pgbench, which does not use conditionals, would have to link to them. Is that a small price to pay for modularity and easier-to-find code? Or should I just tuck it back into psqlscan_int.[ch]? Pardon me for coming in late, but what in the world has this to do with the lexer's state at all? IOW, I don't think I like either of what you're suggesting ... The "lexer" state holds the stuff useful to psql to know where commands start and stop, to process backslash commands, including counting parenthesis and nested comments and so on... It seems logical to put the "if" stack there as well, but if you think that it should be somewhere else, please advise Corey about where to put it. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
On Tue, Feb 14, 2017 at 4:44 PM, Tom Lanewrote: > Corey Huinker writes: > > So moving the conditional stack back into PsqlScanState has some side > > effects: conditional.[ch] have to move to the fe_utils/ dirs, and now > > pgbench, which does not use conditionals, would have to link to them. Is > > that a small price to pay for modularity and easier-to-find code? Or > should > > I just tuck it back into psqlscan_int.[ch]? > > Pardon me for coming in late, but what in the world has this to do with > the lexer's state at all? IOW, I don't think I like either of what you're > suggesting ... > > regards, tom lane > Patch v12 has them separated, if that was more to your liking. The stack state lived in MainLoop() and was passed into HandleSlashCommands with an extra state variable.
Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions
Andres Freundwrites: > On 2016-11-26 08:41:28 -0800, Andres Freund wrote: >> On November 26, 2016 8:06:26 AM PST, Tom Lane wrote: >>> Those don't call functions, they call operators. Yes, I know that an >>> operator has a function underlying it, but the user-level expectation >>> for track_functions is that what it counts are things that look >>> syntactically like function calls. I'm not eager to add tracking >>> overhead for cases that there's been exactly zero field demand for. >> But we do track for OpExprs? Otherwise I'd agree. > Bump? If you're going to insist on foolish consistency, I'd rather take out tracking in OpExpr than add it in dozens of other places. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions
On 2016-11-26 08:41:28 -0800, Andres Freund wrote: > On November 26, 2016 8:06:26 AM PST, Tom Lanewrote: > >Robert Haas writes: > >> On Fri, Nov 25, 2016 at 11:12 PM, Andres Freund > >wrote: > >>> while working on my faster expression evaluation stuff I noticed > >that a > >>> lot of expression types that call functions don't call the necessary > >>> functions to make track_functions work. > >>> > >>> ExecEvalFunc/ExecEvalOper (via ExecMakeFunctionResultNoSets) call > >>> pgstat_init_function_usage/pgstat_end_function_usage, but others > >like > >>> ExecEvalRowCompare, ExecEvalMinMax, ExecEvalNullIf, > >ExecEvalDistinct, > >>> ExecEvalScalarArrayOp (and indirectly ExecEvalArrayCoerceExpr) > >don't. > >>> > >>> Similarly InvokeFunctionExecuteHook isn't used very thoroughly. > >>> > >>> Are these worth fixing? I suspect yes. If so, do we want to > >backpatch? > > > >Those don't call functions, they call operators. Yes, I know that an > >operator has a function underlying it, but the user-level expectation > >for > >track_functions is that what it counts are things that look > >syntactically > >like function calls. I'm not eager to add tracking overhead for cases > >that there's been exactly zero field demand for. > > But we do track for OpExprs? Otherwise I'd agree. Bump? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Wed, Feb 15, 2017 at 2:43 AM, Robert Haaswrote: > On Thu, Feb 9, 2017 at 8:17 PM, Michael Paquier > wrote: >> Please find attached a patch with those fixes. > > Committed, but I changed the copyright dates to 2016-2017 rather than > just 2017 since surely some of the code was originally written before > 2017. Even that might not really be going back far enough, but it > doesn't matter too much. Just for curiosity: does the moment when the code has been written or committed counts? It's no big deal seeing how liberal the Postgres license is, but this makes me wonder... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] renaming pg_resetxlog to pg_resetwal has broken pg_upgrade.
On Wed, Feb 15, 2017 at 3:55 AM, Jeff Janeswrote: > I thought people would object to checking the version number in two > different places to make the same fundamental decision, and would want that > refactored somehow. But if you are OK with it, then I am. The binary versions are checked only once, which does not with change HEAD. With this patch it happens just earlier, which makes the most sense now that we have a condition depending on the version of what is installed. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set of fixes for WAL consistency check facility
On Wed, Feb 15, 2017 at 2:44 AM, Robert Haaswrote: > I committed the patch posted to the other thread. Hopefully that > closes this issue. Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] bytea_output vs make installcheck
make installcheck currently fails against a server running with bytea_output = escape. Making it succeed is fairly easy, and I think it is worth doing. Attached are two options for doing that. One overrides bytea_output locally where-ever needed, and the other overrides it for the entire 'regression' database. Cheers, Jeff installcheck_bytea_fix_2.patch Description: Binary data installcheck_bytea_fix_1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Corey Huinkerwrites: > So moving the conditional stack back into PsqlScanState has some side > effects: conditional.[ch] have to move to the fe_utils/ dirs, and now > pgbench, which does not use conditionals, would have to link to them. Is > that a small price to pay for modularity and easier-to-find code? Or should > I just tuck it back into psqlscan_int.[ch]? Pardon me for coming in late, but what in the world has this to do with the lexer's state at all? IOW, I don't think I like either of what you're suggesting ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Robert Haas wrote: > On Tue, Feb 14, 2017 at 12:06 PM, Magnus Haganderwrote: > > However, outputing this info by default will make it show up in things like > > everybodys cronjobs by default. Right now a successful pg_basebackup run > > will come out with no output at all, which is how most Unix commands work, > > and brings it's own advantages. If we change that people will have to send > > all the output to /dev/null, resulting in missing the things that are > > actually important in any regard. > > I agree with that. I think having this show up in verbose mode is a > really good idea - when something just hangs, users don't know what's > going on, and that's bad. But showing it all the time seems like a > bridge too far. As the postmortem linked above shows, people will > think of things like "hey, let's try --verbose mode" when the obvious > thing doesn't work. What is really irritating to them is when > --verbose mode fails to be, uh, verbose. I'd rather have a --quiet mode instead. If you're running it by hand, you're likely to omit the switch, whereas when writing the cron job you're going to notice lack of switch even before you let the job run once. I think progress reporting ought to go to stderr anyway. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve OR conditions on joined columns (common star schema problem)
Jim Nasbywrites: > On 2/14/17 1:18 PM, Tom Lane wrote: >> One point that could use further review is whether the de-duplication >> algorithm is actually correct. I'm only about 95% convinced by the >> argument I wrote in planunionor.c's header comment. > I'll put some thought into it and see if I can find any holes. Are you > only worried about the removal of "useless" rels or is there more? Well, the key point is whether it's really OK to de-dup on the basis of only the CTIDs that are not eliminated in any UNION arm. I was feeling fairly good about that until I thought of the full-join-to- left-join-to-no-join conversion issue mentioned in the comment. Now I'm wondering if there are other holes; or maybe I'm wrong about that one and it's not necessary to be afraid of full joins. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess
Robert Haas wrote: > On Tue, Feb 14, 2017 at 2:54 PM, Andres Freundwrote: > >> Thoughts, comments, objections, better ideas? > > > > No better ideas. I'm a bit concerned about declarations needed both by > > normal and xlog related routines, but I guess that can be solved by a > > third header as you did. > > Yeah, that doesn't seem bad to me. I think it's actually fairly > unfortunate that we've just shoved declarations from 3 or 4 or 5 > different source files into a single header in many of these cases. I > think it leads to not thinking clearly about what the dependencies > between the different source files in the index AM stuff is, and > certainly there seems to be some room for improvement there, > especially with regard to gist and gin. Agreed -- on a very quick read, I like the way you split the GIN headers. I don't think the concern is really the number of source files involved, because I think in some places we're bad at splitting source files sensibly. > Sorting that out is a bigger > project than I'm prepared to undertake right now, but I think this is > probably a step in the right direction. Yes, I think so too. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possibility to specify template database for pg_regress
Dne 14. 2. 2017 21:35 napsal uživatel "Andres Freund": On 2017-02-14 14:29:56 -0600, Jim Nasby wrote: > On 2/14/17 1:59 PM, Andres Freund wrote: > > > AFAIK if you're doing make check (as opposed to installcheck) it's > > > significantly more complicated than that since you'd have to create a temp > > > cluster/install yourself. > > > > But in that case you can't have useful templates in the regression test > > either, so the whole discussion is moot? > > At that point it depends on what you're trying to do. Presumably separating > cluster control would make it much easier to script createdb/dropdb as you > suggested. That's not what I responded to... > Tom's use case might be more easily served by specifying a > template database. I don't think Pavel ever posted his use case. Wait, that's precisely what Pavel asked? I would to use regress test environment in my current case. 99% code in plpgsql, but there is pretty complex schema. About 300 tables. 1k views. 2k functions. Import schema is slow. Database clonning is much faster. On 2017-02-07 16:43:47 +0100, Pavel Stehule wrote: > Is possible to specify template database for pg_regress? > > I have to run tests on database with thousands database objects. Using > template is much faster than import these objects. Obviously that only makes sense with installcheck. > Speaking for myself, my normal pattern is to have a number of separate > pg_regress suites, each of which ends up loading the extension under test. > Loading a large extension can end up being very time consuming; enough so > that I'd expect it to be much faster to create the temp cluster, load all > the prereq's once in some template database, and then use that template for > most/all of the tests. I seriously doubt that. CREATE DATABASE is ridiculously expensive, copies everything on the file-level and requires checkpoints. If your extension is more expensive than that, I'd say you're likely doing something wrong. - Andres
Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess
On Tue, Feb 14, 2017 at 2:54 PM, Andres Freundwrote: >> Thoughts, comments, objections, better ideas? > > No better ideas. I'm a bit concerned about declarations needed both by > normal and xlog related routines, but I guess that can be solved by a > third header as you did. Yeah, that doesn't seem bad to me. I think it's actually fairly unfortunate that we've just shoved declarations from 3 or 4 or 5 different source files into a single header in many of these cases. I think it leads to not thinking clearly about what the dependencies between the different source files in the index AM stuff is, and certainly there seems to be some room for improvement there, especially with regard to gist and gin. Sorting that out is a bigger project than I'm prepared to undertake right now, but I think this is probably a step in the right direction. >> +++ b/src/include/access/nbtxlog.h >> @@ -0,0 +1,255 @@ >> +/*- >> + * >> + * nbtree.h >> + * header file for postgres btree xlog routines > > Wrong file name. Thanks to you and Michael for the reviews. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve OR conditions on joined columns (common star schema problem)
On 2/14/17 1:18 PM, Tom Lane wrote: I think this might be code-complete, modulo the question of whether we want an enabling GUC for it. I'm still concerned about the question of whether it adds more planning time than it's worth for most users. (Obviously it needs some regression test cases too, and a lot more real-world testing than I've given it.) Don't we normally provide a GUC for this level of query manipulation? We can always remove it later if evidence shows there's not sufficient downside (perhaps after gaining the ability for the planner to do extra work on queries that appear to be large enough to warrant it). One point that could use further review is whether the de-duplication algorithm is actually correct. I'm only about 95% convinced by the argument I wrote in planunionor.c's header comment. Another reason for a GUC... I'll put some thought into it and see if I can find any holes. Are you only worried about the removal of "useless" rels or is there more? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small improvement to parallel query docs
On 15 February 2017 at 03:41, Robert Haaswrote: > On Tue, Feb 14, 2017 at 5:17 AM, David Rowley > wrote: >> Updated patch attached. > > Committed and back-patched to 9.6. Great. Thanks Robert. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN
On 2017-02-14 12:19:56 -0800, Josh Berkus wrote: > On 02/14/2017 12:05 PM, Tom Lane wrote: > > Jim Nasbywrites: > >> First, just to clarify: my reasons for proposing "core adoption" of PGXN > >> are not technical in nature. > > > > What do you think "core adoption" means? Surely not that anything > > associated with PGXN would be in the core distro. > > One part of this would need to be having a designated committee of the > Postgres community pick a set of "blessed" extensions for packagers to > package. Right now, contrib serves that purpose (badly). One of the > reasons we haven't dealt with the extension distribution problem is that > nobody wanted to take on the issue of picking a list of blessed extensions. I don't see the trust problem being solved by them being blessed unless they're part of the regularly scheduled postgres back-branch releases. Which essentially requires them to be in core, or increase the release maintenance/management cost further. We sure could have levels between "random github repo" and "in core extension", but I don't see "bless external extensions" supplanting all contrib stuff. There's a few extension in contrib/ where that level would make sense, and surely more outside, but I think moving all of contrib to something externally managed would be horrible idea. > You have to admit that it seems really strange in the eyes of a new user > that ISN is packaged with PostgreSQL, whereas better-written and more > popular extensions (like plv8, pg_partman or pgq) are not. These actually seem like easy cases. plv8 has a massive external dependency, pg_partman is a) relatively new, b) there's in-core development of extensions, and pgq isn't exactly trivial, partially written in python, ... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possibility to specify template database for pg_regress
On 2017-02-14 14:29:56 -0600, Jim Nasby wrote: > On 2/14/17 1:59 PM, Andres Freund wrote: > > > AFAIK if you're doing make check (as opposed to installcheck) it's > > > significantly more complicated than that since you'd have to create a temp > > > cluster/install yourself. > > > > But in that case you can't have useful templates in the regression test > > either, so the whole discussion is moot? > > At that point it depends on what you're trying to do. Presumably separating > cluster control would make it much easier to script createdb/dropdb as you > suggested. That's not what I responded to... > Tom's use case might be more easily served by specifying a > template database. I don't think Pavel ever posted his use case. Wait, that's precisely what Pavel asked? On 2017-02-07 16:43:47 +0100, Pavel Stehule wrote: > Is possible to specify template database for pg_regress? > > I have to run tests on database with thousands database objects. Using > template is much faster than import these objects. Obviously that only makes sense with installcheck. > Speaking for myself, my normal pattern is to have a number of separate > pg_regress suites, each of which ends up loading the extension under test. > Loading a large extension can end up being very time consuming; enough so > that I'd expect it to be much faster to create the temp cluster, load all > the prereq's once in some template database, and then use that template for > most/all of the tests. I seriously doubt that. CREATE DATABASE is ridiculously expensive, copies everything on the file-level and requires checkpoints. If your extension is more expensive than that, I'd say you're likely doing something wrong. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN
On 2/14/17 2:19 PM, Josh Berkus wrote: One part of this would need to be having a designated committee of the Postgres community pick a set of "blessed" extensions for packagers to package. Right now, contrib serves that purpose (badly). One of the reasons we haven't dealt with the extension distribution problem is that nobody wanted to take on the issue of picking a list of blessed extensions. That was my idea behind "all other official extensions live at git URL here> / ". Adding some kind of reputation system to PGXN would probably be even more useful, but I certainly don't think that's mandatory for having officially blessed "core extensions". -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Index Scans
On Tue, Feb 14, 2017 at 12:48 PM, Robert Haaswrote: > That sounds way better. Here's an updated patch. Please review my changes, which include: * Various comment updates. * _bt_parallel_seize now unconditionally sets *pageno to P_NONE at the beginning, instead of doing it conditionally at the end. This seems cleaner to me. * I removed various BTScanPosInvalidate calls from _bt_first in places where they followed calls to _bt_parallel_done, because I can't see how the scan position could be valid at that point; note that _bt_first asserts that it is invalid on entry. * I added a _bt_parallel_done() call to _bt_first where it apparently returned without releasing the scan; search for SK_ROW_MEMBER. Maybe there's something I'm missing that makes this unnecessary, but if so there should probably be a comment here. * I wasn't happy with the strange calling convention where _bt_readnextpage usually gets a valid block number but not for non-parallel backward scans. I had a stab at fixing that so that the block number is always valid, but I'm not entirely sure I've got the logic right. Please see what you think. * I repositioned the function prototypes you added to nbtree.h to separate the public and non-public interfaces. I can't easily test this because your second patch doesn't apply, so I'd appreciate it if you could have a stab at checking whether I've broken anything in this revision. Also, it would be good if you could rebase the second patch. I think this is pretty close to committable at this point. Whether or not I broke anything in this revision, I don't think there's too much left to be done here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company parallel_index_scan_v9.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possibility to specify template database for pg_regress
On 2/14/17 1:59 PM, Andres Freund wrote: AFAIK if you're doing make check (as opposed to installcheck) it's significantly more complicated than that since you'd have to create a temp cluster/install yourself. > But in that case you can't have useful templates in the regression test either, so the whole discussion is moot? At that point it depends on what you're trying to do. Presumably separating cluster control would make it much easier to script createdb/dropdb as you suggested. Tom's use case might be more easily served by specifying a template database. I don't think Pavel ever posted his use case. Speaking for myself, my normal pattern is to have a number of separate pg_regress suites, each of which ends up loading the extension under test. Loading a large extension can end up being very time consuming; enough so that I'd expect it to be much faster to create the temp cluster, load all the prereq's once in some template database, and then use that template for most/all of the tests. In that scenario separating cluster create/drop would certainly be useful, but the template option would probably be helpful as well (though since pg_regress' diff-based methodology just gets in my way I'd likely use some other harness to actually run the tests). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN
On 02/14/2017 12:05 PM, Tom Lane wrote: > Jim Nasbywrites: >> First, just to clarify: my reasons for proposing "core adoption" of PGXN >> are not technical in nature. > > What do you think "core adoption" means? Surely not that anything > associated with PGXN would be in the core distro. One part of this would need to be having a designated committee of the Postgres community pick a set of "blessed" extensions for packagers to package. Right now, contrib serves that purpose (badly). One of the reasons we haven't dealt with the extension distribution problem is that nobody wanted to take on the issue of picking a list of blessed extensions. > >> Right now contrib is serving two completely separate purposes: > >> 1) location for code that (for technical reasons) should be tied to >> specific PG versions >> 2) indication of "official endorsement" of a module by the community > > This argument ignores what I think is the real technical reason for > keeping contrib, which is to have a set of close-at-hand test cases > for extension and hook mechanisms. Certainly, not every one of the > existing contrib modules is especially useful for that purpose, but > quite a few of them are. Yes. But there's a bunch that aren't, and those are the ones which we previously discussed, the ones with indifferent maintenance like ISN and Intarray. You have to admit that it seems really strange in the eyes of a new user that ISN is packaged with PostgreSQL, whereas better-written and more popular extensions (like plv8, pg_partman or pgq) are not. -- Josh Berkus Containers & Databases Oh My! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN
On 2/14/17 2:05 PM, Tom Lane wrote: Jim Nasbywrites: First, just to clarify: my reasons for proposing "core adoption" of PGXN are not technical in nature. What do you think "core adoption" means? Surely not that anything associated with PGXN would be in the core distro. No, certainly not. If anything, PGXN being a first class citizen would allow for potentially removing code from core, since there would then be a first-class way for users to add it. Right now contrib is serving two completely separate purposes: 1) location for code that (for technical reasons) should be tied to specific PG versions 2) indication of "official endorsement" of a module by the community This argument ignores what I think is the real technical reason for keeping contrib, which is to have a set of close-at-hand test cases for extension and hook mechanisms. Certainly, not every one of the existing contrib modules is especially useful for that purpose, but quite a few of them are. I was under the impression that a lot of that had moved to test, but yes, that's a consideration. That said, if local caching was added to pgxnclient I don't think it'd require much change to just pull those cases from PGXN instead of the core repo. Alternatively, they could be pulled from a git repo that's houses the source code for the official PGXN modules (or what PGXN calls a distribution). Those kind of changes would actually help any extension author that wants to depend on another extension (namely, automatically pulling dependent extensions from PGXN). I have make targets that currently accomplish this. They'd be nicer with some additions to both extensions and PGXN, but IMHO they're workable right now. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Official adoption of PGXN (was: removing tsearch2)
Jim Nasbywrites: > First, just to clarify: my reasons for proposing "core adoption" of PGXN > are not technical in nature. What do you think "core adoption" means? Surely not that anything associated with PGXN would be in the core distro. > Right now contrib is serving two completely separate purposes: > 1) location for code that (for technical reasons) should be tied to > specific PG versions > 2) indication of "official endorsement" of a module by the community This argument ignores what I think is the real technical reason for keeping contrib, which is to have a set of close-at-hand test cases for extension and hook mechanisms. Certainly, not every one of the existing contrib modules is especially useful for that purpose, but quite a few of them are. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possibility to specify template database for pg_regress
On 2017-02-14 12:33:35 -0600, Jim Nasby wrote: > On 2/13/17 8:50 PM, Andres Freund wrote: > > On 2017-02-14 11:46:52 +0900, Michael Paquier wrote: > > > > I still fail to see why --use-existing as suggested in > > > > https://www.postgresql.org/message-id/20170208002900.vkldujzfkwbvq...@alap3.anarazel.de > > > > isn't sufficient. > > > > > > Some tests create objects without removing them, meaning that > > > continuous runs would fail with only --use-existing. This patch brings > > > value in such cases. > > > > You can trivially script the CREATE/DROP DB outside with > > --use-existing. Which seems a lot more flexible than adding more and > > more options to pg_regress. > > AFAIK if you're doing make check (as opposed to installcheck) it's > significantly more complicated than that since you'd have to create a temp > cluster/install yourself. But in that case you can't have useful templates in the regression test either, so the whole discussion is moot? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess
Hi, On 2017-02-13 22:42:00 -0500, Robert Haas wrote: > I dug into the problem and discovered that pg_waldump is slurping up a > tremendous crapload of backend headers. It includes gin.h, > gist_private.h, hash_xlog.h, nbtree.h, and spgist.h, which all end up > including amapi.h, which includes genam.h, which includes tidbitmap.h. Right. Hard to avoid with the current organization - IIRC Alvaro, I (and others?) had talked about doing more agressive reorganization first, but the patch was already big enough... > When you make tidbitmap.h include utils/dsa.h, it in turn includes > port/atomics.h, which has an #error preventing frontend includes Has to, because otherwise there's undefined variable/symbol references on some, but not all, platforms. > There are a number of ways to fix this problem; probably the cheapest > available hack is to stick #ifndef FRONTEND around the additional > stuff getting added to tidbitmap.h. But that seems to be attacking > the problem from the wrong end. Agreed. > Therefore, I proposed the attached patch, which splits spgxlog.h out > of spgist_private.h, nbtxlog.h out of nbtree.h, gistxlog.h out of > gist_private.h, and ginxlog.h and ginblock.h out of gin_private.h. > These new header files are included by pg_waldump in lieu of the > "private" versions. This solves the immediate problem and I suspect > it will head off future problems as well. > > Thoughts, comments, objections, better ideas? No better ideas. I'm a bit concerned about declarations needed both by normal and xlog related routines, but I guess that can be solved by a third header as you did. > +++ b/src/include/access/nbtxlog.h > @@ -0,0 +1,255 @@ > +/*- > + * > + * nbtree.h > + * header file for postgres btree xlog routines Wrong file name. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Official adoption of PGXN (was: removing tsearch2)
First, just to clarify: my reasons for proposing "core adoption" of PGXN are not technical in nature. My desire is to have an extension/add-on system that's officially endorsed and embraced by the official community, similar to CPAN, pypy, npm, etc. There's no technical reason we need PGXN to be a first class citizen, but IMHO making it a first class citizen would greatly enhance the usefulness of Postgres and strengthen & expand the community. That community expansion should eventually result in an increase in new contributors to the database code itself. On 2/14/17 8:24 AM, Robert Haas wrote: On Tue, Feb 14, 2017 at 8:37 AM, Jim Nasbywrote: Right; I think we need at least some amount of pgxn buildfarm coverage. There probably also needs to be a way to officially bless certain distributions. Unless there's a pretty significant need for an official extension to be in contrib, it should go into PGXN. I'm not sure a need-based test is going to be entirely the right thing. The advantage of having something in contrib is that the core developers will maintain it for you. With fairly minimal effort, that could be true of something in another repository though. When things change from release to release, everything in contrib necessarily gets updated; things on PGXN or elsewhere only get updated if their owners update them. There Right, and the intricate tie between contrib and rest of the database is why I'm not proposing ditching contrib completely. There's clearly some cases when that close tie makes the contrib code significantly simpler. (Though, it'd certainly be great if we found a way to reduce the multi-version support overhead for all extension creators!) are several good things about that. First, it means that people can rely on the stuff in contrib mostly sticking around for future releases, except occasionally when we decide to drop a module. Second, it gives maintainers of external modules examples of what they need to do to update their code when we change the core APIs. Third, it's probably actually more efficient for one person to go sweep through a large number of modules and fix them all at once than if those things were all on PGXN with separate owners. Now, you can overdo that: I don't want to be responsible for every module on PGXN or anything close to it. But on balance I think there's a good case for shipping a substantial set of modules in contrib. I agree with your points, but AFAIK there's no reason that needs to happen in contrib. There could certainly be a dedicated git.PG.org repo for official extensions. AFAICT that would meet all your points (understanding that code that really needed to be tied to specific versions could remain in contrib). Another option would be to start with just publishing most/all of what's in contrib on PGXN. Most OS packaging solutions for contrib seem rather clunky/arbitrary to me, so having pgxnclient as an install option would probably be welcome to some users. This would mean an additional step in the release process, but AFAIK that could be hidden behind a single make target (whoever's doing the release would also need access to the relevant account on pgxn.org). ... points about current contrib modules that I agree with... There's a lot of good stuff in contrib, though, and I don't think we should be averse to adding more in the future. It shouldn't be just that any random contrib module somebody writes can get dropped into the core distribution, but I think we ought to be open to reasonable proposals to add things there when it makes sense. Right now contrib is serving two completely separate purposes: 1) location for code that (for technical reasons) should be tied to specific PG versions 2) indication of "official endorsement" of a module by the community My view is that we've turned #2 into a nail simply because of the hammer we built for #1 (certainly understandable since contrib way pre-dates extensions, let alone PGXN). The community has discussions (about once a year) about what should or shouldn't stay in contrib, and a lot of the decision making process ends up driven by #2. If we had another official distribution channel for extensions, we could completely separate #1 and #2: contrib is strictly for official community extensions that are greatly simplified by being in the main repo; all other official extensions live at here> / . With some effort (hopefully from newly attracted extension authors), #1 could probable be eliminated as well, to the benefit of other external projects. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improve OR conditions on joined columns (common star schema problem)
I wrote: > The main remaining piece of work here is that, as you can see from the > above, it fails to eliminate joins to tables that we don't actually need > in a particular UNION arm. This is because the references to those > tables' ctid columns prevent analyzejoins.c from removing the joins. > I've thought about ways to deal with that but haven't come up with > anything that wasn't pretty ugly and/or invasive. I thought of a way that wasn't too awful, which was to inject the requests for CTID columns only after we've done join removal. The attached v2 patch produces this for your test case: QUERY PLAN --- Aggregate (cost=36.84..36.85 rows=1 width=8) (actual time=0.075..0.075 rows=1 loops=1) -> Result (cost=36.77..36.83 rows=4 width=0) (actual time=0.069..0.073 rows=3 loops=1) -> Unique (cost=36.77..36.79 rows=4 width=6) (actual time=0.069..0.072 rows=3 loops=1) -> Sort (cost=36.77..36.78 rows=4 width=6) (actual time=0.068..0.069 rows=4 loops=1) Sort Key: f.ctid Sort Method: quicksort Memory: 25kB -> Append (cost=4.57..36.73 rows=4 width=6) (actual time=0.018..0.033 rows=4 loops=1) -> Nested Loop (cost=4.57..18.37 rows=2 width=6) (actual time=0.018..0.020 rows=2 loops=1) -> Index Scan using dim_t_key on dim d1 (cost=0.28..8.29 rows=1 width=10) (actual time=0.005..0.005 rows=1 loops=1) Index Cond: ('1'::text = t) -> Bitmap Heap Scan on fact f (cost=4.30..10.05 rows=2 width=14) (actual time=0.010..0.012 rows=2 loops=1) Recheck Cond: (f1 = d1.s) Heap Blocks: exact=2 -> Bitmap Index Scan on f_f1 (cost=0.00..4.29 rows=2 width=0) (actual time=0.006..0.006 rows=2 loops=1) Index Cond: (f1 = d1.s) -> Nested Loop (cost=4.57..18.37 rows=2 width=6) (actual time=0.009..0.011 rows=2 loops=1) -> Index Scan using dim_t_key on dim d2 (cost=0.28..8.29 rows=1 width=10) (actual time=0.003..0.003 rows=1 loops=1) Index Cond: ('1'::text = t) -> Bitmap Heap Scan on fact f (cost=4.30..10.05 rows=2 width=14) (actual time=0.005..0.006 rows=2 loops=1) Recheck Cond: (f2 = d2.s) Heap Blocks: exact=2 -> Bitmap Index Scan on f_f2 (cost=0.00..4.29 rows=2 width=0) (actual time=0.003..0.003 rows=2 loops=1) Index Cond: (f2 = d2.s) Planning time: 0.732 ms Execution time: 0.156 ms (25 rows) I think this might be code-complete, modulo the question of whether we want an enabling GUC for it. I'm still concerned about the question of whether it adds more planning time than it's worth for most users. (Obviously it needs some regression test cases too, and a lot more real-world testing than I've given it.) One point that could use further review is whether the de-duplication algorithm is actually correct. I'm only about 95% convinced by the argument I wrote in planunionor.c's header comment. Potential future work includes finding join ORs in upper-level INNER JOIN ON clauses, and improving matters so we can do the unique-ification with hashing as well as sorting. I don't think either of those things has to be in the first commit, though. regards, tom lane diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 1560ac3..fa34efb 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *** _outPlannerInfo(StringInfo str, const Pl *** 2078,2083 --- 2078,2084 WRITE_FLOAT_FIELD(tuple_fraction, "%.4f"); WRITE_FLOAT_FIELD(limit_tuples, "%.0f"); WRITE_UINT_FIELD(qual_security_level); + WRITE_BOOL_FIELD(needBaseTids); WRITE_BOOL_FIELD(hasInheritedTarget); WRITE_BOOL_FIELD(hasJoinRTEs); WRITE_BOOL_FIELD(hasLateralRTEs); diff --git a/src/backend/optimizer/plan/Makefile b/src/backend/optimizer/plan/Makefile index 88a9f7f..1db6bd5 100644 *** a/src/backend/optimizer/plan/Makefile --- b/src/backend/optimizer/plan/Makefile *** top_builddir = ../../../.. *** 13,18 include $(top_builddir)/src/Makefile.global OBJS = analyzejoins.o createplan.o initsplan.o planagg.o planmain.o planner.o \ ! setrefs.o subselect.o include
Re: [HACKERS] PATCH: two slab-like memory allocators
On 02/14/2017 03:22 AM, Andres Freund wrote: Hi, On 2017-02-11 14:40:18 +0100, Tomas Vondra wrote: On 02/11/2017 02:33 AM, Andres Freund wrote: I have a hard time believing this the cache efficiency of linked lists (which may or may not be real in this case) out-weights this, but if you want to try, be my guest. I'm not following - why would there be overhead in anything for allocations bigger than 4 (or maybe 8) bytes? You can store the list (via chunk ids, not pointers) inside the chunks itself, where data otherwise would be. And I don't see why you'd need a doubly linked list, as the only operations that are needed are to push to the front of the list, and to pop from the front of the list - and both operations are simple to do with a singly linked list? Oh! I have not considered storing the chunk indexes (for linked lists) in the chunk itself, which obviously eliminates the overhead concerns, and you're right a singly-linked list should be enough. There will be some minimum-chunk-size requirement, depending on the block size/chunk size. I wonder whether it makes sense to try to be smart and make it dynamic, so that we only require 1B or 2B for cases when only that many chunks fit into a block, or just say that it's 4B and be done with it. I doubt it's worth it - it seems likely that the added branch is more noticeable overall than the possible savings of 3 bytes. Also, won't the space be lost due to alignment *anyway*? + /* chunk, including SLAB header (both addresses nicely aligned) */ + fullChunkSize = MAXALIGN(sizeof(SlabChunkData) + MAXALIGN(chunkSize)); In that case I'd just Assert(MAXIMUM_ALIGNOF >= sizeof(slist_head)) and use a plain slist - no point in being more careful than that. Hmm, I think you're right. I mean 2^32 chunks ought to be enough for anyone, right? Yea, that seems enough; but given the alignment thing pointed out above, I think we can just use plain pointers - and that definitely should be enough :P People in year 2078: Why the hell did they only use 32 bits? Wasn't it obvious we'll have tiny computers with 32EB of RAM? ;-) I'm still not buying the cache efficiency argument, though. One of the reasons is that the implementation prefers blocks with fewer free chunks when handling palloc(), so pfree() is making the block less likely to be chosen by the next palloc(). That'll possibly de-optimize L1, but for L2 usage the higher density seems like it'll be a win. All this memory is only accessed by a single backend, so packing as densely as possible is good. If so, if you have e.g. 8 byte allocations and 64kb sized blocks, you end up with an array of 1024 doubly linked lists, which'll take up 64kb on its own. And there a certainly scenarios where even bigger block sizes could make sense. That's both memory overhead, and runtime overhead, because at reset-time we'll have to check the whole array (which'll presumably largely be empty lists). Resetting is a pretty common path... True, but it's not entirely clear if resetting is common for the paths where we use those new allocators. That's fair enough. There's still the memory overhead, but I guess we can also live with that. Right. My ambition was not to develop another general-purpose memory context that would work perfectly for everything, but something that works (better than the current code) for places like reorderbuffer. Also, if we accept that it might be a problem, what other solution you propose? I don't think just merging everything into a single list is a good idea, for the reasons I explained before (it might make the resets somewhat less expensive, but it'll make pfree() more expensive). >> Now that I think about it, a binary heap, as suggested elsewhere, isn't entirely trivial to use for this - it's more or less trivial to "fix" the heap after changing an element's value, but it's harder to find that element first. But a two-level list approach seems like it could work quite well - basically a simplified skip-list. A top-level list contains that has the property that all the elements have a distinct #free, and then hanging of those sub-lists for all the other blocks with the same number of chunks. I think that'd be a better implementation, but I can understand if you don't immediately want to go there. I don't want to go there. I'm not all that interested in reorderbuffer, to be honest, and this started more as "Hold my beer!" hack, after a midnight discussion with Petr, than a seriously meant patch. I've already spent like 100x time on it than I expected. What might work is replacing the array with a list, though. So we'd have a list of lists, which would eliminate the array overhead. That seems likely to be significantly worse, because a) iteration is more expensive b) accessing the relevant list to move between two different "freecount" lists would be O(n). Oh, right, I haven't realized we won't know the current head of the list,
Re: [HACKERS] renaming pg_resetxlog to pg_resetwal has broken pg_upgrade.
On Mon, Feb 13, 2017 at 6:19 PM, Michael Paquierwrote: > On Tue, Feb 14, 2017 at 9:09 AM, Jeff Janes wrote: > > check for "/usr/local/pgsql9_6/bin/pg_resetwal" failed: No such file or > > directory > > > > This looks somewhat complicated to fix. Should check_bin_dir test the > old > > cluster version, and make a deterministic check based on that? Or just > > check for either spelling, and stash the successful result somewhere? > > The fix does not seem that complicated to me. get_bin_version() just > needs pg_ctl to be present, so we could move that in check_bin_dir() > after looking if pg_ctl is in a valid state, and reuse the version of > bin_version to see if the binary version is post-10 or not. Then the > decision making just depends on this value. Please see the patch > attached, this is passing 9.6->10 and check-world. > That fixes it for me. I thought people would object to checking the version number in two different places to make the same fundamental decision, and would want that refactored somehow. But if you are OK with it, then I am. Cheers, Jeff
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Tue, Feb 14, 2017 at 12:06 PM, Magnus Haganderwrote: > However, outputing this info by default will make it show up in things like > everybodys cronjobs by default. Right now a successful pg_basebackup run > will come out with no output at all, which is how most Unix commands work, > and brings it's own advantages. If we change that people will have to send > all the output to /dev/null, resulting in missing the things that are > actually important in any regard. I agree with that. I think having this show up in verbose mode is a really good idea - when something just hangs, users don't know what's going on, and that's bad. But showing it all the time seems like a bridge too far. As the postmortem linked above shows, people will think of things like "hey, let's try --verbose mode" when the obvious thing doesn't work. What is really irritating to them is when --verbose mode fails to be, uh, verbose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags
On 2/14/17 3:13 AM, Seki, Eiji wrote: +extern TransactionId GetOldestXmin(Relation rel, uint8 ignoreFlags); My impression is that most other places that do this sort of thing just call the argument 'flags', so as not to "lock in" a single idea of what the flags are for. I can't readily think of another use for flags in GetOldestXmin, but ISTM it's better to just go with "flags" instead of "ignoreFlags". -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possibility to specify template database for pg_regress
On 2/13/17 8:50 PM, Andres Freund wrote: On 2017-02-14 11:46:52 +0900, Michael Paquier wrote: I still fail to see why --use-existing as suggested in https://www.postgresql.org/message-id/20170208002900.vkldujzfkwbvq...@alap3.anarazel.de isn't sufficient. Some tests create objects without removing them, meaning that continuous runs would fail with only --use-existing. This patch brings value in such cases. You can trivially script the CREATE/DROP DB outside with --use-existing. Which seems a lot more flexible than adding more and more options to pg_regress. AFAIK if you're doing make check (as opposed to installcheck) it's significantly more complicated than that since you'd have to create a temp cluster/install yourself. As an extension author, I'd *love* to have the cluster management stuff in pg_regress broken out: it's the only reason I use pg_regress, and pg_regress's idea of what a test failure is just gets in my way. But breaking that out is far more invasive than allowing a template database. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
On Mon, Feb 13, 2017 at 3:40 PM, Fabien COELHOwrote: > What would be the mnemonic for "," an "@"? Oh, I just picked it because control-@ is the nul character, and your commands would be nullified. I realize that's pretty weak, but we're talking about finding a punctuation mark to represent the concept of commands-are-currently-being-skipped, and it doesn't seem particularly worse than ^ to represent single-line mode. If somebody's got a better idea, fine, but there aren't that many unused punctuation marks to choose from, and I think it's better to use a punctuation mark rather than, say, a letter, like 's' for skip. Otherwise you might have the prompt change from: banana=> to bananas> Which I think is less obvious than banana@> -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_autovacuum_min_duration doesn't log VACUUMs
On 2/13/17 11:12 AM, Robert Haas wrote: FWIW, this is a significant problem outside of DDL. Once you're past 1-2 levels of nesting SET client_min_messages = DEBUG becomes completely useless. I think the ability to filter logging based on context would be very valuable. AFAIK you could actually do that for manual logging with existing plpgsql support, but obviously that won't help for anything else. > Well, that's moving the goalposts a lot further and in an unrelated direction. Short of someone getting a flash of brilliance, I agree. I tried indicating as much with my "FWIW" but obviously that wasn't explicit enough. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Provide list of subscriptions and publications in psql's completion
On 2/13/17 9:36 PM, Michael Paquier wrote: Probably I failed to get Peter's point... Anyway IMO that we can expose all the columns except the sensitive information (i.e., subconninfo field) in pg_subscription to even non-superusers. Then we can use pg_subscription for the tab-completion for ALTER/DROP SUBSCRIPTION. To be honest, I find subconninfo quite useful to check where a subscription is getting its changes from, so I'd rather not touch it. It looks as well a bit overkill to just create a new view on an object type non-superusers cannot even use... There are already 1 view and 1 system catalog related to it, so I'd be of the opinion to let it fail silently with a permission error and keep it as an empty list for them. Why not do what we do for pg_stat_activity.current_query and leave it NULL for non-SU? Even better would be if we could simply strip out password info. Presumably we already know how to parse the contents, so I'd think that shouldn't be that difficult. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add checklist item for psql completion to commitfest review
After seeing Yet Another Missing Psql Tab Completion it occurred to me... why not add a checklist item to the commitfest review page? I realize most regular contributors don't use the form, but a fair number of people do. I don't see how it could hurt. Another possible idea is a git hook that checks to see if the psql completion code has been touched if any of the grammar has been. That could certainly trigger false positives so it'd need to be easy to over-ride, but AFAIK that could be done via a special phrase in the commit message. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible TODO: allow arbitrary expressions in event trigger WHEN
On 2/13/17 9:34 PM, Tom Lane wrote: Jim Nasbywrites: Is there a reason not to allow $SUBJECT? Specifically, it'd be nice to be able to do something like WHEN tag LIKE 'ALTER%'. Seems like it would be a seriously bad idea for such an expression to be able to invoke arbitrary SQL code. What if it calls a user-defined function that tries to do DDL? Hmm... could we temporarily mark the transaction as being read-only? Though, can't users already run arbitrary code inside the triggers themselves? If we don't want arbitrary DDL there might be other stuff we'd presumably want to prevent. FDW access comes to mind. So maybe just restrict what nodes can appear in the expression. You'd want to allow operators in that list which still leaves a bit of a hole, but if you're going to take up chainsaw juggling you better know what you're doing... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] removing tsearch2
On Feb 14, 2017, at 9:37 AM, Magnus Haganderwrote: > It's a failing in one of the two at least. It either needs to be easier to > build the things on windows, or pgxn would need to learn to do binary > distributions. PGXN makes no effort to support installation on any platform at all. Happy to work with anyone who wants to add binary distribution, but supporting multiple platforms might be a PITA. Maybe there’d be a way to integrate with the RPM and .deb and Windows repos (is there something like that for Windows?). > Even if we get the building easier on windows, it'll likely remain a second > class citizen (though better than today's third class), given the amount of > windows machines that actually have a compiler on them for start. Pgxs in > Windows would be a big improvement, but it won't solve the problem. Yep. David smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Parallel Index Scans
On Mon, Feb 13, 2017 at 9:04 PM, Amit Kapilawrote: > I think the comment at that place is not as clear as it should be. So > how about changing it as below: > > Existing comment: > -- > /* > * For parallel scans, get the last page scanned as it is quite > * possible that by the time we try to fetch previous page, other > * worker has also decided to scan that previous page. So we > * can't rely on _bt_walk_left call. > */ > > Modified comment: > -- > /* > * For parallel scans, it is quite possible that by the time we try to fetch > * the previous page, another worker has also decided to scan that > * previous page. So to avoid that we need to get the last page scanned > * from shared scan descriptor before calling _bt_walk_left. > */ That sounds way better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set of fixes for WAL consistency check facility
On Mon, Feb 13, 2017 at 8:00 PM, Michael Paquierwrote: > Beginning a new thread to raise awareness... As already reported here, > I had a look at what has been committed in a507b869: > https://www.postgresql.org/message-id/CAB7nPqRzCQb=vdfhvmtp0hmlbhu6z1agdo4gjsup-hp8jx+...@mail.gmail.com > > Here are a couple of things I have noticed while looking at the code: > > + * Portions Copyright (c) 2016, PostgreSQL Global Development Group > s/2016/2017/ in bufmask.c and bufmask.h. > > + if (ItemIdIsNormal(iid)) > + { > + > + HeapTupleHeader page_htup = (HeapTupleHeader) page_item; > Unnecessary newline here. > > +* Read the contents from the backup copy, stored in WAL record and > +* store it in a temporary page. There is not need to allocate a new > +* page here, a local buffer is fine to hold its contents and a mask > +* can be directly applied on it. > s/not need/no need/. > > In checkXLogConsistency(), FPWs that have the flag BKPIMAGE_APPLY set > will still be checked, resulting in a FPW being compared to itself. I > think that those had better be bypassed. > > Please find attached a patch with those fixes. I am attaching as well > this patch to next CF. I committed the patch posted to the other thread. Hopefully that closes this issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Thu, Feb 9, 2017 at 8:17 PM, Michael Paquierwrote: > Please find attached a patch with those fixes. Committed, but I changed the copyright dates to 2016-2017 rather than just 2017 since surely some of the code was originally written before 2017. Even that might not really be going back far enough, but it doesn't matter too much. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning - another take
On Fri, Feb 10, 2017 at 12:54 AM, Amit Langotewrote: > On 2017/02/09 15:22, amul sul wrote: >> About 0001-Check-partition-strategy-in-ATExecDropNotNull.patch, >> following test is already covered in alter_table.sql @ Line # 1918, >> instead of this kindly add test for list_partition: >> >> 77 +-- cannot drop NOT NULL constraint of a range partition key column >> 78 +ALTER TABLE range_parted ALTER a DROP NOT NULL; >> 79 + > > Thanks for the review! You're right. Updated patch attached. Committed, thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup -R
On Thu, Feb 9, 2017 at 9:40 PM, Michael Paquierwrote: > On Thu, Feb 9, 2017 at 8:17 PM, Amit Kapila wrote: >> +1. Sounds sensible thing to do. > > Hm. It seems to me that PGPASSFILE still needs to be treated as an > exception because it is set to $HOME/.pgpass without any value set in > PQconninfoOption->compiled and it depends on the environment. Similar > rules apply to fallback_application_name, dbname and replication as > well, so they would need to be kept as checked on an individual basis. > > Now it is true that pg_basebackup -R enforces the value set for a > parameter in the created string if its environment variable is set. > Bypassing those values would potentially break applications that rely > on the existing behavior. Hmm, I didn't think about environment variables. > In short, I'd like to think that we should just filter out those two > parameters by name and call it a day. Or introduce an idea of value > set for the environment by adding some kind of tracking flag in > PQconninfoOption? Though I am not sure that it is worth complicating > libpq to just generate recovery.conf in pg_basebackup. Yeah, I'm not sure what the best solution is. I just thought it was strange. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] removing tsearch2
On Feb 14, 2017 18:26, "David E. Wheeler"wrote: On Feb 14, 2017, at 5:37 AM, Jim Nasby wrote: >> Until pgxn has a way of helping users on for example Windows (or other >> platforms where they don't have a pgxs system and a compiler around), >> it's always going to be a "second class citizen". > > I view that as more of a failing of pgxs than pgxn. Granted, the most common (only?) pgxn client right now is written in python, but it's certainly possible to run that on windows with some effort (BigSQL does it), and I'm fairly certain it's not that hard to package a python script as a windows .exe. Yeah, that’s outside of PGXN’s mandate. It doesn’t do any installing at all, just distribution (release, search, download). Even the Python client just looks to see what build support is in a distribution it downloads to decide how to build it (make, configure, etc.), IIRC. It's a failing in one of the two at least. It either needs to be easier to build the things on windows, or pgxn would need to learn to do binary distributions. Even if we get the building easier on windows, it'll likely remain a second class citizen (though better than today's third class), given the amount of windows machines that actually have a compiler on them for start. Pgxs in Windows would be a big improvement, but it won't solve the problem. /Magnus
Re: [HACKERS] AT detach partition is broken
On Mon, Feb 13, 2017 at 2:30 AM, Amit Langotewrote: > I noticed that running ALTER TABLE table_name DETACH PARTITION crashes, if > table_name is not a partitioned table. That's because of an Assert in > ATExecDetachPartition(). We really should error out much sooner in this > case, IOW during transformAlterTableStmt(), as is done in the case of > ATTACH PARTITION. > > Attached patch fixes that. -/* assign transformed values */ -partcmd->bound = cxt.partbound; +/* + * Assign transformed value of the partition bound, if + * any. + */ +if (cxt.partbound != NULL) +partcmd->bound = cxt.partbound; This hunk isn't really needed, is it? I mean, if cxt.partbound comes out NULL, then partcmd->bound will be NULL with or without adding an "if" here, won't it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements
On Sun, Jul 17, 2016 at 7:15 PM, Tom Lanewrote: > Peter Geoghegan writes: >> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim wrote: >>> If we have timestamp of first and last executed, we can easily gather thess >>> informations and there are tons of more use cases. > >> -1 from me. > >> I think that this is the job of a tool that aggregates things from >> pg_stat_statements. It's unfortunate that there isn't a good >> third-party tool that does that, but there is nothing that prevents >> it. > > The concern I've got about this proposal is that the results get very > questionable as soon as we start dropping statement entries for lack > of space. last_executed would be okay, perhaps, but first_executed > not so much. ISTM that last_executed is useful - you can then see for yourself which of the statements that you see in the pg_stat_statements output have been issued recently, and which are older. I mean, you could also do that, as Peter says, with an additional tool that takes periodic snapshots of the data and then figures out an approximate last_executed time, but if you had this, you wouldn't need an additional tool, at least not for simple cases. Better yet, the data would be more exact. I dunno what's not to like about that, unless we're worried that tracking it will incur too much overhead. first_executed doesn't seem as useful as last_executed, but it isn't useless either. It can't properly be read as the first time that statement was ever executed, but it can be properly read as the amount of time that has passed during which that statement has been executed frequently enough to stay in the hash table, which is something that someone might want to know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] removing tsearch2
On Feb 14, 2017, at 5:37 AM, Jim Nasbywrote: >> Until pgxn has a way of helping users on for example Windows (or other >> platforms where they don't have a pgxs system and a compiler around), >> it's always going to be a "second class citizen". > > I view that as more of a failing of pgxs than pgxn. Granted, the most common > (only?) pgxn client right now is written in python, but it's certainly > possible to run that on windows with some effort (BigSQL does it), and I'm > fairly certain it's not that hard to package a python script as a windows > .exe. Yeah, that’s outside of PGXN’s mandate. It doesn’t do any installing at all, just distribution (release, search, download). Even the Python client just looks to see what build support is in a distribution it downloads to decide how to build it (make, configure, etc.), IIRC. David smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Removal of deprecated views pg_user, pg_group, pg_shadow
On Mon, Feb 13, 2017 at 11:38 PM, Stephen Frostwrote: > > * Robert Haas (robertmh...@gmail.com) wrote: > > On Mon, Feb 13, 2017 at 2:42 PM, Alvaro Herrera > > wrote: > > > Well, we can remove them from PG10 and pgAdmin3 (and others) be > adjusted > > > to use the new ones, conditionally on server version. Surely pgAdmin3 > > > is going to receive further updates, given that it's still widely used? > > > > According to the pgAdmin web site, no. (Yeah, that does seem odd.) > > I really do not think the PG core project should be held hostage by an > external and apparently not-really-maintained project. What if we > introduce some other difference in PG10 that breaks pgAdmin3? Are we > going to roll that change back? Are we sure that none exists already? > > And, as I understand it, pgAdmin3 hasn't got support for features > introduced as far back as 9.5 either, surely it's not going to have > support added to it for the publication/subscription things or > declarative partitioning, should we rip those out to accomedate > pgAdmin3? > FWIW, I think pgadmin3 is already pretty solidly broken on 10 because of the renaming of xlog related functions to WAL. I certainly can't get it started... It fails on pg_xlog_receive_location(). Which causes all sorts of further fallout. You can get past it after clicking through like 10-15 asserts. > >> IMHO, these views aren't costing us much. It'd be nice to get rid of > > >> them eventually but a view definition doesn't really need much > > >> maintenance. > > > > > > Maybe not, but the fact that they convey misleading information is bad. > > > > Has anyone actually been confused by them? > > This isn't something we can prove. Nor can we prove that no one has > ever been confused. What we can show is that they're clearly misleading > and inconsistent. Even if no one is ever confused by them, having them > is bad. > > > > On the other hand, I suppose that the last version of pgAdmin 3 isn't > > likely to work with future major versions of PostgreSQL anyway unless > > somebody updates it, and if somebody decides to update it for the > > other changes in v10 then updating it for the removal of these views > > won't be much extra work. So maybe it doesn't matter > I don't think pgadmin3 can really be said to be an argument for it no. Since it's already unsupported with that version, and the pgadmin team has been pretty clear at saying it won't be supported. pgadmin4 will support it of course. But things like the xlog->wal changes are much more likely to break parts of pgadmin than these views are, and I would guess the same for most other admin tools as well. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove all references to "xlog" from SQL-callable functions in p
On Mon, Feb 13, 2017 at 5:00 PM, Michael Paquierwrote: > It seems like the previous review I provided for the set of renaming > patches has been ignored: > https://www.postgresql.org/message-id/CAB7nPqSm=PNSe3EfvnEResdFzpQMcXXgapFBcF=EFdxVW4=f...@mail.gmail.com > Good that somebody else checked... Sorry about that, Michael. That was careless of me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Mon, Feb 13, 2017 at 10:33 AM, Michael Banckwrote: > Hi, > > Am Montag, den 13.02.2017, 09:31 +0100 schrieb Magnus Hagander: > > On Mon, Feb 13, 2017 at 3:29 AM, Jim Nasby > > wrote: > > On 2/11/17 4:36 AM, Michael Banck wrote: > > I guess you're right, I've moved it further down. > > There is in fact a > > message about the xlog location (unless you switch off > > wal entirely), > > but having another one right before that mentioning > > the completed > > checkpoint sounds ok to me. > > > > 1) I don't think this should be verbose output. Having a > > program sit there "doing nothing" for no apparent reason is > > just horrible UI design. > > > > > > That would include much of Unix then.. For example if I run "cp" on a > > large file it sits around "doing nothing". Same if I do "tar". No? > > The expectation for all three commands is that, even if there is no > output on stdout, they will write data to the local machine. So you can > easily monitor the progress of cp and tar by running du or something in > a different terminal. > > With pg_basebackup, nothing is happening on the local machine until the > checkpoint on the remote is finished; while this is obvious to somebody > familiar with how basebackups work internally, it appears to be not > clear at all to some users. > True. However, outputing this info by default will make it show up in things like everybodys cronjobs by default. Right now a successful pg_basebackup run will come out with no output at all, which is how most Unix commands work, and brings it's own advantages. If we change that people will have to send all the output to /dev/null, resulting in missing the things that are actually important in any regard. > > So I think notifying the user that something is happening remotely while > the local process waits would be useful, but on the other hand, > pg_basebackup does not print anything unless (i) --verbose is set or > (ii) there is an error, so I think having it mention the checkpoint in > --verbose mode only is acceptable. > Yeah, that's my view as well. I'm all for including it in verbose mode. *Iff* we can get a progress indicator through the checkpoint we could include that in --progress mode. But that's a different patch, of course, but it shouldn't be included in the default output even if we find it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Parallel Append implementation
On Tue, Feb 14, 2017 at 12:05 PM, Robert Haaswrote: > Having the extra > 1-2 workers exist does not seem better. Err, exit, not exist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On Mon, Feb 6, 2017 at 12:36 AM, Amit Khandekarwrote: >> Now that I think of that, I think for implementing above, we need to >> keep track of per-subplan max_workers in the Append path; and with >> that, the bitmap will be redundant. Instead, it can be replaced with >> max_workers. Let me check if it is easy to do that. We don't want to >> have the bitmap if we are sure it would be replaced by some other data >> structure. > > Attached is v2 patch, which implements above. Now Append plan node > stores a list of per-subplan max worker count, rather than the > Bitmapset. But still Bitmapset turned out to be necessary for > AppendPath. More details are in the subsequent comments. Keep in mind that, for a non-partial path, the cap of 1 worker for that subplan is a hard limit. Anything more will break the world. But for a partial plan, the limit -- whether 1 or otherwise -- is a soft limit. It may not help much to route more workers to that node, and conceivably it could even hurt, but it shouldn't yield any incorrect result. I'm not sure it's a good idea to conflate those two things. For example, suppose that I have a scan of two children, one of which has parallel_workers of 4, and the other of which has parallel_workers of 3. If I pick parallel_workers of 7 for the Parallel Append, that's probably too high. Had those two tables been a single unpartitioned table, I would have picked 4 or 5 workers, not 7. On the other hand, if I pick parallel_workers of 4 or 5 for the Parallel Append, and I finish with the larger table first, I think I might as well throw all 4 of those workers at the smaller table even though it would normally have only used 3 workers. Having the extra 1-2 workers exist does not seem better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPDATE of partition key
On Mon, Feb 13, 2017 at 05:31:56PM +0530, Amit Khandekar wrote: > Currently, an update of a partition key of a partition is not > allowed, since it requires to move the row(s) into the applicable > partition. > > Attached is a WIP patch (update-partition-key.patch) that removes > this restriction. When an UPDATE causes the row of a partition to > violate its partition constraint, then a partition is searched in > that subtree that can accommodate this row, and if found, the row is > deleted from the old partition and inserted in the new partition. If > not found, an error is reported. This is great! Would it be really invasive to HINT something when the subtree is a proper subtree? Best, David. -- David Fetterhttp://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add doc advice about systemd RemoveIPC
On Fri, Feb 10, 2017 at 10:36 PM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 12/31/16 11:43 AM, Tom Lane wrote: > > Magnus Haganderwrites: > >> I still think that some wording in the direction of the fact that the > >> majority of all users won't actually have this problem is the right > thing > >> to do (regardless of our previous history in the area as pointed out by > >> Craig) > > > > +1. The use-a-system-user solution is the one that's in place on the > > ground for most current PG users on affected platforms. We should > explain > > that one first and make clear that platform-specific packages attempt to > > set it up that way for you. The RemoveIPC technique should be documented > > as a fallback to be used if you can't/won't use a system userid. > > How about this version, which shifts the emphasis a bit, as suggested? > > Looks much better. + +If systemd is in use, some care must be taken +that IPC resources (shared memory and semaphores) are not prematurely +removed by the operating system. This is especially of concern when +installing PostgreSQL from source. Users of distribution packages of +PostgreSQL are less likely to be affected. + I would add "are less likely to be affected as the postgres user is normally created as a system user" or something like that -- to indicate *why* they are less likely to be affected (and it also tells people that if they change the user, then they might become affected again). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Parallel bitmap heap scan
On Tue, Feb 14, 2017 at 12:18 AM, Dilip Kumarwrote: > Fixed. Thanks, the external interface to this looks much cleaner now. Scrutinizing the internals: What is the point of having a TBMSharedIterator contain a TIDBitmap pointer? All the values in that TIDBitmap are populated from the TBMSharedInfo, but it seems to me that the fields that are copied over unchanged (like status and nchunks) could just be used directly from the TBMSharedInfo, and the fields that are computed (like relpages and relchunks) could be stored directly in the TBMSharedIterator. tbm_shared_iterate() is already separate code from tbm_iterate(), so it can be changed to refer to whichever data structure contains the data it needs. Why is TBMSharedInfo separate from TBMSharedIteratorState? It seems like you could merge those two into a single structure. I think we can simplify things here a bit by having shared iterators not support single-page mode. In the backend-private case, tbm_begin_iterate() really doesn't need to do anything with the pagetable in the TBM_ONE_PAGE case, but for a shared iterator, we've got to anyway copy the pagetable into shared memory. So it seems to me that it would be simpler just to transform it into a standard iteration array while we're at it, instead of copying it into entry1. In other words, I suggest removing both "entry1" and "status" from TBMSharedInfo and making tbm_prepare_shared_iterate smarter to compensate. I think "dsa_data" is poorly named; it should be something like "dsapagetable" in TIDBitmap and just "pagetable" in TBMSharedInfo. I think you should should move the "base", "relpages", and "relchunks" into TBMSharedIterator and give them better names, like "ptbase", "ptpages" and "ptchunks". "base" isn't clear that we're talking about the pagetable's base as opposed to anything else, and "relpages" and "relchunks" are named based on the fact that the pointers are relative rather than named based on what data they point at, which doesn't seem like the right choice. I suggest putting the parallel functions next to each other in the file: tbm_begin_iterate(), tbm_prepare_shared_iterate(), tbm_iterate(), tbm_shared_iterate(), tbm_end_iterate(), tbm_end_shared_iterate(). +if (tbm->dsa == NULL) +return pfree(pointer); Don't try to return a value of type void. The correct spelling of this is { pfree(pointer); return; }. Formatted appropriately across 4 lines, of course. +/* + * If TBM is in iterating phase that means pagetable is already created + * and we have come here during tbm_free. By this time we are already + * detached from the DSA because the GatherNode would have been shutdown. + */ +if (tbm->iterating) +return; This seems like something of a kludge, although not a real easy one to fix. The problem here is that tidbitmap.c ideally shouldn't have to know that it's being used by the executor or that there's a Gather node involved, and certainly not the order of operations around shutdown. It should really be the problem of 0002 to handle this kind of problem, without 0001 having to know anything about it. It strikes me that it would probably be a good idea to have Gather clean up its children before it gets cleaned up itself. So in ExecShutdownNode, we could do something like this: diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 0dd95c6..5ccc2e8 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c @@ -815,6 +815,8 @@ ExecShutdownNode(PlanState *node) if (node == NULL) return false; +planstate_tree_walker(node, ExecShutdownNode, NULL); + switch (nodeTag(node)) { case T_GatherState: @@ -824,5 +826,5 @@ ExecShutdownNode(PlanState *node) break; } -return planstate_tree_walker(node, ExecShutdownNode, NULL); +return false; } Also, in ExecEndGather, something like this: diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c index a1a3561..32c97d3 100644 --- a/src/backend/executor/nodeGather.c +++ b/src/backend/executor/nodeGather.c @@ -229,10 +229,10 @@ ExecGather(GatherState *node) void ExecEndGather(GatherState *node) { +ExecEndNode(outerPlanState(node)); /* let children clean up first */ ExecShutdownGather(node); ExecFreeExprContext(>ps); ExecClearTuple(node->ps.ps_ResultTupleSlot); -ExecEndNode(outerPlanState(node)); } /* With those changes and an ExecShutdownBitmapHeapScan() called from ExecShutdownNode(), it should be possible (I think) for us to always have the bitmap heap scan node shut down before the Gather node shuts down, which I think would let you avoid having a special case for this inside the TBM code. +char *ptr; +dsaptr = dsa_allocate(tbm->dsa, size + sizeof(dsa_pointer)); +tbm->dsa_data = dsaptr; +ptr = dsa_get_address(tbm->dsa, dsaptr); +memset(ptr,
Re: [HACKERS] UPDATE of partition key
On Mon, Feb 13, 2017 at 7:01 AM, Amit Khandekarwrote: > partspartitioned inheritance no. of rows subpartitions > ==== === === = > > 500 10 sec 3 min 02 sec 1,000,000 0 > 1000 10 sec 3 min 05 sec 1,000,000 0 > 1000 1 min 38sec 30min 50 sec 10,000,000 0 > 4000 28 sec 5 min 41 sec 1,000,000 10 That's a big speedup. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small improvement to parallel query docs
On Tue, Feb 14, 2017 at 5:17 AM, David Rowleywrote: > On 14 February 2017 at 21:25, Amit Kapila wrote: >> +Aggregate stage. For such cases there is clearly going to be no >> +performance benefit to using parallel aggregation. >> >> A comma is required after "For such cases" > > Added > >>> The query planner takes >>> +this into account during the planning process and will choose how to >>> +perform the aggregation accordingly. >> >> This part of the sentence sounds unclear. It doesn't clearly >> indicate that planner won't choose a parallel plan in such cases. > > I thought that was obvious enough giving that I'd just mentioned that > there's clearly no benefit, however I've changed things to make that a > bit more explicit. > > Thanks for reviewing this. > > Updated patch attached. Committed and back-patched to 9.6. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] removing tsearch2
On Tue, Feb 14, 2017 at 8:37 AM, Jim Nasbywrote: > Right; I think we need at least some amount of pgxn buildfarm coverage. > There probably also needs to be a way to officially bless certain > distributions. Unless there's a pretty significant need for an official > extension to be in contrib, it should go into PGXN. I'm not sure a need-based test is going to be entirely the right thing. The advantage of having something in contrib is that the core developers will maintain it for you. When things change from release to release, everything in contrib necessarily gets updated; things on PGXN or elsewhere only get updated if their owners update them. There are several good things about that. First, it means that people can rely on the stuff in contrib mostly sticking around for future releases, except occasionally when we decide to drop a module. Second, it gives maintainers of external modules examples of what they need to do to update their code when we change the core APIs. Third, it's probably actually more efficient for one person to go sweep through a large number of modules and fix them all at once than if those things were all on PGXN with separate owners. Now, you can overdo that: I don't want to be responsible for every module on PGXN or anything close to it. But on balance I think there's a good case for shipping a substantial set of modules in contrib. I think, in general, that we've done a good job increasing the quality of contrib over time. Pretty much all of the new stuff we've added is fairly solid, and we cleaned things up significantly by moving test code to src/test/modules. But we've still got some older modules in contrib whose quality and general usefulness is pretty questionable IMV: earthdistance, intagg, intarray, isn, and of course the eternally-deprecated but never-actually-removed xml2. I'm not in a hurry to expend a lot of energy removing any of that stuff, and I think we might be reaching our quota of backward-incompatible changes for this release, but it's questionable in my mind whether having those things there works out to a net plus. There's a lot of good stuff in contrib, though, and I don't think we should be averse to adding more in the future. It shouldn't be just that any random contrib module somebody writes can get dropped into the core distribution, but I think we ought to be open to reasonable proposals to add things there when it makes sense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers