Re: [HACKERS] better atomics - v0.5
On Thu, Jul 3, 2014 at 10:56 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jul 1, 2014 at 4:10 PM, Andres Freund and...@2ndquadrant.com wrote: Further review of patch: 1. /* * pg_atomic_test_and_set_flag - TAS() * * Acquire/read barrier semantics. */ STATIC_IF_INLINE_DECLARE bool pg_atomic_test_set_flag(volatile pg_atomic_flag *ptr); a. I think Acquire and read barrier semantics aren't equal. With acquire semantics, the results of the operation are available before the results of any operation that appears after it in code which means it applies for both load and stores. Definition of read barrier just ensures loads. So will be right to declare like above in comments? b. I think it adheres to Acquire semantics for platforms where that semantics are supported else it defaults to full memory ordering. Example : #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0)) Is it right to declare that generic version of function adheres to Acquire semantics. 2. bool pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr) { return TAS((slock_t *) ptr-sema); #define TAS(lock) (InterlockedCompareExchange(lock, 1, 0)) a. In above usage (ptr-sema), sema itself is not declared as volatile, so would it be right to use it in this way for API (InterlockedCompareExchange) expecting volatile. b. Also shouldn't this implementation be moved to atomics-generic-msvc.h 3. /* * pg_atomic_unlocked_test_flag - TAS() * * No barrier semantics. */ STATIC_IF_INLINE_DECLARE bool pg_atomic_unlocked_test_flag(volatile pg_atomic_flag *ptr); a. There is no setting in this, so using TAS in function comments seems to be wrong. b. BTW, I am not able see this function in C11 specs. 4. #if !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) defined(PG_HAVE_ATOMIC_EXCHANGE_U32) .. #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG static inline bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) { return pg_atomic_read_u32_impl(ptr) == 1; } #elif !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32) .. #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG static inline bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) { return (bool) pg_atomic_read_u32_impl(ptr); } Is there any reason to keep these two implementations different? 5. atomics-generic-gcc.h #ifndef PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG #define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG static inline bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) { return ptr-value == 0; } #endif Don't we need to compare it with 1 instead of 0? 6. atomics-fallback.h pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr) { /* * Can't do this in the semaphore based implementation - always return * true. Do this in the header so compilers can optimize the test away. */ return true; } Why we can't implement this in semaphore based implementation? 7. /* * pg_atomic_clear_flag - release lock set by TAS() * * Release/write barrier semantics. */ STATIC_IF_INLINE_DECLARE void pg_atomic_clear_flag(volatile pg_atomic_flag *ptr); a. Are Release and write barriers equivalent? With release semantics, the results of the operation are available after the results of any operation that appears before it in code. A write barrier acts as a compiler barrier, and orders stores. I think both definitions are not same. b. IIUC then current code doesn't have release semantics for unlock/clear. We are planing to introduce it in this patch, also this doesn't follow the specs which says that clear should have memory_order_seq_cst ordering semantics. As per its current usage in patch (S_UNLOCK), it seems reasonable to have *release* semantics for this API, however I think if we use it for generic purpose then it might not be right to define it with Release semantics. 8. #define PG_HAS_ATOMIC_CLEAR_FLAG static inline void pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr) { /* XXX: release semantics suffice? */ pg_memory_barrier_impl(); pg_atomic_write_u32_impl(ptr, 0); } Are we sure that having memory barrier before clearing flag will achieve *Release* semantics; as per my understanding the definition of Release sematics is as below and above doesn't seem to follow the same. With release semantics, the results of the operation are available after the results of any operation that appears before it in code. 9. static inline uint32 pg_atomic_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 xchg_) { uint32 old; while (true) { old = pg_atomic_read_u32_impl(ptr); if (pg_atomic_compare_exchange_u32_impl(ptr, old, xchg_)) break; } return old; } What is the reason to implement exchange as compare-and-exchange, at least for Windows I know corresponding function (InterlockedExchange) exists. I could not see any other implementation of same. I think this at least deserves comment explaining why we have done the implementation this way. 10. STATIC_IF_INLINE_DECLARE uint32 pg_atomic_fetch_add_u32(volatile pg_atomic_uint32 *ptr,
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
On Mon, Jul 7, 2014 at 11:46 PM, Greg Stark st...@mit.edu wrote: On Mon, Jul 7, 2014 at 3:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. Well, not necessarily. You only need to consider constraints on matching columns and only when there's a join column on those columns. So you could imagine, for example, sorting all the constraints by the eclass for the non-const side of their expression, then going through them linearly to see where you have multiple constraints on the same eclass. For what it's worth I think there is a case where this is a common optimization. When you have multiple tables partitioned the same way. Then you ideally want to be able to turn that from an join of multiple appends into an append of multiple joins. This is even more important when it comes to a parallelizing executor since it lets you do the joins in parallel. Ah, right. Also, if the foreign tables come under the inheritance hierarchy, and we want push joins to foreign servers. However to get from here to there I guess you would need to turn the join of the appends into NxM joins of every pair of subscans and then figure out which ones to exclude. That would be pretty nuts. To do it reasonably we probably need the partitioning infrastructure we've been talking about where Postgres would know what the partitioning key is and the order and range of the partitions so it can directly generate the matching subjoins in less than n^2 time. -- greg -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke
Re: [HACKERS] RLS Design
Hi all I was jotting notes about this last sleepless night, and was really glad to see the suggestion of enabling RLS on a table being a requirement for OR-style quals suggested in the thread when I woke. The only sane way to do OR-ing of multiple rules is to require that tables be switched to deny-by-default before RLS quals can be added to then selectively enable access. The next step is DENY rules that override ALLOW rules, and are also ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that can be a later - I just think room for it should be left in any catalog definition. My concern with the talk of policies, etc, is with making it possible to impliment this for 9.5. I'd really like to see a robust declarative row-security framework with access policies - but I'm not sure sure it's a good idea to try to assemble policies directly out of low level row security predicates. Tying things into a policy model that isn't tried or tested might create more problems than it solves unless we implement multiple real-world test cases on top of the model to show it works. For how I think we should be pursuing this in the long run, take a look at how TeraData does it, with heirachical and non-heirachical rules - basically bitmaps or thresholds - that get grouped into access policies. It's a very good way to abstract the low level stuff. If we have low level table predicate filters, we can build this sort of thing on top. For 9.5, unless the basics turn out to be way easier than they look and it's all done soon in the release process, surely we should be sticking to just getting the basics of row security in place? Leaving room for enhancement, sure, but sticking to the core feature which to my mind is: - A row security on/off flag for a table; - Room in the catalogs for multiple row security rules per table and a type flag for them. The initial type flag, for ALLOW rules, specifies that all ALLOW rules be ORed together. - Syntax for creating and dropping row security predicates. If there can be multiple ones per table they'll need names, like we have with triggers, indexes, etc. - psql support for listing row security predicates on a table if running as superuser or if you've been explicitly GRANTed access to the catalog table listing row security quals. - The hooks for contribs to inject their own row security rules. The API will need a tweak - right now it assumes these rules are ANDed with any row security predicates in the catalogs, but we'd want the option of treating them as ALLOW or DENY rules to get ORed with the rest of the set *or* as a pre-filter predicate like currently. - A row-security-exempt right, at the user-level, to assuage the concerns about malicious predicates. I maintain that in the first rev this should be simple: superuser is row security exempt. I don't think I'm going to win that one though, so a user/role attribute that makes the role ignore row security seems like the next simplest option. - A way to test whether the current user is row-security exempt so pg_dump can complain unless explicitly told it's allowed to do a selective dump via a cmdline option; Plus a number of fixes: - Fixing the security barrier view isssue with row level lock pushdown that's breaking the row security regression tests; - Enhancing plan cache invalidation so that row-security exempt-ness of a user is part of the plancache key; - Adding session state like current_user to portals, so security_barrier functions returning refcursor, and cursors created before SET SESSION AUTHORIZATION or SET ROLE, get the correct values when they use session information like current_user Note that this doesn't even consider the with check option style write-filtering side of row security and the corresponding challenges with the semantics around RETURNING. It's already a decent sized amount of work on top of the existing row security patch. If we start adding policy groups, etc, this will never get done. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Optimization for updating foreign tables in Postgres FDW
Attached is a WIP patch for the following: /* * postgresPlanForeignModify * Plan an insert/update/delete operation on a foreign table * * Note: currently, the plan tree generated for UPDATE/DELETE will always * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE) * and then the ModifyTable node will have to execute individual remote * UPDATE/DELETE commands. If there are no local conditions or joins * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING * and then do nothing at ModifyTable. Room for future optimization ... */ In the patch postgresPlanForeignModify has been modified so that if, in addition to the above condition, the followings are satisfied, then the ForeignScan and ModifyTable node will work that way. - There are no local BEFORE/AFTER triggers. - In UPDATE it's safe to evaluate expressions to assign to the target columns on the remote server. Here is a simple performance test. On remote side: postgres=# create table t (id serial primary key, inserted timestamp default clock_timestamp(), data text); CREATE TABLE postgres=# insert into t(data) select random() from generate_series(0, 9); INSERT 0 10 postgres=# vacuum t; VACUUM On local side: postgres=# create foreign table ft (id integer, inserted timestamp, data text) server myserver options (table_name 't'); CREATE FOREIGN TABLE Unpatched: postgres=# explain analyze verbose delete from ft where id 1; QUERY PLAN --- Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=1275.255..1275.255 rows=0 loops=1) Remote SQL: DELETE FROM public.t WHERE ctid = $1 - Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=1.180..52.095 rows= loops=1) Output: ctid Remote SQL: SELECT ctid FROM public.t WHERE ((id 1)) FOR UPDATE Planning time: 0.112 ms Execution time: 1275.733 ms (7 rows) Patched (Note that the DELETE command has been pushed down.): postgres=# explain analyze verbose delete from ft where id 1; QUERY PLAN --- Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=0.006..0.006 rows=0 loops=1) - Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) (actual time=0.001..0.001 rows=0 loops=1) Output: ctid Remote SQL: DELETE FROM public.t WHERE ((id 1)) Planning time: 0.101 ms Execution time: 8.808 ms (6 rows) I'll add this to the next CF. Comments are welcome. Thanks, Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 189,198 is_foreign_expr(PlannerInfo *root, if (!foreign_expr_walker((Node *) expr, glob_cxt, loc_cxt)) return false; - /* Expressions examined here should be boolean, ie noncollatable */ - Assert(loc_cxt.collation == InvalidOid); - Assert(loc_cxt.state == FDW_COLLATE_NONE); - /* * An expression which includes any mutable functions can't be sent over * because its result is not stable. For example, sending now() remote --- 189,194 *** *** 928,933 deparseUpdateSql(StringInfo buf, PlannerInfo *root, --- 924,982 } /* + * deparse remote UPDATE statement + * + * The statement text is appended to buf, and we also create an integer List + * of the columns being retrieved by RETURNING (if any), which is returned + * to *retrieved_attrs. + */ + void + deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root, + Index rtindex, Relation rel, + List *remote_conds, + List *targetlist, + List *targetAttrs, List *returningList, + List **retrieved_attrs) + { + RelOptInfo *baserel = root-simple_rel_array[rtindex]; + List *params_list = NIL; + deparse_expr_cxt context; + boolfirst; + ListCell *lc; + + /* Set up context struct for recursion */ + context.root = root; + context.foreignrel = baserel; + context.buf = buf; + context.params_list = NULL; + + appendStringInfoString(buf, UPDATE ); + deparseRelation(buf, rel); + appendStringInfoString(buf, SET ); + + first = true; + foreach(lc, targetAttrs) + { + int attnum = lfirst_int(lc); + TargetEntry *tle = get_tle_by_resno(targetlist, attnum); + + if (!first) +
Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings
Ok, I was able to figure out that when strings contained 'spaces', PostgreSQL appends them with double quotes. On Tue, Jul 8, 2014 at 12:04 PM, Ashoke s.ash...@gmail.com wrote: As a follow-up question, I found some of the varchar column types, in which the histogram_bounds are not being surrounded in double quotes ( ) even in the default implementation. Ex : *c_name* column of *Customer* table I also found histogram_bounds in which only some strings are surrounded in double quotes and some are not. Ex : *c_address *column of* Customer *table Why are there such inconsistencies? How is this determined? Thank you. On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote: Hi, I am trying to implement a functionality that is similar to ANALYZE, but needs to have different values (the values will be valid and is stored in inp-str[][]) for MCV/Histogram Bounds in case the column under consideration is varchar (C Strings). I have written a function *dummy_update_attstats* with the following changes. Other things remain the same as in *update_attstats* of *~/src/backend/commands/analyze.c* *---* *{* * ArrayType *arry; * * if (* *strcmp(col_type,varchar) == 0* * )* * arry = construct_array(stats-stavalues[k],* * stats-numvalues[k], * * CSTRINGOID,* * -2, * * false,* * 'c'); * * else* * arry = construct_array(stats-stavalues[k], * * stats-numvalues[k],* * stats-statypid[k], * * stats-statyplen[k],* * stats-statypbyval[k], * * stats-statypalign[k]);* * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }* --- and I update the hist_values in the appropriate function as: --- *if (strcmp(col_type,varchar) == 0**)* * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),* * false,* * -2);* *---* I tried this based on the following reference : http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff My issue is : When I use my way for strings, the MCV/histogram_bounds in pg_stats doesn't have double quotes ( ) surrounding string. That is, If normal *update_attstats* is used, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA ,ARGENTINA,...* If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH nation(n_name)* are : *ALGERIA,ARGENTINA,...* This becomes an issue if the string has ',' (commas), like for example in *n_comment* column of *nation* table. Could someone point out the problem and suggest a solution? Thank you. -- Regards, Ashoke -- Regards, Ashoke -- Regards, Ashoke
[HACKERS] query_is_distinct_for does not take into account set returning functions
Over here - http://www.postgresql.org/message-id/6351.1404663...@sss.pgh.pa.us Tom noted that create_unique_path did not check for set returning functions. Tom Wrote: I notice that create_unique_path is not paying attention to the question of whether the subselect's tlist contains SRFs or volatile functions. It's possible that that's a pre-existing bug. I looked at this a bit and I can confirm that it does not behave as it should do. Take the following as an example: create table x (id int primary key); create table y (n int not null); insert into x values(1); insert into y values(1); select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g from y); id 1 (1 row) select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g from y group by n); id 1 1 (2 rows) The 2nd query does group by n, so query_is_distinct_for returns true, therefore the outer query think's it's ok to perform an INNER JOIN rather than a SEMI join, which is this case produces an extra record. I think we should probably include the logic to test for set returning functions into query_is_distinct_for. The attached fixes the problem. Regards David Rowley query_is_distinct_for_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote: David Rowley dgrowle...@gmail.com writes: On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: I poked around to see if we didn't have some code already for that, and soon found that not only do we have such code (equality_ops_are_compatible) but actually almost this entire patch duplicates logic that already exists in optimizer/util/pathnode.c, to wit create_unique_path's subroutines query_is_distinct_for et al. So I'm thinking what this needs to turn into is an exercise in refactoring to allow that logic to be used for both purposes. Well, it seems that might just reduce the patch size a little! I currently have this half hacked up to use query_is_distinct_for, but I see there's no code that allows Const's to exist in the join condition. I had allowed for this in groupinglist_is_unique_on_restrictinfo() and I tested it worked in a regression test (which now fails). I think to fix this, all it would take would be to modify query_is_distinct_for to take a list of Node's rather than a list of column numbers then just add some logic that skips if it's a Const and checks it as it does now if it's a Var Would you see a change of this kind a valid refactor for this patch? I'm a bit skeptical as to whether testing for that case is actually worth any extra complexity. Do you have a compelling use-case? But anyway, if we do want to allow it, why does it take any more than adding a check for Consts to the loops in query_is_distinct_for? It's the targetlist entries where we'd want to allow Consts, not the join conditions. I don't really have a compelling use-case, but you're right, it's just a Const check in query_is_distinct_for(), it seems simple enough so I've included that in my refactor of the patch to use query_is_distinct_for(). This allows the regression tests all to pass again. I've included an updated patch and a delta patch. Now a couple of things to note: 1. The fast path code that exited in join_is_removable() for subquery's when the subquery had no group or distinct clause is now gone. I wasn't too sure that I wanted to assume too much about what query_is_distinct_for may do in the future and I thought if I included some logic in join_is_removable() to fast path, that one day it may fast path wrongly. Perhaps we could protect against this with a small note in query_is_distinct_for(). 2. The patch I submitted here http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com if that gets accepted then it makes the check for set returning functions in join_is_removable void. Regards David Rowley subquery_leftjoin_removal_v1.5.delta.patch Description: Binary data subquery_leftjoin_removal_v1.5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
I don't know enough about XML/XPATH to know if this is a good idea or not, Actually currently because of the namespace problem, xpath() returns wrong result (even worse: invalid xml!). So this patch corrects the behavior of xpath() to the correct one. but I did go read the documentation of xmlCopyNode(), and I notice it says the second argument is extended: if 1 do a recursive copy (properties, namespaces and children when applicable) if 2 copy properties and namespaces (when applicable) Do we really want it to copy children? Perhaps the value should be 2? (I have no idea, I'm just raising the question.) If we put 1 as the parameter, the resulting Node will link to the existing children. In this case, the namespace problem for the children might be still exists. If any of the children uses different namespace(s) than the parent, the namespace definition will not be copied correctly. Just to be safe, in the patch 1 passed as the second argument. Ideally, we can traverse the Node object and generate XML accordingly, but it is a lot of work, and a lot of duplicating libxml's code. Maybe we should push this upstream to libxml? I also notice that it says Returns: a new #xmlNodePtr, or NULL in case of error. and the patch is failing to cover the error case. Most likely, that would represent out-of-memory, so it definitely seems to be something we should account for. Ah, overlooked that. Skimming through libxml2's source, it looks like there are some other cases beside out-of-memory. Will update the patch according to these cases. Thanks for the review. -- Ali Akbar
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
From: Amit Kapila amit.kapil...@gmail.com On Fri, Jul 4, 2014 at 7:29 PM, MauMau maumau...@gmail.com wrote: [Hypothesis] Why does the connection processing emit WAL? Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes: I agree with your analysis that it can happen during connection attempt. Thank you. I'm relieved the cause seems correct. But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database. One way to confirm could be to perform the archive recovery by disabling autovacuum. Yeah, I thought of that too. Unfortunately, the customer deleted the the base backup for testing. Another thing which I am wondering about is can't the same happen even for Read Only transaction (incase someone does Select which prunes the page). I'm afraid about that, too. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
From: Rajeev rastogi rajeev.rast...@huawei.com As of now there is no solution for this in PostgreSQL but I had submitted a patch Standalone synchronous master in 9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous level of master, if all synchronous standbys are down. I plan to resubmit this with better design sometime in 9.5. Although I only read some mails of that thread, I'm sure your proposal is what many people would appreciate. Your new operation mode is equivalent to the maximum availability mode of Oracle Data Guard, isn't it? I'm looking forward to it. Good luck. == Maximum availability This protection mode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one standby database. If the primary database cannot write its redo stream to at least one standby database, it effectively switches to maximum performance mode to preserve primary database availability and operates in that mode until it is again able to write its redo stream to a standby database. This protection mode ensures zero data loss except in the case of certain double faults, such as failure of a primary database after failure of the standby database. Maximum performance This is the default protection mode. It provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by delays in writing redo data to the standby database(s). This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance. == Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
From: Tom Lane t...@sss.pgh.pa.us problem, the user might not realize he's got one until he starts to wonder why autovac/autoanalyze aren't working. In autovacuum.c, autovacuum workers avoid waiting for the standby by doing: /* * Force synchronous replication off to allow regular maintenance even if * we are waiting for standbys to connect. This is important to ensure we * aren't blocked from performing anti-wraparound tasks. */ if (synchronous_commit SYNCHRONOUS_COMMIT_LOCAL_FLUSH) SetConfigOption(synchronous_commit, local, PGC_SUSET, PGC_S_OVERRIDE); Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
From: Tom Lane t...@sss.pgh.pa.us Andres Freund and...@2ndquadrant.com writes: On 2014-07-07 09:57:20 -0400, Tom Lane wrote: Well, see the comment that explains why the logic is like this now: I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. Yeah, that could well be true. I'm not sure if there are any other cases where we have non-xid-assigning operations that are considered part of what has to be flushed before reporting commit; if there are not, I'd be okay with changing nextval() this way. Thank you all for letting me know your thoughts. I understood and agree that read-only transactions, including the connection establishment one, should not wait for the standby nor the XLOG flush at commit, and the current documentation/specification should not be changed. I'll consider how to fix this problem, learning the code, then I'll ask for review. I'd like to submit the patch for next CF if possible. From: Fujii Masao masao.fu...@gmail.com Sounds good direction. One question is: Can RecordTransactionCommit() avoid waiting for not only replication but also local WAL flush safely in such read-only transaction case? I'd appreciate any opinion on this, too. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote: I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1) once the table is built and there's free space in work_mem. The patch mentioned above makes implementing this possible / rather simple. Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we run out of memory, increase NTUP_PER_BUCKET. I'd like to think that the common case is that work_mem will be large enough that everything fits; and if you do it that way, then you save yourself the trouble of rehashing later, which as you point out might lose if there are only a few probes. If it turns out that you run short of memory, you can merge pairs of buckets up to three times, effectively doubling NTUP_PER_BUCKET each time. Yet another idea is to stick with your scheme, but do the dynamic bucket splits lazily. Set a flag on each bucket indicating whether or not it needs a lazy split. When someone actually probes the hash table, if the flag is set for a particular bucket, move any tuples that don't belong as we scan the bucket. If we reach the end of the bucket chain, clear the flag. Not sure either of these are better (though I kind of like the first one) but I thought I'd throw them out there... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
From: Asif Naeem anaeem...@gmail.com Other than my pervious comments, patch looks good to me. Thanks. Thanks for reviewing. I understood that your previous comment was to suggest copying the desired DLLs directly from Release/Debug folder to bin. But I'm afraid it cannot be done cleanly... CopySolutionOutput() copies all DLLS to lib folder with no exception as follows: if ($1 == 1) { $dir = bin; $ext = exe; } elsif ($1 == 2) { $dir = lib; $ext = dll; } It seems like I have to do something like this, listing the relevant DLL names anyway. I don't think this is not cleaner. if ($1 == 1) { $dir = bin; $ext = exe; } elsif ($1 == 2 /* the project is libpq, libecpg, etc. */) { $dir = bin; $ext = dll; } elsif ($1 == 2) { $dir = lib; $ext = dll; } What do you think? Am I misunderstanding your suggestion? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On 8 Červenec 2014, 14:49, Robert Haas wrote: On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote: I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1) once the table is built and there's free space in work_mem. The patch mentioned above makes implementing this possible / rather simple. Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we run out of memory, increase NTUP_PER_BUCKET. I'd like to think that the common case is that work_mem will be large enough that everything fits; and if you do it that way, then you save yourself the trouble of rehashing later, which as you point out might lose if there are only a few probes. If it turns out that you run short of memory, you can merge pairs of buckets up to three times, effectively doubling NTUP_PER_BUCKET each time. Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. With large work_mem values the amount of memory for buckets may be quite significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB. Yet another idea is to stick with your scheme, but do the dynamic bucket splits lazily. Set a flag on each bucket indicating whether or not it needs a lazy split. When someone actually probes the hash table, if the flag is set for a particular bucket, move any tuples that don't belong as we scan the bucket. If we reach the end of the bucket chain, clear the flag. I don't think a lazy scheme makes much sense here. There are two clearly separated phases - loading the table and search. Also, it seems to me it can't work as described. Say we build a table and then find out we need a table 4x the size. If I understand your proposal correctly, you'd just resize buckets array, copy the existing buckets (first 1/4 buckets) and set 'needs_split' flags. Correct? Well, the problem is that while scanning a bucket you already need to know which tuples belong there. But with the lazy approach, the tuples might still be in the old (not yet split) buckets. So you'd have to search the current bucket and all buckets that were not split yet. Or did I miss something? Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Sat, Jul 5, 2014 at 12:46 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-07-02 15:51:08 -0700, p...@heroku.com wrote: Attached revision factors in everyone's concerns here, I think. Is anyone planning to review Peter's revised patch? I have been doing some functional tests, and looked quickly at the code to understand what it does: 1) Compiles without warnings, passes regression tests 2) Checking process goes through all the existing columns of a relation even a difference of 1 with some other column(s) has already been found. As we try to limit the number of hints returned, this seems like a waste of resources. 3) distanceName could be improved, by for example having some checks on the string lengths of target and source columns, and immediately reject the match if for example the length of the source string is the double/half of the length of target. 4) This is not nice, could it be possible to remove the stuff from varlena.c? +/* Expand each Levenshtein distance variant */ +#include levenshtein.c +#define LEVENSHTEIN_LESS_EQUAL +#include levenshtein.c +#undef LEVENSHTEIN_LESS_EQUAL Part of the same comment: only varstr_leven_less_equal is used to calculate the distance, should we really move varstr_leven to core? This clearly needs to be reworked as not just a copy-paste of the things in fuzzystrmatch. The flag LEVENSHTEIN_LESS_EQUAL should be let within fuzzystrmatch I think. 5) Do we want hints on system columns as well? For example here we could get tableoid as column hint: =# select tablepid from foo; ERROR: 42703: column tablepid does not exist LINE 1: select tablepid from foo; ^ LOCATION: errorMissingColumn, parse_relation.c:3123 Time: 0.425 ms 6) Sometimes no hints are returned... Even in simple cases like this one: =# create table foo (aa int, bb int); CREATE TABLE =# select ab from foo; ERROR: 42703: column ab does not exist LINE 1: select ab from foo; ^ LOCATION: errorMissingColumn, parse_relation.c:3123 7) Performance penalty with a table with 1600 columns: =# CREATE FUNCTION create_long_table(tabname text, columns int) RETURNS void LANGUAGE plpgsql as $$ declare first_col bool = true; count int; query text; begin query := 'CREATE TABLE ' || tabname || ' ('; for count in 0..columns loop query := query || 'col' || count || ' int'; if count columns then query := query || ', '; end if; end loop; query := query || ')'; execute query; end; $$; =# SELECT create_long_table('aa', 1599); create_long_table --- (1 row) Then tested queries like that: SELECT col888a FROM aa; Patched version: 2.100ms~2.200ms master branch (6048896): 0.956 ms~0.990 ms So the performance impact seems limited. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
2014-07-06 14:19 GMT+09:00 Stephen Frost sfr...@snowman.net: Kaigai, * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote: Can you clarify where this is coming from..? It sounds like you're referring to an existing implementation and, if so, it'd be good to get more information on how that works exactly. Oracle VPD - Multiple Policies for Each Table, View, or Synonym http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351 It says - Note that all policies applied to a table are enforced with AND syntax. While I'm not against using this as an example to consider, it's much more complex than what we're talking about here- and it supports application contexts which allow groups of RLS rights to be applied or not applied; essentially it allows both AND and OR for sets of RLS policies, along with default policies which are applied no matter what. Not only Oracle VPD, it fits attitude of defense in depth. Please assume a system that installs network firewall, unix permissions and selinux. If somebody wants to reference an information asset within a file, he has to connect the server from the network address being allowed by the firewall configuration AND both of DAC and MAC has to allow his access. These are not independent systems and your argument would apply to our GRANT system also, which I hope it's agreed would make it far less useful. Note also that SELinux brings in another complexity- it needs to make system calls out to check the access. Usually, we have to pass all the access control to reference the target information, not one of the access control stuffs being installed. This is true in some cases, and not in others. Only one role you are a member of needs to have access to a relation, not all of them. There are other examples of 'OR'-style security policies, this is merely one. I'm simply not convinced that it applies in the specific case we're talking about. In the end, I expect that either way people will be upset because they won't be able to specify fully which should be AND vs. which should be OR with the kind of flexibility other systems provide. What I'm trying to get to is an initial implementation which is generally useful and is able to add such support later. This is similar to how roles work- your overall access includes all access granted to any roles you are a member of. You don't need SELECT rights granted to every role you are a member of to select from the table. Additionally, if an admin wants to AND the quals together then they can simply create a policy which does that rather than have 2 policies. It seems to me a pain on database administration, if we have to pay attention not to conflict each RLS-policy. This notion of 'conflict' doesn't make much sense to me. What is 'conflicting' here? Each policy would simply need to stand on its own for the role which it's being applied to. That's very simple and straight-forward. I expect 90% of RLS-policy will be configured to PUBLIC user, to apply everybody same rules on access. In this case, DBA has to ensure the target table has no policy or existing policy does not conflict with the new policy to be set. I don't think it is a good idea to enforce DBA these checks. If the DBA only uses PUBLIC then they have to ensure that each policy they set up for PUBLIC can stand on its own- though, really, I expect if they go that route they'd end up with just one policy that calls a stored procedure... You are suggesting instead that if application 2 sets up policies on the table and then application 1 adds another policy that it should reduce what application 2's users can see? That doesn't make any sense to me. I'd actually expect these applications to at least use different roles anyway, which means they could each have a single role specific policy which only returns what that application is allowed to see. I don't think this assumption is reasonable. Please expect two applications: app-X that is a database security product to apply access control based on remote ip-address of the client for any table accesses by any database roles. app-Y that is a usual enterprise package for daily business data, with RLS-policy. What is the expected behavior in this case? That the DBA manage the rights on the tables. I expect that will be required for quite a while with PG. It's nice to think of these application products that will manage all access for users by setting up their own policies, but we have yet to even discuss how they would have appropriate rights on the table to be able to do so (and to not interfere with each other..). Let's at least get something which is generally useful in. I'm all for trying to plan out how to get there and would welcome suggestions you have which are specific to PG on what we could do here (I'm not keen on just trying to mimic another product completely...), but at the level
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote: On 8 Červenec 2014, 14:49, Robert Haas wrote: On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote: I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1) once the table is built and there's free space in work_mem. The patch mentioned above makes implementing this possible / rather simple. Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we run out of memory, increase NTUP_PER_BUCKET. I'd like to think that the common case is that work_mem will be large enough that everything fits; and if you do it that way, then you save yourself the trouble of rehashing later, which as you point out might lose if there are only a few probes. If it turns out that you run short of memory, you can merge pairs of buckets up to three times, effectively doubling NTUP_PER_BUCKET each time. Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. Right, I think that's clear. I'm just pointing out that you get to decide: you can either start with a larger NTUP_PER_BUCKET and then reduce it if you enough memory left, or you can start with a smaller NTUP_PER_BUCKET and then increase it if you run short of memory. Yet another idea is to stick with your scheme, but do the dynamic bucket splits lazily. Set a flag on each bucket indicating whether or not it needs a lazy split. When someone actually probes the hash table, if the flag is set for a particular bucket, move any tuples that don't belong as we scan the bucket. If we reach the end of the bucket chain, clear the flag. I don't think a lazy scheme makes much sense here. There are two clearly separated phases - loading the table and search. Also, it seems to me it can't work as described. Say we build a table and then find out we need a table 4x the size. If I understand your proposal correctly, you'd just resize buckets array, copy the existing buckets (first 1/4 buckets) and set 'needs_split' flags. Correct? Well, the problem is that while scanning a bucket you already need to know which tuples belong there. But with the lazy approach, the tuples might still be in the old (not yet split) buckets. So you'd have to search the current bucket and all buckets that were not split yet. Or did I miss something? If you have, say, bucket 0..63 and you decide to change it to buckets 0..255, then any tuple that isn't in bucket N has to be in bucket N % 64. More generally, any time you expand or contract by a power of two it's pretty easy to figure out where tuples have to go. But even if you do something that's not a power of two, like say a 10x compaction of the buckets, there's still only two buckets that can contain any particular hash value. If the hash value is H, the old number of buckets is O, and the new number of buckets is N, then the tuple has got to be in bucket H % O or bucket H % N. If bucket H % O doesn't have the needs-split flag set, then it must be in H % N (if it exists at all). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. I can understand that it will create combinatorial explosion in the number of predicates that need to be examined by the constraint exclusion. I do not understand where come the paths gets involved here. The constraint exclusion kicks in before paths are created Perhaps I should not have used the term path here, because I wasn't referring to what the planner calls Paths. I just meant that there will be many more ways to perhaps prove a constraint-exclusion result, and the planner will have to go through them all. (Usually to no avail, because how often do people write queries that are guaranteed to produce no rows?) An example of what I meant by combinatorial explosion is that if a clause mentions K variables, and each of those variables has been equated to M other variables, there are (M+1)^K possible derived clauses, and it looks to me like we'd have to consider them all to catch the sort of situation you're talking about. I think this is going to require a *lot* of additional planner cycles, and TBH you've not provided any very compelling example of why it's worthwhile. Yeah, if you can exclude a large partition it's a win, but how often is that going to happen in real-world queries? The one example you gave seemed pretty artificial to me, ie unrelated to typical partitioning constraints. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] query_is_distinct_for does not take into account set returning functions
David Rowley dgrowle...@gmail.com writes: I think we should probably include the logic to test for set returning functions into query_is_distinct_for. It strikes me that there's only a problem if the SRF is in a tlist entry that is not one of the DISTINCT or GROUP BY columns, respectively. It may not be worth the extra complexity to figure that out, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How to use Makefile - pgxs without gcc -O2 ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 hi, is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ? regards geohas -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTvA4AAAoJEJFGMlQe7wR/6PAIAIy6619E9WpksRxaT+t132Py +jsiIhuj0qfyRHglILmyQLITc1Y3MoTO0jjBfhspBZE/NdOMd2uqf+0+QMIBdy9L vnM36quDUs5Zd0Ix8EWG75l9z4aeRRZG5Sh4pH8gGbOYwWPT9uefl7/4bvIeMD3c 0Zsmgt3nE6F5Yxptud+RQNf4ppOfh+46K0Y24WSXW5V99YuLaOfVJXbe0IyU9Zna /Ib6UqRwKaBoHwqub3jVZTcCKk3CXigqeQ9dJffw0e+3uMOlajRLTQ2CDlZKg6I+ Pakzlwby6szFsMEG13S8e32QuQ/X3lCxPfl2Us+78R4vXpfKSdqvfdIMEtbWTLM= =aUO6 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] LEFT JOINs not optimized away when not needed
On Tue, Jul 08, 2014 at 11:19:31AM -0400, Tom Lane wrote: Moshe Jacobson mo...@neadwerx.com writes: Seeing that there is only one output column, and that the results are grouped by this output column, it seems to me that the optimizer should not even look at the rest of the tables. The GROUP BY has nothing to do with it, but if all the other tables' join keys are primary keys (or at least unique), I'd expect the planner to get rid of the joins. However, I'm not sure whether it works completely when there are more than join_collapse_limit relations to worry about. Eliminating JOINs seems orthogonal, at least in theory, to join_collapse_limit. What have I missed here, and how might they have dependencies? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?
geohas li...@hasibether.at writes: is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ? I tend to use make PROFILE=-O0 which relies on knowing that PG's make rules append $(PROFILE) to CFLAGS. Alternatively you could just override CFLAGS: make CFLAGS=whatever but this requires knowing exactly what configure put into CFLAGS so that you don't remove any flags that you still want. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thank you Tom! make CFLAGS=-O0 was it. now gdb doesn't print Value optimized out. regards On 08/07/14 17:45, Tom Lane wrote: geohas li...@hasibether.at writes: is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ? I tend to use make PROFILE=-O0 which relies on knowing that PG's make rules append $(PROFILE) to CFLAGS. Alternatively you could just override CFLAGS: make CFLAGS=whatever but this requires knowing exactly what configure put into CFLAGS so that you don't remove any flags that you still want. regards, tom lane -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTvBP3AAoJEJFGMlQe7wR/ACYIAJufplHygGUpKft8+S0W/Mpu +tLX8bWpFd9Zg1fD1+gC0AglN6hSmiBc1TpSejOSruVUnFRoIAPbzDp+lSUpR4Hm nC99Zjr71Qosinrwh3upLcN+CvM5jG+J5SLJfjTK1z5CLR0imudTQLXrNw8j0rFu mgH5gKYa+/idxKnG1Hf8dzvdGIUOiRxDvX0F+FXgESedur6/voYgA6utdQBjHVLS rfdNncWtcvswNhvhBfaZX30Nv9w+tiY/i+d/fR5mxU1D6RPDDy2JDIiQcX4ZstaS 1Rc4GYfH780TJxxp8whZ1cVgCcUC6G1cC23bvBfScuZn14+1qMsyHX/AjQRpugQ= =lDtn -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On 8 Červenec 2014, 16:16, Robert Haas wrote: On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote: Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. Right, I think that's clear. I'm just pointing out that you get to decide: you can either start with a larger NTUP_PER_BUCKET and then reduce it if you enough memory left, or you can start with a smaller NTUP_PER_BUCKET and then increase it if you run short of memory. I don't think those two approaches are equal. With the approach I took, I can use a compromise value (NTUP=4) initially, and only resize the hash table once at the end (while keeping the amount of memory under work_mem all the time). With the NTUP=1 and increase in case of memory pressure you have to shrink the table immediately (to fit into work_mem), and if the hash table gets split into multiple batches you're suddenly in a situation with lower memory pressure and you may need to increase it again. Yet another idea is to stick with your scheme, but do the dynamic bucket splits lazily. Set a flag on each bucket indicating whether or not it needs a lazy split. When someone actually probes the hash table, if the flag is set for a particular bucket, move any tuples that don't belong as we scan the bucket. If we reach the end of the bucket chain, clear the flag. I don't think a lazy scheme makes much sense here. There are two clearly separated phases - loading the table and search. Also, it seems to me it can't work as described. Say we build a table and then find out we need a table 4x the size. If I understand your proposal correctly, you'd just resize buckets array, copy the existing buckets (first 1/4 buckets) and set 'needs_split' flags. Correct? Well, the problem is that while scanning a bucket you already need to know which tuples belong there. But with the lazy approach, the tuples might still be in the old (not yet split) buckets. So you'd have to search the current bucket and all buckets that were not split yet. Or did I miss something? If you have, say, bucket 0..63 and you decide to change it to buckets 0..255, then any tuple that isn't in bucket N has to be in bucket N % 64. More generally, any time you expand or contract by a power of two it's pretty easy to figure out where tuples have to go. OK. But even if you do something that's not a power of two, like say a 10x compaction of the buckets, there's still only two buckets that can contain any particular hash value. If the hash value is H, the old number of buckets is O, and the new number of buckets is N, then the tuple has got to be in bucket H % O or bucket H % N. If bucket H % O doesn't have the needs-split flag set, then it must be in H % N (if it exists at all). I wonder if this is really worth the effort - my guess is it's efficient only if large portion of buckets is not visited (and thus does not need to be split) at all. Not sure how common that is (our workloads certainly are not like that). regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?
On 07/08/2014 17:53, geohas wrote: make CFLAGS=-O0 was it. now gdb doesn't print Value optimized out. If you're using GCC 4.8 or later, consider using it with -Og for that kind of builds. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
Hi, On Tue, Jul 8, 2014 at 6:58 AM, Michael Paquier michael.paqu...@gmail.com wrote: 2) Checking process goes through all the existing columns of a relation even a difference of 1 with some other column(s) has already been found. As we try to limit the number of hints returned, this seems like a waste of resources. In general it's possible that an exact match will later be found within the RTE, and exact matches don't have to pay the wrong alias penalty, and are immediately returned. It is therefore not a waste of resources, but even if it was that would be pretty inconsequential as your benchmark shows. 3) distanceName could be improved, by for example having some checks on the string lengths of target and source columns, and immediately reject the match if for example the length of the source string is the double/half of the length of target. I don't think it's a good idea to tie distanceName() to the ultimate behavior of errorMissingColumn() hinting, since there may be other callers in the future. Besides, that isn't going to help much. 4) This is not nice, could it be possible to remove the stuff from varlena.c? +/* Expand each Levenshtein distance variant */ +#include levenshtein.c +#define LEVENSHTEIN_LESS_EQUAL +#include levenshtein.c +#undef LEVENSHTEIN_LESS_EQUAL Part of the same comment: only varstr_leven_less_equal is used to calculate the distance, should we really move varstr_leven to core? This clearly needs to be reworked as not just a copy-paste of the things in fuzzystrmatch. The flag LEVENSHTEIN_LESS_EQUAL should be let within fuzzystrmatch I think. So there'd be one variant within core and one within contrib/fuzzystrmatch? I don't think that's an improvement. 5) Do we want hints on system columns as well? I think it's obvious that the answer must be no. That's going to frequently result in suggestions of columns that users will complain aren't even there. If you know about the system columns, you can just get it right. They're supposed to be hidden for most purposes. 6) Sometimes no hints are returned... Even in simple cases like this one: =# create table foo (aa int, bb int); CREATE TABLE =# select ab from foo; ERROR: 42703: column ab does not exist LINE 1: select ab from foo; ^ LOCATION: errorMissingColumn, parse_relation.c:3123 That's because those two candidates come from a single RTE and have an equal distance -- you'd see both suggestions if you joined two tables with each candidate, assuming that each table being joined didn't individually have the same issue. I think that that's probably considered the correct behavior by most. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote: On 8 Červenec 2014, 16:16, Robert Haas wrote: On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote: Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. Right, I think that's clear. I'm just pointing out that you get to decide: you can either start with a larger NTUP_PER_BUCKET and then reduce it if you enough memory left, or you can start with a smaller NTUP_PER_BUCKET and then increase it if you run short of memory. I don't think those two approaches are equal. With the approach I took, I can use a compromise value (NTUP=4) initially, and only resize the hash table once at the end (while keeping the amount of memory under work_mem all the time). With the NTUP=1 and increase in case of memory pressure you have to shrink the table immediately (to fit into work_mem), and if the hash table gets split into multiple batches you're suddenly in a situation with lower memory pressure and you may need to increase it again. True. On the other hand, this really only comes into play when the estimates are wrong. If you know at the start how many tuples you're going to need to store and how big they are, you determine whether NTUP_PER_BUCKET=1 is going to work before you even start building the hash table. If it isn't, then you use fewer buckets right from the start. If we start by estimating a small value for NTUP_PER_BUCKET and then let it float upward if we turn out to have more tuples than expected, we're optimizing for the case where our statistics are right. If we start by estimating a larger value for NTUP_PER_BUCKET than what we think we need to fit within work_mem, we're basically guessing that our statistics are more likely to be wrong than to be right. I think. I wonder if this is really worth the effort - my guess is it's efficient only if large portion of buckets is not visited (and thus does not need to be split) at all. Not sure how common that is (our workloads certainly are not like that). Yeah. It may be a bad idea. I threw it out there as a possible way of trying to mitigate the worst case, which is when you trouble to build the hash table and then make very few probes. But that may not be worth the complexity that this would introduce. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com Here's an update that places the socket in a temporary subdirectory of /tmp. The first attached patch adds NetBSD mkdtemp() to libpgport. The second, principal, patch uses mkdtemp() to implement this design in pg_regress. The corresponding change to contrib/pg_upgrade/test.sh is based on the configure script's arrangements for its temporary directory. Hi, I believe pg_upgrade itself still needs a fix. While it's not a security problem to put the socket in $CWD while upgrading (it is using -c unix_socket_permissions=0700), this behavior is pretty unexpected, and does fail if your $CWD is 107 bytes. In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl perl tests to avoid that problem, so imho it would make even more sense to fix pg_upgrade which could also fail in production. This has been discussed here and elsewhere [1] before, but was rejected as not being in line what the other utilities do, but now pg_upgrade is the lone outlier. Noah's changes let Debian drop 4 out of 5 pg_regress-sockdir patches, having this fixed would also let us get rid of the last one [2]. [1] http://lists.debian.org/debian-wb-team/2013/05/msg00015.html [2] https://alioth.debian.org/scm/loggerhead/pkg-postgresql/postgresql-9.5/trunk/view/head:/debian/patches/64-pg_upgrade-sockdir Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
On Sat, Jul 5, 2014 at 10:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: Amit Kapila amit.kapil...@gmail.com writes: Please find the patch attached to address the above concern. I have updated docs, so that users can be aware of such behaviour. I'm in the camp that says that we need to do something about this, not just claim that it's operating as designed. AFAICS, the *entire* argument for having ALTER SYSTEM at all is ease of use; but this behavior is not what I would call facilitating ease of use. In particular, if you are conveniently able to edit postgresql.conf, what the heck do you need ALTER SYSTEM for? One possible answer is to modify guc-file.l so that only the last value supplied for any one GUC gets processed. The naive way of doing that would be O(N^2) in the number of uncommented settings, which might or might not be a problem; if it is, we could no doubt devise a smarter data structure. I've been really annoyed by the current behavior even on older releases because I have a script, which I use frequently, that does this: initdb cat $PGDATA/postgresql.conf EOM shared_buffers=8GB another setting another setting another setting EOM Now, sometimes while experimenting, I will change a setting in postgresql.conf and reload the config. At which point it complains that it can't change shared_buffers at runtime. It does this because the line from initdb is in the file, and so is the one I added. But I'm *not* trying to change shared_buffers. Sure, the first value in the config file doesn't match the current value, but the *last* line in the config file is the one that's supposed to control, so why is it complaining about the earlier line? I haven't looked at the code in this area too carefully, but it seems to me like the flow ought to be: 1. Read all of the config files and determine what the final value present in each config file is. 2. *Then*, in a second pass, enforce requirements like can't be changed except at server start. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
On 07/08/2014 10:07 AM, Robert Haas wrote: I haven't looked at the code in this area too carefully, but it seems to me like the flow ought to be: 1. Read all of the config files and determine what the final value present in each config file is. 2. *Then*, in a second pass, enforce requirements like can't be changed except at server start. +1 This would also make conf.d much more useful; I wouldn't have to worry as much about overlapping config settings. Sounds like a 9.5 feature, though. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting Windows SChannel as OpenSSL replacement
On Thu, Jun 26, 2014 at 4:26 PM, Andreas Karlsson andr...@proxel.se wrote: On 06/24/2014 03:20 AM, Jeff Janes wrote: I've tried your 0001 patch, reflecting this refactoring, on Linux and it caused 'make check' to hang at 'starting postmaster'. I found the bug in the code, and I have attached the a patch which you can apply on top of the patch. The regression tests pass now on my Debian machine. Your fix works for me as well. Thanks. Is there some recipe for testing the 0002 patch? Can it be tested on an MinGW environment, or does it need to use the MicroSoft supplied compilers? Thanks, Jeff
Re: [HACKERS] postgresql.auto.conf and reload
On 07/06/2014 01:27 AM, Christoph Berg wrote: Another could be that during initdb all the uncommented settings be written to postgresql.auto.conf file rather than to postgresql.conf. I think we can do this by changing code in initdb.c-setup_config(). This will ensure that unless user is changing settings in a mixed way (some by Alter System and some manually by editing postgresql.conf), there will no such problem. There is no reasonable way for us to prevent issues for users who are manually changing both pg.conf and pg.auto.conf. Users should stick to one or the other method of management (or, thirdly, using conf.d); if they mix methods, we can't prevent confusion at restart time and we shouldn't try. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote: Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com Here's an update that places the socket in a temporary subdirectory of /tmp. The first attached patch adds NetBSD mkdtemp() to libpgport. The second, principal, patch uses mkdtemp() to implement this design in pg_regress. The corresponding change to contrib/pg_upgrade/test.sh is based on the configure script's arrangements for its temporary directory. Hi, I believe pg_upgrade itself still needs a fix. While it's not a security problem to put the socket in $CWD while upgrading (it is using -c unix_socket_permissions=0700), this behavior is pretty unexpected, and does fail if your $CWD is 107 bytes. In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl perl tests to avoid that problem, so imho it would make even more sense to fix pg_upgrade which could also fail in production. +1. Does writing that patch interest you? -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
Re: Noah Misch 2014-07-08 20140708174125.ga1884...@tornado.leadboat.com On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote: Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com Here's an update that places the socket in a temporary subdirectory of /tmp. The first attached patch adds NetBSD mkdtemp() to libpgport. The second, principal, patch uses mkdtemp() to implement this design in pg_regress. The corresponding change to contrib/pg_upgrade/test.sh is based on the configure script's arrangements for its temporary directory. Hi, I believe pg_upgrade itself still needs a fix. While it's not a security problem to put the socket in $CWD while upgrading (it is using -c unix_socket_permissions=0700), this behavior is pretty unexpected, and does fail if your $CWD is 107 bytes. In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl perl tests to avoid that problem, so imho it would make even more sense to fix pg_upgrade which could also fail in production. +1. Does writing that patch interest you? I'll give it a try once I've finished this CF review. Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On 8.7.2014 19:00, Robert Haas wrote: On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote: On 8 Červenec 2014, 16:16, Robert Haas wrote: Right, I think that's clear. I'm just pointing out that you get to decide: you can either start with a larger NTUP_PER_BUCKET and then reduce it if you enough memory left, or you can start with a smaller NTUP_PER_BUCKET and then increase it if you run short of memory. I don't think those two approaches are equal. With the approach I took, I can use a compromise value (NTUP=4) initially, and only resize the hash table once at the end (while keeping the amount of memory under work_mem all the time). With the NTUP=1 and increase in case of memory pressure you have to shrink the table immediately (to fit into work_mem), and if the hash table gets split into multiple batches you're suddenly in a situation with lower memory pressure and you may need to increase it again. True. On the other hand, this really only comes into play when the estimates are wrong. If you know at the start how many tuples you're going to need to store and how big they are, you determine whether NTUP_PER_BUCKET=1 is going to work before you even start building the hash table. If it isn't, then you use fewer buckets right from the start. If we start by estimating a small value for NTUP_PER_BUCKET and then let it float upward if we turn out to have more tuples than expected, we're optimizing for the case where our statistics are right. If we start by estimating a larger value for NTUP_PER_BUCKET than what we think we need to fit within work_mem, we're basically guessing that our statistics are more likely to be wrong than to be right. I think. Good point. The fist patch was targetted exactly at the wrongly estimated queries. This patch attempts to apply the rehash to all plans, and maybe there's a better way. If the estimates are correct / not too off, we can use this information to do the sizing 'right' at the beginning (without facing rehashes later). Over-estimates are not a problem, because it won't make the hash table slower (it'll be sized for more tuples) and we can't change the number of batches anyway. With under-estimates we have to decide whether to resize the hash or increase the number of batches. In both cases that matter (correct estimates and under-estimates) we have to decide whether to increase the number of buckets or batches. I'm not sure how to do that. I wonder if this is really worth the effort - my guess is it's efficient only if large portion of buckets is not visited (and thus does not need to be split) at all. Not sure how common that is (our workloads certainly are not like that). Yeah. It may be a bad idea. I threw it out there as a possible way of trying to mitigate the worst case, which is when you trouble to build the hash table and then make very few probes. But that may not be worth the complexity that this would introduce. Let's keep it simple for now. I think the sizing question (explained above) is more important and needs to be solved first. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Hi, here's my review for this patch. Generally, the patch addresses a real need, tables often only turn out to be desired as unlogged if there are performance problems in practice, and the other way round changing an unlogged table to logged is way more involved manually than it could be with this patch. So yes, we want the feature. I've tried the patch, and it works as desired, including tab completion in psql. There are a few issues to be resolved, though. Re: Fabrízio de Royes Mello 2014-06-26 cafcns+pyv0pbjlo97osyqv1yqoc7s+jgvwxc8uny5jsrdwy...@mail.gmail.com As part of GSoC2014 and with agreement of my mentor and reviewer (Stephen Frost) I've send a complement of the first patch to add the capability to change a regular table to unlogged. (Fwiw, I believe this direction is the more interesting one.) With this patch we finish the main goals of the GSoC2014 and now we'll work in the additional goals. ... that being the non-WAL-logging with wal_level=minimal, or more? diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 69a1e14..424f2e9 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -58,6 +58,7 @@ ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable ENABLE REPLICA RULE replaceable class=PARAMETERrewrite_rule_name/replaceable ENABLE ALWAYS RULE replaceable class=PARAMETERrewrite_rule_name/replaceable CLUSTER ON replaceable class=PARAMETERindex_name/replaceable +SET {LOGGED | UNLOGGED} SET WITHOUT CLUSTER SET WITH OIDS SET WITHOUT OIDS This must not be between the two CLUSTER lines. I think the best spot would just be one line down, before SET WITH OIDS. (While we are at it, SET TABLESPACE should probably be moved up to the other SET options...) @@ -432,6 +433,20 @@ ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable /varlistentry varlistentry +termliteralSET {LOGGED | UNLOGGED}/literal/term +listitem + para + This form change the table persistence type from unlogged to permanent or This grammar bug pops up consistently: This form *changes*... There's trailing whitespace. + from unlogged to permanent by rewriting the entire contents of the table + and associated indexes into new disk files. + /para + para + Changing the table persistence type acquires an literalACCESS EXCLUSIVE/literal lock. + /para I'd rewrite that and add a reference: ... from unlogged to permanent (see xref linkend=SQL-CREATETABLE-UNLOGGED). /para para Changing the table persistence type acquires an literalACCESS EXCLUSIVE/literal lock and rewrites the table contents and associated indexes into new disk files. /para diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 60d387a..9dfdca2 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -125,18 +125,19 @@ static List *on_commits = NIL; * a pass determined by subcommand type. */ -#define AT_PASS_UNSET-1 /* UNSET will cause ERROR */ -#define AT_PASS_DROP 0 /* DROP (all flavors) */ -#define AT_PASS_ALTER_TYPE 1 /* ALTER COLUMN TYPE */ -#define AT_PASS_OLD_INDEX2 /* re-add existing indexes */ -#define AT_PASS_OLD_CONSTR 3 /* re-add existing constraints */ -#define AT_PASS_COL_ATTRS4 /* set other column attributes */ +#define AT_PASS_UNSET-1 /* UNSET will cause ERROR */ +#define AT_PASS_DROP 0 /* DROP (all flavors) */ +#define AT_PASS_ALTER_TYPE 1 /* ALTER COLUMN TYPE */ +#define AT_PASS_OLD_INDEX2 /* re-add existing indexes */ +#define AT_PASS_OLD_CONSTR 3 /* re-add existing constraints */ +#define AT_PASS_COL_ATTRS4 /* set other column attributes */ /* We could support a RENAME COLUMN pass here, but not currently used */ -#define AT_PASS_ADD_COL 5 /* ADD COLUMN */ -#define AT_PASS_ADD_INDEX6 /* ADD indexes */ -#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */ -#define AT_PASS_MISC 8 /* other stuff */ -#define AT_NUM_PASSES9 +#define AT_PASS_ADD_COL 5 /* ADD COLUMN */ +#define AT_PASS_ADD_INDEX6 /* ADD indexes */ +#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */ +#define AT_PASS_MISC 8 /* other stuff */
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote: On 8 Červenec 2014, 14:49, Robert Haas wrote: On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote: I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1) once the table is built and there's free space in work_mem. The patch mentioned above makes implementing this possible / rather simple. Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we run out of memory, increase NTUP_PER_BUCKET. I'd like to think that the common case is that work_mem will be large enough that everything fits; and if you do it that way, then you save yourself the trouble of rehashing later, which as you point out might lose if there are only a few probes. If it turns out that you run short of memory, you can merge pairs of buckets up to three times, effectively doubling NTUP_PER_BUCKET each time. Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. With large work_mem values the amount of memory for buckets may be quite significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB. That extra 90MB is memory well spent, in my book. Versus having to walk a 4-deep linked list which is scattered all over main memory just to find one entry we care about in it. It might cause some things that were very close to the edge to tip over into multi-pass hash joins, but even that is not necessarily a bad thing. (When I test with work_mem in the 20 to 50 MB range, I often find batches=2 be ~30% faster than batches=1, I think because partitioning into main memory using sequential writes is not much of a burden, and building and walking two hash tables that both fit in L3 cache is much faster than building 1 hash table in main memory, and more than makes up for the work of partitioning. Presumably that situation doesn't apply to work_mem 900MB, but isn't NUMA about the same thing as L4 cache, in effect?). And if someone had a whole bunch of hash joins which were right in that anti-sweet spot, all they have to do is increase work_mem by (at most) 15% to get out of it. work_mem is basically impossible to tune, so I doubt anyone exists who has a reasonable setting for it which can' be increased by 15% and still be reasonable. And if someone does have it tuned so tightly, they probably won't be upgrading to new major versions without expecting to do some re-tuning. Cheers, Jeff
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Tue, Jul 8, 2014 at 08:21:48PM +0200, Christoph Berg wrote: Re: Noah Misch 2014-07-08 20140708174125.ga1884...@tornado.leadboat.com On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote: Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com Here's an update that places the socket in a temporary subdirectory of /tmp. The first attached patch adds NetBSD mkdtemp() to libpgport. The second, principal, patch uses mkdtemp() to implement this design in pg_regress. The corresponding change to contrib/pg_upgrade/test.sh is based on the configure script's arrangements for its temporary directory. Hi, I believe pg_upgrade itself still needs a fix. While it's not a security problem to put the socket in $CWD while upgrading (it is using -c unix_socket_permissions=0700), this behavior is pretty unexpected, and does fail if your $CWD is 107 bytes. In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl perl tests to avoid that problem, so imho it would make even more sense to fix pg_upgrade which could also fail in production. +1. Does writing that patch interest you? I'll give it a try once I've finished this CF review. OK. Let me know if you need help. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
Peter Geoghegan wrote: 6) Sometimes no hints are returned... Even in simple cases like this one: =# create table foo (aa int, bb int); CREATE TABLE =# select ab from foo; ERROR: 42703: column ab does not exist LINE 1: select ab from foo; ^ LOCATION: errorMissingColumn, parse_relation.c:3123 That's because those two candidates come from a single RTE and have an equal distance -- you'd see both suggestions if you joined two tables with each candidate, assuming that each table being joined didn't individually have the same issue. I think that that's probably considered the correct behavior by most. It seems pretty silly to me actually. Was this designed by a committee? I agree with the general principle that showing a large number of candidates (a la bash) is a bad idea, but failing to show two of them ... Words fail me. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Tue, Jul 8, 2014 at 1:42 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: That's because those two candidates come from a single RTE and have an equal distance -- you'd see both suggestions if you joined two tables with each candidate, assuming that each table being joined didn't individually have the same issue. I think that that's probably considered the correct behavior by most. It seems pretty silly to me actually. Was this designed by a committee? I agree with the general principle that showing a large number of candidates (a la bash) is a bad idea, but failing to show two of them ... I guess it was designed by a committee. But we don't fail to show both because they're equally distant. Rather, it's because they're equally distant and from the same RTE. This is a contrived example, but typically showing equally distant columns is useful when they're in a foreign-key relationship - I was worried about the common case where a column name is misspelled that would otherwise be ambiguous, which is why that shows a HINT while the single RTE case doesn't. I think that in most realistic cases it wouldn't be all that useful to show two columns from the same table when they're equally distant. It's easy to imagine that reflecting that no match is good in absolute terms, and we're somewhat conservative about showing any match. While I think this general behavior is defensible, I must admit that it did suit me to write it that way because to do otherwise would have necessitated more invasive code in the existing general purpose scanRTEForColumn() function. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Tue, Jul 8, 2014 at 1:55 PM, Peter Geoghegan p...@heroku.com wrote: I was worried about the common case where a column name is misspelled that would otherwise be ambiguous, which is why that shows a HINT while the single RTE case doesn't To be clear - I mean a HINT with two suggestions rather than just one. If there are 3 or more equally distant suggestions (even if they're all from different RTEs) we also give no HINT in the proposed patch. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
On 8.7.2014 21:53, Jeff Janes wrote: On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote: Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer relations it may be way cheaper to use higher NTUP_PER_BUCKET values instead of increasing the number of batches (resulting in repeated scans of the outer table). I think it's important (but difficult) to keep these things somehow balanced. With large work_mem values the amount of memory for buckets may be quite significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB. That extra 90MB is memory well spent, in my book. Versus having to walk a 4-deep linked list which is scattered all over main memory just to find one entry we care about in it. Yes, I share this view, although it really depends on how expensive it's to rescan the outer relation (due to more batches) vs. lookup in a deeper hash table. The other thing is that this memory is currently unaccounted for, i.e. the space for buckets is not counted within the work_mem limit (unless I'm missing something in the code). I'm not sure why, and I believe it should be, so I changer this in the patch. It might cause some things that were very close to the edge to tip over into multi-pass hash joins, but even that is not necessarily a bad thing. (When I test with work_mem in the 20 to 50 MB range, I often find batches=2 be ~30% faster than batches=1, I think because partitioning into main memory using sequential writes is not much of a burden, and building and walking two hash tables that both fit in L3 cache is much faster than building 1 hash table in main memory, and more than makes up for the work of partitioning. Presumably that situation doesn't apply to work_mem 900MB, but isn't NUMA about the same thing as L4 cache, in effect?). Yes, I've seen cases where plans with (nbatch1) were faster than a plan with (nbatch=1). I'm not entirely sure why :-( but I have two hypotheses so far: (a) Caching within CPU (current CPUs have multiple MBs of L2 cache), which may make a difference, especially in the size range you've mentioned. (b) Lower tuple/bucket ratio - this may easily happen for example if the estimates are slighly lower than reality (either row count or row width) and narrowly exceed work_mem, thus forcing batching. The resulting hash table has ~50% tuple/bucket on average, and thus is faster. And if someone had a whole bunch of hash joins which were right in that anti-sweet spot, all they have to do is increase work_mem by (at most) 15% to get out of it. work_mem is basically impossible to tune, so I doubt anyone exists who has a reasonable setting for it which can' be increased by 15% and still be reasonable. And if someone does have it tuned so tightly, they probably won't be upgrading to new major versions without expecting to do some re-tuning. Right. Still, it's not really nice to get slower hash joins after upgrading to a new version - I somehow expect to get the same or better performance, if possible. So I'd like to make it as efficient as possible, within the given memory limit. Sadly, the increase may be needed anyway because of counting the bucket memory into work_mem, as mentioned above. If committed, this should be probably mentioned in release notes or something. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tweaking NTUP_PER_BUCKET
Hi, Thinking about this a bit more, do we really need to build the hash table on the first pass? Why not to do this: (1) batching - read the tuples, stuff them into a simple list - don't build the hash table yet (2) building the hash table - we have all the tuples in a simple list, batching is done - we know exact row count, can size the table properly - build the table Also, maybe we could use a regular linear hash table [1], instead of using the current implementation with NTUP_PER_BUCKET=1. (Although, that'd be absolutely awful with duplicates.) regards Tomas [1] http://en.wikipedia.org/wiki/Linear_probing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
David Rowley dgrowle...@gmail.com writes: On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm a bit skeptical as to whether testing for that case is actually worth any extra complexity. Do you have a compelling use-case? But anyway, if we do want to allow it, why does it take any more than adding a check for Consts to the loops in query_is_distinct_for? It's the targetlist entries where we'd want to allow Consts, not the join conditions. I don't really have a compelling use-case, but you're right, it's just a Const check in query_is_distinct_for(), it seems simple enough so I've included that in my refactor of the patch to use query_is_distinct_for(). This allows the regression tests all to pass again. Meh. I wrote a regression test that expects it is a pretty poor rationale for adding logic. If you can't point to a real-world case where this is important, I'm inclined to take it out. If we were actually serious about exploiting such cases, looking for bare Consts would be a poor implementation anyhow, not least because const-folding has not yet been applied to the sub-select. I think we'd want to do it for any pseudoconstant expression (no Vars, no volatile functions); which is a substantially more expensive test. 1. The fast path code that exited in join_is_removable() for subquery's when the subquery had no group or distinct clause is now gone. I wasn't too sure that I wanted to assume too much about what query_is_distinct_for may do in the future and I thought if I included some logic in join_is_removable() to fast path, that one day it may fast path wrongly. Or put a quick-check subroutine next to query_is_distinct_for(). The code we're skipping here is not so cheap that I want to blow off skipping it. On review it looks like analyzejoins.c would possibly benefit from an earlier fast-path check as well. 2. The patch I submitted here http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com if that gets accepted then it makes the check for set returning functions in join_is_removable void. Right (and done, if you didn't notice already). TBH I find the checks for FOR UPDATE and volatile functions to be questionable as well. We have never considered those things to prevent pushdown of quals into a subquery (cf subquery_is_pushdown_safe). I think what we're talking about here is pretty much equivalent to pushing an always-false qual into the subquery; if people haven't complained about that, why should they complain about this? Or to put it in slightly more principled terms, we've attempted to prevent subquery optimization from causing volatile expressions to be evaluated *more* times than the naive reading of the query would suggest, but we have generally not felt that we needed to prevent them from happening *fewer* times. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
Yes. Can you please take a look at Win_lib_bin.patch I shared earlier ?, I think it is or similar approach will be appropriate. Thanks. Regards, Muhammad Asif Naeem On Tue, Jul 8, 2014 at 5:53 PM, MauMau maumau...@gmail.com wrote: From: Asif Naeem anaeem...@gmail.com Other than my pervious comments, patch looks good to me. Thanks. Thanks for reviewing. I understood that your previous comment was to suggest copying the desired DLLs directly from Release/Debug folder to bin. But I'm afraid it cannot be done cleanly... CopySolutionOutput() copies all DLLS to lib folder with no exception as follows: if ($1 == 1) { $dir = bin; $ext = exe; } elsif ($1 == 2) { $dir = lib; $ext = dll; } It seems like I have to do something like this, listing the relevant DLL names anyway. I don't think this is not cleaner. if ($1 == 1) { $dir = bin; $ext = exe; } elsif ($1 == 2 /* the project is libpq, libecpg, etc. */) { $dir = bin; $ext = dll; } elsif ($1 == 2) { $dir = lib; $ext = dll; } What do you think? Am I misunderstanding your suggestion? Regards MauMau
Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
Asif Naeem wrote: Yes. Can you please take a look at Win_lib_bin.patch I shared earlier ?, I think it is or similar approach will be appropriate. Thanks. I think the suggestion by Peter Eisentraut upthread was pretty reasonable -- the Makefiles are already aware that they are building a shared library, so scrape the info off them. The less hardcoded stuff in the msvc framework, the better. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing join removals for more join types
On 9 July 2014 09:27, Tom Lane t...@sss.pgh.pa.us wrote: David Rowley dgrowle...@gmail.com writes: On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm a bit skeptical as to whether testing for that case is actually worth any extra complexity. Do you have a compelling use-case? But anyway, if we do want to allow it, why does it take any more than adding a check for Consts to the loops in query_is_distinct_for? It's the targetlist entries where we'd want to allow Consts, not the join conditions. I don't really have a compelling use-case, but you're right, it's just a Const check in query_is_distinct_for(), it seems simple enough so I've included that in my refactor of the patch to use query_is_distinct_for(). This allows the regression tests all to pass again. Meh. I wrote a regression test that expects it is a pretty poor rationale for adding logic. If you can't point to a real-world case where this is important, I'm inclined to take it out. Ok, I'll pull that logic back out when I get home tonight. If we were actually serious about exploiting such cases, looking for bare Consts would be a poor implementation anyhow, not least because const-folding has not yet been applied to the sub-select. I think we'd want to do it for any pseudoconstant expression (no Vars, no volatile functions); which is a substantially more expensive test. 1. The fast path code that exited in join_is_removable() for subquery's when the subquery had no group or distinct clause is now gone. I wasn't too sure that I wanted to assume too much about what query_is_distinct_for may do in the future and I thought if I included some logic in join_is_removable() to fast path, that one day it may fast path wrongly. Or put a quick-check subroutine next to query_is_distinct_for(). The code we're skipping here is not so cheap that I want to blow off skipping it. Ok, good idea. I'll craft something up tonight along those lines. On review it looks like analyzejoins.c would possibly benefit from an earlier fast-path check as well. Do you mean for non-subqueries? There already is a check to see if the relation has no indexes. 2. The patch I submitted here http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com if that gets accepted then it makes the check for set returning functions in join_is_removable void. Right (and done, if you didn't notice already). Thanks, I noticed that this morning. I'll remove the (now) duplicate check from the patch TBH I find the checks for FOR UPDATE and volatile functions to be questionable as well. We have never considered those things to prevent pushdown of quals into a subquery (cf subquery_is_pushdown_safe). I think what we're talking about here is pretty much equivalent to pushing an always-false qual into the subquery; if people haven't complained about that, why should they complain about this? Or to put it in slightly more principled terms, we've attempted to prevent subquery optimization from causing volatile expressions to be evaluated *more* times than the naive reading of the query would suggest, but we have generally not felt that we needed to prevent them from happening *fewer* times. Well, that's a real tough one for me as I only added that based on what you told me here: On 20 May 2014 23:22, Tom Lane t...@sss.pgh.pa.us wrote: I doubt you should drop a subquery containing FOR UPDATE, either. That's a side effect, just as much as a volatile function would be. regards, tom lane As far as I know the FOR UPDATE check is pretty much void as of now anyway, since the current state of query_is_distinct_for() demands that there's either a DISTINCT, GROUP BY or just a plain old aggregate without any grouping, which will just return a single row, neither of these will allow FOR UPDATE anyway. I really just added the check just to protect the code from possible future additions to query_is_distinct_for() which may add logic to determine uniqueness by some other means. So the effort here should be probably be more focused on if we should allow the join removal when the subquery contains volatile functions. We should probably think fairly hard on this now as I'm still planning on working on INNER JOIN removals at some point and I'm thinking we should likely be consistent between the 2 types of join for when it comes to FOR UPDATE and volatile functions, and I'm thinking right now that for INNER JOINs that, since they're INNER that we could remove either side of the join. In that case maybe it would be harder for the user to understand why their volatile function didn't get executed. Saying that... off the top of my head I can't remember what we'd do in a case like: create view v_a as select a,volatilefunc(a) AS funcresult from a; select a from v_a; Since we didn't select funcresult, do we execute the function?
Re: [HACKERS] Allowing join removals for more join types
David Rowley dgrow...@gmail.com writes: On 9 July 2014 09:27, Tom Lane t...@sss.pgh.pa.us wrote: On review it looks like analyzejoins.c would possibly benefit from an earlier fast-path check as well. Do you mean for non-subqueries? There already is a check to see if the relation has no indexes. Oh, sorry, that was a typo: I meant to write pathnode.c. Specifically, we could skip the translate_sub_tlist step. Admittedly that's not hugely expensive, but as long as we have the infrastructure for a quick check it might be worth doing. TBH I find the checks for FOR UPDATE and volatile functions to be questionable as well. Well, that's a real tough one for me as I only added that based on what you told me here: I doubt you should drop a subquery containing FOR UPDATE, either. That's a side effect, just as much as a volatile function would be. Hah ;-). But the analogy to qual pushdown hadn't occurred to me at the time. As far as I know the FOR UPDATE check is pretty much void as of now anyway, since the current state of query_is_distinct_for() demands that there's either a DISTINCT, GROUP BY or just a plain old aggregate without any grouping, which will just return a single row, neither of these will allow FOR UPDATE anyway. True. So the effort here should be probably be more focused on if we should allow the join removal when the subquery contains volatile functions. We should probably think fairly hard on this now as I'm still planning on working on INNER JOIN removals at some point and I'm thinking we should likely be consistent between the 2 types of join for when it comes to FOR UPDATE and volatile functions, and I'm thinking right now that for INNER JOINs that, since they're INNER that we could remove either side of the join. In that case maybe it would be harder for the user to understand why their volatile function didn't get executed. Color me dubious. In exactly what circumstances would it be valid to suppress an inner join involving a sub-select? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
On 09/07/14 05:13, Josh Berkus wrote: On 07/06/2014 01:27 AM, Christoph Berg wrote: Another could be that during initdb all the uncommented settings be written to postgresql.auto.conf file rather than to postgresql.conf. I think we can do this by changing code in initdb.c-setup_config(). This will ensure that unless user is changing settings in a mixed way (some by Alter System and some manually by editing postgresql.conf), there will no such problem. There is no reasonable way for us to prevent issues for users who are manually changing both pg.conf and pg.auto.conf. Users should stick to one or the other method of management (or, thirdly, using conf.d); if they mix methods, we can't prevent confusion at restart time and we shouldn't try. Yes, but even well behaved users will see this type of error, because initdb uncomments certain values (ones that are dead certs for being changed via ALTER SYSTEM subsequently like shared_buffers), and then - bang! your next reload gets that your postgresql.conf contains errors message. Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 9.4 pg_control corruption
I've encountered a corrupt pg_control file on my 9.4 development cluster. I've mostly been using the cluster for changeset extraction / slony testing. This is a 9.4 (currently commit 6ad903d70a440e + a walsender change discussed in another thread) but would have had the initdb done with an earlier 9.4 snapshot. /usr/local/pgsql94wal/bin$ ./pg_controldata ../data WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy. pg_control version number:937 Catalog version number: 201405111 Database system identifier: 6014096177254975326 Database cluster state: in production pg_control last modified: Tue 08 Jul 2014 06:15:58 PM EDT Latest checkpoint location: 5/44DC5FC8 Prior checkpoint location:5/44C58B88 Latest checkpoint's REDO location:5/44DC5FC8 Latest checkpoint's REDO WAL file:000100050044 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/1558590 Latest checkpoint's NextOID: 505898 Latest checkpoint's NextMultiXactId: 3285 Latest checkpoint's NextMultiOffset: 6569 Latest checkpoint's oldestXID:1281 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Time of latest checkpoint:Tue 08 Jul 2014 06:15:23 PM EDT Fake LSN counter for unlogged rels: 0/1 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location:0/0 Backup end location: 0/0 End-of-backup record required:no Current wal_level setting:logical Current wal_log_hints setting:off Current max_connections setting: 200 Current max_worker_processes setting: 8 Current max_prepared_xacts setting: 0 Current max_locks_per_xact setting: 64 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment:16777216 Maximum length of identifiers:64 Maximum columns in an index: 32 Maximum size of a TOAST chunk:1996 Size of a large-object chunk: 65793 Date/time type storage: floating-point numbers Float4 argument passing: by reference Float8 argument passing: by reference Data page checksum version: 2602751502 ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$ Before this postgres crashed, and seemed to have problems recovering. I might have hit CTRL-C but I didn't do anything drastic like issue a kill -9. test1 [unknown] 2014-07-08 18:15:18.986 EDTFATAL: the database system is in recovery mode test1 [unknown] 2014-07-08 18:15:20.482 EDTWARNING: terminating connection because of crash of another server process test1 [unknown] 2014-07-08 18:15:20.482 EDTDETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. test1 [unknown] 2014-07-08 18:15:20.482 EDTHINT: In a moment you should be able to reconnect to the database and repeat your command. 2014-07-08 18:15:20.483 EDTLOG: all server processes terminated; reinitializing 2014-07-08 18:15:20.720 EDTLOG: database system was interrupted; last known up at 2014-07-08 18:15:15 EDT 2014-07-08 18:15:20.865 EDTLOG: database system was not properly shut down; automatic recovery in progress 2014-07-08 18:15:20.954 EDTLOG: redo starts at 5/41023848 2014-07-08 18:15:23.153 EDTLOG: unexpected pageaddr 4/D8DC6000 in log segment 000100050044, offset 14442496 2014-07-08 18:15:23.153 EDTLOG: redo done at 5/44DC5F60 2014-07-08 18:15:23.153 EDTLOG: last completed transaction was at log time 2014-07-08 18:15:17.874937-04 test2 [unknown] 2014-07-08 18:15:24.247 EDTFATAL: the database system is in recovery mode test2 [unknown] 2014-07-08 18:15:24.772 EDTFATAL: the database system is in recovery mode test2 [unknown] 2014-07-08 18:15:25.281 EDTFATAL: the database system is in recovery mode test1 [unknown] 2014-07-08 18:15:25.547 EDTFATAL: the database system is in recovery mode test2 [unknown] 2014-07-08 18:15:25.548 EDTFATAL: the database system is in recovery mode test3 [unknown] 2014-07-08 18:15:25.549 EDTFATAL: the database system is in recovery mode test4 [unknown] 2014-07-08 18:15:25.557 EDTFATAL: the database system is in recovery mode test5 [unknown] 2014-07-08 18:15:25.582 EDTFATAL: the database system is in recovery mode test2 [unknown] 2014-07-08 18:15:25.584 EDTFATAL: the database system is in
Re: [HACKERS] 9.4 pg_control corruption
Steve Singer st...@ssinger.info writes: I've encountered a corrupt pg_control file on my 9.4 development cluster. I've mostly been using the cluster for changeset extraction / slony testing. This is a 9.4 (currently commit 6ad903d70a440e + a walsender change discussed in another thread) but would have had the initdb done with an earlier 9.4 snapshot. Somehow or other you missed the update to pg_control version number 942. There's no obvious reason to think that this pg_control file is corrupt on its own terms, but the pg_controldata version you're using expects the 942 layout. The fact that the server wasn't complaining about this suggests that you've not recompiled the server, or at least not xlog.c. Possibly the odd failure to restart indicates that you have a partially updated server executable? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 pg_control corruption
On 07/08/2014 10:14 PM, Tom Lane wrote: Steve Singer st...@ssinger.info writes: I've encountered a corrupt pg_control file on my 9.4 development cluster. I've mostly been using the cluster for changeset extraction / slony testing. This is a 9.4 (currently commit 6ad903d70a440e + a walsender change discussed in another thread) but would have had the initdb done with an earlier 9.4 snapshot. Somehow or other you missed the update to pg_control version number 942. There's no obvious reason to think that this pg_control file is corrupt on its own terms, but the pg_controldata version you're using expects the 942 layout. The fact that the server wasn't complaining about this suggests that you've not recompiled the server, or at least not xlog.c. Possibly the odd failure to restart indicates that you have a partially updated server executable? The server is complaining about that, it started to after the crash (which is why I ran pg_controldata) ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$ ./postgres -D ../data 2014-07-08 22:28:57.796 EDTFATAL: database files are incompatible with server 2014-07-08 22:28:57.796 EDTDETAIL: The database cluster was initialized with PG_CONTROL_VERSION 937, but the server was compiled with PG_CONTROL_VERSION 942. 2014-07-08 22:28:57.796 EDTHINT: It looks like you need to initdb. ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$ The server seemed fine (and it was 9.4 because I was using 9.4 features) The server crashed The server performed crash recovery The server server wouldn't start and pg_controldata shows the attached output I wasn't recompiling or reinstalling around this time either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 pg_control corruption
Steve Singer st...@ssinger.info writes: On 07/08/2014 10:14 PM, Tom Lane wrote: There's no obvious reason to think that this pg_control file is corrupt on its own terms, but the pg_controldata version you're using expects the 942 layout. The fact that the server wasn't complaining about this suggests that you've not recompiled the server, or at least not xlog.c. The server is complaining about that, it started to after the crash Then you updated your sources, recompiled and reinstalled, but failed to restart the server when you did that. Else it would have complained on the spot. If you had any valuable data in the installation, we could talk about how to get it out; but since you didn't I'd suggest just re-initdb and move on. I don't see anything unexpected here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
On Tue, Jul 8, 2014 at 11:59 PM, Tom Lane t...@sss.pgh.pa.us wrote: No, ALTER SYSTEM is there now and it needs to work right in its first release. I will go fix this if nobody else does. I am planing to provide an initial patch for this issue in a day or so, hope that is not too late. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] postgresql.auto.conf and reload
On Wed, Jul 9, 2014 at 6:40 AM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 09/07/14 05:13, Josh Berkus wrote: Another could be that during initdb all the uncommented settings be written to postgresql.auto.conf file rather than to postgresql.conf. I think we can do this by changing code in initdb.c-setup_config(). This will ensure that unless user is changing settings in a mixed way (some by Alter System and some manually by editing postgresql.conf), there will no such problem. There is no reasonable way for us to prevent issues for users who are manually changing both pg.conf and pg.auto.conf. Users should stick to one or the other method of management (or, thirdly, using conf.d); if they mix methods, we can't prevent confusion at restart time and we shouldn't try. Yes, but even well behaved users will see this type of error, because initdb uncomments certain values (ones that are dead certs for being changed via ALTER SYSTEM subsequently like shared_buffers), and then - bang! your next reload gets that your postgresql.conf contains errors message. That is the reason, why I have suggested up-thread that uncommented values should go to postgresql.auto.conf, that will avoid any such observations for a well-behaved user. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] postgresql.auto.conf and reload
Amit Kapila amit.kapil...@gmail.com writes: On Wed, Jul 9, 2014 at 6:40 AM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: Yes, but even well behaved users will see this type of error, because initdb uncomments certain values (ones that are dead certs for being changed via ALTER SYSTEM subsequently like shared_buffers), and then - bang! your next reload gets that your postgresql.conf contains errors message. That is the reason, why I have suggested up-thread that uncommented values should go to postgresql.auto.conf, that will avoid any such observations for a well-behaved user. Uh, what? That sounds like you are proposing that postgresql.conf itself is a dead letter. Which is not going to fly. We had that conversation already. The right way to fix this is just to avoid processing entries that get overridden later in the configuration file scan. That won't cause anyone to get upset about how their old habits no longer work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: show only failed queries
On Mon, Jun 30, 2014 at 8:33 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-06-30 13:01 GMT+02:00 Abhijit Menon-Sen a...@2ndquadrant.com: At 2014-06-30 12:48:30 +0200, pavel.steh...@gmail.com wrote: + para + Print a failed SQL commands to standard error output. This is + equivalent to setting the variable varnameECHO/varname to + literalerrors/literal. No a, just Print failed SQL commands …. -option-e/option. +option-e/option. If set to literalerror/literal then only +failed queries are displayed. Should be errors here, not error. printf(_( -a, --echo-all echo all input from script\n)); + printf(_( -b --echo-errorsecho failed commands sent to server\n)); printf(_( -e, --echo-queries echo commands sent to server\n)); Should have a comma after -b to match other options. Also I would remove sent to server from the description: echo failed commands is fine. fixed $ psql -b bin/psql: invalid option -- 'b' Try psql --help for more information. I got this error. ISTM you forgot to add 'b' into the third argument of getopt_long in startup.c. applicationpsql/application merely prints all queries as they are sent to the server. The switch for this is -option-e/option. +option-e/option. If set to literalerrors/literal then only +failed queries are displayed. I think that where failed queries are output should be documented here. Otherwise users might misunderstand they are output to standard output like ECHO=all and queries do. It's better to add The switch for this is option-b/option. into the doc. +else if (strcmp(prev2_wd, \\set) == 0) +{ +if (strcmp(prev_wd, ECHO) == 0) +{ +static const char *const my_list[] = +{none, errors, queries, all, NULL}; + +COMPLETE_WITH_LIST_CS(my_list); +} +else if (strcmp(prev_wd, ECHO_HIDDEN) == 0) +{ +static const char *const my_list[] = +{noexec, off, on, NULL}; + +COMPLETE_WITH_LIST_CS(my_list); +} +} I think that adding tab-completions of psql variables is good, but adding those of only ECHO and ECHO_HIDDEN seems half-baked. Probably this part should be split into separate patch. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions
On Tue, Jul 8, 2014 at 7:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: I doubt it. The extra code isn't the problem so much, it's the extra planning cycles spent trying to make proofs that will usually fail. What you propose will create a combinatorial explosion in the number of proof paths to be considered. I can understand that it will create combinatorial explosion in the number of predicates that need to be examined by the constraint exclusion. I do not understand where come the paths gets involved here. The constraint exclusion kicks in before paths are created Perhaps I should not have used the term path here, because I wasn't referring to what the planner calls Paths. I just meant that there will be many more ways to perhaps prove a constraint-exclusion result, and the planner will have to go through them all. (Usually to no avail, because how often do people write queries that are guaranteed to produce no rows?) An example of what I meant by combinatorial explosion is that if a clause mentions K variables, and each of those variables has been equated to M other variables, there are (M+1)^K possible derived clauses, and it looks to me like we'd have to consider them all to catch the sort of situation you're talking about. I agree. I think this is going to require a *lot* of additional planner cycles, and TBH you've not provided any very compelling example of why it's worthwhile. Yeah, if you can exclude a large partition it's a win, but how often is that going to happen in real-world queries? The one example you gave seemed pretty artificial to me, ie unrelated to typical partitioning constraints. Yes, the example is a cooked one to show the problem. But the case can be encountered easily in partitioned tables. A partitioned table (having constraints on each partition) with few table level constraints, can undergo queries with some clauses in WHERE clause which yield empty results for one or more partitions. Saving some table scans would be worth the time spent in bringing up (M+1)^K derived clauses and examining those. Greg's example about parallel joins or joins between partitioned foreign tables would yield much better results, if we have this optimization. But, I think, I will wait till parallel joins or partitioned foreign tables is a reality in PostgreSQL. regards, tom lane -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company