Re: [HACKERS] better atomics - v0.5

2014-07-08 Thread Amit Kapila
On Thu, Jul 3, 2014 at 10:56 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
 On Tue, Jul 1, 2014 at 4:10 PM, Andres Freund and...@2ndquadrant.com
wrote:

Further review of patch:
1.
/*
 * pg_atomic_test_and_set_flag - TAS()
 *
 * Acquire/read barrier semantics.
 */
STATIC_IF_INLINE_DECLARE bool
pg_atomic_test_set_flag(volatile pg_atomic_flag *ptr);

a. I think Acquire and read barrier semantics aren't equal.
With acquire semantics, the results of the operation are available before
the
results of any operation that appears after it in code which means it
applies
for both load and stores. Definition of read barrier just ensures loads.

So will be right to declare like above in comments?

b. I think it adheres to Acquire semantics for platforms where
that semantics
are supported else it defaults to full memory ordering.
Example :
#define TAS(lock) (InterlockedCompareExchange(lock, 1, 0))

Is it right to declare that generic version of function adheres to
Acquire semantics.


2.
bool
pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr)
{
return TAS((slock_t *) ptr-sema);
#define TAS(lock) (InterlockedCompareExchange(lock, 1, 0))

a. In above usage (ptr-sema), sema itself is not declared as volatile,
so would it be right to use it in this way for API
(InterlockedCompareExchange)
expecting volatile.

b. Also shouldn't this implementation be moved to atomics-generic-msvc.h

3.
/*
 * pg_atomic_unlocked_test_flag - TAS()
 *
 * No barrier semantics.
 */
STATIC_IF_INLINE_DECLARE bool
pg_atomic_unlocked_test_flag(volatile pg_atomic_flag *ptr);

a. There is no setting in this, so using TAS in function comments
seems to be wrong.
b. BTW, I am not able see this function in C11 specs.

4.
#if !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) 
defined(PG_HAVE_ATOMIC_EXCHANGE_U32)
..
#define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
static inline bool
pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
{
return pg_atomic_read_u32_impl(ptr) == 1;
}


#elif !defined(PG_HAS_ATOMIC_TEST_SET_FLAG) 
defined(PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32)
..
#define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
static inline bool
pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
{
return (bool) pg_atomic_read_u32_impl(ptr);
}

Is there any reason to keep these two implementations different?

5.
atomics-generic-gcc.h
#ifndef PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
#define PG_HAS_ATOMIC_UNLOCKED_TEST_FLAG
static inline bool
pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
{
return ptr-value == 0;
}
#endif

Don't we need to compare it with 1 instead of 0?

6.
atomics-fallback.h
pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
{
/*
 * Can't do this in the semaphore based implementation - always return
 * true. Do this in the header so compilers can optimize the test away.
 */
return true;
}

Why we can't implement this in semaphore based implementation?

7.
/*
 * pg_atomic_clear_flag - release lock set by TAS()
 *
 * Release/write barrier semantics.
 */
STATIC_IF_INLINE_DECLARE void
pg_atomic_clear_flag(volatile pg_atomic_flag *ptr);

a. Are Release and write barriers equivalent?

With release semantics, the results of the operation are available after the
results of any operation that appears before it in code.
A write barrier acts as a compiler barrier, and orders stores.

I think both definitions are not same.

b. IIUC then current code doesn't have release semantics for unlock/clear.
We are planing to introduce it in this patch, also this doesn't follow the
specs which says that clear should have memory_order_seq_cst ordering
semantics.

As per its current usage in patch (S_UNLOCK), it seems reasonable
to have *release* semantics for this API, however I think if we use it for
generic purpose then it might not be right to define it with Release
semantics.

8.
#define PG_HAS_ATOMIC_CLEAR_FLAG
static inline void
pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr)
{
/* XXX: release semantics suffice? */
pg_memory_barrier_impl();
pg_atomic_write_u32_impl(ptr, 0);
}

Are we sure that having memory barrier before clearing flag will
achieve *Release* semantics; as per my understanding the definition
of Release sematics is as below and above doesn't seem to follow the same.
With release semantics, the results of the operation are available after
the results of any operation that appears before it in code.

9.
static inline uint32
pg_atomic_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 xchg_)
{
uint32 old;
while (true)
{
old = pg_atomic_read_u32_impl(ptr);
if (pg_atomic_compare_exchange_u32_impl(ptr, old, xchg_))
break;
}
return old;
}

What is the reason to implement exchange as compare-and-exchange, at least
for
Windows I know corresponding function (InterlockedExchange) exists.
I could not see any other implementation of same.
I think this at least deserves comment explaining why we have done
the implementation this way.

10.
STATIC_IF_INLINE_DECLARE uint32
pg_atomic_fetch_add_u32(volatile pg_atomic_uint32 *ptr, 

Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-08 Thread Ashutosh Bapat
On Mon, Jul 7, 2014 at 11:46 PM, Greg Stark st...@mit.edu wrote:

 On Mon, Jul 7, 2014 at 3:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I doubt it.  The extra code isn't the problem so much, it's the extra
  planning cycles spent trying to make proofs that will usually fail.
  What you propose will create a combinatorial explosion in the number
  of proof paths to be considered.

 Well, not necessarily. You only need to consider constraints on
 matching columns and only when there's a join column on those columns.
 So you could imagine, for example, sorting all the constraints by the
 eclass for the non-const side of their expression, then going through
 them linearly to see where you have multiple constraints on the same
 eclass.

 For what it's worth I think there is a case where this is a common
 optimization. When you have multiple tables partitioned the same way.
 Then you ideally want to be able to turn that from an join of multiple
 appends into an append of multiple joins. This is even more important
 when it comes to a parallelizing executor since it lets you do the
 joins in parallel.


Ah, right. Also, if the foreign tables come under the inheritance
hierarchy,  and we want push joins to foreign servers.


 However to get from here to there I guess you would need to turn the
 join of the appends into NxM joins of every pair of subscans and then
 figure out which ones to exclude. That would be pretty nuts. To do it
 reasonably we probably need the partitioning infrastructure we've been
 talking about where Postgres would know what the partitioning key is
 and the order and range of the partitions so it can directly generate
 the matching subjoins in less than n^2 time.

 --
 greg




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-08 Thread Ashoke
As a follow-up question,

I found some of the varchar column types, in which the histogram_bounds are
not being surrounded in double quotes ( ) even in the default
implementation.
Ex : *c_name* column of *Customer* table

I also found histogram_bounds in which only some strings are surrounded in
double quotes and some are not.
Ex : *c_address *column of* Customer *table

Why are there such inconsistencies? How is this determined?

Thank you.


On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote:

 Hi,

 I am trying to implement a functionality that is similar to ANALYZE, but
 needs to have different values (the values will be valid and is stored in
 inp-str[][]) for MCV/Histogram Bounds in case the column under
 consideration is varchar (C Strings). I have written a function
 *dummy_update_attstats* with the following changes. Other things remain
 the same as in *update_attstats* of *~/src/backend/commands/analyze.c*


 *---*
 *{*

 * ArrayType  *arry; *
 * if (*
 *strcmp(col_type,varchar) == 0*
 * )*
 * arry = construct_array(stats-stavalues[k],*
 * stats-numvalues[k], *
 * CSTRINGOID,*
 * -2, *
 * false,*
 * 'c'); *
 * else*
 * arry = construct_array(stats-stavalues[k], *
 * stats-numvalues[k],*
 * stats-statypid[k], *
 * stats-statyplen[k],*
 * stats-statypbyval[k], *
 * stats-statypalign[k]);*
 * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
   ---

 and I update the hist_values in the appropriate function as:
   ---

 *if (strcmp(col_type,varchar) == 0**)*
 * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
 * false,*
 * -2);*
 *---*

 I tried this based on the following reference :
 http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

 My issue is : When I use my way for strings, the MCV/histogram_bounds in
 pg_stats doesn't have double quotes ( ) surrounding string. That is,

 If normal *update_attstats* is used, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
 If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA,ARGENTINA,...*

 This becomes an issue if the string has ',' (commas), like for example in
 *n_comment* column of *nation* table.

 Could someone point out the problem and suggest a solution?

 Thank you.

 --
 Regards,
 Ashoke




-- 
Regards,
Ashoke


Re: [HACKERS] RLS Design

2014-07-08 Thread Craig Ringer
Hi all

I was jotting notes about this last sleepless night, and was really glad
to see the suggestion of enabling RLS on a table being a requirement for
OR-style quals suggested in the thread when I woke.

The only sane way to do OR-ing of multiple rules is to require that
tables be switched to deny-by-default before RLS quals can be added to
then selectively enable access.

The next step is DENY rules that override ALLOW rules, and are also
ORed, so any DENY rule overrides any ALLOW rule. Like in ACLs. But that
can be a later - I just think room for it should be left in any
catalog definition.

My concern with the talk of policies, etc, is with making it possible to
impliment this for 9.5. I'd really like to see a robust declarative
row-security framework with access policies - but I'm not sure sure it's
a good idea to try to assemble policies directly out of low level row
security predicates.

Tying things into a policy model that isn't tried or tested might create
more problems than it solves unless we implement multiple real-world
test cases on top of the model to show it works.

For how I think we should be pursuing this in the long run, take a look
at how TeraData does it, with heirachical and non-heirachical rules -
basically bitmaps or thresholds - that get grouped into access policies.
It's a very good way to abstract the low level stuff. If we have low
level table predicate filters, we can build this sort of thing on top.


For 9.5, unless the basics turn out to be way easier than they look and
it's all done soon in the release process, surely we should be sticking
to just getting the basics of row security in place? Leaving room for
enhancement, sure, but sticking to the core feature which to my mind is:

- A row security on/off flag for a table;

- Room in the catalogs for multiple row security rules per table
  and a type flag for them. The initial type flag, for ALLOW rules,
  specifies that all ALLOW rules be ORed together.

- Syntax for creating and dropping row security predicates. If there
  can be multiple ones per table they'll need names, like we have with
  triggers, indexes, etc.

- psql support for listing row security predicates on a table if running
  as superuser or if you've been explicitly GRANTed access to the
  catalog table listing row security quals.

- The hooks for contribs to inject their own row security rules. The
  API will need a tweak - right now it assumes these rules are ANDed
  with any row security predicates in the catalogs, but we'd want the
  option of treating them as ALLOW or DENY rules to get ORed with the
  rest of the set *or* as a pre-filter predicate like currently.

- A row-security-exempt right, at the user-level,
  to assuage the concerns about malicious predicates. I maintain that
  in the first rev this should be simple: superuser is row security
  exempt. I don't think I'm going to win that one though, so a
  user/role attribute that makes the role ignore row security
  seems like the next simplest option.

- A way to test whether the current user is row-security exempt
  so pg_dump can complain unless explicitly told it's allowed
  to do a selective dump via a cmdline option;

Plus a number of fixes:

- Fixing the security barrier view isssue with row level lock pushdown
  that's breaking the row security regression tests;

- Enhancing plan cache invalidation so that row-security exempt-ness
  of a user is part of the plancache key;

- Adding session state like current_user to portals, so security_barrier
  functions returning refcursor, and cursors created before SET SESSION
  AUTHORIZATION or SET ROLE, get the correct values when they use
  session information like current_user

Note that this doesn't even consider the with check option style
write-filtering side of row security and the corresponding challenges
with the semantics around RETURNING.

It's already a decent sized amount of work on top of the existing row
security patch.

If we start adding policy groups, etc, this will never get done.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Optimization for updating foreign tables in Postgres FDW

2014-07-08 Thread Etsuro Fujita
Attached is a WIP patch for the following:

/*
 * postgresPlanForeignModify
 *  Plan an insert/update/delete operation on a foreign table
 *
 * Note: currently, the plan tree generated for UPDATE/DELETE will always
 * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE)
 * and then the ModifyTable node will have to execute individual remote
 * UPDATE/DELETE commands.  If there are no local conditions or joins
 * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING
 * and then do nothing at ModifyTable.  Room for future optimization ...
 */

In the patch postgresPlanForeignModify has been modified so that if, in
addition to the above condition, the followings are satisfied, then the
ForeignScan and ModifyTable node will work that way.

 - There are no local BEFORE/AFTER triggers.
 - In UPDATE it's safe to evaluate expressions to assign to the target
columns on the remote server.

Here is a simple performance test.

On remote side:
postgres=# create table t (id serial primary key, inserted timestamp
default clock_timestamp(), data text);
CREATE TABLE
postgres=# insert into t(data) select random() from generate_series(0,
9);
INSERT 0 10
postgres=# vacuum t;
VACUUM

On local side:
postgres=# create foreign table ft (id integer, inserted timestamp, data
text) server myserver options (table_name 't');
CREATE FOREIGN TABLE

Unpatched:
postgres=# explain analyze verbose delete from ft where id  1;
  QUERY PLAN
---
 Delete on public.ft  (cost=100.00..162.32 rows=910 width=6) (actual
time=1275.255..1275.255 rows=0 loops=1)
   Remote SQL: DELETE FROM public.t WHERE ctid = $1
   -  Foreign Scan on public.ft  (cost=100.00..162.32 rows=910 width=6)
(actual time=1.180..52.095 rows= loops=1)
 Output: ctid
 Remote SQL: SELECT ctid FROM public.t WHERE ((id  1)) FOR
UPDATE
 Planning time: 0.112 ms
 Execution time: 1275.733 ms
(7 rows)

Patched (Note that the DELETE command has been pushed down.):
postgres=# explain analyze verbose delete from ft where id  1;
QUERY PLAN
---
 Delete on public.ft  (cost=100.00..162.32 rows=910 width=6) (actual
time=0.006..0.006 rows=0 loops=1)
   -  Foreign Scan on public.ft  (cost=100.00..162.32 rows=910 width=6)
(actual time=0.001..0.001 rows=0 loops=1)
 Output: ctid
 Remote SQL: DELETE FROM public.t WHERE ((id  1))
 Planning time: 0.101 ms
 Execution time: 8.808 ms
(6 rows)

I'll add this to the next CF.  Comments are welcome.

Thanks,

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
***
*** 189,198  is_foreign_expr(PlannerInfo *root,
if (!foreign_expr_walker((Node *) expr, glob_cxt, loc_cxt))
return false;
  
-   /* Expressions examined here should be boolean, ie noncollatable */
-   Assert(loc_cxt.collation == InvalidOid);
-   Assert(loc_cxt.state == FDW_COLLATE_NONE);
- 
/*
 * An expression which includes any mutable functions can't be sent over
 * because its result is not stable.  For example, sending now() remote
--- 189,194 
***
*** 928,933  deparseUpdateSql(StringInfo buf, PlannerInfo *root,
--- 924,982 
  }
  
  /*
+  * deparse remote UPDATE statement
+  *
+  * The statement text is appended to buf, and we also create an integer List
+  * of the columns being retrieved by RETURNING (if any), which is returned
+  * to *retrieved_attrs.
+  */
+ void
+ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
+  Index rtindex, Relation rel,
+  List *remote_conds,
+  List *targetlist,
+  List *targetAttrs, List 
*returningList,
+  List **retrieved_attrs)
+ {
+   RelOptInfo *baserel = root-simple_rel_array[rtindex];
+   List   *params_list = NIL;
+   deparse_expr_cxt context;
+   boolfirst;
+   ListCell   *lc;
+ 
+   /* Set up context struct for recursion */
+   context.root = root;
+   context.foreignrel = baserel;
+   context.buf = buf;
+   context.params_list = NULL;
+ 
+   appendStringInfoString(buf, UPDATE );
+   deparseRelation(buf, rel);
+   appendStringInfoString(buf,  SET );
+ 
+   first = true;
+   foreach(lc, targetAttrs)
+   {
+   int attnum = lfirst_int(lc);
+   TargetEntry *tle = get_tle_by_resno(targetlist, attnum);
+ 
+   if (!first)
+   

Re: [HACKERS] Modifying update_attstats of analyze.c for C Strings

2014-07-08 Thread Ashoke
Ok, I was able to figure out that when strings contained 'spaces',
PostgreSQL appends them with double quotes.


On Tue, Jul 8, 2014 at 12:04 PM, Ashoke s.ash...@gmail.com wrote:

 As a follow-up question,

 I found some of the varchar column types, in which the histogram_bounds
 are not being surrounded in double quotes ( ) even in the default
 implementation.
 Ex : *c_name* column of *Customer* table

 I also found histogram_bounds in which only some strings are surrounded in
 double quotes and some are not.
 Ex : *c_address *column of* Customer *table

 Why are there such inconsistencies? How is this determined?

 Thank you.


 On Tue, Jul 8, 2014 at 10:52 AM, Ashoke s.ash...@gmail.com wrote:

 Hi,

 I am trying to implement a functionality that is similar to ANALYZE, but
 needs to have different values (the values will be valid and is stored in
 inp-str[][]) for MCV/Histogram Bounds in case the column under
 consideration is varchar (C Strings). I have written a function
 *dummy_update_attstats* with the following changes. Other things remain
 the same as in *update_attstats* of *~/src/backend/commands/analyze.c*


 *---*
 *{*

 * ArrayType  *arry; *
 * if (*
 *strcmp(col_type,varchar) == 0*
 * )*
 * arry = construct_array(stats-stavalues[k],*
 * stats-numvalues[k], *
 * CSTRINGOID,*
 * -2, *
 * false,*
 * 'c'); *
 * else*
 * arry = construct_array(stats-stavalues[k], *
 * stats-numvalues[k],*
 * stats-statypid[k], *
 * stats-statyplen[k],*
 * stats-statypbyval[k], *
 * stats-statypalign[k]);*
 * values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
   ---

 and I update the hist_values in the appropriate function as:
   ---

 *if (strcmp(col_type,varchar) == 0**)*
 * hist_values[i] = datumCopy(CStringGetDatum(inp-str[i][j]),*
 * false,*
 * -2);*
 *---*

 I tried this based on the following reference :
 http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

 My issue is : When I use my way for strings, the MCV/histogram_bounds in
 pg_stats doesn't have double quotes ( ) surrounding string. That is,

 If normal *update_attstats* is used, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA   ,ARGENTINA,...*
 If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
 nation(n_name)* are : *ALGERIA,ARGENTINA,...*

 This becomes an issue if the string has ',' (commas), like for example in
 *n_comment* column of *nation* table.

 Could someone point out the problem and suggest a solution?

 Thank you.

 --
 Regards,
 Ashoke




 --
 Regards,
 Ashoke







-- 
Regards,
Ashoke


[HACKERS] query_is_distinct_for does not take into account set returning functions

2014-07-08 Thread David Rowley
Over here -
http://www.postgresql.org/message-id/6351.1404663...@sss.pgh.pa.us Tom
noted that create_unique_path did not check for set returning functions.

Tom Wrote:
 I notice that create_unique_path is not paying attention to the question
 of whether the subselect's tlist contains SRFs or volatile functions.
 It's possible that that's a pre-existing bug.

I looked at this a bit and I can confirm that it does not behave as it
should do. Take the following as an example:

create table x (id int primary key);
create table y (n int not null);

insert into x values(1);
insert into y values(1);

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g
from y);
 id

  1
(1 row)

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g
from y group by n);
 id

  1
  1
(2 rows)

The 2nd query does group by n, so query_is_distinct_for returns true,
therefore the outer query think's it's ok to perform an INNER JOIN rather
than a SEMI join, which is this case produces an extra record.

I think we should probably include the logic to test for set returning
functions into query_is_distinct_for.

The attached fixes the problem.

Regards

David Rowley


query_is_distinct_for_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-08 Thread David Rowley
On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  On Mon, Jul 7, 2014 at 4:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  I poked around to see if we didn't have some code already for that, and
  soon found that not only do we have such code
 (equality_ops_are_compatible)
  but actually almost this entire patch duplicates logic that already
 exists
  in optimizer/util/pathnode.c, to wit create_unique_path's subroutines
  query_is_distinct_for et al.  So I'm thinking what this needs to turn
 into
  is an exercise in refactoring to allow that logic to be used for both
  purposes.

  Well, it seems that might just reduce the patch size a little!
  I currently have this half hacked up to use query_is_distinct_for, but I
  see there's no code that allows Const's to exist in the join condition. I
  had allowed for this in groupinglist_is_unique_on_restrictinfo() and I
  tested it worked in a regression test (which now fails). I think to fix
  this, all it would take would be to modify query_is_distinct_for to take
 a
  list of Node's rather than a list of column numbers then just add some
  logic that skips if it's a Const and checks it as it does now if it's a
 Var
  Would you see a change of this kind a valid refactor for this patch?

 I'm a bit skeptical as to whether testing for that case is actually worth
 any extra complexity.  Do you have a compelling use-case?  But anyway,
 if we do want to allow it, why does it take any more than adding a check
 for Consts to the loops in query_is_distinct_for?  It's the targetlist
 entries where we'd want to allow Consts, not the join conditions.


I don't really have a compelling use-case, but you're right, it's just a
Const check in query_is_distinct_for(), it seems simple enough so I've
included that in my refactor of the patch to use query_is_distinct_for().
This allows the regression tests all to pass again.

I've included an updated patch and a delta patch.

Now a couple of things to note:

1. The fast path code that exited in join_is_removable() for subquery's
when the subquery had no group or distinct clause is now gone. I wasn't too
sure that I wanted to assume too much about what query_is_distinct_for may
do in the future and I thought if I included some logic in
join_is_removable() to fast path, that one day it may fast path wrongly.
Perhaps we could protect against this with a small note in
query_is_distinct_for().

2. The patch I submitted here
http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com
if that gets accepted then it makes the check for set returning functions
in join_is_removable void.

Regards

David Rowley


subquery_leftjoin_removal_v1.5.delta.patch
Description: Binary data


subquery_leftjoin_removal_v1.5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-08 Thread Ali Akbar
 I don't know enough about XML/XPATH to know if this is a good idea or not,


Actually currently because of the namespace problem, xpath() returns wrong
result (even worse: invalid xml!). So this patch corrects the behavior of
xpath() to the correct one.


 but I did go read the documentation of xmlCopyNode(), and I notice it says
 the second argument is

 extended: if 1 do a recursive copy (properties, namespaces and children
 when applicable) if 2 copy properties and namespaces (when applicable)

 Do we really want it to copy children?  Perhaps the value should be 2?
 (I have no idea, I'm just raising the question.)


If we put 1 as the parameter, the resulting Node will link to the existing
children. In this case, the namespace problem for the children might be
still exists. If any of the children uses different namespace(s) than the
parent, the namespace definition will not be copied correctly.

Just to be safe, in the patch 1 passed as the second argument.

Ideally, we can traverse the Node object and generate XML accordingly, but
it is a lot of work, and a lot of duplicating libxml's code. Maybe we
should push this upstream to libxml?

I also notice that it says

 Returns: a new #xmlNodePtr, or NULL in case of error.

 and the patch is failing to cover the error case.  Most likely, that
 would represent out-of-memory, so it definitely seems to be something
 we should account for.


Ah, overlooked that.

Skimming through libxml2's source, it looks like there are some other cases
beside out-of-memory. Will update the patch according to these cases.

Thanks for the review.

-- 
Ali Akbar


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-08 Thread MauMau

From: Amit Kapila amit.kapil...@gmail.com

On Fri, Jul 4, 2014 at 7:29 PM, MauMau maumau...@gmail.com wrote:

[Hypothesis]
Why does the connection processing emit WAL?

Probably, it did page-at-a-time vacuum during access to pg_database and

pg_authid for client authentication.  src/backend/access/heap/README.HOT
describes:

I agree with your analysis that it can happen during connection
attempt.


Thank you.  I'm relieved the cause seems correct.




But the customer could not reproduce the problem when he performed the

same archive recovery from the same base backup again.  Why?  I guess the
autovacuum daemon vacuumed the system catalogs before he attempted to
connect to the database.



One way to confirm could be to perform the archive recovery by
disabling autovacuum.


Yeah, I thought of that too.  Unfortunately, the customer deleted the the 
base backup for testing.




Another thing which I am wondering about is can't the same happen
even for Read Only transaction (incase someone does Select which
prunes the page).


I'm afraid about that, too.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-08 Thread MauMau

From: Rajeev rastogi rajeev.rast...@huawei.com
As of now there is no solution for this in PostgreSQL but I had submitted a 
patch Standalone synchronous master in
9.4 2014-01 CommitFest, which was rejected because of some issues. This 
patch was meant to degrade the synchronous

level of master, if all synchronous standbys are down.
I plan to resubmit this with better design sometime in 9.5.


Although I only read some mails of that thread, I'm sure your proposal is 
what many people would appreciate.  Your new operation mode is equivalent to 
the maximum availability mode of Oracle Data Guard, isn't it?  I'm looking 
forward to it.  Good luck.



==
Maximum availability
This protection mode provides the highest level of data protection that is 
possible without compromising the availability of a primary database. 
Transactions do not commit until all redo data needed to recover those 
transactions has been written to the online redo log and to at least one 
standby database. If the primary database cannot write its redo stream to at 
least one standby database, it effectively switches to maximum performance 
mode to preserve primary database availability and operates in that mode 
until it is again able to write its redo stream to a standby database.


This protection mode ensures zero data loss except in the case of certain 
double faults, such as failure of a primary database after failure of the 
standby database.


Maximum performance
This is the default protection mode. It provides the highest level of data 
protection that is possible without affecting the performance of a primary 
database. This is accomplished by allowing transactions to commit as soon as 
all redo data generated by those transactions has been written to the online 
log. Redo data is also written to one or more standby databases, but this is 
done asynchronously with respect to transaction commitment, so primary 
database performance is unaffected by delays in writing redo data to the 
standby database(s).


This protection mode offers slightly less data protection than maximum 
availability mode and has minimal impact on primary database performance.

==


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-08 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

problem, the user might not realize he's got one until he starts to wonder
why autovac/autoanalyze aren't working.


In autovacuum.c, autovacuum workers avoid waiting for the standby by doing:

/*
 * Force synchronous replication off to allow regular maintenance even if
 * we are waiting for standbys to connect. This is important to ensure we
 * aren't blocked from performing anti-wraparound tasks.
 */
if (synchronous_commit  SYNCHRONOUS_COMMIT_LOCAL_FLUSH)
 SetConfigOption(synchronous_commit, local,
 PGC_SUSET, PGC_S_OVERRIDE);

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

2014-07-08 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

Andres Freund and...@2ndquadrant.com writes:

On 2014-07-07 09:57:20 -0400, Tom Lane wrote:

Well, see the comment that explains why the logic is like this now:



I think we should 'simply' make sequences assign a toplevel xid - then
we can get rid of that special case in RecordTransactionCommit(). And I
think the performance benefit of not having to wait on XLogFlush() for
readonly xacts due to hot prunes far outweighs the decrease due to the
xid assignment/commit record.  I don't think that nextval()s are called
overly much without a later xid assigning statement.


Yeah, that could well be true.  I'm not sure if there are any other cases
where we have non-xid-assigning operations that are considered part of
what has to be flushed before reporting commit; if there are not, I'd
be okay with changing nextval() this way.


Thank you all for letting me know your thoughts.  I understood and agree 
that read-only transactions, including the connection establishment one, 
should not wait for the standby nor the XLOG flush at commit, and the 
current documentation/specification should not be changed.


I'll consider how to fix this problem, learning the code, then I'll ask for 
review.  I'd like to submit the patch for next CF if possible.


From: Fujii Masao masao.fu...@gmail.com
Sounds good direction. One question is: Can RecordTransactionCommit() 
avoid

waiting for not only replication but also local WAL flush safely in
such read-only
transaction case?


I'd appreciate any opinion on this, too.

Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Robert Haas
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
 once the table is built and there's free space in work_mem. The patch
 mentioned above makes implementing this possible / rather simple.

Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we
run out of memory, increase NTUP_PER_BUCKET.  I'd like to think that
the common case is that work_mem will be large enough that everything
fits; and if you do it that way, then you save yourself the trouble of
rehashing later, which as you point out might lose if there are only a
few probes.  If it turns out that you run short of memory, you can
merge pairs of buckets up to three times, effectively doubling
NTUP_PER_BUCKET each time.

Yet another idea is to stick with your scheme, but do the dynamic
bucket splits lazily.  Set a flag on each bucket indicating whether or
not it needs a lazy split.  When someone actually probes the hash
table, if the flag is set for a particular bucket, move any tuples
that don't belong as we scan the bucket.  If we reach the end of the
bucket chain, clear the flag.

Not sure either of these are better (though I kind of like the first
one) but I thought I'd throw them out there...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder

2014-07-08 Thread MauMau

From: Asif Naeem anaeem...@gmail.com

Other than my pervious comments, patch looks good to me. Thanks.


Thanks for reviewing.  I understood that your previous comment was to 
suggest copying the desired DLLs directly from Release/Debug folder to bin. 
But I'm afraid it cannot be done cleanly...  CopySolutionOutput() copies all 
DLLS to lib folder with no exception as follows:


  if ($1 == 1)
  {
   $dir = bin;
   $ext = exe;
  }
  elsif ($1 == 2)
  {
   $dir = lib;
   $ext = dll;
  }

It seems like I have to do something like this, listing the relevant DLL 
names anyway.  I don't think this is not cleaner.


  if ($1 == 1)
  {
   $dir = bin;
   $ext = exe;
  }
  elsif ($1 == 2  /* the project is libpq, libecpg, etc. */)
  {
   $dir = bin;
   $ext = dll;
  }
  elsif ($1 == 2)
  {
   $dir = lib;
   $ext = dll;
  }

What do you think?  Am I misunderstanding your suggestion?

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Tomas Vondra
On 8 Červenec 2014, 14:49, Robert Haas wrote:
 On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
 once the table is built and there's free space in work_mem. The patch
 mentioned above makes implementing this possible / rather simple.

 Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we
 run out of memory, increase NTUP_PER_BUCKET.  I'd like to think that
 the common case is that work_mem will be large enough that everything
 fits; and if you do it that way, then you save yourself the trouble of
 rehashing later, which as you point out might lose if there are only a
 few probes.  If it turns out that you run short of memory, you can
 merge pairs of buckets up to three times, effectively doubling
 NTUP_PER_BUCKET each time.

Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
relations it may be way cheaper to use higher NTUP_PER_BUCKET values
instead of increasing the number of batches (resulting in repeated scans
of the outer table). I think it's important (but difficult) to keep these
things somehow balanced.

With large work_mem values the amount of memory for buckets may be quite
significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by
buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB.


 Yet another idea is to stick with your scheme, but do the dynamic
 bucket splits lazily.  Set a flag on each bucket indicating whether or
 not it needs a lazy split.  When someone actually probes the hash
 table, if the flag is set for a particular bucket, move any tuples
 that don't belong as we scan the bucket.  If we reach the end of the
 bucket chain, clear the flag.

I don't think a lazy scheme makes much sense here. There are two clearly
separated phases - loading the table and search.

Also, it seems to me it can't work as described. Say we build a table and
then find out we need a table 4x the size. If I understand your proposal
correctly, you'd just resize buckets array, copy the existing buckets
(first 1/4 buckets) and set 'needs_split' flags. Correct?

Well, the problem is that while scanning a bucket you already need to know
which tuples belong there. But with the lazy approach, the tuples might
still be in the old (not yet split) buckets. So you'd have to search the
current bucket and all buckets that were not split yet. Or did I miss
something?

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-07-08 Thread Michael Paquier
On Sat, Jul 5, 2014 at 12:46 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 At 2014-07-02 15:51:08 -0700, p...@heroku.com wrote:

 Attached revision factors in everyone's concerns here, I think.

 Is anyone planning to review Peter's revised patch?

I have been doing some functional tests, and looked quickly at the
code to understand what it does:
1) Compiles without warnings, passes regression tests
2) Checking process goes through all the existing columns of a
relation even a difference of 1 with some other column(s) has already
been found. As we try to limit the number of hints returned, this
seems like a waste of resources.
3) distanceName could be improved, by for example having some checks
on the string lengths of target and source columns, and immediately
reject the match if for example the length of the source string is the
double/half of the length of target.
4) This is not nice, could it be possible to remove the stuff from varlena.c?
+/* Expand each Levenshtein distance variant */
+#include levenshtein.c
+#define LEVENSHTEIN_LESS_EQUAL
+#include levenshtein.c
+#undef LEVENSHTEIN_LESS_EQUAL
Part of the same comment: only varstr_leven_less_equal is used to
calculate the distance, should we really move varstr_leven to core?
This clearly needs to be reworked as not just a copy-paste of the
things in fuzzystrmatch.
The flag LEVENSHTEIN_LESS_EQUAL should be let within fuzzystrmatch I think.
5) Do we want hints on system columns as well? For example here we
could get tableoid as column hint:
=# select tablepid from foo;
ERROR:  42703: column tablepid does not exist
LINE 1: select tablepid from foo;
   ^
LOCATION:  errorMissingColumn, parse_relation.c:3123
Time: 0.425 ms
6) Sometimes no hints are returned... Even in simple cases like this one:
=# create table foo (aa int, bb int);
CREATE TABLE
=# select ab from foo;
ERROR:  42703: column ab does not exist
LINE 1: select ab from foo;
   ^
LOCATION:  errorMissingColumn, parse_relation.c:3123
7) Performance penalty with a table with 1600 columns:
=# CREATE FUNCTION create_long_table(tabname text, columns int)
RETURNS void
LANGUAGE plpgsql
as $$
declare
  first_col bool = true;
  count int;
  query text;
begin
  query := 'CREATE TABLE ' || tabname || ' (';
  for count in 0..columns loop
query := query || 'col' || count ||  ' int';
if count  columns then
  query := query || ', ';
end if;
  end loop;
  query := query || ')';
  execute query;
end;
$$;
=# SELECT create_long_table('aa', 1599);
 create_long_table
---

(1 row)
Then tested queries like that: SELECT col888a FROM aa;
Patched version: 2.100ms~2.200ms
master branch (6048896): 0.956 ms~0.990 ms
So the performance impact seems limited.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-08 Thread Kohei KaiGai
2014-07-06 14:19 GMT+09:00 Stephen Frost sfr...@snowman.net:
 Kaigai,

 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Can you clarify where this is coming from..?  It sounds like you're
  referring to an existing implementation and, if so, it'd be good to get
  more information on how that works exactly.

 Oracle VPD - Multiple Policies for Each Table, View, or Synonym
 http://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvpoli.htm#i1008351

 It says - Note that all policies applied to a table are enforced with AND 
 syntax.

 While I'm not against using this as an example to consider, it's much
 more complex than what we're talking about here- and it supports
 application contexts which allow groups of RLS rights to be applied or
 not applied; essentially it allows both AND and OR for sets of RLS
 policies, along with default policies which are applied no matter
 what.

 Not only Oracle VPD, it fits attitude of defense in depth.
 Please assume a system that installs network firewall, unix permissions
 and selinux. If somebody wants to reference an information asset within
 a file, he has to connect the server from the network address being allowed
 by the firewall configuration AND both of DAC and MAC has to allow his
 access.

 These are not independent systems and your argument would apply to our
 GRANT system also, which I hope it's agreed would make it far less
 useful.  Note also that SELinux brings in another complexity- it needs
 to make system calls out to check the access.

 Usually, we have to pass all the access control to reference the target
 information, not one of the access control stuffs being installed.

 This is true in some cases, and not in others.  Only one role you are a
 member of needs to have access to a relation, not all of them.  There
 are other examples of 'OR'-style security policies, this is merely one.
 I'm simply not convinced that it applies in the specific case we're
 talking about.

 In the end, I expect that either way people will be upset because they
 won't be able to specify fully which should be AND vs. which should be
 OR with the kind of flexibility other systems provide.  What I'm trying
 to get to is an initial implementation which is generally useful and is
 able to add such support later.

  This is similar to how roles work- your overall access includes all access
  granted to any roles you are a member of. You don't need SELECT rights 
  granted
  to every role you are a member of to select from the table. Additionally,
  if an admin wants to AND the quals together then they can simply create
  a policy which does that rather than have 2 policies.
 
 It seems to me a pain on database administration, if we have to pay attention
 not to conflict each RLS-policy.

 This notion of 'conflict' doesn't make much sense to me.  What is
 'conflicting' here?  Each policy would simply need to stand on its own
 for the role which it's being applied to.  That's very simple and
 straight-forward.

 I expect 90% of RLS-policy will be configured
 to PUBLIC user, to apply everybody same rules on access. In this case, DBA
 has to ensure the target table has no policy or existing policy does not
 conflict with the new policy to be set.
 I don't think it is a good idea to enforce DBA these checks.

 If the DBA only uses PUBLIC then they have to ensure that each policy
 they set up for PUBLIC can stand on its own- though, really, I expect if
 they go that route they'd end up with just one policy that calls a
 stored procedure...

   You are suggesting instead that if application 2 sets up policies on the
  table and then application 1 adds another policy that it should reduce what
  application 2's users can see?  That doesn't make any sense to me.  I'd
  actually expect these applications to at least use different roles anyway,
  which means they could each have a single role specific policy which only
  returns what that application is allowed to see.
 
 I don't think this assumption is reasonable.
 Please expect two applications: app-X that is a database security product
 to apply access control based on remote ip-address of the client for any
 table accesses by any database roles. app-Y that is a usual enterprise
 package for daily business data, with RLS-policy.
 What is the expected behavior in this case?

 That the DBA manage the rights on the tables.  I expect that will be
 required for quite a while with PG.  It's nice to think of these
 application products that will manage all access for users by setting up
 their own policies, but we have yet to even discuss how they would have
 appropriate rights on the table to be able to do so (and to not
 interfere with each other..).

 Let's at least get something which is generally useful in.  I'm all for
 trying to plan out how to get there and would welcome suggestions you
 have which are specific to PG on what we could do here (I'm not keen on
 just trying to mimic another product completely...), but at the level
 

Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Robert Haas
On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
 On 8 Červenec 2014, 14:49, Robert Haas wrote:
 On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
 once the table is built and there's free space in work_mem. The patch
 mentioned above makes implementing this possible / rather simple.

 Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we
 run out of memory, increase NTUP_PER_BUCKET.  I'd like to think that
 the common case is that work_mem will be large enough that everything
 fits; and if you do it that way, then you save yourself the trouble of
 rehashing later, which as you point out might lose if there are only a
 few probes.  If it turns out that you run short of memory, you can
 merge pairs of buckets up to three times, effectively doubling
 NTUP_PER_BUCKET each time.

 Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
 relations it may be way cheaper to use higher NTUP_PER_BUCKET values
 instead of increasing the number of batches (resulting in repeated scans
 of the outer table). I think it's important (but difficult) to keep these
 things somehow balanced.

Right, I think that's clear.  I'm just pointing out that you get to
decide: you can either start with a larger NTUP_PER_BUCKET and then
reduce it if you enough memory left, or you can start with a smaller
NTUP_PER_BUCKET and then increase it if you run short of memory.

 Yet another idea is to stick with your scheme, but do the dynamic
 bucket splits lazily.  Set a flag on each bucket indicating whether or
 not it needs a lazy split.  When someone actually probes the hash
 table, if the flag is set for a particular bucket, move any tuples
 that don't belong as we scan the bucket.  If we reach the end of the
 bucket chain, clear the flag.

 I don't think a lazy scheme makes much sense here. There are two clearly
 separated phases - loading the table and search.

 Also, it seems to me it can't work as described. Say we build a table and
 then find out we need a table 4x the size. If I understand your proposal
 correctly, you'd just resize buckets array, copy the existing buckets
 (first 1/4 buckets) and set 'needs_split' flags. Correct?

 Well, the problem is that while scanning a bucket you already need to know
 which tuples belong there. But with the lazy approach, the tuples might
 still be in the old (not yet split) buckets. So you'd have to search the
 current bucket and all buckets that were not split yet. Or did I miss
 something?

If you have, say, bucket 0..63 and you decide to change it to buckets
0..255, then any tuple that isn't in bucket N has to be in bucket N %
64.  More generally, any time you expand or contract by a power of two
it's pretty easy to figure out where tuples have to go.

But even if you do something that's not a power of two, like say a 10x
compaction of the buckets, there's still only two buckets that can
contain any particular hash value.  If the hash value is H, the old
number of buckets is O, and the new number of buckets is N, then the
tuple has got to be in bucket H % O or bucket H % N.  If bucket H % O
doesn't have the needs-split flag set, then it must be in H % N (if it
exists at all).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-08 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I doubt it.  The extra code isn't the problem so much, it's the extra
 planning cycles spent trying to make proofs that will usually fail.
 What you propose will create a combinatorial explosion in the number
 of proof paths to be considered.

 I can understand that it will create combinatorial explosion in the number
 of predicates that need to be examined by the constraint exclusion. I do
 not understand where come the paths gets involved here. The constraint
 exclusion kicks in before paths are created

Perhaps I should not have used the term path here, because I wasn't
referring to what the planner calls Paths.  I just meant that there will
be many more ways to perhaps prove a constraint-exclusion result, and the
planner will have to go through them all.  (Usually to no avail, because
how often do people write queries that are guaranteed to produce no rows?)

An example of what I meant by combinatorial explosion is that if a clause
mentions K variables, and each of those variables has been equated to M
other variables, there are (M+1)^K possible derived clauses, and it looks
to me like we'd have to consider them all to catch the sort of situation
you're talking about.

I think this is going to require a *lot* of additional planner cycles,
and TBH you've not provided any very compelling example of why it's
worthwhile.  Yeah, if you can exclude a large partition it's a win,
but how often is that going to happen in real-world queries?  The one
example you gave seemed pretty artificial to me, ie unrelated to typical
partitioning constraints.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] query_is_distinct_for does not take into account set returning functions

2014-07-08 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I think we should probably include the logic to test for set returning
 functions into query_is_distinct_for.

It strikes me that there's only a problem if the SRF is in a tlist entry
that is not one of the DISTINCT or GROUP BY columns, respectively.  It
may not be worth the extra complexity to figure that out, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How to use Makefile - pgxs without gcc -O2 ?

2014-07-08 Thread geohas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

hi,

is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ?

regards

geohas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvA4AAAoJEJFGMlQe7wR/6PAIAIy6619E9WpksRxaT+t132Py
+jsiIhuj0qfyRHglILmyQLITc1Y3MoTO0jjBfhspBZE/NdOMd2uqf+0+QMIBdy9L
vnM36quDUs5Zd0Ix8EWG75l9z4aeRRZG5Sh4pH8gGbOYwWPT9uefl7/4bvIeMD3c
0Zsmgt3nE6F5Yxptud+RQNf4ppOfh+46K0Y24WSXW5V99YuLaOfVJXbe0IyU9Zna
/Ib6UqRwKaBoHwqub3jVZTcCKk3CXigqeQ9dJffw0e+3uMOlajRLTQ2CDlZKg6I+
Pakzlwby6szFsMEG13S8e32QuQ/X3lCxPfl2Us+78R4vXpfKSdqvfdIMEtbWTLM=
=aUO6
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] LEFT JOINs not optimized away when not needed

2014-07-08 Thread David Fetter
On Tue, Jul 08, 2014 at 11:19:31AM -0400, Tom Lane wrote:
 Moshe Jacobson mo...@neadwerx.com writes:
 
  Seeing that there is only one output column, and that the results are
  grouped by this output column, it seems to me that the optimizer should not
  even look at the rest of the tables.
 
 The GROUP BY has nothing to do with it, but if all the other tables' join
 keys are primary keys (or at least unique), I'd expect the planner to get
 rid of the joins.  However, I'm not sure whether it works completely when
 there are more than join_collapse_limit relations to worry about.

Eliminating JOINs seems orthogonal, at least in theory, to
join_collapse_limit.  What have I missed here, and how might they have
dependencies?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?

2014-07-08 Thread Tom Lane
geohas li...@hasibether.at writes:
 is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ?

I tend to use

make PROFILE=-O0

which relies on knowing that PG's make rules append $(PROFILE) to CFLAGS.
Alternatively you could just override CFLAGS:

make CFLAGS=whatever

but this requires knowing exactly what configure put into CFLAGS so
that you don't remove any flags that you still want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?

2014-07-08 Thread geohas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thank you Tom!

make CFLAGS=-O0  was it.

now gdb doesn't print Value optimized out.

regards


On 08/07/14 17:45, Tom Lane wrote:
 geohas li...@hasibether.at writes:
 is there any hint to tell pgxs to compile with gcc -O0 (CFLAGS) ?

 I tend to use

 make PROFILE=-O0

 which relies on knowing that PG's make rules append $(PROFILE) to CFLAGS.
 Alternatively you could just override CFLAGS:

 make CFLAGS=whatever

 but this requires knowing exactly what configure put into CFLAGS so
 that you don't remove any flags that you still want.

 regards, tom lane



-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvBP3AAoJEJFGMlQe7wR/ACYIAJufplHygGUpKft8+S0W/Mpu
+tLX8bWpFd9Zg1fD1+gC0AglN6hSmiBc1TpSejOSruVUnFRoIAPbzDp+lSUpR4Hm
nC99Zjr71Qosinrwh3upLcN+CvM5jG+J5SLJfjTK1z5CLR0imudTQLXrNw8j0rFu
mgH5gKYa+/idxKnG1Hf8dzvdGIUOiRxDvX0F+FXgESedur6/voYgA6utdQBjHVLS
rfdNncWtcvswNhvhBfaZX30Nv9w+tiY/i+d/fR5mxU1D6RPDDy2JDIiQcX4ZstaS
1Rc4GYfH780TJxxp8whZ1cVgCcUC6G1cC23bvBfScuZn14+1qMsyHX/AjQRpugQ=
=lDtn
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Tomas Vondra
On 8 Červenec 2014, 16:16, Robert Haas wrote:
 On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
 relations it may be way cheaper to use higher NTUP_PER_BUCKET values
 instead of increasing the number of batches (resulting in repeated scans
 of the outer table). I think it's important (but difficult) to keep
 these
 things somehow balanced.

 Right, I think that's clear.  I'm just pointing out that you get to
 decide: you can either start with a larger NTUP_PER_BUCKET and then
 reduce it if you enough memory left, or you can start with a smaller
 NTUP_PER_BUCKET and then increase it if you run short of memory.

I don't think those two approaches are equal.

With the approach I took, I can use a compromise value (NTUP=4) initially,
and only resize the hash table once at the end (while keeping the amount
of memory under work_mem all the time).

With the NTUP=1 and increase in case of memory pressure you have to
shrink the table immediately (to fit into work_mem), and if the hash table
gets split into multiple batches you're suddenly in a situation with lower
memory pressure and you may need to increase it again.

 Yet another idea is to stick with your scheme, but do the dynamic
 bucket splits lazily.  Set a flag on each bucket indicating whether or
 not it needs a lazy split.  When someone actually probes the hash
 table, if the flag is set for a particular bucket, move any tuples
 that don't belong as we scan the bucket.  If we reach the end of the
 bucket chain, clear the flag.

 I don't think a lazy scheme makes much sense here. There are two clearly
 separated phases - loading the table and search.

 Also, it seems to me it can't work as described. Say we build a table
 and
 then find out we need a table 4x the size. If I understand your proposal
 correctly, you'd just resize buckets array, copy the existing buckets
 (first 1/4 buckets) and set 'needs_split' flags. Correct?

 Well, the problem is that while scanning a bucket you already need to
 know
 which tuples belong there. But with the lazy approach, the tuples might
 still be in the old (not yet split) buckets. So you'd have to search the
 current bucket and all buckets that were not split yet. Or did I miss
 something?

 If you have, say, bucket 0..63 and you decide to change it to buckets
 0..255, then any tuple that isn't in bucket N has to be in bucket N %
 64.  More generally, any time you expand or contract by a power of two
 it's pretty easy to figure out where tuples have to go.

OK.

 But even if you do something that's not a power of two, like say a 10x
 compaction of the buckets, there's still only two buckets that can
 contain any particular hash value.  If the hash value is H, the old
 number of buckets is O, and the new number of buckets is N, then the
 tuple has got to be in bucket H % O or bucket H % N.  If bucket H % O
 doesn't have the needs-split flag set, then it must be in H % N (if it
 exists at all).

I wonder if this is really worth the effort - my guess is it's efficient
only if large portion of buckets is not visited (and thus does not need to
be split) at all. Not sure how common that is (our workloads certainly are
not like that).

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to use Makefile - pgxs without gcc -O2 ?

2014-07-08 Thread Piotr Stefaniak
On 07/08/2014 17:53, geohas wrote:
 make CFLAGS=-O0  was it.
 
 now gdb doesn't print Value optimized out.
If you're using GCC 4.8 or later, consider using it with -Og for that
kind of builds.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-07-08 Thread Peter Geoghegan
Hi,

On Tue, Jul 8, 2014 at 6:58 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 2) Checking process goes through all the existing columns of a
 relation even a difference of 1 with some other column(s) has already
 been found. As we try to limit the number of hints returned, this
 seems like a waste of resources.

In general it's possible that an exact match will later be found
within the RTE, and exact matches don't have to pay the wrong alias
penalty, and are immediately returned. It is therefore not a waste of
resources, but even if it was that would be pretty inconsequential as
your benchmark shows.

 3) distanceName could be improved, by for example having some checks
 on the string lengths of target and source columns, and immediately
 reject the match if for example the length of the source string is the
 double/half of the length of target.

I don't think it's a good idea to tie distanceName() to the ultimate
behavior of errorMissingColumn() hinting, since there may be other
callers in the future. Besides, that isn't going to help much.

 4) This is not nice, could it be possible to remove the stuff from varlena.c?
 +/* Expand each Levenshtein distance variant */
 +#include levenshtein.c
 +#define LEVENSHTEIN_LESS_EQUAL
 +#include levenshtein.c
 +#undef LEVENSHTEIN_LESS_EQUAL
 Part of the same comment: only varstr_leven_less_equal is used to
 calculate the distance, should we really move varstr_leven to core?
 This clearly needs to be reworked as not just a copy-paste of the
 things in fuzzystrmatch.
 The flag LEVENSHTEIN_LESS_EQUAL should be let within fuzzystrmatch I think.

So there'd be one variant within core and one within
contrib/fuzzystrmatch? I don't think that's an improvement.

 5) Do we want hints on system columns as well?

I think it's obvious that the answer must be no. That's going to
frequently result in suggestions of columns that users will complain
aren't even there. If you know about the system columns, you can just
get it right. They're supposed to be hidden for most purposes.

 6) Sometimes no hints are returned... Even in simple cases like this one:
 =# create table foo (aa int, bb int);
 CREATE TABLE
 =# select ab from foo;
 ERROR:  42703: column ab does not exist
 LINE 1: select ab from foo;
^
 LOCATION:  errorMissingColumn, parse_relation.c:3123

That's because those two candidates come from a single RTE and have an
equal distance -- you'd see both suggestions if you joined two tables
with each candidate, assuming that each table being joined didn't
individually have the same issue. I think that that's probably
considered the correct behavior by most.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Robert Haas
On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 8 Červenec 2014, 16:16, Robert Haas wrote:
 On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
 relations it may be way cheaper to use higher NTUP_PER_BUCKET values
 instead of increasing the number of batches (resulting in repeated scans
 of the outer table). I think it's important (but difficult) to keep
 these
 things somehow balanced.

 Right, I think that's clear.  I'm just pointing out that you get to
 decide: you can either start with a larger NTUP_PER_BUCKET and then
 reduce it if you enough memory left, or you can start with a smaller
 NTUP_PER_BUCKET and then increase it if you run short of memory.

 I don't think those two approaches are equal.

 With the approach I took, I can use a compromise value (NTUP=4) initially,
 and only resize the hash table once at the end (while keeping the amount
 of memory under work_mem all the time).

 With the NTUP=1 and increase in case of memory pressure you have to
 shrink the table immediately (to fit into work_mem), and if the hash table
 gets split into multiple batches you're suddenly in a situation with lower
 memory pressure and you may need to increase it again.

True.  On the other hand, this really only comes into play when the
estimates are wrong.  If you know at the start how many tuples you're
going to need to store and how big they are, you determine whether
NTUP_PER_BUCKET=1 is going to work before you even start building the
hash table.  If it isn't, then you use fewer buckets right from the
start.  If we start by estimating a small value for NTUP_PER_BUCKET
and then let it float upward if we turn out to have more tuples than
expected, we're optimizing for the case where our statistics are
right.  If we start by estimating a larger value for NTUP_PER_BUCKET
than what we think we need to fit within work_mem, we're basically
guessing that our statistics are more likely to be wrong than to be
right.  I think.

 I wonder if this is really worth the effort - my guess is it's efficient
 only if large portion of buckets is not visited (and thus does not need to
 be split) at all. Not sure how common that is (our workloads certainly are
 not like that).

Yeah.  It may be a bad idea.  I threw it out there as a possible way
of trying to mitigate the worst case, which is when you trouble to
build the hash table and then make very few probes.  But that may not
be worth the complexity that this would introduce.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-07-08 Thread Christoph Berg
Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com
 Here's an update that places the socket in a temporary subdirectory of /tmp.
 The first attached patch adds NetBSD mkdtemp() to libpgport.  The second,
 principal, patch uses mkdtemp() to implement this design in pg_regress.  The
 corresponding change to contrib/pg_upgrade/test.sh is based on the configure
 script's arrangements for its temporary directory.

Hi,

I believe pg_upgrade itself still needs a fix. While it's not a
security problem to put the socket in $CWD while upgrading (it is
using -c unix_socket_permissions=0700), this behavior is pretty
unexpected, and does fail if your $CWD is  107 bytes.

In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
perl tests to avoid that problem, so imho it would make even more
sense to fix pg_upgrade which could also fail in production.

This has been discussed here and elsewhere [1] before, but was
rejected as not being in line what the other utilities do, but now
pg_upgrade is the lone outlier. Noah's changes let Debian drop 4 out
of 5 pg_regress-sockdir patches, having this fixed would also let us
get rid of the last one [2].

[1] http://lists.debian.org/debian-wb-team/2013/05/msg00015.html
[2] 
https://alioth.debian.org/scm/loggerhead/pkg-postgresql/postgresql-9.5/trunk/view/head:/debian/patches/64-pg_upgrade-sockdir

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Robert Haas
On Sat, Jul 5, 2014 at 10:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Amit Kapila amit.kapil...@gmail.com writes:
 Please find the patch attached to address the above concern.  I have
 updated docs, so that users can be aware of such behaviour.

 I'm in the camp that says that we need to do something about this, not
 just claim that it's operating as designed.  AFAICS, the *entire* argument
 for having ALTER SYSTEM at all is ease of use; but this behavior is not
 what I would call facilitating ease of use.  In particular, if you are
 conveniently able to edit postgresql.conf, what the heck do you need
 ALTER SYSTEM for?

 One possible answer is to modify guc-file.l so that only the last value
 supplied for any one GUC gets processed.  The naive way of doing that
 would be O(N^2) in the number of uncommented settings, which might or
 might not be a problem; if it is, we could no doubt devise a smarter
 data structure.

I've been really annoyed by the current behavior even on older
releases because I have a script, which I use frequently, that does
this:

initdb
cat  $PGDATA/postgresql.conf EOM
shared_buffers=8GB
another setting
another setting
another setting
EOM

Now, sometimes while experimenting, I will change a setting in
postgresql.conf and reload the config.  At which point it complains
that it can't change shared_buffers at runtime.  It does this because
the line from initdb is in the file, and so is the one I added.  But
I'm *not* trying to change shared_buffers.  Sure, the first value in
the config file doesn't match the current value, but the *last* line
in the config file is the one that's supposed to control, so why is it
complaining about the earlier line?

I haven't looked at the code in this area too carefully, but it seems
to me like the flow ought to be:

1. Read all of the config files and determine what the final value
present in each config file is.
2. *Then*, in a second pass, enforce requirements like can't be
changed except at server start.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Josh Berkus
On 07/08/2014 10:07 AM, Robert Haas wrote:
 I haven't looked at the code in this area too carefully, but it seems
 to me like the flow ought to be:
 
 1. Read all of the config files and determine what the final value
 present in each config file is.
 2. *Then*, in a second pass, enforce requirements like can't be
 changed except at server start.

+1

This would also make conf.d much more useful; I wouldn't have to worry
as much about overlapping config settings.

Sounds like a 9.5 feature, though.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting Windows SChannel as OpenSSL replacement

2014-07-08 Thread Jeff Janes
On Thu, Jun 26, 2014 at 4:26 PM, Andreas Karlsson andr...@proxel.se wrote:

 On 06/24/2014 03:20 AM, Jeff Janes wrote:

 I've tried your 0001 patch, reflecting this refactoring, on Linux and it
 caused 'make check' to hang at 'starting postmaster'.


 I found the bug in the code, and I have attached the a patch which you can
 apply on top of the patch. The regression tests pass now on my Debian
 machine.


Your fix works for me as well.  Thanks.

Is there some recipe for testing the 0002 patch?  Can it be tested on an
MinGW environment, or does it need to use the MicroSoft supplied compilers?

Thanks,

Jeff


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Josh Berkus
On 07/06/2014 01:27 AM, Christoph Berg wrote:
 Another could be that during initdb all the uncommented settings be
 written to postgresql.auto.conf file rather than to postgresql.conf.
 I think we can do this by changing code in initdb.c-setup_config().
 This will ensure that unless user is changing settings in a mixed way
 (some by Alter System and some manually by editing postgresql.conf),
 there will no such problem.

There is no reasonable way for us to prevent issues for users who are
manually changing both pg.conf and pg.auto.conf.  Users should stick to
one or the other method of management (or, thirdly, using conf.d); if
they mix methods, we can't prevent confusion at restart time and we
shouldn't try.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-07-08 Thread Noah Misch
On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote:
 Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com
  Here's an update that places the socket in a temporary subdirectory of /tmp.
  The first attached patch adds NetBSD mkdtemp() to libpgport.  The second,
  principal, patch uses mkdtemp() to implement this design in pg_regress.  The
  corresponding change to contrib/pg_upgrade/test.sh is based on the 
  configure
  script's arrangements for its temporary directory.
 
 Hi,
 
 I believe pg_upgrade itself still needs a fix. While it's not a
 security problem to put the socket in $CWD while upgrading (it is
 using -c unix_socket_permissions=0700), this behavior is pretty
 unexpected, and does fail if your $CWD is  107 bytes.
 
 In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
 perl tests to avoid that problem, so imho it would make even more
 sense to fix pg_upgrade which could also fail in production.

+1.  Does writing that patch interest you?

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-07-08 Thread Christoph Berg
Re: Noah Misch 2014-07-08 20140708174125.ga1884...@tornado.leadboat.com
 On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote:
  Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com
   Here's an update that places the socket in a temporary subdirectory of 
   /tmp.
   The first attached patch adds NetBSD mkdtemp() to libpgport.  The second,
   principal, patch uses mkdtemp() to implement this design in pg_regress.  
   The
   corresponding change to contrib/pg_upgrade/test.sh is based on the 
   configure
   script's arrangements for its temporary directory.
  
  Hi,
  
  I believe pg_upgrade itself still needs a fix. While it's not a
  security problem to put the socket in $CWD while upgrading (it is
  using -c unix_socket_permissions=0700), this behavior is pretty
  unexpected, and does fail if your $CWD is  107 bytes.
  
  In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
  perl tests to avoid that problem, so imho it would make even more
  sense to fix pg_upgrade which could also fail in production.
 
 +1.  Does writing that patch interest you?

I'll give it a try once I've finished this CF review.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Tomas Vondra
On 8.7.2014 19:00, Robert Haas wrote:
 On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 8 Červenec 2014, 16:16, Robert Haas wrote:

 Right, I think that's clear. I'm just pointing out that you get
 to decide: you can either start with a larger NTUP_PER_BUCKET and
 then reduce it if you enough memory left, or you can start with a
 smaller NTUP_PER_BUCKET and then increase it if you run short of
 memory.

 I don't think those two approaches are equal.

 With the approach I took, I can use a compromise value (NTUP=4) 
 initially, and only resize the hash table once at the end (while
 keeping the amount of memory under work_mem all the time).

 With the NTUP=1 and increase in case of memory pressure you have 
 to shrink the table immediately (to fit into work_mem), and if the 
 hash table gets split into multiple batches you're suddenly in a
 situation with lower memory pressure and you may need to increase
 it again.
 
 True. On the other hand, this really only comes into play when the 
 estimates are wrong. If you know at the start how many tuples you're 
 going to need to store and how big they are, you determine whether 
 NTUP_PER_BUCKET=1 is going to work before you even start building
 the hash table. If it isn't, then you use fewer buckets right from
 the start. If we start by estimating a small value for
 NTUP_PER_BUCKET and then let it float upward if we turn out to have
 more tuples than expected, we're optimizing for the case where our
 statistics are right. If we start by estimating a larger value for
 NTUP_PER_BUCKET than what we think we need to fit within work_mem,
 we're basically guessing that our statistics are more likely to be
 wrong than to be right. I think.

Good point. The fist patch was targetted exactly at the wrongly
estimated queries. This patch attempts to apply the rehash to all plans,
and maybe there's a better way.

If the estimates are correct / not too off, we can use this information
to do the sizing 'right' at the beginning (without facing rehashes later).

Over-estimates are not a problem, because it won't make the hash table
slower (it'll be sized for more tuples) and we can't change the number
of batches anyway.

With under-estimates we have to decide whether to resize the hash or
increase the number of batches.

In both cases that matter (correct estimates and under-estimates) we
have to decide whether to increase the number of buckets or batches. I'm
not sure how to do that.

 I wonder if this is really worth the effort - my guess is it's 
 efficient only if large portion of buckets is not visited (and
 thus does not need to be split) at all. Not sure how common that is
 (our workloads certainly are not like that).
 
 Yeah. It may be a bad idea. I threw it out there as a possible way of
 trying to mitigate the worst case, which is when you trouble to build
 the hash table and then make very few probes. But that may not be
 worth the complexity that this would introduce.

Let's keep it simple for now. I think the sizing question (explained
above) is more important and needs to be solved first.

regards
Tomas




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-07-08 Thread Christoph Berg
Hi,

here's my review for this patch.

Generally, the patch addresses a real need, tables often only turn
out to be desired as unlogged if there are performance problems in
practice, and the other way round changing an unlogged table to logged
is way more involved manually than it could be with this patch. So
yes, we want the feature.

I've tried the patch, and it works as desired, including tab
completion in psql. There are a few issues to be resolved, though.

Re: Fabrízio de Royes Mello 2014-06-26 
cafcns+pyv0pbjlo97osyqv1yqoc7s+jgvwxc8uny5jsrdwy...@mail.gmail.com
 As part of GSoC2014 and with agreement of my mentor and reviewer (Stephen
 Frost) I've send a complement of the first patch to add the capability to
 change a regular table to unlogged.

(Fwiw, I believe this direction is the more interesting one.)

 With this patch we finish the main goals of the GSoC2014 and now we'll work
 in the additional goals.

... that being the non-WAL-logging with wal_level=minimal, or more?


 diff --git a/doc/src/sgml/ref/alter_table.sgml 
 b/doc/src/sgml/ref/alter_table.sgml
 index 69a1e14..424f2e9 100644
 --- a/doc/src/sgml/ref/alter_table.sgml
 +++ b/doc/src/sgml/ref/alter_table.sgml
 @@ -58,6 +58,7 @@ ALTER TABLE [ IF EXISTS ] replaceable 
 class=PARAMETERname/replaceable
  ENABLE REPLICA RULE replaceable 
 class=PARAMETERrewrite_rule_name/replaceable
  ENABLE ALWAYS RULE replaceable 
 class=PARAMETERrewrite_rule_name/replaceable
  CLUSTER ON replaceable class=PARAMETERindex_name/replaceable
 +SET {LOGGED | UNLOGGED}
  SET WITHOUT CLUSTER
  SET WITH OIDS
  SET WITHOUT OIDS

This must not be between the two CLUSTER lines. I think the best spot
would just be one line down, before SET WITH OIDS.

(While we are at it, SET TABLESPACE should probably be moved up to the
other SET options...)

 @@ -432,6 +433,20 @@ ALTER TABLE [ IF EXISTS ] replaceable 
 class=PARAMETERname/replaceable
 /varlistentry
  
 varlistentry
 +termliteralSET {LOGGED | UNLOGGED}/literal/term
 +listitem
 + para
 +  This form change the table persistence type from unlogged to permanent 
 or 

This grammar bug pops up consistently: This form *changes*...

There's trailing whitespace.

 +  from unlogged to permanent by rewriting the entire contents of the 
 table 
 +  and associated indexes into new disk files.
 + /para
 + para
 +  Changing the table persistence type acquires an literalACCESS 
 EXCLUSIVE/literal lock.
 + /para

I'd rewrite that and add a reference:

... from unlogged to permanent (see xref linkend=SQL-CREATETABLE-UNLOGGED).
/para
para
Changing the table persistence type acquires an literalACCESS 
EXCLUSIVE/literal lock
and rewrites the table contents and associated indexes into new disk files.
/para


 diff --git a/src/backend/commands/tablecmds.c 
 b/src/backend/commands/tablecmds.c
 index 60d387a..9dfdca2 100644
 --- a/src/backend/commands/tablecmds.c
 +++ b/src/backend/commands/tablecmds.c
 @@ -125,18 +125,19 @@ static List *on_commits = NIL;
   * a pass determined by subcommand type.
   */
  
 -#define AT_PASS_UNSET-1  /* UNSET will 
 cause ERROR */
 -#define AT_PASS_DROP 0   /* DROP (all flavors) */
 -#define AT_PASS_ALTER_TYPE   1   /* ALTER COLUMN TYPE */
 -#define AT_PASS_OLD_INDEX2   /* re-add existing 
 indexes */
 -#define AT_PASS_OLD_CONSTR   3   /* re-add existing 
 constraints */
 -#define AT_PASS_COL_ATTRS4   /* set other column 
 attributes */
 +#define AT_PASS_UNSET-1  /* 
 UNSET will cause ERROR */
 +#define AT_PASS_DROP 0   /* DROP (all 
 flavors) */
 +#define AT_PASS_ALTER_TYPE   1   /* ALTER COLUMN 
 TYPE */
 +#define AT_PASS_OLD_INDEX2   /* re-add 
 existing indexes */
 +#define AT_PASS_OLD_CONSTR   3   /* re-add 
 existing constraints */
 +#define AT_PASS_COL_ATTRS4   /* set other 
 column attributes */
  /* We could support a RENAME COLUMN pass here, but not currently used */
 -#define AT_PASS_ADD_COL  5   /* ADD COLUMN */
 -#define AT_PASS_ADD_INDEX6   /* ADD indexes */
 -#define AT_PASS_ADD_CONSTR   7   /* ADD constraints, 
 defaults */
 -#define AT_PASS_MISC 8   /* other stuff */
 -#define AT_NUM_PASSES9
 +#define AT_PASS_ADD_COL  5   /* ADD 
 COLUMN */
 +#define AT_PASS_ADD_INDEX6   /* ADD indexes 
 */
 +#define AT_PASS_ADD_CONSTR   7   /* ADD 
 constraints, defaults */
 +#define AT_PASS_MISC 8   /* other stuff 
 */
 

Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Jeff Janes
On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote:

 On 8 Červenec 2014, 14:49, Robert Haas wrote:
  On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
  I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
  once the table is built and there's free space in work_mem. The patch
  mentioned above makes implementing this possible / rather simple.
 
  Another idea would be to start with NTUP_PER_BUCKET=1 and then, if we
  run out of memory, increase NTUP_PER_BUCKET.  I'd like to think that
  the common case is that work_mem will be large enough that everything
  fits; and if you do it that way, then you save yourself the trouble of
  rehashing later, which as you point out might lose if there are only a
  few probes.  If it turns out that you run short of memory, you can
  merge pairs of buckets up to three times, effectively doubling
  NTUP_PER_BUCKET each time.

 Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
 relations it may be way cheaper to use higher NTUP_PER_BUCKET values
 instead of increasing the number of batches (resulting in repeated scans
 of the outer table). I think it's important (but difficult) to keep these
 things somehow balanced.

 With large work_mem values the amount of memory for buckets may be quite
 significant. E.g. 800MB work_mem may easily give ~120MB of memory taken by
 buckets with NTUP_PER_BUCKET=1. With NTUP_PER_BUCKET=4 it's just ~30MB.



That extra 90MB is memory well spent, in my book.  Versus having to walk a
4-deep linked list which is scattered all over main memory just to find one
entry we care about in it.

It might cause some things that were very close to the edge to tip over
into multi-pass hash joins, but even that is not necessarily a bad thing.
 (When I test with work_mem in the 20 to 50 MB range, I often find
batches=2 be ~30% faster than batches=1, I think because partitioning into
main memory using sequential writes is not much of a burden, and building
and walking two hash tables that both fit in L3 cache is much faster than
building 1 hash table in main memory, and more than makes up for the work
of partitioning.  Presumably that situation doesn't apply to work_mem
900MB, but isn't NUMA about the same thing as L4 cache, in effect?).

And if someone had a whole bunch of hash joins which were right in that
anti-sweet spot, all they have to do is increase work_mem by (at most) 15%
to get out of it.  work_mem is basically impossible to tune, so I doubt
anyone exists who has a reasonable setting for it which can' be increased
by 15% and still be reasonable.  And if someone does have it tuned so
tightly, they probably won't be upgrading to new major versions without
expecting to do some re-tuning.

Cheers,

Jeff


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-07-08 Thread Bruce Momjian
On Tue, Jul  8, 2014 at 08:21:48PM +0200, Christoph Berg wrote:
 Re: Noah Misch 2014-07-08 20140708174125.ga1884...@tornado.leadboat.com
  On Tue, Jul 08, 2014 at 07:02:04PM +0200, Christoph Berg wrote:
   Re: Noah Misch 2014-06-08 20140608135713.ga525...@tornado.leadboat.com
Here's an update that places the socket in a temporary subdirectory of 
/tmp.
The first attached patch adds NetBSD mkdtemp() to libpgport.  The 
second,
principal, patch uses mkdtemp() to implement this design in pg_regress. 
 The
corresponding change to contrib/pg_upgrade/test.sh is based on the 
configure
script's arrangements for its temporary directory.
   
   Hi,
   
   I believe pg_upgrade itself still needs a fix. While it's not a
   security problem to put the socket in $CWD while upgrading (it is
   using -c unix_socket_permissions=0700), this behavior is pretty
   unexpected, and does fail if your $CWD is  107 bytes.
   
   In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
   perl tests to avoid that problem, so imho it would make even more
   sense to fix pg_upgrade which could also fail in production.
  
  +1.  Does writing that patch interest you?
 
 I'll give it a try once I've finished this CF review.

OK.  Let me know if you need help.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-07-08 Thread Alvaro Herrera
Peter Geoghegan wrote:

  6) Sometimes no hints are returned... Even in simple cases like this one:
  =# create table foo (aa int, bb int);
  CREATE TABLE
  =# select ab from foo;
  ERROR:  42703: column ab does not exist
  LINE 1: select ab from foo;
 ^
  LOCATION:  errorMissingColumn, parse_relation.c:3123
 
 That's because those two candidates come from a single RTE and have an
 equal distance -- you'd see both suggestions if you joined two tables
 with each candidate, assuming that each table being joined didn't
 individually have the same issue. I think that that's probably
 considered the correct behavior by most.

It seems pretty silly to me actually.  Was this designed by a committee?
I agree with the general principle that showing a large number of
candidates (a la bash) is a bad idea, but failing to show two of them ...

Words fail me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-07-08 Thread Peter Geoghegan
On Tue, Jul 8, 2014 at 1:42 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 That's because those two candidates come from a single RTE and have an
 equal distance -- you'd see both suggestions if you joined two tables
 with each candidate, assuming that each table being joined didn't
 individually have the same issue. I think that that's probably
 considered the correct behavior by most.

 It seems pretty silly to me actually.  Was this designed by a committee?
 I agree with the general principle that showing a large number of
 candidates (a la bash) is a bad idea, but failing to show two of them ...

I guess it was designed by a committee. But we don't fail to show both
because they're equally distant. Rather, it's because they're equally
distant and from the same RTE. This is a contrived example, but
typically showing equally distant columns is useful when they're in a
foreign-key relationship - I was worried about the common case where a
column name is misspelled that would otherwise be ambiguous, which is
why that shows a HINT while the single RTE case doesn't. I think that
in most realistic cases it wouldn't be all that useful to show two
columns from the same table when they're equally distant. It's easy to
imagine that reflecting that no match is good in absolute terms, and
we're somewhat conservative about showing any match. While I think
this general behavior is defensible, I must admit that it did suit me
to write it that way because to do otherwise would have necessitated
more invasive code in the existing general purpose scanRTEForColumn()
function.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-07-08 Thread Peter Geoghegan
On Tue, Jul 8, 2014 at 1:55 PM, Peter Geoghegan p...@heroku.com wrote:
 I was worried about the common case where a
 column name is misspelled that would otherwise be ambiguous, which is
 why that shows a HINT while the single RTE case doesn't

To be clear - I mean a HINT with two suggestions rather than just one.
If there are 3 or more equally distant suggestions (even if they're
all from different RTEs) we also give no HINT in the proposed patch.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Tomas Vondra
On 8.7.2014 21:53, Jeff Janes wrote:
 On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
 
 Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large
 outer relations it may be way cheaper to use higher NTUP_PER_BUCKET
 values instead of increasing the number of batches (resulting in
 repeated scans of the outer table). I think it's important (but
 difficult) to keep these things somehow balanced.
 
 With large work_mem values the amount of memory for buckets may be
 quite significant. E.g. 800MB work_mem may easily give ~120MB of
 memory taken by buckets with NTUP_PER_BUCKET=1. With
 NTUP_PER_BUCKET=4 it's just ~30MB.
 
 
 That extra 90MB is memory well spent, in my book. Versus having to
 walk a 4-deep linked list which is scattered all over main memory
 just to find one entry we care about in it.

Yes, I share this view, although it really depends on how expensive it's
to rescan the outer relation (due to more batches) vs. lookup in a
deeper hash table.

The other thing is that this memory is currently unaccounted for, i.e.
the space for buckets is not counted within the work_mem limit (unless
I'm missing something in the code). I'm not sure why, and I believe it
should be, so I changer this in the patch.

 It might cause some things that were very close to the edge to tip
 over into multi-pass hash joins, but even that is not necessarily a
 bad thing. (When I test with work_mem in the 20 to 50 MB range, I
 often find batches=2 be ~30% faster than batches=1, I think because 
 partitioning into main memory using sequential writes is not much of
 a burden, and building and walking two hash tables that both fit in
 L3 cache is much faster than building 1 hash table in main memory,
 and more than makes up for the work of partitioning. Presumably that
 situation doesn't apply to work_mem 900MB, but isn't NUMA about the
 same thing as L4 cache, in effect?).

Yes, I've seen cases where plans with (nbatch1) were faster than a plan
with (nbatch=1). I'm not entirely sure why :-( but I have two hypotheses
so far:

(a) Caching within CPU (current CPUs have multiple MBs of L2 cache),
which may make a difference, especially in the size range you've
mentioned.

(b) Lower tuple/bucket ratio - this may easily happen for example if
the estimates are slighly lower than reality (either row count or
row width) and narrowly exceed work_mem, thus forcing batching.
The resulting hash table has ~50% tuple/bucket on average, and
thus is faster.

 And if someone had a whole bunch of hash joins which were right in
 that anti-sweet spot, all they have to do is increase work_mem by (at
 most) 15% to get out of it. work_mem is basically impossible to tune,
 so I doubt anyone exists who has a reasonable setting for it which
 can' be increased by 15% and still be reasonable. And if someone does
 have it tuned so tightly, they probably won't be upgrading to new
 major versions without expecting to do some re-tuning.

Right. Still, it's not really nice to get slower hash joins after
upgrading to a new version - I somehow expect to get the same or better
performance, if possible. So I'd like to make it as efficient as
possible, within the given memory limit.

Sadly, the increase may be needed anyway because of counting the bucket
memory into work_mem, as mentioned above. If committed, this should be
probably mentioned in release notes or something.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-08 Thread Tomas Vondra
Hi,

Thinking about this a bit more, do we really need to build the hash
table on the first pass? Why not to do this:

(1) batching
- read the tuples, stuff them into a simple list
- don't build the hash table yet

(2) building the hash table
- we have all the tuples in a simple list, batching is done
- we know exact row count, can size the table properly
- build the table

Also, maybe we could use a regular linear hash table [1], instead of
using the current implementation with NTUP_PER_BUCKET=1. (Although,
that'd be absolutely awful with duplicates.)

regards
Tomas

[1] http://en.wikipedia.org/wiki/Linear_probing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-08 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm a bit skeptical as to whether testing for that case is actually worth
 any extra complexity.  Do you have a compelling use-case?  But anyway,
 if we do want to allow it, why does it take any more than adding a check
 for Consts to the loops in query_is_distinct_for?  It's the targetlist
 entries where we'd want to allow Consts, not the join conditions.

 I don't really have a compelling use-case, but you're right, it's just a
 Const check in query_is_distinct_for(), it seems simple enough so I've
 included that in my refactor of the patch to use query_is_distinct_for().
 This allows the regression tests all to pass again.

Meh.  I wrote a regression test that expects it is a pretty poor
rationale for adding logic.  If you can't point to a real-world case
where this is important, I'm inclined to take it out.

If we were actually serious about exploiting such cases, looking for
bare Consts would be a poor implementation anyhow, not least because
const-folding has not yet been applied to the sub-select.  I think we'd
want to do it for any pseudoconstant expression (no Vars, no volatile
functions); which is a substantially more expensive test.

 1. The fast path code that exited in join_is_removable() for subquery's
 when the subquery had no group or distinct clause is now gone. I wasn't too
 sure that I wanted to assume too much about what query_is_distinct_for may
 do in the future and I thought if I included some logic in
 join_is_removable() to fast path, that one day it may fast path wrongly.

Or put a quick-check subroutine next to query_is_distinct_for().  The code
we're skipping here is not so cheap that I want to blow off skipping it.
On review it looks like analyzejoins.c would possibly benefit from an
earlier fast-path check as well.

 2. The patch I submitted here
 http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com
 if that gets accepted then it makes the check for set returning functions
 in join_is_removable void.

Right (and done, if you didn't notice already).

TBH I find the checks for FOR UPDATE and volatile functions to be
questionable as well.  We have never considered those things to prevent
pushdown of quals into a subquery (cf subquery_is_pushdown_safe).  I think
what we're talking about here is pretty much equivalent to pushing an
always-false qual into the subquery; if people haven't complained about
that, why should they complain about this?  Or to put it in slightly more
principled terms, we've attempted to prevent subquery optimization from
causing volatile expressions to be evaluated *more* times than the naive
reading of the query would suggest, but we have generally not felt that
we needed to prevent them from happening *fewer* times.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder

2014-07-08 Thread Asif Naeem
Yes. Can you please take a look at Win_lib_bin.patch I shared earlier ?, I
think it is or similar approach will be appropriate. Thanks.

Regards,
Muhammad Asif Naeem


On Tue, Jul 8, 2014 at 5:53 PM, MauMau maumau...@gmail.com wrote:

 From: Asif Naeem anaeem...@gmail.com

  Other than my pervious comments, patch looks good to me. Thanks.


 Thanks for reviewing.  I understood that your previous comment was to
 suggest copying the desired DLLs directly from Release/Debug folder to bin.
 But I'm afraid it cannot be done cleanly...  CopySolutionOutput() copies
 all DLLS to lib folder with no exception as follows:

   if ($1 == 1)
   {
$dir = bin;
$ext = exe;
   }
   elsif ($1 == 2)
   {
$dir = lib;
$ext = dll;
   }

 It seems like I have to do something like this, listing the relevant DLL
 names anyway.  I don't think this is not cleaner.

   if ($1 == 1)
   {
$dir = bin;
$ext = exe;
   }
   elsif ($1 == 2  /* the project is libpq, libecpg, etc. */)
   {
$dir = bin;
$ext = dll;
   }
   elsif ($1 == 2)
   {
$dir = lib;
$ext = dll;
   }

 What do you think?  Am I misunderstanding your suggestion?

 Regards
 MauMau




Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder

2014-07-08 Thread Alvaro Herrera
Asif Naeem wrote:
 Yes. Can you please take a look at Win_lib_bin.patch I shared earlier ?, I
 think it is or similar approach will be appropriate. Thanks.

I think the suggestion by Peter Eisentraut upthread was pretty
reasonable -- the Makefiles are already aware that they are building a
shared library, so scrape the info off them.  The less hardcoded stuff
in the msvc framework, the better.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-08 Thread David Rowley
On 9 July 2014 09:27, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  On Tue, Jul 8, 2014 at 4:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  I'm a bit skeptical as to whether testing for that case is actually
 worth
  any extra complexity.  Do you have a compelling use-case?  But anyway,
  if we do want to allow it, why does it take any more than adding a check
  for Consts to the loops in query_is_distinct_for?  It's the targetlist
  entries where we'd want to allow Consts, not the join conditions.

  I don't really have a compelling use-case, but you're right, it's just a
  Const check in query_is_distinct_for(), it seems simple enough so I've
  included that in my refactor of the patch to use query_is_distinct_for().
  This allows the regression tests all to pass again.

 Meh.  I wrote a regression test that expects it is a pretty poor
 rationale for adding logic.  If you can't point to a real-world case
 where this is important, I'm inclined to take it out.


Ok, I'll pull that logic back out when I get home tonight.


 If we were actually serious about exploiting such cases, looking for
 bare Consts would be a poor implementation anyhow, not least because
 const-folding has not yet been applied to the sub-select.  I think we'd
 want to do it for any pseudoconstant expression (no Vars, no volatile
 functions); which is a substantially more expensive test.

  1. The fast path code that exited in join_is_removable() for subquery's
  when the subquery had no group or distinct clause is now gone. I wasn't
 too
  sure that I wanted to assume too much about what query_is_distinct_for
 may
  do in the future and I thought if I included some logic in
  join_is_removable() to fast path, that one day it may fast path wrongly.

 Or put a quick-check subroutine next to query_is_distinct_for().  The code
 we're skipping here is not so cheap that I want to blow off skipping it.


Ok, good idea. I'll craft something up tonight along those lines.


 On review it looks like analyzejoins.c would possibly benefit from an
 earlier fast-path check as well.


Do you mean for non-subqueries? There already is a check to see if the
relation has no indexes.


  2. The patch I submitted here
 
 http://www.postgresql.org/message-id/caaphdvrfvkh0p3faoogcckby7fecj9qfanklkx7mwsbcxy2...@mail.gmail.com
  if that gets accepted then it makes the check for set returning functions
  in join_is_removable void.

 Right (and done, if you didn't notice already).


Thanks, I noticed that this morning. I'll remove the (now) duplicate check
from the patch


 TBH I find the checks for FOR UPDATE and volatile functions to be
 questionable as well.  We have never considered those things to prevent
 pushdown of quals into a subquery (cf subquery_is_pushdown_safe).  I think
 what we're talking about here is pretty much equivalent to pushing an
 always-false qual into the subquery; if people haven't complained about
 that, why should they complain about this?  Or to put it in slightly more
 principled terms, we've attempted to prevent subquery optimization from
 causing volatile expressions to be evaluated *more* times than the naive
 reading of the query would suggest, but we have generally not felt that
 we needed to prevent them from happening *fewer* times.


Well, that's a real tough one for me as I only added that based on what you
told me here:

On 20 May 2014 23:22, Tom Lane t...@sss.pgh.pa.us wrote:


 I doubt you should drop a subquery containing FOR UPDATE, either.
 That's a side effect, just as much as a volatile function would be.

 regards, tom lane



As far as I know the FOR UPDATE check is pretty much void as of now anyway,
since the current state of query_is_distinct_for() demands that there's
either a DISTINCT, GROUP BY or just a plain old aggregate without any
grouping, which will just return a single row, neither of these will allow
FOR UPDATE anyway. I really just added the check just to protect the code
from possible future additions to query_is_distinct_for() which may add
logic to determine uniqueness by some other means.

So the effort here should be probably be more focused on if we should allow
the join removal when the subquery contains volatile functions. We should
probably think fairly hard on this now as I'm still planning on working on
INNER JOIN removals at some point and I'm thinking we should likely be
consistent between the 2 types of join for when it comes to FOR UPDATE and
volatile functions, and I'm thinking right now that for INNER JOINs that,
since they're INNER that we could remove either side of the join. In that
case maybe it would be harder for the user to understand why their volatile
function didn't get executed.

Saying that... off the top of my head I can't remember what we'd do in a
case like:

create view v_a as select a,volatilefunc(a) AS funcresult from a;

select a from v_a;

Since we didn't select funcresult, do we execute the function?


Re: [HACKERS] Allowing join removals for more join types

2014-07-08 Thread Tom Lane
David Rowley dgrow...@gmail.com writes:
 On 9 July 2014 09:27, Tom Lane t...@sss.pgh.pa.us wrote:
 On review it looks like analyzejoins.c would possibly benefit from an
 earlier fast-path check as well.

 Do you mean for non-subqueries? There already is a check to see if the
 relation has no indexes.

Oh, sorry, that was a typo: I meant to write pathnode.c.  Specifically,
we could skip the translate_sub_tlist step.  Admittedly that's not
hugely expensive, but as long as we have the infrastructure for a quick
check it might be worth doing.


 TBH I find the checks for FOR UPDATE and volatile functions to be
 questionable as well.

 Well, that's a real tough one for me as I only added that based on what you
 told me here:
 I doubt you should drop a subquery containing FOR UPDATE, either.
 That's a side effect, just as much as a volatile function would be.

Hah ;-).  But the analogy to qual pushdown hadn't occurred to me at the
time.

 As far as I know the FOR UPDATE check is pretty much void as of now anyway,
 since the current state of query_is_distinct_for() demands that there's
 either a DISTINCT, GROUP BY or just a plain old aggregate without any
 grouping, which will just return a single row, neither of these will allow
 FOR UPDATE anyway.

True.

 So the effort here should be probably be more focused on if we should allow
 the join removal when the subquery contains volatile functions. We should
 probably think fairly hard on this now as I'm still planning on working on
 INNER JOIN removals at some point and I'm thinking we should likely be
 consistent between the 2 types of join for when it comes to FOR UPDATE and
 volatile functions, and I'm thinking right now that for INNER JOINs that,
 since they're INNER that we could remove either side of the join. In that
 case maybe it would be harder for the user to understand why their volatile
 function didn't get executed.

Color me dubious.  In exactly what circumstances would it be valid to
suppress an inner join involving a sub-select?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Mark Kirkwood

On 09/07/14 05:13, Josh Berkus wrote:

On 07/06/2014 01:27 AM, Christoph Berg wrote:

Another could be that during initdb all the uncommented settings be

written to postgresql.auto.conf file rather than to postgresql.conf.
I think we can do this by changing code in initdb.c-setup_config().
This will ensure that unless user is changing settings in a mixed way
(some by Alter System and some manually by editing postgresql.conf),
there will no such problem.


There is no reasonable way for us to prevent issues for users who are
manually changing both pg.conf and pg.auto.conf.  Users should stick to
one or the other method of management (or, thirdly, using conf.d); if
they mix methods, we can't prevent confusion at restart time and we
shouldn't try.




Yes, but even well behaved users will see this type of error, because 
initdb uncomments certain values (ones that are dead certs for being 
changed via ALTER SYSTEM subsequently like shared_buffers), and then - 
bang! your next reload gets that your postgresql.conf contains errors 
message.


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Steve Singer
I've encountered a corrupt pg_control  file on my 9.4 development 
cluster.  I've mostly been using the cluster for changeset extraction / 
slony testing.


This is a 9.4 (currently commit 6ad903d70a440e  + a walsender change 
discussed in another thread) but would have had the initdb done with an 
earlier 9.4 snapshot.




/usr/local/pgsql94wal/bin$ ./pg_controldata ../data
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting.  The results below are untrustworthy.

pg_control version number:937
Catalog version number:   201405111
Database system identifier:   6014096177254975326
Database cluster state:   in production
pg_control last modified: Tue 08 Jul 2014 06:15:58 PM EDT
Latest checkpoint location:   5/44DC5FC8
Prior checkpoint location:5/44C58B88
Latest checkpoint's REDO location:5/44DC5FC8
Latest checkpoint's REDO WAL file:000100050044
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/1558590
Latest checkpoint's NextOID:  505898
Latest checkpoint's NextMultiXactId:  3285
Latest checkpoint's NextMultiOffset:  6569
Latest checkpoint's oldestXID:1281
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint:Tue 08 Jul 2014 06:15:23 PM EDT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
Current wal_level setting:logical
Current wal_log_hints setting:off
Current max_connections setting:  200
Current max_worker_processes setting: 8
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Size of a large-object chunk: 65793
Date/time type storage:   floating-point numbers
Float4 argument passing:  by reference
Float8 argument passing:  by reference
Data page checksum version:   2602751502
ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$


Before this postgres crashed, and seemed to have problems recovering. I 
might have hit CTRL-C but I didn't do anything drastic like issue a kill -9.



test1 [unknown] 2014-07-08 18:15:18.986 EDTFATAL:  the database system 
is in recovery mode
test1 [unknown] 2014-07-08 18:15:20.482 EDTWARNING:  terminating 
connection because of crash of another server process
test1 [unknown] 2014-07-08 18:15:20.482 EDTDETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and 
exit, because another server process exited abnormally and possibly 
corrupted shared memory.
test1 [unknown] 2014-07-08 18:15:20.482 EDTHINT:  In a moment you should 
be able to reconnect to the database and repeat your command.
  2014-07-08 18:15:20.483 EDTLOG:  all server processes terminated; 
reinitializing
  2014-07-08 18:15:20.720 EDTLOG:  database system was interrupted; 
last known up at 2014-07-08 18:15:15 EDT
  2014-07-08 18:15:20.865 EDTLOG:  database system was not properly 
shut down; automatic recovery in progress

  2014-07-08 18:15:20.954 EDTLOG:  redo starts at 5/41023848
  2014-07-08 18:15:23.153 EDTLOG:  unexpected pageaddr 4/D8DC6000 in 
log segment 000100050044, offset 14442496

  2014-07-08 18:15:23.153 EDTLOG:  redo done at 5/44DC5F60
  2014-07-08 18:15:23.153 EDTLOG:  last completed transaction was at 
log time 2014-07-08 18:15:17.874937-04
test2 [unknown] 2014-07-08 18:15:24.247 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:24.772 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.281 EDTFATAL:  the database system 
is in recovery mode
test1 [unknown] 2014-07-08 18:15:25.547 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.548 EDTFATAL:  the database system 
is in recovery mode
test3 [unknown] 2014-07-08 18:15:25.549 EDTFATAL:  the database system 
is in recovery mode
test4 [unknown] 2014-07-08 18:15:25.557 EDTFATAL:  the database system 
is in recovery mode
test5 [unknown] 2014-07-08 18:15:25.582 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.584 EDTFATAL:  the database system 
is in 

Re: [HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Tom Lane
Steve Singer st...@ssinger.info writes:
 I've encountered a corrupt pg_control  file on my 9.4 development 
 cluster.  I've mostly been using the cluster for changeset extraction / 
 slony testing.

 This is a 9.4 (currently commit 6ad903d70a440e  + a walsender change 
 discussed in another thread) but would have had the initdb done with an 
 earlier 9.4 snapshot.

Somehow or other you missed the update to pg_control version number 942.
There's no obvious reason to think that this pg_control file is corrupt
on its own terms, but the pg_controldata version you're using expects
the 942 layout.  The fact that the server wasn't complaining about this
suggests that you've not recompiled the server, or at least not xlog.c.
Possibly the odd failure to restart indicates that you have a partially
updated server executable?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Steve Singer

On 07/08/2014 10:14 PM, Tom Lane wrote:

Steve Singer st...@ssinger.info writes:

I've encountered a corrupt pg_control  file on my 9.4 development
cluster.  I've mostly been using the cluster for changeset extraction /
slony testing.
This is a 9.4 (currently commit 6ad903d70a440e  + a walsender change
discussed in another thread) but would have had the initdb done with an
earlier 9.4 snapshot.

Somehow or other you missed the update to pg_control version number 942.
There's no obvious reason to think that this pg_control file is corrupt
on its own terms, but the pg_controldata version you're using expects
the 942 layout.  The fact that the server wasn't complaining about this
suggests that you've not recompiled the server, or at least not xlog.c.
Possibly the odd failure to restart indicates that you have a partially
updated server executable?



The server  is complaining about that, it started to after the crash 
(which is why I ran pg_controldata)


ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$ ./postgres -D ../data
  2014-07-08 22:28:57.796 EDTFATAL:  database files are incompatible 
with server
  2014-07-08 22:28:57.796 EDTDETAIL:  The database cluster was 
initialized with PG_CONTROL_VERSION 937, but the server was compiled 
with PG_CONTROL_VERSION 942.

  2014-07-08 22:28:57.796 EDTHINT:  It looks like you need to initdb.
ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$


The server seemed fine (and it was 9.4 because I was using 9.4 features)
The server crashed
The server performed crash recovery
The server server wouldn't start and pg_controldata shows the attached 
output


I wasn't recompiling or reinstalling around this time either.




regards, tom lane






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Tom Lane
Steve Singer st...@ssinger.info writes:
 On 07/08/2014 10:14 PM, Tom Lane wrote:
 There's no obvious reason to think that this pg_control file is corrupt
 on its own terms, but the pg_controldata version you're using expects
 the 942 layout.  The fact that the server wasn't complaining about this
 suggests that you've not recompiled the server, or at least not xlog.c.

 The server  is complaining about that, it started to after the crash 

Then you updated your sources, recompiled and reinstalled, but failed to
restart the server when you did that.  Else it would have complained on
the spot.

If you had any valuable data in the installation, we could talk about how
to get it out; but since you didn't I'd suggest just re-initdb and move
on.  I don't see anything unexpected here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Amit Kapila
On Tue, Jul 8, 2014 at 11:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 No, ALTER SYSTEM is there now and it needs to work right in its first
 release.  I will go fix this if nobody else does.

I am planing to provide an initial patch for this issue in a day or so, hope
that is not too late.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Amit Kapila
On Wed, Jul 9, 2014 at 6:40 AM, Mark Kirkwood mark.kirkw...@catalyst.net.nz
wrote:
 On 09/07/14 05:13, Josh Berkus wrote:
 Another could be that during initdb all the uncommented settings be

 written to postgresql.auto.conf file rather than to postgresql.conf.
 I think we can do this by changing code in initdb.c-setup_config().
 This will ensure that unless user is changing settings in a mixed way
 (some by Alter System and some manually by editing postgresql.conf),
 there will no such problem.


 There is no reasonable way for us to prevent issues for users who are
 manually changing both pg.conf and pg.auto.conf.  Users should stick to
 one or the other method of management (or, thirdly, using conf.d); if
 they mix methods, we can't prevent confusion at restart time and we
 shouldn't try.

 Yes, but even well behaved users will see this type of error, because
initdb uncomments certain values (ones that are dead certs for being
changed via ALTER SYSTEM subsequently like shared_buffers), and then -
bang! your next reload gets that your postgresql.conf contains errors
message.

That is the reason, why I have suggested up-thread that uncommented
values should go to postgresql.auto.conf, that will avoid any such
observations for a well-behaved user.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-08 Thread Tom Lane
Amit Kapila amit.kapil...@gmail.com writes:
 On Wed, Jul 9, 2014 at 6:40 AM, Mark Kirkwood mark.kirkw...@catalyst.net.nz
 wrote:
 Yes, but even well behaved users will see this type of error, because
 initdb uncomments certain values (ones that are dead certs for being
 changed via ALTER SYSTEM subsequently like shared_buffers), and then -
 bang! your next reload gets that your postgresql.conf contains errors
 message.

 That is the reason, why I have suggested up-thread that uncommented
 values should go to postgresql.auto.conf, that will avoid any such
 observations for a well-behaved user.

Uh, what?  That sounds like you are proposing that postgresql.conf itself
is a dead letter.  Which is not going to fly.  We had that conversation
already.

The right way to fix this is just to avoid processing entries that get
overridden later in the configuration file scan.  That won't cause anyone
to get upset about how their old habits no longer work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: show only failed queries

2014-07-08 Thread Fujii Masao
On Mon, Jun 30, 2014 at 8:33 PM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-06-30 13:01 GMT+02:00 Abhijit Menon-Sen a...@2ndquadrant.com:

 At 2014-06-30 12:48:30 +0200, pavel.steh...@gmail.com wrote:
 
  +  para
  +  Print a failed SQL commands to standard error output. This is
  +  equivalent to setting the variable varnameECHO/varname to
  +  literalerrors/literal.

 No a, just Print failed SQL commands ….

  -option-e/option.
  +option-e/option. If set to literalerror/literal then
  only
  +failed queries are displayed.

 Should be errors here, not error.

printf(_(  -a, --echo-all   echo all input from
  script\n));
  + printf(_(  -b  --echo-errorsecho failed commands sent to
  server\n));
printf(_(  -e, --echo-queries   echo commands sent to
  server\n));

 Should have a comma after -b to match other options. Also I would remove
 sent to server from the description: echo failed commands is fine.


 fixed

$ psql -b
bin/psql: invalid option -- 'b'
Try psql --help for more information.

I got this error. ISTM you forgot to add 'b' into the third argument of
getopt_long in startup.c.

 applicationpsql/application merely prints all queries as
 they are sent to the server. The switch for this is
-option-e/option.
+option-e/option. If set to literalerrors/literal then only
+failed queries are displayed.

I think that where failed queries are output should be documented here.
Otherwise users might misunderstand they are output to standard output
like ECHO=all and queries do.

It's better to add The switch for this is option-b/option. into the doc.

+else if (strcmp(prev2_wd, \\set) == 0)
+{
+if (strcmp(prev_wd, ECHO) == 0)
+{
+static const char *const my_list[] =
+{none, errors, queries, all, NULL};
+
+COMPLETE_WITH_LIST_CS(my_list);
+}
+else if (strcmp(prev_wd, ECHO_HIDDEN) == 0)
+{
+static const char *const my_list[] =
+{noexec, off, on, NULL};
+
+COMPLETE_WITH_LIST_CS(my_list);
+}
+}

I think that adding tab-completions of psql variables is good, but
adding those of only ECHO and ECHO_HIDDEN seems half-baked.
Probably this part should be split into separate patch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending constraint exclusion for implied constraints/conditions

2014-07-08 Thread Ashutosh Bapat
On Tue, Jul 8, 2014 at 7:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
  On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I doubt it.  The extra code isn't the problem so much, it's the extra
  planning cycles spent trying to make proofs that will usually fail.
  What you propose will create a combinatorial explosion in the number
  of proof paths to be considered.

  I can understand that it will create combinatorial explosion in the
 number
  of predicates that need to be examined by the constraint exclusion. I do
  not understand where come the paths gets involved here. The constraint
  exclusion kicks in before paths are created

 Perhaps I should not have used the term path here, because I wasn't
 referring to what the planner calls Paths.  I just meant that there will
 be many more ways to perhaps prove a constraint-exclusion result, and the
 planner will have to go through them all.

(Usually to no avail, because
 how often do people write queries that are guaranteed to produce no rows?)

 An example of what I meant by combinatorial explosion is that if a clause
 mentions K variables, and each of those variables has been equated to M
 other variables, there are (M+1)^K possible derived clauses, and it looks
 to me like we'd have to consider them all to catch the sort of situation
 you're talking about.


I agree.



 I think this is going to require a *lot* of additional planner cycles,
 and TBH you've not provided any very compelling example of why it's
 worthwhile.  Yeah, if you can exclude a large partition it's a win,
 but how often is that going to happen in real-world queries?  The one
 example you gave seemed pretty artificial to me, ie unrelated to typical
 partitioning constraints.


Yes, the example is a cooked one to show the problem. But the case can be
encountered easily in partitioned tables. A partitioned table (having
constraints on each partition) with few table level constraints, can
undergo queries with some clauses in WHERE clause which yield empty results
for one or more partitions. Saving some table scans would be worth the time
spent in bringing up (M+1)^K derived clauses and examining those. Greg's
example about parallel joins or joins between partitioned foreign tables
would yield much better results, if we have this optimization. But, I
think, I will wait till parallel joins or partitioned foreign tables is a
reality in PostgreSQL.


 regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company