Re: [HACKERS] proposal: plpgsql - Assert statement

2015-03-25 Thread Pavel Stehule
2015-03-25 0:17 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Pavel Stehule pavel.steh...@gmail.com writes:
  updated version with Jim Nasby's doc and rebase against last changes in
  plpgsql.

 I started looking at this patch.  ISTM there are some pretty questionable
 design decisions in it:

 1. Why create a core GUC to control a behavior that's plpgsql-only?
 I think it'd make more sense for this to be a plgsql custom GUC
 (ie, plpgsql.enable_asserts or some such name).


This type of assertations can be implemented in any PL language - so I
prefer global setting. But I have not strong option in this case - this is
question about granularity - and more ways are valid.



 2. I find the use of errdetail to report whether the assertion condition
 evaluated to FALSE or to NULL to be pretty useless.  (BTW, is considering
 NULL to be a failure the right thing?  SQL CHECK conditions consider NULL
 to be allowed ...)


This is a question - I am happy with SQL CHECK for data, but I am not sure
if same behave is safe for plpgsql (procedural) assert. More stricter
behave is safer  - and some bugs in procedures are based on unhandled NULLs
in variables. So in this topic I prefer implemented behave. It is some like:

IF expression THEN
-- I am able do all
...
ELSE
   RAISE EXCEPTION 'some is wrong';
END IF;



 I also don't care at all for reporting the internal
 text of the assertion expression in the errdetail: that will expose
 implementation details (ie, exactly what does plpgsql convert the user
 expression to, does it remove comments, etc) which we will then be
 constrained from changing for fear of breaking client code that expects a
 particular spelling of the condition.  I think we should just drop that
 whole business.  The user can report the condition in her message, if she
 feels the need to.


+1


 3. If we drop the errdetail as per #2, then reporting the optional
 user-supplied string as a HINT would be just plain bizarre; not that it
 wasn't bizarre already, because there's no good reason to suppose that
 whatever the programmer has to say about the assertion is merely a hint.
 I also find the behavior for string-evaluates-to-NULL bizarre; it'd be
 saner just to leave out the message field, same as if the argument weren't
 there.  I would suggest that we adopt one of these two definitions for the
 optional string:

 3a. If string is present and not null, use it as the primary message text
 (otherwise use assertion failed).

 3b. If string is present and not null, use it as errdetail, with the
 primary message text always being assertion failed.

 I mildly prefer #3a, but could be talked into #3b.


I prefer #3b - there is more informations.

Regards

Pavel



 Comments?

 regards, tom lane



Re: [HACKERS] Custom/Foreign-Join-APIs

2015-03-25 Thread Kyotaro HORIGUCHI
Hello, I had a look on this.

At Wed, 25 Mar 2015 03:59:28 +, Kouhei Kaigai kai...@ak.jp.nec.com wrote 
in 9a28c8860f777e439aa12e8aea7694f8010c6...@bpxm15gp.gisp.nec.co.jp
   At this moment, I'm not 100% certain about its logic. Especially, I didn't
   test SEMI- and ANTI- join cases yet.
   However, time is money - I want people to check overall design first, 
   rather
   than detailed debugging. Please tell me if I misunderstood the logic to 
   break
   down join relations.
  
  With applying your patch, regression tests of “updatable view” failed.
  regression.diff contains some errors like this:
  ! ERROR:  could not find RelOptInfo for given relids
  
  Could you check that?
 
 It is a bug around the logic to find out two RelOptInfo that can construct
 another RelOptInfo of joinrel.

It is caused by split (or multilevel) joinlist. Setting
join_collapse_limit to 10 makes the query to go well.

I suppose that get_joinrel_broken_down should give up returning
result when given joinrel spans over multiple join subproblems,
becuase they cannot be merged by FDW anyway even if they
comformed the basic requirements for merging.

 Even though I'm now working to correct the logic, it is not obvious to
 identify two relids that satisfy joinrel-relids.
 (Yep, law of entropy enhancement...)
 
 On the other hands, we may have a solution that does not need a complicated
 reconstruction process. The original concern was, FDW driver may add paths
 that will replace entire join subtree by foreign-scan on remote join multiple
 times, repeatedly, but these paths shall be identical.
 
 If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
 to solve the problem more straight-forward and simply way.
 Because build_join_rel() finds a cache on root-join_rel_hash then returns
 immediately if required joinrelids already has its RelOptInfo, bottom of
 this function never called twice on a particular set of joinrelids.
 Once FDW/CSP constructs a path that replaces entire join subtree towards
 the joinrel just after construction, it shall be kept until cheaper built-in
 paths are added (if exists).
 
 This idea has one other positive side-effect. We expect remote-join is cheaper
 than local join with two remote scan in most cases. Once a much cheaper path
 is added prior to local join consideration, add_path_precheck() breaks path
 consideration earlier.

+1 as a whole.

regards,

-- 
堀口恭太郎

日本電信電話株式会社 NTTオープンソースソフトウェアセンタ
Phone: 03-5860-5115 / Fax: 03-5463-5490



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2015-03-25 Thread Bruce Momjian
On Wed, Mar 25, 2015 at 02:18:58PM +0900, Michael Paquier wrote:
  [options=header,cols=l,l,frame=none]
  |
  |5 2.2+^.^ |4 2.2+^.^
  |2 2.2+^.^ |3 2.2+^.^
  |
 
 Hm. This is still incorrect. You should remove options=header here
 or the first tuple is treated as a header in the case
 non-expanded/tuple-only. Your patch removes correctly the header for
 the expanded/tuple-only case though.
 Regards,

OK, fixed.  Thanks for the testing.  Patch attached.  New output:

---

test= \pset format asciidoc
Output format is asciidoc.
test= \t
Tuples only is on.
test= table 5 2.2+^.^ ;

[cols=l,l,frame=none]
|
|5 2.2+^.^ |4 2.2+^.^
|2 2.2+^.^ |3 2.2+^.^
|

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index a637001..82a91ec
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** lo_import 152801
*** 2092,2099 
literalaligned/literal, literalwrapped/literal,
literalhtml/literal,
literallatex/literal (uses literaltabular/literal),
!   literallatex-longtable/literal, or
!   literaltroff-ms/literal.
Unique abbreviations are allowed.  (That would mean one letter
is enough.)
/para
--- 2092,2099 
literalaligned/literal, literalwrapped/literal,
literalhtml/literal,
literallatex/literal (uses literaltabular/literal),
!   literallatex-longtable/literal,
!   literaltroff-ms/literal, or literalasciidoc/literal.
Unique abbreviations are allowed.  (That would mean one letter
is enough.)
/para
*** lo_import 152801
*** 2120,2126 
  
para
The literalhtml/, literallatex/,
!   literallatex-longtable/literal, and literaltroff-ms/
formats put out tables that are intended to
be included in documents using the respective mark-up
language. They are not complete documents! This might not be
--- 2120,2127 
  
para
The literalhtml/, literallatex/,
!   literallatex-longtable/literal, literaltroff-ms/,
!   and literalasciidoc/
formats put out tables that are intended to
be included in documents using the respective mark-up
language. They are not complete documents! This might not be
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index 7c9f28d..a96f0ef
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** _align2string(enum printFormat in)
*** 2257,2262 
--- 2257,2265 
  		case PRINT_TROFF_MS:
  			return troff-ms;
  			break;
+ 		case PRINT_ASCIIDOC:
+ 			return asciidoc;
+ 			break;
  	}
  	return unknown;
  }
*** do_pset(const char *param, const char *v
*** 2330,2338 
  			popt-topt.format = PRINT_LATEX_LONGTABLE;
  		else if (pg_strncasecmp(troff-ms, value, vallen) == 0)
  			popt-topt.format = PRINT_TROFF_MS;
  		else
  		{
! 			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms\n);
  			return false;
  		}
  
--- 2333,2343 
  			popt-topt.format = PRINT_LATEX_LONGTABLE;
  		else if (pg_strncasecmp(troff-ms, value, vallen) == 0)
  			popt-topt.format = PRINT_TROFF_MS;
+ 		else if (pg_strncasecmp(asciidoc, value, vallen) == 0)
+ 			popt-topt.format = PRINT_ASCIIDOC;
  		else
  		{
! 			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms, asciidoc\n);
  			return false;
  		}
  
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
new file mode 100644
index ac0dc27..93a517e
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** helpVariables(unsigned short int pager)
*** 351,357 
  	fprintf(output, _(  expanded (or x)toggle expanded output\n));
  	fprintf(output, _(  fieldsep   field separator for unaligned output (default '|')\n));
  	fprintf(output, _(  fieldsep_zero  set field separator in unaligned mode to zero\n));
! 	fprintf(output, _(  format set output format [unaligned, aligned, wrapped, html, latex, ..]\n));
  	fprintf(output, _(  footer enable or disable display of the table footer [on, off]\n));
  	fprintf(output, _(  linestyle  set the border line drawing style [ascii, old-ascii, unicode]\n));
  	fprintf(output, _(  null   set the string to be printed in place of a null value\n));
--- 351,357 
  	fprintf(output, _(  expanded (or x)toggle expanded output\n));
  	fprintf(output, _(  fieldsep   field separator for unaligned output (default '|')\n));
  	fprintf(output, _(  fieldsep_zero  set field 

Re: [HACKERS] recovery_target_time ignored ?

2015-03-25 Thread Venkata Balaji N
On Wed, Mar 25, 2015 at 1:28 AM, David Steele da...@pgmasters.net wrote:

 On 3/24/15 6:12 AM, Venkata Balaji N wrote:
 
  On Tue, Mar 24, 2015 at 9:54 AM, David Steele da...@pgmasters.net
  mailto:da...@pgmasters.net wrote:
 
  On 3/23/15 12:42 AM, Venkata Balaji N wrote:
   Hi,
  
   Assuming that this might require a patch, i am posting this in
   pgsql-hackers. Apologies, if this is not the appropriate mailing
 list to
   start this discussion.
  
   I performed a PITR and saw the below message in the log file is a
 bit
   confusing.
  
   2015-03-23 13:49:09.816 GMT-10 DB= PID=4707 LOG: *database system
 was
   interrupted; last known up at 2015-03-23 10:30:26 GMT-10*
   2015-03-23 13:49:09.817 GMT-10 DB= PID=4707 LOG: *starting
  point-in-time
   recovery to 2015-03-23 10:00:26+10*
   2015-03-23 13:49:09.827 GMT-10 DB= PID=4707 LOG:  restored log file
   0001000B0020 from archive
   2015-03-23 13:49:09.888 GMT-10 DB= PID=4707 LOG:  redo starts at
 B/2090
   2015-03-23 13:49:09.937 GMT-10 DB= PID=4707 LOG:  consistent
 recovery
   state reached at B/20B8
   2015-03-23 13:49:09.947 GMT-10 DB= PID=4707 LOG:  restored log file
   0001000B0021 from archive
   2015-03-23 13:49:09.950 GMT-10 DB= PID=4707 LOG:  *recovery
 stopping
   before commit of transaction 16267, time 2015-03-23
 13:22:37.53007+10*
  
  
   By mistake i gave recovery_target_time as 10:00 GMT which is
 25/30
   minutes behind the backup start/end time registered in the
 backup_label.
  
   The parameter recovery_target_time is ignored and recovery proceeds
   further applying all the available WAL Archive files finally ends
 up
   bringing up the database.
  
   I think it would make sense if the recovery does not proceed any
 further
   and error out with a message like recovery_target_time is behind
 the
   backup time.. please consider using the backup taken prior to the
   recovery_target_time
 
  I just tried it with 9.3.5 and I do get an error:
 
  LOG:  starting point-in-time recovery to 2015-03-23
 17:26:02.721307-04
  LOG:  restored log file 00010003 from archive
  LOG:  redo starts at 0/3C8
  LOG:  recovery stopping before commit of transaction 1001, time
  2015-03-23 18:26:01.012593-04
  LOG:  redo done at 0/3000228
  FATAL:  requested recovery stop point is before consistent recovery
  point
 
 
  That makes more sense. This is what i was expecting to happen. Then, i
  think it is something to do with the timestamp format.
 
 
  Here's my recovery.conf file:
 
  restore_command = '/usr/bin/pg_backrest.pl http://pg_backrest.pl
  --stanza=db archive-get %f %p'
  recovery_target_time = '2015-03-23 17:26:02.721307 EDT'
 
   recovery.conf file is as follows :
  
   restore_command='cp /data/pgdata9400backup/pgwalarch9400backup/%f
 %p '
   recovery_target_time='2015-03-23 10:00:26 GMT-10'
   recovery_target_inclusive='true'
 
  You have '2015-03-23 10:00:26 GMT-10' in recovery.conf but the log
 says
  'starting point-in-time recovery to 2015-03-23 10:00:26+10'.  Note
 the -
  vs +.
 
 
  This is my confusion too. I picked up the time format from the backup
 label.
 
  Could you check your log and recovery.conf and make sure the timezone
  offsets are actually different?
 
 
  I am not sure why the timestamp is taken as  2015-03-23 10:00:26+10
  for 2015-03-23 10:00:26 GMT-10'.
 
  My another system's timestamp format is also AEDT. I did another test
  and I get the same problem.
 
  Below is the text from the log file. I gave a recovery_target_time
  almost a day behind the consistent recovery point. Still, the
  recovery_target_time is taken as *2015-03-23 11:07:10+11* for
  *2015-03-23 11:07:10 AEDT*
 
  2015-03-24 18:42:44.608 AEDT LOG:  database system was interrupted;
  last known up at *2015-03-24 18:20:53 AEDT*
 
  2015-03-24 18:42:44.608 AEDT LOG:  starting point-in-time recovery
  to *2015-03-23 11:07:10+11*
 
  cp: /disk3/pgwalarch9401/0001000300FE: No such file or
  directory
 
  2015-03-24 18:42:44.626 AEDT LOG:  record with zero length at
 3/FE90
 
  Below is my recovery.conf file
 
  restore_command='cp /disk3/pgwalarch9401/%f %p'
 
  recovery_target_time='*2015-03-23 11:07:10 AEDT*'
 
  recovery_target_inclusive=true
 
  I am checking if  this has something to do with my system timestamp
 format.
 
  Not sure what am i missing. Do i need to give any special time format ?

 This is a weird one.  I can reproduce the problem by setting
 timezone=timezone='Australia/Sydney'.  I also tried setting
 log_timezone=timezone='Australia/Sydney' and my system clock to
 'Australia/Sydney' but I still saw the same issue.

 I'm testing this using unit tests for some backup software I'm 

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Shigeru HANADA

2015/03/25 12:59、Kouhei Kaigai kai...@ak.jp.nec.com のメール:

 At this moment, I'm not 100% certain about its logic. Especially, I didn't
 test SEMI- and ANTI- join cases yet.
 However, time is money - I want people to check overall design first, rather
 than detailed debugging. Please tell me if I misunderstood the logic to 
 break
 down join relations.
 
 With applying your patch, regression tests of “updatable view” failed.
 regression.diff contains some errors like this:
 ! ERROR:  could not find RelOptInfo for given relids
 
 Could you check that?
 
 It is a bug around the logic to find out two RelOptInfo that can construct
 another RelOptInfo of joinrel.
 Even though I'm now working to correct the logic, it is not obvious to
 identify two relids that satisfy joinrel-relids.
 (Yep, law of entropy enhancement…)

IIUC, this problem is in only non-INNER JOINs because we can treat relations 
joined with only INNER JOIN in arbitrary order.  But supporting OUTER JOINs 
would be necessary even for the first cut.

 On the other hands, we may have a solution that does not need a complicated
 reconstruction process. The original concern was, FDW driver may add paths
 that will replace entire join subtree by foreign-scan on remote join multiple
 times, repeatedly, but these paths shall be identical.
 
 If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
 to solve the problem more straight-forward and simply way.
 Because build_join_rel() finds a cache on root-join_rel_hash then returns
 immediately if required joinrelids already has its RelOptInfo, bottom of
 this function never called twice on a particular set of joinrelids.
 Once FDW/CSP constructs a path that replaces entire join subtree towards
 the joinrel just after construction, it shall be kept until cheaper built-in
 paths are added (if exists).
 
 This idea has one other positive side-effect. We expect remote-join is cheaper
 than local join with two remote scan in most cases. Once a much cheaper path
 is added prior to local join consideration, add_path_precheck() breaks path
 consideration earlier.
 
 Please comment on.

Or bottom of make_join_rel().  IMO build_join_rel() is responsible for just 
building (or searching from a list) a RelOptInfo for given relids.  After that 
make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per 
join type to generate actual Paths implements the join.  make_join_rel() is 
called only once for particular relid combination, and there SpecialJoinInfo 
and restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
promising for FDW cases.

Though I’m not sure that it also fits custom join provider’s requirements.

—
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Ashutosh Bapat
On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com
wrote:


 Or bottom of make_join_rel().  IMO build_join_rel() is responsible for
 just building (or searching from a list) a RelOptInfo for given relids.
 After that make_join_rel() calls add_paths_to_joinrel() with appropriate
 arguments per join type to generate actual Paths implements the join.
 make_join_rel() is called only once for particular relid combination, and
 there SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and
 WHERE), so it seems promising for FDW cases.


I like that idea, but I think we will have complex hook signature, it won't
remain as simple as hook (root, joinrel).

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Kouhei Kaigai
 On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com 
 wrote:
   Or bottom of make_join_rel().  IMO build_join_rel() is responsible for
 just building (or searching from a list) a RelOptInfo for given relids.  After
 that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments 
 per
 join type to generate actual Paths implements the join.  make_join_rel() is
 called only once for particular relid combination, and there SpecialJoinInfo 
 and
 restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
 promising
 for FDW cases.
 
 
 
 I like that idea, but I think we will have complex hook signature, it won't 
 remain
 as simple as hook (root, joinrel).
 
In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
sjinfo and restrictlist.
It is not too simple, but not complicated signature.

Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
restrictlist using build_joinrel_restrictlist() again. It is a static function
in relnode.c. So, I don't think either of them has definitive advantage from
the standpoint of simplicity.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Rajeev rastogi
On 20 March 2015 17:37, Amit Kapila Wrote:

 So the patches have to be applied in below sequence:
 HEAD Commit-id : 8d1f2390
 parallel-mode-v8.1.patch [2]
 assess-parallel-safety-v4.patch [1]
 parallel-heap-scan.patch [3]
 parallel_seqscan_v11.patch (Attached with this mail)

While I was going through this patch, I observed one invalid ASSERT in the 
function “ExecInitFunnel” i.e.
Assert(outerPlan(node) == NULL);

Outer node of Funnel node is always non-NULL and currently it will be 
PartialSeqScan Node.

May be ASSERT is disabled while building the code because of which this issue 
has not yet been observed.

Thanks and Regards,
Kumar Rajeev Rastogi


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Amit Kapila
On Wed, Mar 25, 2015 at 3:47 PM, Rajeev rastogi rajeev.rast...@huawei.com
wrote:

 On 20 March 2015 17:37, Amit Kapila Wrote:


  So the patches have to be applied in below sequence:

  HEAD Commit-id : 8d1f2390
  parallel-mode-v8.1.patch [2]
  assess-parallel-safety-v4.patch [1]
  parallel-heap-scan.patch [3]
  parallel_seqscan_v11.patch (Attached with this mail)


 While I was going through this patch, I observed one invalid ASSERT in
the function “ExecInitFunnel” i.e.

 Assert(outerPlan(node) == NULL);

 Outer node of Funnel node is always non-NULL and currently it will be
PartialSeqScan Node.


Which version of patch you are looking at?

I am seeing below code in ExecInitFunnel() in Version-11 to which
you have replied.

+ /* Funnel node doesn't have innerPlan node. */
+ Assert(innerPlan(node) == NULL);


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] possible dsm bug in dsm_attach()

2015-03-25 Thread Amit Kapila
On Tue, May 6, 2014 at 11:15 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, May 6, 2014 at 1:14 PM, Andres Freund and...@2ndquadrant.com
wrote:
  On 2014-05-06 08:48:57 -0400, Robert Haas wrote:
  On Tue, May 6, 2014 at 8:43 AM, Andres Freund and...@2ndquadrant.com
wrote:
   The break because of refcnt == 1 doesn't generally seem to be a good
   idea. Why are we bailing if there's *any* segment that's in the
process
   of being removed? I think the check should be there *after* the
   dsm_control-item[i].handle == seg-handle check?
 
  You are correct.  Good catch.
 
  Fix attached.

 Committed, thanks.


dsm_create(Size size, int flags)
{
..
/* Lock the control segment so we can register the new segment. */
LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);

..
/* Verify that we can support an additional mapping. */
if (nitems = dsm_control-maxitems)
{
if ((flags  DSM_CREATE_NULL_IF_MAXSEGMENTS) != 0)
{
dsm_impl_op(DSM_OP_DESTROY, seg-handle, 0, seg-impl_private,
seg-mapped_address, seg-mapped_size, WARNING);
if (seg-resowner != NULL)
ResourceOwnerForgetDSM(seg-resowner, seg);
dlist_delete(seg-node);
pfree(seg);
return NULL;
}
..
}

Is there a reason lock is not released in case we return NULL in above
code?

I am facing an issue in case we need to create many segments for
large inheritance hierarchy.  Attached patch fixes the problem for me.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


release_lock_dsm_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Shigeru HANADA

2015/03/25 19:09、Kouhei Kaigai kai...@ak.jp.nec.com のメール:

 On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com 
 wrote:
  Or bottom of make_join_rel().  IMO build_join_rel() is responsible for
 just building (or searching from a list) a RelOptInfo for given relids.  
 After
 that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments 
 per
 join type to generate actual Paths implements the join.  make_join_rel() is
 called only once for particular relid combination, and there SpecialJoinInfo 
 and
 restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
 promising
 for FDW cases.
 
 
 
 I like that idea, but I think we will have complex hook signature, it won't 
 remain
 as simple as hook (root, joinrel).
 
 In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
 sjinfo and restrictlist.
 It is not too simple, but not complicated signature.
 
 Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
 restrictlist using build_joinrel_restrictlist() again. It is a static function
 in relnode.c. So, I don't think either of them has definitive advantage from
 the standpoint of simplicity.

The bottom of make_join_rel() seems good from the viewpoint of information, but 
it is called multiple times for join combinations which are essentially 
identical, for INNER JOIN case like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid 
= b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO:  postgresGetForeignJoinPaths() 1x2
INFO:  postgresGetForeignJoinPaths() 1x4
INFO:  postgresGetForeignJoinPaths() 2x4
INFO:  standard_join_search() old hook point
INFO:  standard_join_search() old hook point
INFO:  standard_join_search() old hook point
INFO:  postgresGetForeignJoinPaths() 0x4
INFO:  postgresGetForeignJoinPaths() 0x2
INFO:  postgresGetForeignJoinPaths() 0x1
INFO:  standard_join_search() old hook point
   QUERY PLAN
-
 Foreign Scan  (cost=100.00..102.11 rows=211 width=1068)
(1 row)

Here I’ve put probe point in the beginnig of GetForeignJoinPaths handler and 
just before set_cheapest() call in standard_join_search() as “old hook point”.  
In this example 1, 2, and 4 are base relations, and in the join level 3 planner 
calls GetForeignJoinPaths() three times for the combinations:

1) (1x2)x4
2) (1x4)x2
3) (2x4)x1

Tom’s suggestion is aiming at providing a chance to consider join push-down in 
more abstract level, IIUC.  So it would be good to call handler only once for 
that case, for flattened combination (1x2x3).

Hum, how about skipping calling handler (or hook) if the joinrel was found by 
find_join_rel()?  At least it suppress redundant call for different join 
orders, and handler can determine whether the combination can be flattened by 
checking that all RelOptInfo with RELOPT_JOINREL under joinrel has JOIN_INNER 
as jointype.

—
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assessing parallel-safety

2015-03-25 Thread Amit Kapila
On Sun, Mar 22, 2015 at 12:00 AM, Thom Brown t...@linux.com wrote:

 On 21 March 2015 at 14:28, Amit Kapila amit.kapil...@gmail.com wrote:

 On Fri, Mar 20, 2015 at 7:54 PM, Thom Brown t...@linux.com wrote:
  createdb pgbench
  pgbench -i -s 200 pgbench
 
  CREATE TABLE pgbench_accounts_1 (CHECK (bid = 1)) INHERITS
(pgbench_accounts);
  ...
  CREATE TABLE pgbench_accounts_200 (CHECK (bid = 200)) INHERITS
  (pgbench_accounts);
 

 I managed to reproduce the Assertion reported by you as:

 #2  0x007a053a in ExceptionalCondition
(conditionName=conditionName@entry=0x813a4b !(IsInParallelMode()),
errorType=errorType@entry=0x7da1d6 FailedAssertion,
fileName=fileName@entry=0x81397d parallel.c, lineNumber=lineNumber@entry=123)
at assert.c:54
 #3  0x004cd5ba in CreateParallelContext
(entrypoint=entrypoint@entry=0x659d2c ParallelQueryMain,
nworkers=nworkers@entry=8) at parallel.c:123

 The reason is that CreateParallelContext() expects to be called
 in ParallelMode and we enter into parallel-mode after InitPlan()
 in standard_ExecutorStart().  So the probable fix could be
 to EnterParallelMode before initializing the plan.

 I still could not reproduce the crash you have reported as:
  #0  0x00770843 in pfree ()
  #1  0x005a382f in ExecEndFunnel ()
  #2  0x0059fe75 in ExecEndAppend ()
  #3  0x005920bd in standard_ExecutorEnd ()


 Could you let me know which all patches you have tried
 and on top of which commit.

 I am trying on the commit as mentioned in mail[1]. Basically
 have you tried the versions mentioned in that mail:

 HEAD Commit-id : 8d1f2390
 parallel-mode-v8.1.patch [2]
 assess-parallel-safety-v4.patch [1]
 parallel-heap-scan.patch [3]
 parallel_seqscan_v11.patch (Attached with this mail)

 If something else, could you let me know the same so that I can try
 that to reproduce the issue reported by you.


 Looks like one of the patches I applied is newer than the one in your
list:


Okay, then that was the reason why you were seeing the crash whereas
I could not reproduce, however I have integrated the patch with the
v-9 of parallel-mode patch and posted it on appropriate thread.
One point to note is that I think there is one small issue in one of the
latest commits [1]. After that is fixed you can once verify with latest
patch.

[1] -
http://www.postgresql.org/message-id/caa4ek1+nwuj9ik61ygfzbcn85dqunevd38_h1zngcdzrglg...@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Kouhei Kaigai
 2015/03/25 12:59、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
 
  At this moment, I'm not 100% certain about its logic. Especially, I didn't
  test SEMI- and ANTI- join cases yet.
  However, time is money - I want people to check overall design first, 
  rather
  than detailed debugging. Please tell me if I misunderstood the logic to 
  break
  down join relations.
 
  With applying your patch, regression tests of “updatable view” failed.
  regression.diff contains some errors like this:
  ! ERROR:  could not find RelOptInfo for given relids
 
  Could you check that?
 
  It is a bug around the logic to find out two RelOptInfo that can construct
  another RelOptInfo of joinrel.
  Even though I'm now working to correct the logic, it is not obvious to
  identify two relids that satisfy joinrel-relids.
  (Yep, law of entropy enhancement…)
 
 IIUC, this problem is in only non-INNER JOINs because we can treat relations 
 joined
 with only INNER JOIN in arbitrary order.  But supporting OUTER JOINs would be
 necessary even for the first cut.
 
Yep. In case when joinrel contains all inner-joined relations managed by same
FDW driver, job of get_joinrel_broken_down() is quite simple.
However, people want to support outer-join also, doesn't it?

  On the other hands, we may have a solution that does not need a complicated
  reconstruction process. The original concern was, FDW driver may add paths
  that will replace entire join subtree by foreign-scan on remote join 
  multiple
  times, repeatedly, but these paths shall be identical.
 
  If we put a hook for FDW/CSP on bottom of build_join_rel(), we may be able
  to solve the problem more straight-forward and simply way.
  Because build_join_rel() finds a cache on root-join_rel_hash then returns
  immediately if required joinrelids already has its RelOptInfo, bottom of
  this function never called twice on a particular set of joinrelids.
  Once FDW/CSP constructs a path that replaces entire join subtree towards
  the joinrel just after construction, it shall be kept until cheaper built-in
  paths are added (if exists).
 
  This idea has one other positive side-effect. We expect remote-join is 
  cheaper
  than local join with two remote scan in most cases. Once a much cheaper path
  is added prior to local join consideration, add_path_precheck() breaks path
  consideration earlier.
 
  Please comment on.
 
 Or bottom of make_join_rel().  IMO build_join_rel() is responsible for just
 building (or searching from a list) a RelOptInfo for given relids.  After that
 make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per 
 join
 type to generate actual Paths implements the join.  make_join_rel() is called
 only once for particular relid combination, and there SpecialJoinInfo and
 restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
 promising
 for FDW cases.
 
As long as caller can know whether build_join_rel() actually construct a new
RelOptInfo object, or not, I think it makes more sense than putting a hook
within make_join_rel().

 Though I’m not sure that it also fits custom join provider’s requirements.

Join replaced by CSP has two scenarios. First one implements just an alternative
logic of built-in join, will takes underlying inner/outer node, so its hook
is located on add_paths_to_joinrel() as like built-in join logics.
Second one tries to replace entire join sub-tree by materialized view (for
example), like FDW remote join cases. So, it has to be hooked nearby the
location of GetForeignJoinPaths().
In case of the second scenario, CSP does not have private field in RelOptInfo,
so it may not obvious to check whether the given joinrel exactly matches with
a particular materialized-view or other caches.

At this moment, what I'm interested in is the first scenario, so priority of
the second case is not significant for me, at least.

Thanks.
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assessing parallel-safety

2015-03-25 Thread Amit Kapila
On Fri, Mar 20, 2015 at 11:37 PM, Robert Haas robertmh...@gmail.com wrote:


 That might be a different crash than the first one you showed.  But it
 looks like the problem here is that the parallel sequential scan patch
 is calling CreateParallelContext even though this is just an EXPLAIN
 and we're not actually running the query.  It shouldn't do that.
 (This might be an argument for postponing CreateParallelContext()
 until run time, as I've suggested before.)


Okay, I have postponed the CreateParallelContext() until runtime in
the latest patch posted on Parallel Seq Scan thread.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Amit Langote
On Wednesday, March 25, 2015, Michael Paquier michael.paqu...@gmail.com wrote:

 On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
  The index is unlogged until reindexing...
 
  [...]
  Which is think also raises the question, why are unlogged indexes made
  persistent by a reindex?

 That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
 REINDEX INDEX. What happens is that ReindexIndex relies on
 relpersistence provided by makeRangeVar at parse time, which is just
 incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
 attached fixes that...


How about VACUUM FULL and CLUSTER as the problem seems to have been
reported to be there too?

Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Abbreviated keys for Numeric

2015-03-25 Thread Peter Geoghegan
On Tue, Mar 24, 2015 at 12:03 AM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 So here's the latest (and, hopefully, last) version:

  - adds diagnostic output from numeric_abbrev_abort using the trace_sort
GUC

  - fixed Datum cs. uint32 issues in hash_uint32

  - added a short comment about excess-k representation

  - tweaked the indenting and comments a bit

You still pointlessly check memtupcount here:

+ if (memtupcount  1 || nss-input_count  1 || !nss-estimating)
+ return false;

I still don't like this comment:

+ nss = palloc(sizeof(NumericSortSupport));
+
+ /*
+ * palloc a buffer for handling unaligned packed values in addition to
+ * the support struct
+ */
+ nss-buf = palloc(VARATT_SHORT_MAX + VARHDRSZ + 1);

This cast to void is unnecessary:

+numeric_fast_cmp(Datum x, Datum y, SortSupport ssup)
+{
+ Numeric nx = DatumGetNumeric(x);
+ Numeric ny = DatumGetNumeric(y);
+ int result;
+
+ (void) ssup;

Please try and at least consider my feedback. I don't expect you to do
exactly what I ask, but I also don't expect you to blithely ignore it.

 I'm not particularly committed to any specific way of handling the
 DEC_DIGITS issue. (I moved away from the transparently skip
 abbreviations approach of the original because it seemed that reducing
 #ifdefism in the code was a desirable feature.)

Good.

 The INT64_MIN/MAX changes should be committed fairly soon. (I haven't
 posted a patch for TRACE_SORT)

I wouldn't assume that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Andres Freund
Hi,

On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
 On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
  The index is unlogged until reindexing...
 
  [...]
  Which is think also raises the question, why are unlogged indexes made
  persistent by a reindex?
 
 That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
 REINDEX INDEX. What happens is that ReindexIndex relies on
 relpersistence provided by makeRangeVar at parse time, which is just
 incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
 attached fixes that...

What the hell? That's somewhat nasty. Nice that it got caught before 9.5
was released.

Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2015-03-25 Thread Michael Paquier
On Wed, Mar 25, 2015 at 4:59 PM, Bruce Momjian br...@momjian.us wrote:
 On Wed, Mar 25, 2015 at 02:18:58PM +0900, Michael Paquier wrote:
  [options=header,cols=l,l,frame=none]
  |
  |5 2.2+^.^ |4 2.2+^.^
  |2 2.2+^.^ |3 2.2+^.^
  |

 Hm. This is still incorrect. You should remove options=header here
 or the first tuple is treated as a header in the case
 non-expanded/tuple-only. Your patch removes correctly the header for
 the expanded/tuple-only case though.
 Regards,

 OK, fixed.  Thanks for the testing.  Patch attached.  New output:

This time things look good from my side. I have played with this patch
some time, testing some crazy scenarios and I have not found problems.
That's cool stuff, thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Thom Brown
On 25 March 2015 at 11:46, Thom Brown t...@linux.com wrote:


 Still not sure why 8 workers are needed for each partial scan.  I would
 expect 8 workers to be used for 8 separate scans.  Perhaps this is just my
 misunderstanding of how this feature works.


Another issue:

SELECT * FROM pgbtab

*crash*

Logs:

2015-03-25 13:17:49 GMT [22823]: [124-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [125-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [126-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [127-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [128-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [129-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [130-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [131-1] user=,db=,client= LOG:
registering background worker parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [132-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [133-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [134-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [135-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [136-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [137-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [138-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [139-1] user=,db=,client= LOG:  starting
background worker process parallel worker for PID 24792
2015-03-25 13:17:49 GMT [22823]: [140-1] user=,db=,client= LOG:  worker
process: parallel worker for PID 24792 (PID 24804) was terminated by signal
11: Segmentation fault
2015-03-25 13:17:49 GMT [22823]: [141-1] user=,db=,client= LOG:
terminating any other active server processes
2015-03-25 13:17:49 GMT [24777]: [2-1] user=,db=,client= WARNING:
terminating connection because of crash of another server process
2015-03-25 13:17:49 GMT [24777]: [3-1] user=,db=,client= DETAIL:  The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2015-03-25 13:17:49 GMT [24777]: [4-1] user=,db=,client= HINT:  In a moment
you should be able to reconnect to the database and repeat your command.

Backtrace:

#0  GrantLockLocal (locallock=locallock@entry=0xfbe7f0,
owner=owner@entry=0x1046da0)
at lock.c:1544
#1  0x0066975c in LockAcquireExtended
(locktag=locktag@entry=0x7fffdcb0ea20,
lockmode=1,
lockmode@entry=error reading variable: Cannot access memory at address
0x7fffdcb0e9f0, sessionLock=sessionLock@entry=0 '\000',
dontWait=dontWait@entry=0 '\000',
reportMemoryError=reportMemoryError@entry=1 '\001', ) at lock.c:798
#2  0x0066a1c4 in LockAcquire (locktag=locktag@entry=0x7fffdcb0ea20,
lockmode=error reading variable: Cannot access memory at address
0x7fffdcb0e9f0,
sessionLock=sessionLock@entry=0 '\000', dontWait=dontWait@entry=0
'\000') at lock.c:680
#3  0x00667c48 in LockRelationOid (relid=error reading variable:
Cannot access memory at address 0x7fffdcb0e9e8,
relid@entry=error reading variable: Cannot access memory at address
0x7fffdcb0ea48,
lockmode=error reading variable: Cannot access memory at address
0x7fffdcb0e9f0,
lockmode@entry=error reading variable: Cannot access memory at address
0x7fffdcb0ea48) at lmgr.c:94

But the issue seems to produce a different backtrace each time...

2nd backtrace:

#0  hash_search_with_hash_value (hashp=0x2a2c370,
keyPtr=keyPtr@entry=0x75ad2230,
hashvalue=hashvalue@entry=2114233864, action=action@entry=HASH_FIND,
foundPtr=foundPtr@entry=0x0) at dynahash.c:918
#1  0x00654d1a in BufTableLookup (tagPtr=tagPtr@entry=0x75ad2230,
hashcode=hashcode@entry=2114233864) at buf_table.c:96
#2  0x0065746b in BufferAlloc (foundPtr=0x75ad222f Address
0x75ad222f out of bounds, strategy=0x0,
blockNum=error reading variable: Cannot access memory at address
0x75ad2204,
forkNum=error reading variable: Cannot access memory at address
0x75ad2208,
relpersistence=error reading variable: 

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Kouhei Kaigai
 2015/03/25 19:47、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
  The reason why FDW handler was called multiple times on your example is,
  your modified make_join_rel() does not check whether build_join_rel()
  actually build a new RelOptInfo, or just a cache reference, doesn't it?
 
 
 Yep.  After that change calling count looks like this:
 
 fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid
 = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
 INFO:  postgresGetForeignJoinPaths() 1x2
 INFO:  postgresGetForeignJoinPaths() 1x4
 INFO:  postgresGetForeignJoinPaths() 2x4
 INFO:  standard_join_search() old hook point
 INFO:  standard_join_search() old hook point
 INFO:  standard_join_search() old hook point
 INFO:  postgresGetForeignJoinPaths() 0x4
 INFO:  standard_join_search() old hook point
QUERY PLAN
 -
  Foreign Scan  (cost=100.00..102.11 rows=211 width=1068)
 (1 row)
 
 fdw=#
 
  If so, I'm inclined to your proposition.
  A new bool *found argument of build_join_rel() makes reduce number of
  FDW handler call, with keeping reasonable information to build remote-
  join query.
 
 Another idea is to pass “found” as parameter to FDW handler, and let FDW to 
 decide
 to skip or not.  Some of FDWs (and some of CSP?) might want to be conscious of
 join combination.

I think it does not match the concept we stand on.
Unlike CSP, FDW intends to replace an entire join sub-tree that is
represented with a particular joinrel, regardless of the sequence
to construct a joinrel from multiple baserels.
So, it is sufficient to call GetForeignJoinPaths() once a joinrel
is constructed, isn't it?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Thom Brown
On 25 March 2015 at 12:22, Amit Langote amitlangot...@gmail.com wrote:

 On Wednesday, March 25, 2015, Michael Paquier michael.paqu...@gmail.com
 wrote:
 
  On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
   The index is unlogged until reindexing...
  
   [...]
   Which is think also raises the question, why are unlogged indexes made
   persistent by a reindex?
 
  That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
  REINDEX INDEX. What happens is that ReindexIndex relies on
  relpersistence provided by makeRangeVar at parse time, which is just
  incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
  attached fixes that...
 

 How about VACUUM FULL and CLUSTER as the problem seems to have been
 reported to be there too?


No, those are okay.  They actually revert the index back to the same
persistence level as the table they're attached to.

-- 
Thom


Re: [HACKERS] Ignoring entries generated by autoconf in code tree

2015-03-25 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 When running autoconf from the root tree, autom4te.cache/ is
 automatically generated.
 Wouldn't it make sense to add an entry in .gitignore for that?

Personally, I don't want such a thing, as then I would tend to forget
to remove that cache file.  And you do want to remove it.  autoconf
goes pretty berserk if the cache file hangs around across significant
changes to configure.in, such as if you were to switch branches.
(Or at least that used to be true --- last time I got burnt by it
was quite some time ago, but possibly that's just because I'm careful
about removing the cache file.)

In short, -1.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-03-25 Thread David Steele
On 3/25/15 7:46 AM, Sawada Masahiko wrote:
 On Wed, Mar 25, 2015 at 12:23 PM, David Steele da...@pgmasters.net wrote:
 On Wed, Mar 25, 2015 at 12:38 AM, David Steele da...@pgmasters.net wrote:
 2. OBJECT auditing does not work before adding acl info to 
 pg_class.rel_acl.
 In following situation, pg_audit can not audit OBJECT log.
 $ cat postgresql.conf | grep audit
 shared_preload_libraries = 'pg_audit'
 pg_audit.role = 'hoge_user'
 pg_audit.log = 'read, write'
 $ psql -d postgres -U hoge_user
 =# create table hoge(col int);
 =# select * from hoge;
 LOG:  AUDIT: SESSION,3,1,READ,SELECT,,,select * from hoge;

 OBJECT audit log is not logged here since pg_class.rel_acl is empty
 yet. (Only logged SESSION log)
 So after creating another unconcerned role and grant any privilege to 
 that user,
 OBJECT audit is logged successfully.

 Yes, object auditing does not work until some grants have been made to
 the audit role.

 =# create role bar_user;
 =# grant select on hoge to bar_user;
 =# select * from hoge;
 LOG:  AUDIT: SESSION,4,1,READ,SELECT,,,select * from hoge;
 LOG:  AUDIT: OBJECT,4,1,READ,SELECT,TABLE,public.hoge,select * from hoge;

 The both OBJCET and SESSION log are logged.

 Looks right to me.  If you don't want the session logging then disable
 pg_audit.log.

 Session and object logging are completely independent from each other:
 one or the other, or both, or neither can be enabled at any time.

 It means that OBJECT log is not logged just after creating table, even
 if that table is touched by its owner.
 To write OBJECT log, we need to grant privilege to role at least. right?

 Exactly.  Privileges must be granted to the audit role in order for
 object auditing to work.

 
 The table owner always can touch its table.
 So does it lead that table owner can get its table information while
 hiding OBJECT logging?

Yes, the table owner would be able to access the table without object
logging if grants to that table were not made to the audit role.  That
would also be true for any other user that had grants on the table.

The purpose of object auditing is to allow more fine-grained control and
is intended to be used in situations where you only want to audit some
things, rather than all things.  Logging everything is better done with
the session logging.

However, object logging does yield more information since it lists every
table that was touched by the statement, so there may be cases where
you'd like to object log everything.  In that case I'd recommend writing
a bit of plpgsql code to create the grants.

 Also I looked into latest patch again.
 Here are two review comment.
 
 1.
 typedef struct
 {
int64 statementId;
   int64 substatementId;
 Both statementId and substatementId could be negative number.
 I think these should be uint64 instead.

True.  I did this because printf formatting for uint64 seems to be vary
across platforms.  int64 formatting is more standard and still gives
more than enough IDs.

I could change it back to uint64 if you have a portable way to modify
the sprintf at line 507.

 2.
 I got ERROR when executing function uses cursor.
 
 1) create empty table (hoge table)
 2) create test function as follows.
 
 create function test() returns int as $$
 declare
 cur1 cursor for select * from hoge;
 tmp int;
 begin
 open cur1;
 fetch cur1 into tmp;
return tmp;
 end$$
 language plpgsql ;
 
 3) execute test function (got ERROR)
 =# select test();
 LOG:  AUDIT: SESSION,6,1,READ,SELECT,,,selecT test();
 LOG:  AUDIT: SESSION,6,2,FUNCTION,EXECUTE,FUNCTION,public.test,selecT test();
 LOG:  AUDIT: SESSION,6,3,READ,SELECT,,,select * from hoge
 CONTEXT:  PL/pgSQL function test() line 6 at OPEN
 ERROR:  pg_audit stack is already empty
 STATEMENT:  selecT test();
 
 It seems like that the item in stack is already freed by deleting
 pg_audit memory context (in MemoryContextDelete()),
 before calling stack_pop in dropping of top-level Portal.

Good catch, I'll add this to my test cases and work on a fix.  I think I
see a good way to approach it.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] printing table in asciidoc with psql

2015-03-25 Thread Bruce Momjian
On Wed, Mar 25, 2015 at 09:37:08PM +0900, Michael Paquier wrote:
 On Wed, Mar 25, 2015 at 4:59 PM, Bruce Momjian br...@momjian.us wrote:
  On Wed, Mar 25, 2015 at 02:18:58PM +0900, Michael Paquier wrote:
   [options=header,cols=l,l,frame=none]
   |
   |5 2.2+^.^ |4 2.2+^.^
   |2 2.2+^.^ |3 2.2+^.^
   |
 
  Hm. This is still incorrect. You should remove options=header here
  or the first tuple is treated as a header in the case
  non-expanded/tuple-only. Your patch removes correctly the header for
  the expanded/tuple-only case though.
  Regards,
 
  OK, fixed.  Thanks for the testing.  Patch attached.  New output:
 
 This time things look good from my side. I have played with this patch
 some time, testing some crazy scenarios and I have not found problems.
 That's cool stuff, thanks!

Wow, thanks.  I never would have gotten here without your help.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Amit Langote
On Wednesday, March 25, 2015, Thom Brown t...@linux.com wrote:

 On 25 March 2015 at 12:22, Amit Langote amitlangot...@gmail.com
 javascript:_e(%7B%7D,'cvml','amitlangot...@gmail.com'); wrote:

 On Wednesday, March 25, 2015, Michael Paquier michael.paqu...@gmail.com
 javascript:_e(%7B%7D,'cvml','michael.paqu...@gmail.com'); wrote:
 
  On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
   The index is unlogged until reindexing...
  
   [...]
   Which is think also raises the question, why are unlogged indexes made
   persistent by a reindex?
 
  That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
  REINDEX INDEX. What happens is that ReindexIndex relies on
  relpersistence provided by makeRangeVar at parse time, which is just
  incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
  attached fixes that...
 

 How about VACUUM FULL and CLUSTER as the problem seems to have been
 reported to be there too?


 No, those are okay.  They actually revert the index back to the same
 persistence level as the table they're attached to.


Ah, I misread then; sorry about the noise.

Amit


[HACKERS] varlena.c hash_any() and hash_uint32() calls require DatumGetUInt32()

2015-03-25 Thread Peter Geoghegan
Attached patch adds DatumGetUInt32() around the hash_any() and
hash_uint32() calls within varlena.c. These should have been in the
original abbreviated keys commit. Mea culpa.

-- 
Peter Geoghegan
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 3edd283..02e9949 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -2084,8 +2084,8 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 	 * in order to compensate for cases where differences are past
 	 * PG_CACHE_LINE_SIZE bytes, so as to limit the overhead of hashing.
 	 */
-	hash = hash_any((unsigned char *) authoritative_data,
-	Min(len, PG_CACHE_LINE_SIZE));
+	hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
+   Min(len, PG_CACHE_LINE_SIZE)));
 
 	if (len  PG_CACHE_LINE_SIZE)
 		hash ^= DatumGetUInt32(hash_uint32((uint32) len));
@@ -2100,10 +2100,10 @@ bttext_abbrev_convert(Datum original, SortSupport ssup)
 
 		lohalf = (uint32) res;
 		hihalf = (uint32) (res  32);
-		hash = hash_uint32(lohalf ^ hihalf);
+		hash = DatumGetUInt32(hash_uint32(lohalf ^ hihalf));
 	}
 #else			/* SIZEOF_DATUM != 8 */
-	hash = hash_uint32((uint32) res);
+	hash = DatumGetUInt32(hash_uint32((uint32) res));
 #endif
 
 	addHyperLogLog(tss-abbr_card, hash);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Fabrízio de Royes Mello
On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi,

 On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
  On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
   The index is unlogged until reindexing...
  
   [...]
   Which is think also raises the question, why are unlogged indexes made
   persistent by a reindex?
 
  That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
  REINDEX INDEX. What happens is that ReindexIndex relies on
  relpersistence provided by makeRangeVar at parse time, which is just
  incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
  attached fixes that...

 What the hell? That's somewhat nasty. Nice that it got caught before 9.5
 was released.


Unfortunately this is very nasty. Sorry!


 Did you check whether a similar bug was made in other places of
 85b506bb? Could you additionally add a regression test to this end?
 Seems like something worth testing.


I'm checking it and adding some regression tests.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Amit Kapila
On Wed, Mar 25, 2015 at 5:16 PM, Thom Brown t...@linux.com wrote:

 On 25 March 2015 at 10:27, Amit Kapila amit.kapil...@gmail.com wrote:

 Fixed the reported issue on assess-parallel-safety thread and another
 bug caught while testing joins and integrated with latest version of
 parallel-mode patch (parallel-mode-v9 patch).

 Apart from that I have moved the Initialization of dsm segement from
 InitNode phase to ExecFunnel() (on first execution) as per suggestion
 from Robert.  The main idea is that as it creates large shared memory
 segment, so do the work when it is really required.


 HEAD Commit-Id: 11226e38
 parallel-mode-v9.patch [2]
 assess-parallel-safety-v4.patch [1]
 parallel-heap-scan.patch [3]
 parallel_seqscan_v12.patch (Attached with this mail)

 [1] -
http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com
 [2] -
http://www.postgresql.org/message-id/ca+tgmozfsxzhs6qy4z0786d7iu_abhbvpqfwlthpsvgiecz...@mail.gmail.com
 [3] -
http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com


 Okay, with my pgbench_accounts partitioned into 300, I ran:

 SELECT DISTINCT bid FROM pgbench_accounts;

 The query never returns,

You seem to be hitting the issue I have pointed in near-by thread [1]
and I have mentioned the same while replying on assess-parallel-safety
thread.   Can you check after applying the patch in mail [1]

 and I also get this:

 grep -r 'starting background worker process parallel worker for PID
12165' postgresql-2015-03-25_112522.log  | wc -l
 2496

 2,496 workers?  This is with parallel_seqscan_degree set to 8.  If I set
it to 2, this number goes down to 626, and with 16, goes up to 4320.

..

 Still not sure why 8 workers are needed for each partial scan.  I would
expect 8 workers to be used for 8 separate scans.  Perhaps this is just my
misunderstanding of how this feature works.


The reason is that for each table scan, it tries to use workers
equal to parallel_seqscan_degree if they are available and in this
case as the scan for inheritance hierarchy (tables in hierarchy) happens
one after another, it uses 8 workers for each scan.  I think as of now
the strategy to decide number of workers to be used in scan is kept
simple and in future we can try to come with some better mechanism
to decide number of workers.

[1] -
http://www.postgresql.org/message-id/caa4ek1+nwuj9ik61ygfzbcn85dqunevd38_h1zngcdzrglg...@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Fabrízio de Royes Mello
On Wed, Mar 25, 2015 at 12:46 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund and...@2ndquadrant.com
wrote:
 

  Did you check whether a similar bug was made in other places of
  85b506bb? Could you additionally add a regression test to this end?
  Seems like something worth testing.
 

 I'm checking it and adding some regression tests.


I didn't found any other similar bug introduced by 85b506bb.

Attached the original patch provided by Michael with some regression tests.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 1c1d0da..1520d32 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1685,6 +1685,8 @@ ReindexIndex(RangeVar *indexRelation)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
+	Relation	irel;
+	char		relpersistence;
 
 	/* lock level used here should match index lock reindex_index() */
 	indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
@@ -1692,7 +1694,11 @@ ReindexIndex(RangeVar *indexRelation)
 	  RangeVarCallbackForReindexIndex,
 	  (void *) heapOid);
 
-	reindex_index(indOid, false, indexRelation-relpersistence);
+	irel = index_open(indOid, AccessExclusiveLock);
+	relpersistence = irel-rd_rel-relpersistence;
+	index_close(irel, NoLock);
+
+	reindex_index(indOid, false, relpersistence);
 
 	return indOid;
 }
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index 34b5fc1..3214c19 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -208,6 +208,21 @@ CREATE TABLE IF NOT EXISTS test_tsvector(
 );
 NOTICE:  relation test_tsvector already exists, skipping
 CREATE UNLOGGED TABLE unlogged1 (a int primary key);			-- OK
+SELECT relname, relkind, relpersistence FROM pg_class WHERE relname ~ '^unlogged1' ORDER BY relname;
+relname | relkind | relpersistence 
++-+
+ unlogged1  | r   | u
+ unlogged1_pkey | i   | u
+(2 rows)
+
+REINDEX INDEX unlogged1_pkey;
+SELECT relname, relkind, relpersistence FROM pg_class WHERE relname ~ '^unlogged1' ORDER BY relname;
+relname | relkind | relpersistence 
++-+
+ unlogged1  | r   | u
+ unlogged1_pkey | i   | u
+(2 rows)
+
 INSERT INTO unlogged1 VALUES (42);
 CREATE UNLOGGED TABLE public.unlogged2 (a int primary key);		-- also OK
 CREATE UNLOGGED TABLE pg_temp.unlogged3 (a int primary key);	-- not OK
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 08029a9..acb7eb8 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -246,6 +246,9 @@ CREATE TABLE IF NOT EXISTS test_tsvector(
 );
 
 CREATE UNLOGGED TABLE unlogged1 (a int primary key);			-- OK
+SELECT relname, relkind, relpersistence FROM pg_class WHERE relname ~ '^unlogged1' ORDER BY relname;
+REINDEX INDEX unlogged1_pkey;
+SELECT relname, relkind, relpersistence FROM pg_class WHERE relname ~ '^unlogged1' ORDER BY relname;
 INSERT INTO unlogged1 VALUES (42);
 CREATE UNLOGGED TABLE public.unlogged2 (a int primary key);		-- also OK
 CREATE UNLOGGED TABLE pg_temp.unlogged3 (a int primary key);	-- not OK

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-03-25 Thread Heikki Linnakangas

On 02/12/2015 09:26 PM, Heikki Linnakangas wrote:

On 02/11/2015 04:20 PM, Abhijit Menon-Sen wrote:

At 2015-02-11 13:20:29 +0200, hlinnakan...@vmware.com wrote:


I don't follow. I didn't change configure at all, compared to your
patch.


OK, I extrapolated a little too much. Your patch didn't actually include
crc_instructions.h;


Oh, I'm sorry. Here's the complete patch with crc_instructions.h


I was just about to commit the attached, which is the same as the 
previous patch with just cosmetic comment changes, but then I realized 
that this probably doesn't compile with Visual Studio 2005 or older. The 
code does #ifdef _MSC_VER, and then uses the _mm_crc32_u64 intrinsic, 
but that intrinsic was added in Visual Studio 2008. I think we'll need a 
version check there. Or better yet, a direct configure test to check if 
the intrinsic exists - that way we get to also use it on Intel 
compilers, which I believe also has the same intrinsics.


You want to write that or should I? How do you like this latest version 
of the patch otherwise? You had some criticism earlier, but I had 
forgotten to include the crc_instructions.h header file in that earlier 
version.


- Heikki

From f934cb017ad0270ded73feb4d3279e81a58a4149 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas heikki.linnakangas@iki.fi
Date: Wed, 25 Mar 2015 18:44:07 +0200
Subject: [PATCH v2 1/1] Use Intel SSE4.2 CRC instructions where available.

On x86, perform a runtime check to see if we're running on a CPU that
supports SSE 4.2. If we are, we can use the special crc32b and crc32q
instructions for the CRC-32C calculations. That greatly speeds up CRC
calculation.

Abhijit Menon-Sen, reviewed by Andres Freund and me.
---
 configure   |   2 +-
 configure.in|   2 +-
 src/common/pg_crc.c | 113 +++
 src/include/common/pg_crc.h |  20 --
 src/include/pg_config.h.in  |   3 +
 src/include/port/crc_instructions.h | 128 
 6 files changed, 248 insertions(+), 20 deletions(-)
 create mode 100644 src/include/port/crc_instructions.h

diff --git a/configure b/configure
index 2c9b3a7..87ceb0b 100755
--- a/configure
+++ b/configure
@@ -9204,7 +9204,7 @@ fi
 done
 
 
-for ac_header in atomic.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h
+for ac_header in atomic.h cpuid.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h
 do :
   as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh`
 ac_fn_c_check_header_mongrel $LINENO $ac_header $as_ac_Header $ac_includes_default
diff --git a/configure.in b/configure.in
index b2c1ce7..bf604ea 100644
--- a/configure.in
+++ b/configure.in
@@ -1032,7 +1032,7 @@ AC_SUBST(UUID_LIBS)
 ##
 
 dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES
-AC_CHECK_HEADERS([atomic.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h])
+AC_CHECK_HEADERS([atomic.h cpuid.h crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h mbarrier.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h])
 
 # On BSD, test for net/if.h will fail unless sys/socket.h
 # is included first.
diff --git a/src/common/pg_crc.c b/src/common/pg_crc.c
index eba32d3..6675ae7 100644
--- a/src/common/pg_crc.c
+++ b/src/common/pg_crc.c
@@ -21,25 +21,113 @@
 
 #include common/pg_crc.h
 
-/* Accumulate one input byte */
-#ifdef WORDS_BIGENDIAN
-#define CRC8(x) pg_crc32c_table[0][((crc  24) ^ (x))  0xFF] ^ (crc  8)
+#ifdef PG_HAVE_CRC32C_INSTRUCTIONS
+static pg_crc32 pg_comp_crc32c_hw(pg_crc32 crc, const void *data, size_t len);
+#endif
+
+#if !defined(PG_HAVE_CRC32C_INSTRUCTIONS) || defined(PG_CRC32C_INSTRUCTIONS_NEED_RUNTIME_CHECK)
+static pg_crc32 pg_comp_crc32c_sb8(pg_crc32 crc, const void *data, size_t len);
+static const uint32 pg_crc32c_table[8][256];
+#endif
+
+#ifdef PG_CRC32C_INSTRUCTIONS_NEED_RUNTIME_CHECK
+/*
+ * When built with support for CRC instructions, but we need to perform a
+ * run-time check to determine whether we can actually use them,
+ * pg_comp_crc32c is a function pointer. It is initialized to
+ * pg_comp_crc32c_choose, which performs the runtime check, and 

Re: [HACKERS] What exactly is our CRC algorithm?

2015-03-25 Thread Heikki Linnakangas

On 03/25/2015 07:20 PM, Andres Freund wrote:

On 2015-03-25 19:18:51 +0200, Heikki Linnakangas wrote:

Or better yet, a direct configure test to check if the
intrinsic exists - that way we get to also use it on Intel compilers, which
I believe also has the same intrinsics.


Maybe I'm missing something, but configure isn't run for msvc?


Good point. On MSVC, we use the pre-built pg_config.h.win32 file 
instead. There are already a couple of cases like this in it:


/* Define to 1 if you have the `rint' function. */
#if (_MSC_VER = 1800)
#define HAVE_RINT 1
#endif

I think we should do that for the CRC32 intrinsic too.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-03-25 Thread Andres Freund
On 2015-03-25 19:18:51 +0200, Heikki Linnakangas wrote:
 I was just about to commit the attached, which is the same as the previous
 patch with just cosmetic comment changes, but then I realized that this
 probably doesn't compile with Visual Studio 2005 or older. The code does
 #ifdef _MSC_VER, and then uses the _mm_crc32_u64 intrinsic, but that
 intrinsic was added in Visual Studio 2008. I think we'll need a version
 check there.

Good catch.

 Or better yet, a direct configure test to check if the
 intrinsic exists - that way we get to also use it on Intel compilers, which
 I believe also has the same intrinsics.

Maybe I'm missing something, but configure isn't run for msvc?


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2015-03-25 Thread Jim Nasby

On 3/25/15 1:21 AM, Pavel Stehule wrote:

2015-03-25 0:17 GMT+01:00 Tom Lane t...@sss.pgh.pa.us
mailto:t...@sss.pgh.pa.us:

Pavel Stehule pavel.steh...@gmail.com
mailto:pavel.steh...@gmail.com writes:
 updated version with Jim Nasby's doc and rebase against last changes in
 plpgsql.

I started looking at this patch.  ISTM there are some pretty
questionable
design decisions in it:

1. Why create a core GUC to control a behavior that's plpgsql-only?
I think it'd make more sense for this to be a plgsql custom GUC
(ie, plpgsql.enable_asserts or some such name).


This type of assertations can be implemented in any PL language - so I
prefer global setting. But I have not strong option in this case - this
is question about granularity - and more ways are valid.


+1


2. I find the use of errdetail to report whether the assertion condition
evaluated to FALSE or to NULL to be pretty useless.  (BTW, is
considering
NULL to be a failure the right thing?  SQL CHECK conditions consider
NULL
to be allowed ...)


This is a question - I am happy with SQL CHECK for data, but I am not
sure if same behave is safe for plpgsql (procedural) assert. More
stricter behave is safer  - and some bugs in procedures are based on
unhandled NULLs in variables. So in this topic I prefer implemented
behave. It is some like:


+1. I think POLA here is that an assert must be true and only true to be 
valid. If someone was unhappy with that they could always coalesce(..., 
true).

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing PG_VERSION_NUM in pg_config

2015-03-25 Thread Jim Nasby

On 3/24/15 6:26 PM, Tom Lane wrote:

Andrew Gierth and...@tao11.riddles.org.uk writes:

Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom I concur with Michael that there's value in exposing the version
  Tom number in the numeric form used by PG_VERSION_NUM.  However, I
  Tom also concur with Andrew that if the use-case for this is
  Tom Makefiles, pg_config is a pretty poor transmission mechanism.  We
  Tom should instead add PG_VERSION_NUM to the version variables set in
  Tom Makefile.global.



I think there's an argument for both. pg_config already has a VERSION=
string in the output, and I think adding a VERSION_NUM= would be good
for consistency there. And people definitely do want to do version
comparisons in makefiles...


Hm.  We're all agreed that there's a use case for exposing PG_VERSION_NUM
to the makefiles, but I did not hear one for adding it to pg_config; and
doing the former takes about two lines whereas adding a pg_config option
entails quite a lot of overhead (documentation, translatable help text,
yadda yadda).  So I'm not in favor of doing the latter without a much
more solid case than has been made.


Why else would you want the version number other than to do some kind of 
comparison? I know I've had to play these games in the past (outside of 
a Makefile), though I don't remember the details right now. I'm sure I'm 
not alone in that.


Michael's original patch seems to hit everything necessary but the 
translations, and it's only ~15 lines. That doesn't seem very 
unreasonable to me...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Thom Brown
On 25 March 2015 at 15:49, Amit Kapila amit.kapil...@gmail.com wrote:

 On Wed, Mar 25, 2015 at 5:16 PM, Thom Brown t...@linux.com wrote:
 
  On 25 March 2015 at 10:27, Amit Kapila amit.kapil...@gmail.com wrote:
 
  Fixed the reported issue on assess-parallel-safety thread and another
  bug caught while testing joins and integrated with latest version of
  parallel-mode patch (parallel-mode-v9 patch).
 
  Apart from that I have moved the Initialization of dsm segement from
  InitNode phase to ExecFunnel() (on first execution) as per suggestion
  from Robert.  The main idea is that as it creates large shared memory
  segment, so do the work when it is really required.
 
 
  HEAD Commit-Id: 11226e38
  parallel-mode-v9.patch [2]
  assess-parallel-safety-v4.patch [1]
  parallel-heap-scan.patch [3]
  parallel_seqscan_v12.patch (Attached with this mail)
 
  [1] -
 http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com
  [2] -
 http://www.postgresql.org/message-id/ca+tgmozfsxzhs6qy4z0786d7iu_abhbvpqfwlthpsvgiecz...@mail.gmail.com
  [3] -
 http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com
 
 
  Okay, with my pgbench_accounts partitioned into 300, I ran:
 
  SELECT DISTINCT bid FROM pgbench_accounts;
 
  The query never returns,

 You seem to be hitting the issue I have pointed in near-by thread [1]
 and I have mentioned the same while replying on assess-parallel-safety
 thread.   Can you check after applying the patch in mail [1]


Ah, okay, here's the patches I've now applied:

parallel-mode-v9.patch
assess-parallel-safety-v4.patch
parallel-heap-scan.patch
parallel_seqscan_v12.patch
release_lock_dsm_v1.patch

(with perl patch for pg_proc.h)

The query now returns successfully.

 and I also get this:
 
  grep -r 'starting background worker process parallel worker for PID
 12165' postgresql-2015-03-25_112522.log  | wc -l
  2496
 
  2,496 workers?  This is with parallel_seqscan_degree set to 8.  If I set
 it to 2, this number goes down to 626, and with 16, goes up to 4320.
 
 ..
 
  Still not sure why 8 workers are needed for each partial scan.  I would
 expect 8 workers to be used for 8 separate scans.  Perhaps this is just my
 misunderstanding of how this feature works.
 

 The reason is that for each table scan, it tries to use workers
 equal to parallel_seqscan_degree if they are available and in this
 case as the scan for inheritance hierarchy (tables in hierarchy) happens
 one after another, it uses 8 workers for each scan.  I think as of now
 the strategy to decide number of workers to be used in scan is kept
 simple and in future we can try to come with some better mechanism
 to decide number of workers.


Yes,  I was expecting the parallel aspect to apply across partitions (a
worker per partition up to parallel_seqscan_degree and reallocate to
another scan once finished with current job), not individual ones, so for
the workers to be above the funnel, not below it.  So this is
parallelising, just not in a way that will be a win in this case. :(  For
the query I posted (SELECT DISTINCT bid FROM pgbench_partitions), the
parallelised version takes 8 times longer to complete.  However, I'm
perhaps premature in what I expect from the feature at this stage.

-- 
Thom


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-25 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 2:41 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 Here's what I had in mind: the inserter tags the tuple with the speculative
 insertion token, by storing the token in the t_ctid field. If the inserter
 needs to super-delete the tuple, it sets xmax like in a regular deletion,
 but also sets another flag to indicate that it was a super-deletion.

I was able to quickly hack up a prototype of this in my hotel room at
pgConf.US. It works fine at first blush, passing the jjanes_upsert
stress tests and my own regression tests without a problem. Obviously
it needs more testing and clean-up before posting, but I was pleased
with how easy this was.

 When another backend inserts, and notices that it has a potential conflict
 with the first tuple, it tries to acquire a hw-lock on the token. In most
 cases, the inserter has long since completed the insertion, and the
 acquisition succeeds immediately but you have to check because the token is
 not cleared on a completed insertion.

You don't even have to check/take a ShareLock on the token when the
other xact committed/aborted, because you know that if it is there,
then based on that (and based on the fact that it wasn't super
deleted) the tuple is visible/committed, or (in the event of
other-xact-abort) not visible/aborted. In other words, we continue to
only check for a speculative token when the inserting xact is in
flight - we just take the token from the heap now instead. Not much
needs to change, AFAICT.

 Regarding the physical layout: We can use a magic OffsetNumber value above
 MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
 a regular ctid value. And another magic t_ctid value to indicate that a
 tuple has been super-deleted. The token and the super-deletion flag are
 quite ephemeral, they are not needed after the inserting transaction has
 completed, so it's nice to not consume the valuable infomask bits for these
 things. Those states are conveniently not possible on an updated tuple, when
 we would need the t_ctid field for it's current purpose.

Haven't done anything about this yet. I'm just using an infomask2 bit
for now. Although that was only because I forgot that you suggested
this before having a go at implementing this new t_ctid scheme!

My next revision will have a more polished version of this scheme. I'm
not going to immediately act on Robert's feedback elsewhere (although
I'd like to), owing to time constraints - no reason to deny you the
opportunity to review the entirely unrelated low-level speculative
locking mechanism due to that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing PG_VERSION_NUM in pg_config

2015-03-25 Thread Andres Freund
On 2015-03-25 14:50:44 -0400, Tom Lane wrote:
 Jim Nasby jim.na...@bluetreble.com writes:
  On 3/24/15 6:26 PM, Tom Lane wrote:
  Hm.  We're all agreed that there's a use case for exposing PG_VERSION_NUM
  to the makefiles, but I did not hear one for adding it to pg_config; and
  doing the former takes about two lines whereas adding a pg_config option
  entails quite a lot of overhead (documentation, translatable help text,
  yadda yadda).  So I'm not in favor of doing the latter without a much
  more solid case than has been made.
 
  Why else would you want the version number other than to do some kind of 
  comparison?
 
 The question is why, if we supply the version number in a make variable,
 you would not just use that variable instead of having to do
 $(shell $(PG_CONFIG) --something).  The shell version adds new failure
 modes, removes none, and has no redeeming social value that I can see.

I think using the makefile is preferrable if you have the version
dependency in the makefile. But if you don't actually use make
(e.g. stuff not written in C) or you need the detection in configure or
something, it's different.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deparsing utility commands

2015-03-25 Thread Ryan Pedela
On Wed, Mar 25, 2015 at 11:59 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 * Should we prohibit DDL from within event triggers?


Please don't prohibit DDL unless there is a really, really good reason to
do so. I have several use cases in mind for event triggers, but they are
only useful if I can perform DDL.

For example, I want to create an Elasticsearch FDW and use that to index
and search Postgres tables. When a table is created, I am planning to use
an event trigger to capture the CREATE event and automatically create a
foreign table via the Elasticsearch FDW. In addition, I would add normal
triggers to the new table which capture DML and update the foreign table
accordingly. In other words, I want to use FDWs and event triggers to
automatically sync table DDL and DML to Elasticsearch.


Re: [HACKERS] Exposing PG_VERSION_NUM in pg_config

2015-03-25 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 3/24/15 6:26 PM, Tom Lane wrote:
 Hm.  We're all agreed that there's a use case for exposing PG_VERSION_NUM
 to the makefiles, but I did not hear one for adding it to pg_config; and
 doing the former takes about two lines whereas adding a pg_config option
 entails quite a lot of overhead (documentation, translatable help text,
 yadda yadda).  So I'm not in favor of doing the latter without a much
 more solid case than has been made.

 Why else would you want the version number other than to do some kind of 
 comparison?

The question is why, if we supply the version number in a make variable,
you would not just use that variable instead of having to do
$(shell $(PG_CONFIG) --something).  The shell version adds new failure
modes, removes none, and has no redeeming social value that I can see.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove fsync ON/OFF as a visible option?

2015-03-25 Thread Jim Nasby

On 3/21/15 12:25 PM, Jeff Janes wrote:

On Sat, Mar 21, 2015 at 8:54 AM, Tom Lane t...@sss.pgh.pa.us
mailto:t...@sss.pgh.pa.us wrote:

Stephen Frost sfr...@snowman.net mailto:sfr...@snowman.net writes:
 At the moment, one could look at our default postgresql.conf and the
 turns forced synchronization on or off and think it's something akin
 or somehow related to synchronous_commit (which is completely different,
 but the options are right next to each other..).

 How about a big warning around fsync and make it more indepenent from
 the options around it?

Yeah, the main SGML docs are reasonably clear about the risks of fsync,
but postgresql.conf doesn't give you any hint that it's dangerous.  Now
I'm not entirely sure that people who frob postgresql.conf without
having
read the docs can be saved from themselves, but we could do something
like this:

  # - Settings -

  #wal_level = minimal   # minimal, archive,
hot_standby, or logical
 # (change requires restart)
  #fsync = on# turns forced
synchronization on or off
+   # (fsync=off is dangerous,
read the
+   # (manual before using it)
  #synchronous_commit = on   # synchronization level;
 # off, local, remote_write,
or on
  #wal_sync_method = fsync   # the default is the first
option
 # supported by the
operating system:

Also, I think the short description turns forced synchronization on or
off could stand improvement; it really conveys zero information.  Maybe
something like force data to disk when committing?


I agree the current description is lacking, but that proposed wording
would be a better description of synchronous_commit.  It is
checkpointing and flush-WAL-before-data where fsync=off does its damage.
Force data to disk when needed for integrity?

Or just don't describe what it is at all, and refer to the documentation
only.


I see 3 settings that allow people to accidentally shoot themselves in 
the foot; fsync, wal_sync_method and full_page_writes.


How about just grouping those 3 together with a bulk disclaimer along 
the lines of The following 3 settings are dangerous. Use at your own 
risk, and read the docs first.? That would also allow us to just remove 
the comments about what the settings do; if you don't already know you 
certainly shouldn't be touching them! :)

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2015 proposal. Bitmap Index-only Count

2015-03-25 Thread Anastasia Lubennikova
2015-03-24 18:01 GMT+04:00 Tom Lane t...@sss.pgh.pa.us:

 Anastasia Lubennikova lubennikov...@gmail.com writes:
  There is a problem of slow counting in PostgreSQL [1]. The reason why
 this
  is slow is related to the *MVCC* implementation in PostgreSQL. Index-only
  scans (implemented since PostgreSQL-9.2) providing some performance
  improvements where the *visibility map* of the table allows it. That’s
  good. But it works only for access methods which provide amgettuple
 method.
  Unfortunately GIN supports only BitmapIndexScan and has no implementation
  of index_getnext() interface [2].

 Right ...

  As a GSoC student I will create new Node “Bitmap Index-Only Scan”, which
  would catch tuples from Bitmap Index Scan node and pass them to Aggregate
  node. Thus, new query plan will be as follow:

 I'm pretty hesitant about adding a whole new plan node type (which will
 require quite a lot of infrastructure) for such a narrow use-case.
 I think the odds are good that if you proceed down this path, you will
 end up with something that never gets committed to Postgres.


Thanks a lot for reply. It was just approximate idea. I thought is wasn't
very good.

I wonder whether it'd be possible to teach GIN to support index_getnext
 instead.  Initially it would probably work only for cases where the
 index didn't have to return any columns ... but if we did it, maybe the
 door would be open to cases where GIN could reconstruct actual values.

 Another idea is to write index_getnext() for GIN which would return some
fake tuple. Since there is no difference for COUNT aggregate what the tuple
contains. COUNT just wants to know whether we have tuple that satisfy the
qual.
Is this idea better? Is it possible for planner to use index_getnext() for
GIN only with COUNT aggregate?


-- 
Best regards,
Lubennikova Anastasia


Re: [HACKERS] deparsing utility commands

2015-03-25 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Here's an updated version of this series.

I just pushed patches 0001 and 0002, with very small tweaks; those had
already been reviewed and it didn't seem like there was much
controversy.

To test the posted series it's probably easiest to
git checkout b3196e65f5bfc997ec7fa3f91645a09289c10dee
and apply it all on top of that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-25 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote:

 I attached the your latest patch to this mail as
 bt-nopin-v4.patch for now. Please check that there's no problem
 in it.

 I checked out master, applied the patch and checked it against my
 latest code (merged with master), and it matched.  So, looks
 good.

Pushed with the addition of one more paragraph of comments,
regarding something to watch out for in any follow-on patch to
eliminate the pin for a scan using a non-MVCC snapshot.

Thanks again to Kyotaro-san and Heikki for the reviews!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2015 proposal. Bitmap Index-only Count

2015-03-25 Thread Tom Lane
Anastasia Lubennikova lubennikov...@gmail.com writes:
 2015-03-24 18:01 GMT+04:00 Tom Lane t...@sss.pgh.pa.us:
 I wonder whether it'd be possible to teach GIN to support index_getnext
 instead.  Initially it would probably work only for cases where the
 index didn't have to return any columns ... but if we did it, maybe the
 door would be open to cases where GIN could reconstruct actual values.

 Another idea is to write index_getnext() for GIN which would return some
 fake tuple. Since there is no difference for COUNT aggregate what the tuple
 contains. COUNT just wants to know whether we have tuple that satisfy the
 qual.

Well, yeah, that would be the idea (at least initially).  You don't have
to return any real data unless you claim you can do so via amcanreturn.
The planner is still capable of selecting an index-only scan as long as
the query retrieves no columns.

The trick would be to not return the same heap TID more than once per
scan.  A zero-order implementation would be to construct the same bitmap
we do now and then just provide a gingetnext function that scans through
that.  That would be pretty awful in terms of scan startup time, so doing
better would be nice; but perhaps it would be useful even in that form.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add macros wrapping all usage of gcc's __attribute__.

2015-03-25 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 Add macros wrapping all usage of gcc's __attribute__.

I noticed that this commit attached pg_attribute_noreturn not only
to the extern declarations, but to some actual function definitions.
I think this is a bad idea, because it's going to look like heck after
pgindent gets through with it.  Do we actually need decoration on the
function definitions?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2015-03-25 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 3/25/15 1:21 AM, Pavel Stehule wrote:
 2015-03-25 0:17 GMT+01:00 Tom Lane t...@sss.pgh.pa.us
 mailto:t...@sss.pgh.pa.us:
 (BTW, is considering
 NULL to be a failure the right thing?  SQL CHECK conditions consider
 NULL to be allowed ...)

 This is a question - I am happy with SQL CHECK for data, but I am not
 sure if same behave is safe for plpgsql (procedural) assert. More
 stricter behave is safer  - and some bugs in procedures are based on
 unhandled NULLs in variables. So in this topic I prefer implemented
 behave. It is some like:

 +1. I think POLA here is that an assert must be true and only true to be 
 valid. If someone was unhappy with that they could always coalesce(..., 
 true).

Fair enough.  Committed with the other changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Abbreviated keys for Numeric

2015-03-25 Thread Peter Geoghegan
On Wed, Mar 25, 2015 at 6:26 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 Peter == Peter Geoghegan p...@heroku.com writes:

  Peter You still pointlessly check memtupcount here:

  Peter + if (memtupcount  1 || nss-input_count  1 || 
 !nss-estimating)
  Peter + return false;

 It's in a register; the test is free.

Unbelievable! The test is free, and you wrote it that way in the
first place, so we should just leave it that way rather than changing
it to suit you?

As things stand, the test is totally misleading. In fact, I think it
was the source of my initial confusion about the way you track
non-null tuple count separately. I certainly didn't have any clue from
comments. Apart from fixing this, I added some comments explaining
that in my version, but you didn't make any effort to follow that
either.

  Peter This cast to void is unnecessary:

  Peter + (void) ssup;

 It's an explicit statement that the parameter is otherwise unused.
 Maybe that compiler warning isn't usually on by default, but I
 personally regard it as good style to be explicit about it.

Your personal preferences don't enter into it, since every sort
support routine since 2011 has followed that style (most still don't
touch the sortsupport object). That's the way things work around here
- consistency matters. If the existing convention is wrong, we fix the
existing convention.

Why do you get to ignore well established practice like this? Are you special?

  Peter Please try and at least consider my feedback. I don't expect you
  Peter to do exactly what I ask, but I also don't expect you to
  Peter blithely ignore it.

 You should really stop digging this hole deeper.

You have that backwards.

   The INT64_MIN/MAX changes should be committed fairly soon. (I
   haven't posted a patch for TRACE_SORT)

  Peter I wouldn't assume that.

 Oh ye of little faith. I would not have said that had I not already been
 informed of it by a committer, and indeed it is now committed.

You could have said so.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Michael Paquier
On Thu, Mar 26, 2015 at 1:02 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 On Wed, Mar 25, 2015 at 12:46 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:


 On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund and...@2ndquadrant.com
 wrote:
 

  Did you check whether a similar bug was made in other places of
  85b506bb? Could you additionally add a regression test to this end?
  Seems like something worth testing.
 

 I'm checking it and adding some regression tests.


 I didn't found any other similar bug introduced by 85b506bb.

 Attached the original patch provided by Michael with some regression tests.

Thanks for adding a test, this looks fine to me (did some sanity
checks and tutti-quanti for people wondering). On temporary tables
this was failing with an error in md.c...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove fsync ON/OFF as a visible option?

2015-03-25 Thread Jeff Janes
On Wed, Mar 25, 2015 at 12:45 PM, Jim Nasby jim.na...@bluetreble.com
wrote:


 I see 3 settings that allow people to accidentally shoot themselves in the
 foot; fsync, wal_sync_method and full_page_writes.

 How about just grouping those 3 together with a bulk disclaimer along the
 lines of The following 3 settings are dangerous. Use at your own risk, and
 read the docs first.? That would also allow us to just remove the comments
 about what the settings do; if you don't already know you certainly
 shouldn't be touching them! :)


But one of these things is not like the other.  Any supported (i.e. non
fatal erroring) setting of wal_sync_method *should* always be safe
(although may be inefficient) if the underlying kernel, RAID controller,
hard drives, and fs fulfill their pledges.  It is hard to document every
known liar in this regard.  About the best you can do, short of
pull-the-plug test on a massive scale, is to run pg_fsync_test and assuming
that any result inconsistent with the RPM of the spinning rust is obviously
unsafe. Unfortunately that doesn't rule out the possibility that something
is both unsafe and gratuitously slow.

Cheers,

Jeff


[HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

2015-03-25 Thread Fabrízio de Royes Mello
Hi all,

Below I written my proposal idea to this GSoC.


*** Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED”
statement ***

Last year during the GSoC2014 I implemented the feature to allow an
unlogged table to be changed to logged [1], but the desing chosen was to
rewrite the entire heap in a new relfilenode with a new relpersistence
because some troubles pointed here [2].

The project was successfully finished and got committed [3] into PostgreSQL
to be released this year in the 9.5 version.

However this design lead us to performance problems with large relations
because we need to rewrite the entire content of the relation twice, one
into a new heap and other into the WAL, so this project will change the
current desing of the mechanism of change an unlogged table to logged
without the need to rewrite the entire heap, but just by removing the init
forks and if the wal_level != minimal we’ll write the contents to the WAL
too.


** Benefits to the PostgreSQL Community **

The  “unlogged” tables feature was introduced by 9.1 version, and provide
better write performance than regular tables (logged), but are not
crash-safe. Their contents are automatically discarded (cleared) if the
server crashes. Also, their contents do not propagate to standby servers.

We already have a way to change an unlogged table to logged using “ALTER
TABLE name SET LOGGED” developed last year during the GSoC2014, now during
the GSoC2015 we’ll redesing the internals to improve the I/O performance of
this feature removing the need of rewrite the entire heap into a new
relfilenode.

The are a good idea about the desing here [4], but I’ll discuss the design
with my mentor to implement this improvement.


** Additional Goals **

The main goal of this project is improve the performance of the ALTER TABLE
name SET {LOGGED|UNLOGGED}, but we can expand this propostal to more
related goals.

* Allow unlogged materialized views
  . ALTER MATERIALIZED VIEW name SET { UNLOGGED | LOGGED }
* Allow unlogged indexes on logged tables
  . ALTER INDEX name SET { UNLOGGED | LOGGED }


** Deliverables **

This project has just one deliverable at the end. The deliverable will be
the improvement of the routines that transform an “unlogged” table to
“logged” and “logged” to “unlogged”, without the need to create a new
“relfilenode” with a different “relpersistence”.


** Project Schedule **

until May 25:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project
[4]
* learn about some PostgreSQL internals:
  . control data (src/include/catalog/pg_control.h)
  . storage (src/backend/storage/*)
* discuss the additional goals with community

May 26 - June 21
* implementation of the first prototype:
  . implement the change of unlogged table to logged without rewrite the
entire heap when “wal_level = minimal”
  . at this point when “wal_level != minimal” we use the current
implementation
* write documentation and the test cases
* submit this first prototype to the commitfest 2015/06 (
https://commitfest.postgresql.org/5/)

June 22 - June 26
* mentor review the work in progress

June 27 - August 17
* do the adjustments based on the community feedback during the commitfest
2015/06
* implementation of the second prototype:
  . when “wal_level != minimal” we’ll remove the init fork (first
prototype) and write relation pages to the WAL.
  . implement “ALTER MATERIALIZED VIEW .. SET LOGGED / UNLOGGED”
* submit to the commitfest 2015/09 for final evaluation and maybe will be
committed to 9.6 version (webpage don’t created yet)

August 18 - August 21
* do the adjustments based on the community feedback during the commitfest
2015/09
* final mentor review


** About the proponent **

Fabrízio de Royes Mello

e-mail: fabriziome...@gmail.com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello

Currently I help people and teams to take the full potential of relational
databases, especially PostgreSQL, helping teams to design the structure of
the database (modeling), build physical architecture (database schema),
programming (procedural languages), SQL (usage, tuning, best practices),
optimization and orchestration of instances in production too. I perform a
volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br),
supporting mailing lists, organizing events (pgbr.postgresql.org.br) and
some admin tasks. And also I help a little the PostgreSQL Global
Development Group (PGDG) in the implementation of some features and review
of patches (git.postgresql.org).


Links

[1]
https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014
[2]
http://www.postgresql.org/message-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x...@mail.gmail.com
[3]

Re: [HACKERS] Why SyncRepWakeQueue is not static?

2015-03-25 Thread Tatsuo Ishii
 SyncRepWakeQueue (src/backend/replication/syncrep.c) is not used
 anywhere except in the file. If there's no good reason for it, I think
 it should be declared as a static function.  Included patch does so.

Fix committed/pushed from master to 9.2. 9.1 declares it as a static
function.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why SyncRepWakeQueue is not static?

2015-03-25 Thread Michael Paquier
On Thu, Mar 26, 2015 at 10:46 AM, Tatsuo Ishii is...@postgresql.org wrote:
 SyncRepWakeQueue (src/backend/replication/syncrep.c) is not used
 anywhere except in the file. If there's no good reason for it, I think
 it should be declared as a static function.  Included patch does so.

 Fix committed/pushed from master to 9.2. 9.1 declares it as a static
 function.

Er, is that a good idea to back-patch that? Normally routine specs are
maintained stable on back-branches, and this is just a cosmetic
change.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-03-25 Thread Michael Paquier
On Wed, Mar 25, 2015 at 10:53 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
 On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
  The index is unlogged until reindexing...
 
  [...]
  Which is think also raises the question, why are unlogged indexes made
  persistent by a reindex?

 That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
 REINDEX INDEX. What happens is that ReindexIndex relies on
 relpersistence provided by makeRangeVar at parse time, which is just
 incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
 attached fixes that...

 What the hell? That's somewhat nasty. Nice that it got caught before 9.5
 was released.

 Did you check whether a similar bug was made in other places of
 85b506bb?

Yeah I got a look at the other code paths, particularly cluster and
matviews, and the relpersistence used is taken directly from a
Relation.

 Could you additionally add a regression test to this end?
 Seems like something worth testing.

Definitely. And I guess that Fabrizio already did that...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Moving on to close the current CF 2015-02

2015-03-25 Thread Michael Paquier
Hi all,

Visibly there is no commit fest manager this time (was I?), and people
may think that I still am the CFM for 2015-02, continuously after
2014-12 and that I am severely slacking on my duties. Honestly I
thought that I was not and that it was clear enoug... Still, biting
the bullet to make things move on, should I launch a VACUUM FULL on
the current entries of the CF to brush up things that could get in
9.5? The current CF officially finished two weeks ago.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving on to close the current CF 2015-02

2015-03-25 Thread Michael Paquier
On Thu, Mar 26, 2015 at 10:41 AM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Mar 26, 2015 at 10:22:11AM +0900, Michael Paquier wrote:
 Hi all,

 Visibly there is no commit fest manager this time (was I?), and people
 may think that I still am the CFM for 2015-02, continuously after
 2014-12 and that I am severely slacking on my duties. Honestly I
 thought that I was not and that it was clear enoug... Still, biting
 the bullet to make things move on, should I launch a VACUUM FULL on
 the current entries of the CF to brush up things that could get in
 9.5? The current CF officially finished two weeks ago.

 Usually the last commitfest is double the typical length.

Oh, OK. So this lets 3 weeks...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why SyncRepWakeQueue is not static?

2015-03-25 Thread Tatsuo Ishii
 Fix committed/pushed from master to 9.2. 9.1 declares it as a static
 function.
 
 Er, is that a good idea to back-patch that? Normally routine specs are
 maintained stable on back-branches, and this is just a cosmetic
 change.

I'm not sure if it's a cosmetic change or not. I thought declaring
to-be-static function as extern is against our coding
standard. Moreover, if someone wants to change near the place in the
source code in the future, changes made to head may not be easily back
patched or cherry-picked to older branches if I do not back patch it.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: btree_gin: properly call DirectFunctionCall1()

2015-03-25 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 OK, I figured out that I was only supposed to change inet_in, not the
 other calls to DirectFunctionCall3 (varbit_in and bit_in).  Patch
 attached.

That looks better ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: btree_gin: properly call DirectFunctionCall1()

2015-03-25 Thread Bruce Momjian
On Tue, Mar 24, 2015 at 10:35:10PM -0400, Bruce Momjian wrote:
 On Tue, Mar 24, 2015 at 10:02:03PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   btree_gin:  properly call DirectFunctionCall1()
   Previously we called DirectFunctionCall3() with dummy arguments.
  
  This patch is entirely wrong and has broken the buildfarm.
  Please revert.
 
 OK, done.

OK, I figured out that I was only supposed to change inet_in, not the
other calls to DirectFunctionCall3 (varbit_in and bit_in).  Patch
attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/contrib/btree_gin/btree_gin.c b/contrib/btree_gin/btree_gin.c
new file mode 100644
index 80521fb..1a5bb3c
*** a/contrib/btree_gin/btree_gin.c
--- b/contrib/btree_gin/btree_gin.c
*** GIN_SUPPORT(macaddr, false, leftmostvalu
*** 318,327 
  static Datum
  leftmostvalue_inet(void)
  {
! 	return DirectFunctionCall3(inet_in,
! 			   CStringGetDatum(0.0.0.0/0),
! 			   ObjectIdGetDatum(0),
! 			   Int32GetDatum(-1));
  }
  GIN_SUPPORT(inet, true, leftmostvalue_inet, network_cmp)
  
--- 318,324 
  static Datum
  leftmostvalue_inet(void)
  {
! 	return DirectFunctionCall1(inet_in, CStringGetDatum(0.0.0.0/0));
  }
  GIN_SUPPORT(inet, true, leftmostvalue_inet, network_cmp)
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving on to close the current CF 2015-02

2015-03-25 Thread Bruce Momjian
On Thu, Mar 26, 2015 at 10:22:11AM +0900, Michael Paquier wrote:
 Hi all,
 
 Visibly there is no commit fest manager this time (was I?), and people
 may think that I still am the CFM for 2015-02, continuously after
 2014-12 and that I am severely slacking on my duties. Honestly I
 thought that I was not and that it was clear enoug... Still, biting
 the bullet to make things move on, should I launch a VACUUM FULL on
 the current entries of the CF to brush up things that could get in
 9.5? The current CF officially finished two weeks ago.

Usually the last commitfest is double the typical length.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Kouhei Kaigai
  Or bottom of make_join_rel().  IMO build_join_rel() is responsible for just
 building (or searching from a list) a RelOptInfo for given relids.  After that
 make_join_rel() calls add_paths_to_joinrel() with appropriate arguments per 
 join
 type to generate actual Paths implements the join.  make_join_rel() is called
 only once for particular relid combination, and there SpecialJoinInfo and
 restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
 promising
 for FDW cases.
 
  I like that idea, but I think we will have complex hook signature, it won't
 remain as simple as hook (root, joinrel).
 
 Signature of the hook (or the FDW API handler) would be like this:
 
 typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist);
 
 This is very similar to add_paths_to_joinrel(), but lacks semifactors and
 extra_lateral_rels.  semifactors can be obtained with
 compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed
 from root-placeholder_list as add_paths_to_joinrel() does.
 
 From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to
 generate SELECT statement, so it would require most work done in make_join_rel
 again if the signature was hook(root, joinrel).  sjinfo will be necessary for
 supporting SEMI/ANTI joins, but currently it is not in the scope of 
 postgres_fdw.
 
 I guess that other FDWs require at least jointype and restrictlist.

The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
'joinrel' is actually built and both of child relations are managed by same
FDW driver, prior to any other built-in join paths.
I adjusted the hook definition a little bit, because jointype can be reproduced
using SpecialJoinInfo. Right?

Probably, it will solve the original concern towards multiple calls of FDW
handler in case when it tries to replace an entire join subtree with a foreign-
scan on the result of remote join query.

How about your opinion?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com



pgsql-v9.5-custom-join.v11.patch
Description: pgsql-v9.5-custom-join.v11.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Shigeru HANADA

2015/03/25 18:53、Ashutosh Bapat ashutosh.ba...@enterprisedb.com のメール:

 
 
 On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com 
 wrote:
 
 Or bottom of make_join_rel().  IMO build_join_rel() is responsible for just 
 building (or searching from a list) a RelOptInfo for given relids.  After 
 that make_join_rel() calls add_paths_to_joinrel() with appropriate arguments 
 per join type to generate actual Paths implements the join.  make_join_rel() 
 is called only once for particular relid combination, and there 
 SpecialJoinInfo and restrictlist (conditions specified in JOIN-ON and WHERE), 
 so it seems promising for FDW cases.
 
 I like that idea, but I think we will have complex hook signature, it won't 
 remain as simple as hook (root, joinrel). 

Signature of the hook (or the FDW API handler) would be like this:

typedef void (*GetForeignJoinPaths_function ) (PlannerInfo *root,
   RelOptInfo *joinrel,
   RelOptInfo *outerrel,
   RelOptInfo *innerrel,
   JoinType jointype,
   SpecialJoinInfo *sjinfo,
   List *restrictlist);

This is very similar to add_paths_to_joinrel(), but lacks semifactors and 
extra_lateral_rels.  semifactors can be obtained with 
compute_semi_anti_join_factors(), and extra_lateral_rels can be constructed 
from root-placeholder_list as add_paths_to_joinrel() does.

From the viewpoint of postgres_fdw, jointype and restrictlist is necessary to 
generate SELECT statement, so it would require most work done in make_join_rel 
again if the signature was hook(root, joinrel).  sjinfo will be necessary for 
supporting SEMI/ANTI joins, but currently it is not in the scope of 
postgres_fdw.

I guess that other FDWs require at least jointype and restrictlist.

—
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Ignoring entries generated by autoconf in code tree

2015-03-25 Thread Michael Paquier
Hi all,

When running autoconf from the root tree, autom4te.cache/ is
automatically generated.
Wouldn't it make sense to add an entry in .gitignore for that?
Regards,
-- 
Michael
diff --git a/.gitignore b/.gitignore
index 8d3af50..b1f04bb 100644
--- a/.gitignore
+++ b/.gitignore
@@ -28,6 +28,7 @@ lib*dll.def
 lib*.pc
 
 # Local excludes in root directory
+/autom4te.cache/
 /GNUmakefile
 /config.cache
 /config.log

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Shigeru HANADA

2015/03/25 19:47、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
 The reason why FDW handler was called multiple times on your example is,
 your modified make_join_rel() does not check whether build_join_rel()
 actually build a new RelOptInfo, or just a cache reference, doesn't it?
 

Yep.  After that change calling count looks like this:

fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid 
= b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
INFO:  postgresGetForeignJoinPaths() 1x2
INFO:  postgresGetForeignJoinPaths() 1x4
INFO:  postgresGetForeignJoinPaths() 2x4
INFO:  standard_join_search() old hook point
INFO:  standard_join_search() old hook point
INFO:  standard_join_search() old hook point
INFO:  postgresGetForeignJoinPaths() 0x4
INFO:  standard_join_search() old hook point
   QUERY PLAN
-
 Foreign Scan  (cost=100.00..102.11 rows=211 width=1068)
(1 row)

fdw=#

 If so, I'm inclined to your proposition.
 A new bool *found argument of build_join_rel() makes reduce number of
 FDW handler call, with keeping reasonable information to build remote-
 join query.

Another idea is to pass “found” as parameter to FDW handler, and let FDW to 
decide to skip or not.  Some of FDWs (and some of CSP?) might want to be 
conscious of join combination. 

—
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Thom Brown
On 25 March 2015 at 10:27, Amit Kapila amit.kapil...@gmail.com wrote:

 On Fri, Mar 20, 2015 at 5:36 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
 
 
  So the patches have to be applied in below sequence:
  HEAD Commit-id : 8d1f2390
  parallel-mode-v8.1.patch [2]
  assess-parallel-safety-v4.patch [1]
  parallel-heap-scan.patch [3]
  parallel_seqscan_v11.patch (Attached with this mail)
 
  The reason for not using the latest commit in HEAD is that latest
  version of assess-parallel-safety patch was not getting applied,
  so I generated the patch at commit-id where I could apply that
  patch successfully.
 
   [1] -
 http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com
   [2] -
 http://www.postgresql.org/message-id/ca+tgmozjjzynpxchl3gr7nwruzkazpmpvkatdt5shvc5cd7...@mail.gmail.com
   [3] -
 http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com
 

 Fixed the reported issue on assess-parallel-safety thread and another
 bug caught while testing joins and integrated with latest version of
 parallel-mode patch (parallel-mode-v9 patch).

 Apart from that I have moved the Initialization of dsm segement from
 InitNode phase to ExecFunnel() (on first execution) as per suggestion
 from Robert.  The main idea is that as it creates large shared memory
 segment, so do the work when it is really required.


 HEAD Commit-Id: 11226e38
 parallel-mode-v9.patch [2]
 assess-parallel-safety-v4.patch [1]
 parallel-heap-scan.patch [3]
 parallel_seqscan_v12.patch (Attached with this mail)

 [1] -
 http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com
 [2] -
 http://www.postgresql.org/message-id/ca+tgmozfsxzhs6qy4z0786d7iu_abhbvpqfwlthpsvgiecz...@mail.gmail.com
 [3] -
 http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com


Okay, with my pgbench_accounts partitioned into 300, I ran:

SELECT DISTINCT bid FROM pgbench_accounts;

The query never returns, and I also get this:

grep -r 'starting background worker process parallel worker for PID
12165' postgresql-2015-03-25_112522.log  | wc -l
2496

2,496 workers?  This is with parallel_seqscan_degree set to 8.  If I set it
to 2, this number goes down to 626, and with 16, goes up to 4320.

Here's the query plan:

   QUERY
PLAN
-
 HashAggregate  (cost=38856527.50..38856529.50 rows=200 width=4)
   Group Key: pgbench_accounts.bid
   -  Append  (cost=0.00..38806370.00 rows=20063001 width=4)
 -  Seq Scan on pgbench_accounts  (cost=0.00..0.00 rows=1 width=4)
 -  Funnel on pgbench_accounts_1  (cost=0.00..192333.33
rows=10 width=4)
   Number of Workers: 8
   -  Partial Seq Scan on pgbench_accounts_1
(cost=0.00..1641000.00 rows=10 width=4)
 -  Funnel on pgbench_accounts_2  (cost=0.00..192333.33
rows=10 width=4)
   Number of Workers: 8
   -  Partial Seq Scan on pgbench_accounts_2
(cost=0.00..1641000.00 rows=10 width=4)
 -  Funnel on pgbench_accounts_3  (cost=0.00..192333.33
rows=10 width=4)
   Number of Workers: 8
...
   -  Partial Seq Scan on pgbench_accounts_498
(cost=0.00..10002.10 rows=210 width=4)
 -  Funnel on pgbench_accounts_499  (cost=0.00..1132.34 rows=210
width=4)
   Number of Workers: 8
   -  Partial Seq Scan on pgbench_accounts_499
(cost=0.00..10002.10 rows=210 width=4)
 -  Funnel on pgbench_accounts_500  (cost=0.00..1132.34 rows=210
width=4)
   Number of Workers: 8
   -  Partial Seq Scan on pgbench_accounts_500
(cost=0.00..10002.10 rows=210 width=4)

Still not sure why 8 workers are needed for each partial scan.  I would
expect 8 workers to be used for 8 separate scans.  Perhaps this is just my
misunderstanding of how this feature works.

-- 
Thom


Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-03-25 Thread Sawada Masahiko
On Wed, Mar 25, 2015 at 12:23 PM, David Steele da...@pgmasters.net wrote:
 On Wed, Mar 25, 2015 at 12:38 AM, David Steele da...@pgmasters.net wrote:
 2. OBJECT auditing does not work before adding acl info to 
 pg_class.rel_acl.
 In following situation, pg_audit can not audit OBJECT log.
 $ cat postgresql.conf | grep audit
 shared_preload_libraries = 'pg_audit'
 pg_audit.role = 'hoge_user'
 pg_audit.log = 'read, write'
 $ psql -d postgres -U hoge_user
 =# create table hoge(col int);
 =# select * from hoge;
 LOG:  AUDIT: SESSION,3,1,READ,SELECT,,,select * from hoge;

 OBJECT audit log is not logged here since pg_class.rel_acl is empty
 yet. (Only logged SESSION log)
 So after creating another unconcerned role and grant any privilege to that 
 user,
 OBJECT audit is logged successfully.

 Yes, object auditing does not work until some grants have been made to
 the audit role.

 =# create role bar_user;
 =# grant select on hoge to bar_user;
 =# select * from hoge;
 LOG:  AUDIT: SESSION,4,1,READ,SELECT,,,select * from hoge;
 LOG:  AUDIT: OBJECT,4,1,READ,SELECT,TABLE,public.hoge,select * from hoge;

 The both OBJCET and SESSION log are logged.

 Looks right to me.  If you don't want the session logging then disable
 pg_audit.log.

 Session and object logging are completely independent from each other:
 one or the other, or both, or neither can be enabled at any time.

 It means that OBJECT log is not logged just after creating table, even
 if that table is touched by its owner.
 To write OBJECT log, we need to grant privilege to role at least. right?

 Exactly.  Privileges must be granted to the audit role in order for
 object auditing to work.


The table owner always can touch its table.
So does it lead that table owner can get its table information while
hiding OBJECT logging?

Also I looked into latest patch again.
Here are two review comment.

1.
 typedef struct
 {
int64 statementId;
   int64 substatementId;
Both statementId and substatementId could be negative number.
I think these should be uint64 instead.

2.
I got ERROR when executing function uses cursor.

1) create empty table (hoge table)
2) create test function as follows.

create function test() returns int as $$
declare
cur1 cursor for select * from hoge;
tmp int;
begin
open cur1;
fetch cur1 into tmp;
   return tmp;
end$$
language plpgsql ;

3) execute test function (got ERROR)
=# select test();
LOG:  AUDIT: SESSION,6,1,READ,SELECT,,,selecT test();
LOG:  AUDIT: SESSION,6,2,FUNCTION,EXECUTE,FUNCTION,public.test,selecT test();
LOG:  AUDIT: SESSION,6,3,READ,SELECT,,,select * from hoge
CONTEXT:  PL/pgSQL function test() line 6 at OPEN
ERROR:  pg_audit stack is already empty
STATEMENT:  selecT test();

It seems like that the item in stack is already freed by deleting
pg_audit memory context (in MemoryContextDelete()),
before calling stack_pop in dropping of top-level Portal.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2015-03-25 Thread Pavel Stehule
2015-03-26 0:08 GMT+01:00 Tom Lane t...@sss.pgh.pa.us:

 Jim Nasby jim.na...@bluetreble.com writes:
  On 3/25/15 1:21 AM, Pavel Stehule wrote:
  2015-03-25 0:17 GMT+01:00 Tom Lane t...@sss.pgh.pa.us
  mailto:t...@sss.pgh.pa.us:
  (BTW, is considering
  NULL to be a failure the right thing?  SQL CHECK conditions consider
  NULL to be allowed ...)

  This is a question - I am happy with SQL CHECK for data, but I am not
  sure if same behave is safe for plpgsql (procedural) assert. More
  stricter behave is safer  - and some bugs in procedures are based on
  unhandled NULLs in variables. So in this topic I prefer implemented
  behave. It is some like:

  +1. I think POLA here is that an assert must be true and only true to be
  valid. If someone was unhappy with that they could always coalesce(...,
  true).

 Fair enough.  Committed with the other changes.


Thank you very much

regards

Pavel



 regards, tom lane



[HACKERS] compiler warnings in lwlock

2015-03-25 Thread Jeff Janes
When building with LOCK_DEBUG but without casserts, I was getting unused
variable warnings.

I believe this is the correct way to silence them.

Cheers,

Jeff


silence_lwlock_lock_debug.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Abbreviated keys for Numeric

2015-03-25 Thread Andrew Gierth
 Peter == Peter Geoghegan p...@heroku.com writes:

 Peter You still pointlessly check memtupcount here:

 Peter + if (memtupcount  1 || nss-input_count  1 || 
!nss-estimating)
 Peter + return false;

It's in a register; the test is free.

 Peter This cast to void is unnecessary:

 Peter + (void) ssup;

It's an explicit statement that the parameter is otherwise unused.
Maybe that compiler warning isn't usually on by default, but I
personally regard it as good style to be explicit about it.

 Peter Please try and at least consider my feedback. I don't expect you
 Peter to do exactly what I ask, but I also don't expect you to
 Peter blithely ignore it.

You should really stop digging this hole deeper.

  The INT64_MIN/MAX changes should be committed fairly soon. (I
  haven't posted a patch for TRACE_SORT)

 Peter I wouldn't assume that.

Oh ye of little faith. I would not have said that had I not already been
informed of it by a committer, and indeed it is now committed.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove fsync ON/OFF as a visible option?

2015-03-25 Thread Jim Nasby

On 3/22/15 4:50 PM, Greg Stark wrote:

On Sun, Mar 22, 2015 at 3:34 PM, Euler Taveira eu...@timbira.com.br wrote:

On 21-03-2015 17:53, Josh Berkus wrote:

Now, I have *long* been an advocate that we should ship a stripped
PostgreSQL.conf which has only the most commonly used settings, and
leave the rest of the settings in the docs and
share/postgresql/postgresql.conf.advanced.  Here's my example of such a
file, tailored to PostgreSQL 9.3:


+1. I agree that common used settings in a postgresql.conf file is
useful for newbies. How do we ship it?



Fwiw I disagree. I'm a fan of the idea that the default should be an
empty config file. You should only have to put things in
postgresql.conf if you want something unusual or specific. We're a
long way from there but I would rather move towards there than keep
operating under the assumption that nobody will run Postgres without
first completing the rite of passage of reviewing every option in
postgresql.conf to see if it's relevant to their setup.

Apache used to ship with a config full of commented out options and
going through and figuring out which options needed to be enabled or
changed was the first step in installing Apache. It was awful. When
they adopted a strict policy that the default config was empty so the
only things you need in your config are options to specify what you
want to serve it became so much easier. I would argue it also
represents a more professional attitude where the job of the admin is
to declare only what he wants to happen and how it should differ from
the norm and the job of the software is to go about its business
without needing to be set up for normal uses.


+1. Going the route of big default config files just sucks. We should 
either just have an empty .conf, or only write stuff that initdb has 
tweaked in it.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Rajeev rastogi

On 25 March 2015 16:00, Amit Kapila Wrote:

  Which version of patch you are looking at?
 I am seeing below code in ExecInitFunnel() in Version-11 to which
 you have replied.

  + /* Funnel node doesn't have innerPlan node. */
 + Assert(innerPlan(node) == NULL

I was seeing the version-10.
I just checked version-11 and version-12 and found to be already fixed.
I should have checked the latest version before sending the report…☺

Thanks and Regards,
Kumar Rajeev Rastogi

From: Amit Kapila [mailto:amit.kapil...@gmail.com]
Sent: 25 March 2015 16:00
To: Rajeev rastogi
Cc: Amit Langote; Robert Haas; Andres Freund; Kouhei Kaigai; Amit Langote; 
Fabrízio Mello; Thom Brown; Stephen Frost; pgsql-hackers
Subject: Re: [HACKERS] Parallel Seq Scan

On Wed, Mar 25, 2015 at 3:47 PM, Rajeev rastogi 
rajeev.rast...@huawei.commailto:rajeev.rast...@huawei.com wrote:

 On 20 March 2015 17:37, Amit Kapila Wrote:


  So the patches have to be applied in below sequence:

  HEAD Commit-id : 8d1f2390
  parallel-mode-v8.1.patch [2]
  assess-parallel-safety-v4.patch [1]
  parallel-heap-scan.patch [3]
  parallel_seqscan_v11.patch (Attached with this mail)


 While I was going through this patch, I observed one invalid ASSERT in the 
 function “ExecInitFunnel” i.e.

 Assert(outerPlan(node) == NULL);

 Outer node of Funnel node is always non-NULL and currently it will be 
 PartialSeqScan Node.

Which version of patch you are looking at?

I am seeing below code in ExecInitFunnel() in Version-11 to which
you have replied.

+ /* Funnel node doesn't have innerPlan node. */
+ Assert(innerPlan(node) == NULL);


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.comhttp://www.enterprisedb.com/


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Kouhei Kaigai
 2015/03/25 19:09、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
 
  On Wed, Mar 25, 2015 at 3:14 PM, Shigeru HANADA shigeru.han...@gmail.com
 wrote:
 Or bottom of make_join_rel().  IMO build_join_rel() is responsible for
  just building (or searching from a list) a RelOptInfo for given relids.  
  After
  that make_join_rel() calls add_paths_to_joinrel() with appropriate 
  arguments
 per
  join type to generate actual Paths implements the join.  make_join_rel() is
  called only once for particular relid combination, and there 
  SpecialJoinInfo
 and
  restrictlist (conditions specified in JOIN-ON and WHERE), so it seems 
  promising
  for FDW cases.
 
 
 
  I like that idea, but I think we will have complex hook signature, it won't
 remain
  as simple as hook (root, joinrel).
 
  In this case, GetForeignJoinPaths() will take root, joinrel, rel1, rel2,
  sjinfo and restrictlist.
  It is not too simple, but not complicated signature.
 
  Even if we reconstruct rel1 and rel2 using sjinfo, we also need to compute
  restrictlist using build_joinrel_restrictlist() again. It is a static 
  function
  in relnode.c. So, I don't think either of them has definitive advantage from
  the standpoint of simplicity.
 
 The bottom of make_join_rel() seems good from the viewpoint of information, 
 but
 it is called multiple times for join combinations which are essentially 
 identical,
 for INNER JOIN case like this:
 
 fdw=# explain select * from pgbench_branches b join pgbench_tellers t on t.bid
 = b.bid join pgbench_accounts a on a.bid = b.bid and a.bid = t.bid;
 INFO:  postgresGetForeignJoinPaths() 1x2
 INFO:  postgresGetForeignJoinPaths() 1x4
 INFO:  postgresGetForeignJoinPaths() 2x4
 INFO:  standard_join_search() old hook point
 INFO:  standard_join_search() old hook point
 INFO:  standard_join_search() old hook point
 INFO:  postgresGetForeignJoinPaths() 0x4
 INFO:  postgresGetForeignJoinPaths() 0x2
 INFO:  postgresGetForeignJoinPaths() 0x1
 INFO:  standard_join_search() old hook point
QUERY PLAN
 -
  Foreign Scan  (cost=100.00..102.11 rows=211 width=1068)
 (1 row)
 
 Here I’ve put probe point in the beginnig of GetForeignJoinPaths handler and 
 just
 before set_cheapest() call in standard_join_search() as “old hook point”.  In
 this example 1, 2, and 4 are base relations, and in the join level 3 planner 
 calls
 GetForeignJoinPaths() three times for the combinations:
 
 1) (1x2)x4
 2) (1x4)x2
 3) (2x4)x1
 
 Tom’s suggestion is aiming at providing a chance to consider join push-down in
 more abstract level, IIUC.  So it would be good to call handler only once for
 that case, for flattened combination (1x2x3).

 Hum, how about skipping calling handler (or hook) if the joinrel was found by
 find_join_rel()?  At least it suppress redundant call for different join 
 orders,
 and handler can determine whether the combination can be flattened by checking
 that all RelOptInfo with RELOPT_JOINREL under joinrel has JOIN_INNER as 
 jointype.
 
The reason why FDW handler was called multiple times on your example is,
your modified make_join_rel() does not check whether build_join_rel()
actually build a new RelOptInfo, or just a cache reference, doesn't it?

If so, I'm inclined to your proposition.
A new bool *found argument of build_join_rel() makes reduce number of
FDW handler call, with keeping reasonable information to build remote-
join query.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Amit Kapila
On Wed, Mar 25, 2015 at 4:08 PM, Rajeev rastogi rajeev.rast...@huawei.com
wrote:

 On 25 March 2015 16:00, Amit Kapila Wrote:

   Which version of patch you are looking at?

  I am seeing below code in ExecInitFunnel() in Version-11 to which

  you have replied.



   + /* Funnel node doesn't have innerPlan node. */
  + Assert(innerPlan(node) == NULL



 I was seeing the version-10.

 I just checked version-11 and version-12 and found to be already fixed.

 I should have checked the latest version before sending the report…J


No problem, Thanks for looking into the patch.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] pg_rewind in contrib

2015-03-25 Thread Michael Paquier
On Thu, Mar 26, 2015 at 12:23 PM, Venkata Balaji N nag1...@gmail.com wrote:
 Test 1 :

 [...]

 If the master is crashed or killed abruptly, it may not be possible to do a
 rewind. Is my understanding correct ?

Yep. This is mentioned in the documentation:
http://www.postgresql.org/docs/devel/static/app-pgrewind.html
The target server must shut down cleanly before running pg_rewind.

 Test 2 :

 - On a successfully running streaming replication with one master and one
 slave, i did a clean shutdown of master
 - promoted slave
 - performed some operations (data changes) on newly promoted slave and did a
 clean shutdown
 - Executed pg_rewind on the old master to sync with the latest changes on
 new master. I got the below message

 The servers diverged at WAL position 0/A298 on timeline 1.
 No rewind required.

 I am not getting this too.

In this case the master WAL visibly did not diverge from the slave WAL
line. A rewind is done if the master touches new relation pages after
the standby has been promoted, and before the master is shutdown.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-25 Thread Amit Kapila
On Wed, Mar 25, 2015 at 9:53 PM, Thom Brown t...@linux.com wrote:

 On 25 March 2015 at 15:49, Amit Kapila amit.kapil...@gmail.com wrote:

 On Wed, Mar 25, 2015 at 5:16 PM, Thom Brown t...@linux.com wrote:
  Okay, with my pgbench_accounts partitioned into 300, I ran:
 
  SELECT DISTINCT bid FROM pgbench_accounts;
 
  The query never returns,

 You seem to be hitting the issue I have pointed in near-by thread [1]
 and I have mentioned the same while replying on assess-parallel-safety
 thread.   Can you check after applying the patch in mail [1]


 Ah, okay, here's the patches I've now applied:

 parallel-mode-v9.patch
 assess-parallel-safety-v4.patch
 parallel-heap-scan.patch
 parallel_seqscan_v12.patch
 release_lock_dsm_v1.patch

 (with perl patch for pg_proc.h)

 The query now returns successfully.


Thanks for verification.

 ..
 
  Still not sure why 8 workers are needed for each partial scan.  I
would expect 8 workers to be used for 8 separate scans.  Perhaps this is
just my misunderstanding of how this feature works.
 

 The reason is that for each table scan, it tries to use workers
 equal to parallel_seqscan_degree if they are available and in this
 case as the scan for inheritance hierarchy (tables in hierarchy) happens
 one after another, it uses 8 workers for each scan.  I think as of now
 the strategy to decide number of workers to be used in scan is kept
 simple and in future we can try to come with some better mechanism
 to decide number of workers.


 Yes,  I was expecting the parallel aspect to apply across partitions (a
worker per partition up to parallel_seqscan_degree and reallocate to
another scan once finished with current job), not individual ones,


Here what you are describing is something like parallel partition
scan which is somewhat related but different feature.  This
feature will parallelize the scan for an individual table.

 so for the workers to be above the funnel, not below it.  So this is
parallelising, just not in a way that will be a win in this case. :(  For
the query I
 posted (SELECT DISTINCT bid FROM pgbench_partitions), the parallelised
version takes 8 times longer to complete.


I think the primary reason for it not performing as per expectation is
because we have either not the set the right values for cost
parameters or changed the existing cost parameters (cost_seq_page)
which makes planner to select parallel plan even though it is costly.

This is similar to the behaviour when user has intentionally disabled
index scan to test sequence scan and then telling that it is performing
slower.

I think if you want to help in this direction, then what will be more useful
is to see what could be the appropriate values of cost parameters for
parallel scan.  We have introduced 3 parameters (cpu_tuple_comm_cost,
parallel_setup_cost, parallel_startup_cost) for costing of parallel plans,
so
with your tests if we can decide what is the appropriate value for each of
these parameters such that it chooses parallel plan only when it is better
than non-parallel plan, then that will be really valuable input.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-25 Thread Shigeru HANADA

2015/03/26 10:51、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
 The attached patch adds GetForeignJoinPaths call on make_join_rel() only when
 'joinrel' is actually built and both of child relations are managed by same
 FDW driver, prior to any other built-in join paths.
 I adjusted the hook definition a little bit, because jointype can be 
 reproduced
 using SpecialJoinInfo. Right?

OK.

 
 Probably, it will solve the original concern towards multiple calls of FDW
 handler in case when it tries to replace an entire join subtree with a 
 foreign-
 scan on the result of remote join query.
 
 How about your opinion?

Seems fine.  I’ve fixed my postgres_fdw code to fit the new version, and am 
working on handling a whole-join-tree.

It would be difficult in the 9.5 cycle, but a hook point where we can handle 
whole joinrel might allow us to optimize a query which accesses multiple parent 
tables, each is inherited by foreign tables and partitioned with identical join 
key, by building a path tree which joins sharded tables first, and then union 
those results.

--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2015-03-25 Thread Venkata Balaji N
 I have committed this, with some more kibitzing.  hope I have not missed
 any comments given so far. Many thanks for the review, and please continue
 reviewing and testing it :-).


I have been testing the pg_rewind and have an analysis to share along with
few questions -

I had a streaming replication setup with one master and one slave running
successfully.

Test 1 :

- Killed postgres process on master and promoted slave. Both were in sync
earlier.
- performed some operations (data changes) on newly promoted slave node and
shutdown
- Executed pg_rewind on old master and got the below message


*target server must be shut down cleanly*
*Failure, exiting*

If the master is crashed or killed abruptly, it may not be possible to do a
rewind. Is my understanding correct ?

Test 2 :

- On a successfully running streaming replication with one master and one
slave, i did a clean shutdown of master
- promoted slave
- performed some operations (data changes) on newly promoted slave and did
a clean shutdown
- Executed pg_rewind on the old master to sync with the latest changes on
new master. I got the below message


*The servers diverged at WAL position 0/A298 on timeline 1.*
*No rewind required.*

I am not getting this too.

Regards,
Venkata Balaji N