Re: [HACKERS] logical column ordering

2015-03-13 Thread Jeff Janes
On Mon, Feb 23, 2015 at 3:09 PM, Tomas Vondra tomas.von...@2ndquadrant.com
wrote:

 Hi,

 attached is the result of my first attempt to make the logical column
 ordering patch work. This touches a lot of code in the executor that is
 mostly new to me, so if you see something that looks like an obvious
 bug, it probably is (so let me know).


There is an apply conflict in src/backend/parser/parse_relation.c in the
comments for scanRTEForColumn.

It seems like it should be easy to ignore, but when I ignore it I get make
check failing all over the place.

(The first patch posted in this thread seems to work for me, I did some
testing on it before I realized it was obsolete.)

Cheers,

Jeff


Re: [HACKERS] get_object_address support for additional object types

2015-03-13 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Stephen Frost wrote:
  I thought the rest of it looked alright.  I agree it's a bit odd how the
  opfamily is handled but I agree with your assessment that there's not
  much better we can do with this object representation.
 
 Actually, on second thought I revisited this and changed the
 representation for opfamilies and opclasses: instead of putting the AM
 name in objargs, we can put it as the first element of objname instead.
 That way, objargs is unused for opfamilies and opclasses, and we're free
 to use it for the type arguments in amops and amprocs.  This makes the
 lists consistent for the four cases: in objname, amname first, then
 qualified opclass/opfamily name.  For amop/amproc, the member number
 follows.  Objargs is unused in opclass/opfamily, and it's a two-element
 list of types in amop/amproc.

Agreed, that makes more sense to me also.

 The attached patch changes the grammar to comply with the above, and
 adds the necessary get_object_address and getObjectIdentityParts support
 code for amop/amproc objects.

I took a quick look through and it looked fine to me.

 The only thing a bit unusual is that in does_not_exist_skipping() we
 need to do an list_copy_tail() to strip out the amname before reporting
 the skipping message, for DROP OPERATOR CLASS/FAMILY IF NOT EXISTS.
 I don't think this is a problem.  In return, the code to deconstruct
 amop and amproc addresses is more sensible.

Works for me.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] New CF app deployment

2015-03-13 Thread Jeff Janes
On Sun, Feb 22, 2015 at 12:13 PM, Magnus Hagander mag...@hagander.net
wrote:

 On Sun, Feb 15, 2015 at 9:46 PM, Peter Geoghegan p...@heroku.com wrote:

 On Sun, Feb 15, 2015 at 4:59 PM, Peter Eisentraut pete...@gmx.net
 wrote:
  I think the old system where the patch submitter declared, this message
  contains my patch, is the only one that will work.

 I tend to agree. That being said, calling out latest attachments is
 also useful (or highlighting that a particular mail has a particular
 file attached in general).


 We can keep listing the attachment and just remove the automated check of
 what *kind* of attachment it is.


But when an email has multiple attachments, there should be some kind of
indication that this is the case.

In https://commitfest.postgresql.org/4/21/

The line:

Attachment (attlognum-test.sql
http://www.postgresql.org/message-id/attachment/36971/attlognum-test.sql)
at 2015-02-23 23:09:06
http://www.postgresql.org/message-id/54ebb312.7090...@2ndquadrant.com/ from
Tomas Vondra tomas.vondra at 2ndquadrant.com (Patch: No)

is misleading.

Cheers,

Jeff


[HACKERS] Paper from IBM: Memory-Efficient Hash Joins

2015-03-13 Thread Arthur Silva
I've come across this paper today, I thought I'd share it here too.

http://www.vldb.org/pvldb/vol8/p353-barber.pdf


--
Arthur Silva


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-13 Thread David G. Johnston
On Fri, Mar 13, 2015 at 10:01 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:



 2015-03-13 17:39 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  we found possible bug in pg_dump. It raise a error only when all
 specified
  tables doesn't exists. When it find any table, then ignore missing
 other.
 
  /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres  /dev/null;
 echo
  $?
 
  foo doesn't exists - it creates broken backup due missing Foo table
 
   [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t
 omegaa -s
  postgres  /dev/null; echo $?
  pg_dump: No matching tables were found
  1
 
  Is it ok? I am thinking, so it is potentially dangerous. Any explicitly
  specified table should to exists.

 Keep in mind that the argument to -t is a pattern, not just a table
 name.  I'm not sure how much that affects the calculus here, but it's
 something to think about.


 yes, it has a sense, although now, I am don't think so it was a good idea.
 There should be some difference between table name and table pattern.


​There is...a single table name is simply expressed as a pattern without
any wildcards.  The issue here is that pg_dump doesn't require that every
instance of -t find one (or more, if a wildcard is present) entries only
that at least one entry is found among all of the patterns specified by -t​.

I'll voice my agreement that each of the -t specifications should find at
least one table in order for the dump as a whole to succeed; though
depending on presented use cases for the current behavior I could see
allowing the command writer to specify a more lenient interpretation by
specifying something like --allow-missing-tables.

Command line switch formats don't really allow you to write -t?​ to mean
I want these table(s) if present, do they?  I guess the input itself
could be interpreted that way though; a leading ? is not a valid wildcard
and double-quotes would be required for it to be a valid table name.

David J.


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Another bit of this that I think we could commit without fretting
 about it too much is the code adding set_join_pathlist_hook.  This is
 - I think - analogous to set_rel_pathlist_hook, and like that hook,
 could be used for other purposes than custom plan generation - e.g. to
 delete paths we do not want to use.  I've extracted this portion of
 the patch and adjusted the comments; if there are no objections, I
 will commit this bit also.

I don't object to the concept, but I think that is a pretty bad place
to put the hook call: add_paths_to_joinrel is typically called multiple
(perhaps *many*) times per joinrel and thus this placement would force
any user of the hook to do a lot of repetitive work.

I think the right placement is just before the set_cheapest call for
each joinrel, just as we did with set_rel_pathlist_hook.  It looks
like those calls are at:

allpaths.c:1649 (in standard_join_search)
geqo_eval.c:270 (in merge_clump)

There are a couple of other set_cheapest calls that probably don't need
hooked, since they are for dummy (proven empty) rels, and it's not clear
how a hook could improve on an empty plan.

Also, this would leave you with a much shorter parameter list ;-) ...
really no reason to pass more than root and rel.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-13 Thread Peter Geoghegan
On Thu, Mar 12, 2015 at 6:23 PM, Andreas Karlsson andr...@proxel.se wrote:
 Fixed.

Did you intend to attach a patch here?

 I think you should talk about the new thing first (just after the
 extant, first sentence Integer data types use Numeric...). Refer to
 where 128-bit integers are used and how, and only then the other stuff
 (exceptions). After that, put the  PolyNumAggState struct definition
 and basic functions. Then int2_accum() and so on.


 Good idea! Do you think the rewritten comment is clear enough, or do I need
 to go into more detail?

 /*
  * Integer data types use Numeric accumulators to share code and avoid risk
  * of overflow.  To speed up aggregation 128-bit integer accumulators are
  * used instead where sum(X) or sum(X*X) fit into 128-bits, and there is
  * platform support.
  *
  * For int2 and int4 inputs sum(X) will fit into a 64-bit accumulator, hence
  * we use faster special-purpose accumulator routines for SUM and AVG of
  * these datatypes.
  */

 #ifdef HAVE_INT128
 typedef struct Int128AggState

Not quite. Refer to the 128-bit integer accumulators as
special-purpose accumulator routines instead. Then, in the case of
the extant 64-bit accumulators, refer to them by the shorthand
integer accumulators. Otherwise it's the wrong way around.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regarding pg_stat_statements

2015-03-13 Thread Josh Berkus
On 03/13/2015 02:29 AM, Sreerama Manoj wrote:
   As we know that pg_stat_statements will monitor the queries
 after normalizing the queries(Removes the values present in query). I
 want to know is there a way to store those normalized values because I
 want to check the type of data(values) ,range of data that is being hit
 to the database. I am using Postgres 9.4

BTW, this really isn't appropriate for the pgsql-hackers list; please
ask your next question like this on pgsql-general or pgsql-performance.

However, pg_qualstats is what you want:
http://powa.readthedocs.org/en/latest/stats_extensions/pg_qualstats.html

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs

2015-03-13 Thread Jeff Janes
On Mon, Jan 5, 2015 at 7:12 AM, Atri Sharma atri.j...@gmail.com wrote:


 Hi All,

 Please forgive if this is a repost.

 Please find attached patch for supporting ORDER BY clause in CREATE
 FUNCTION for SRFs.


Hi Atri,

From the discussion, I don't know if this patch is still being proposed.
If so, it needs a rebase.

Thanks,

Jeff


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-03-13 Thread David Rowley
On 13 March 2015 at 20:34, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello.  The performance drag was not so large after all.


Great, thanks for retesting this.


 For all that, I agree that the opition that this kind of separate
 multiple-nested loops on relations, joins or ECs and so on for
 searching something should be avoided. I personally feel that
 additional time to such an extent (around 1%) would be tolerable
 if it affected a wide range of queries or it brought more obvious
 gain.


Do you have any ideas on an implementation of how we can avoid checking all
eclasses for each combination of joins?

The only thing I can think of would be to add a List of eclasses on
RelOptInfo for all eclasses that the RelOptInfo has. That would save the
looping over every single eclass in eclassjoin_is_unique_join() and save
from having to check if the relation is mentioned in the eclass before
continuing. I could certainly make this change, but I'd be worried that it
would just add bloat to the patch and cause a committed to question it.

I could also almost completely remove the extra plan time of your test case
by either adding a proper pre-check to ensure the relation has unique
indexes, rather than just indexes or I could add a new list to RelOptInfo
which stores unique index, then I could just check if that's NIL before
going any further in eclassjoin_is_unique_join(), but I come from a world
where every relation has a primary key, so I'd just imagine that would not
be hit in enough real cases. I'd imagine that pre-check is only there
because it's so cheap in the first place.

For testing, I added some code to mark_unique_joins() to spit out a NOTICE:

if (eclassjoin_is_unique_join(root, joinlist, rtr))
{
root-simple_rel_array[rtr-rtindex]-is_unique_join = true;
elog(NOTICE, Unique Join: Yes);
}
else
elog(NOTICE, Unique Join: No);

and the same below for special joins too.

On running the regression tests I see:

Unique Join: Yes  1557 times
Unique Join: No 11563 times

I would imagine the regression tests are not the best thing to base this
one as they tend to exercise more unusual cases.
I have an OLTP type application here which I will give a bit of exercise
and see how that compares.



 Unfortunately I can't decide this to be 'ready for commiter' for
 now. I think we should have this on smaller footprint, in a
 method without separate exhauxtive searching. I also had very
 similar problem in the past but I haven't find such a way for my
 problem..


I don't think it's ready yet either. I've just been going over a few things
and looking at Tom's recent commit b557226 in pathnode.c I've got a feeling
that this patch would need to re-factor some code that's been modified
around the usage of relation_has_unique_index_for() as when this code is
called, the semi joins have already been analysed to see if they're unique,
so it could be just a case of ripping all of that out
of create_unique_path() and just putting a check to say rel-is_unique_join
in there. But if I do that then I'm not quite sure if
SpecialJoinInfo-semi_rhs_exprs and SpecialJoinInfo-semi_operators would
still be needed at all. These were only added in b557226. Changing this
would help reduce the extra planning time when the query contains
semi-joins. To be quite honest, this type of analysis belongs in
analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
rather Tom had a quick glance at what I'm trying to do here first.



 At Wed, 11 Mar 2015 01:32:24 +1300, David Rowley dgrowle...@gmail.com
 wrote in CAApHDvpEXAjs6mV2ro4=3qbzpx=
 plrteinx0j2yhq6wrp85...@mail.gmail.com
   explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on
   (t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join
 t6 on
   (t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join
 t9 on
   (t8.b = t9.a) join t10 on (t9.b = t10.a);
  
   The head takes 3ms for planning and the patched version takes
   around 5ms while pure execution time is 1ms.. I think it is a too
   long extra time.
  
  
  This smells quite fishy to me. I'd be quite surprised if your machine
 took
  an extra 2 ms to do this.

 You're right. Sorry. I was amazed by the numbers..

 I took again the times for both master and patched on master
 (some conflict arised). Configured with no options so compiled
 with -O2 and no assertions. Measured the planning time for the
 test query 10 times and calcualted the average.

 patched: 1.883ms (stddev 0.034)
 master:  1.861ms (stddev 0.042)

 About 0.02ms, 1% extra time looks to be taken by the extra
 processing.


Is this still using the same test query as I attached in my previous email?

This is still 25 times more of a slowdown as to what I witnessed by calling
mark_unique_joins() 1 million times in a tight loop, but of course much
closer to what I would have thought. You're getting 20,000 nanoseconds and
I'm getting 800 nanoseconds, but our overall planning times are very

Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-13 Thread Andreas Karlsson

On 03/13/2015 10:22 PM, Peter Geoghegan wrote:

On Thu, Mar 12, 2015 at 6:23 PM, Andreas Karlsson andr...@proxel.se wrote:

/*
  * Integer data types use Numeric accumulators to share code and avoid risk
  * of overflow.  To speed up aggregation 128-bit integer accumulators are
  * used instead where sum(X) or sum(X*X) fit into 128-bits, and there is
  * platform support.
  *
  * For int2 and int4 inputs sum(X) will fit into a 64-bit accumulator, hence
  * we use faster special-purpose accumulator routines for SUM and AVG of
  * these datatypes.
  */

#ifdef HAVE_INT128
typedef struct Int128AggState


Not quite. Refer to the 128-bit integer accumulators as
special-purpose accumulator routines instead. Then, in the case of
the extant 64-bit accumulators, refer to them by the shorthand
integer accumulators. Otherwise it's the wrong way around.


I disagree. The term integer accumulators is confusing. Right now I do 
not have any better ideas for how to write that comment, so I submit the 
next version of the patch with the comment as I wrote it above. Feel 
free to come up with a better wording, my English is not always up to 
par when writing technical texts.


--
Andreas Karlsson
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 509f961..48fcc68 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -125,6 +125,41 @@ undefine([Ac_cachevar])dnl
 ])# PGAC_TYPE_64BIT_INT
 
 
+# PGAC_HAVE___INT128
+# --
+# Check if __int128 is a working 128 bit integer type and
+# define HAVE___INT128 if so.
+AC_DEFUN([PGAC_HAVE___INT128],
+[AC_CACHE_CHECK(for __int128, pgac_cv_have___int128,
+[AC_TRY_RUN([
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does_int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+main() {
+  exit(! does_int128_work());
+}],
+[pgac_cv_have___int128=yes],
+[pgac_cv_have___int128=no],
+[# If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+AC_COMPILE_IFELSE([AC_LANG_BOOL_COMPILE_TRY([], [sizeof(__int128) == 16])],
+  pgac_cv_have___int128=yes,
+  pgac_cv_have___int128=no)])])
+
+if test x$pgac_cv_have___int128 = xyes ; then
+  AC_DEFINE(HAVE___INT128, 1, [Define to 1 if the system has the type `__int128'.])
+fi])# PGAC_TYPE_64BIT_INT
+
 
 # PGAC_C_FUNCNAME_SUPPORT
 # ---
diff --git a/configure b/configure
index fa271fe..1cd964d 100755
--- a/configure
+++ b/configure
@@ -13773,6 +13773,74 @@ _ACEOF
 fi
 
 
+{ $as_echo $as_me:${as_lineno-$LINENO}: checking for __int128 5
+$as_echo_n checking for __int128...  6; }
+if ${pgac_cv_have___int128+:} false; then :
+  $as_echo_n (cached)  6
+else
+  if test $cross_compiling = yes; then :
+  # If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+static int test_array [1 - 2 * !(sizeof(__int128) == 16)];
+test_array [0] = 0;
+return test_array [0];
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile $LINENO; then :
+  pgac_cv_have___int128=yes
+else
+  pgac_cv_have___int128=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+else
+  cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does_int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+main() {
+  exit(! does_int128_work());
+}
+_ACEOF
+if ac_fn_c_try_run $LINENO; then :
+  pgac_cv_have___int128=yes
+else
+  pgac_cv_have___int128=no
+fi
+rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
+  conftest.$ac_objext conftest.beam conftest.$ac_ext
+fi
+
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: $pgac_cv_have___int128 5
+$as_echo $pgac_cv_have___int128 6; }
+
+if test x$pgac_cv_have___int128 = xyes ; then
+
+$as_echo #define HAVE___INT128 1 confdefs.h
+
+fi
+
 # We also check for sig_atomic_t, which *should* be defined per ANSI
 # C, but is missing on some old platforms.
 ac_fn_c_check_type $LINENO sig_atomic_t ac_cv_type_sig_atomic_t #include signal.h
diff --git a/configure.in b/configure.in
index e6a49d1..2ef7870 100644
--- a/configure.in
+++ b/configure.in
@@ -1761,6 +1761,8 @@ AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignme
 AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
 [#include stdio.h])
 
+PGAC_HAVE___INT128
+
 # We also check for sig_atomic_t, which *should* be defined per ANSI
 # C, but is missing on some old platforms.
 AC_CHECK_TYPES(sig_atomic_t, [], [], [#include signal.h])
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 715917b..1fe3d69 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -402,7 +402,10 @@ static void apply_typmod(NumericVar *var, 

Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-13 Thread Andrew Gierth
 Peter == Peter Geoghegan p...@heroku.com writes:

 Peter I attach a slightly tweaked version of Andrew's original.

You changed this:

 static int
 comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
 {
-int compare;
+int32   compare;
 
 compare = ApplySortComparator(a-datum1, a-isnull1,

Since ApplySortComparator returns int, and compare is used to store
the return value of comparetup_datum which is also declared int, this
seems inappropriate even as a stylistic tweak.

Also, your changes to the block comment for SortTuple now hide the fact
that datum1 is potentially the abbreviated value in tuple as well as
single-Datum cases.  Here are the versions for comparison (mine is
first):

***
*** 136,175 
  /*
   * The objects we actually sort are SortTuple structs.  These contain
   * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
   * which is a separate palloc chunk --- we assume it is just one chunk and
   * can be freed by a simple pfree().  SortTuples also contain the tuple's
   * first key column in Datum/nullflag format, and an index integer.
   *
   * Storing the first key column lets us save heap_getattr or index_getattr
   * calls during tuple comparisons.  We could extract and save all the key
   * columns not just the first, but this would increase code complexity and
   * overhead, and wouldn't actually save any comparison cycles in the common
   * case where the first key determines the comparison result.  Note that
   * for a pass-by-reference datatype, datum1 points into the tuple storage.
   *
-  * There is one special case: when the sort support infrastructure provides an
-  * abbreviated key representation, where the key is (typically) a pass by
-  * value proxy for a pass by reference type.  In this case, the abbreviated 
key
-  * is stored in datum1 in place of the actual first key column.
-  *
   * When sorting single Datums, the data value is represented directly by
!  * datum1/isnull1 for pass by value types (or null values).  If the datatype 
is
!  * pass-by-reference and isnull1 is false, then tuple points to a separately
!  * palloc'd data value, otherwise tuple is NULL.  The value of datum1 is 
then
!  * either the same pointer as tuple, or is an abbreviated key value as
!  * described above.  Accordingly, tuple is always used in preference to
!  * datum1 as the authoritative value for pass-by-reference cases.
   *
   * While building initial runs, tupindex holds the tuple's run number.  During
   * merge passes, we re-use it to hold the input tape number that each tuple in
   * the heap was read from, or to hold the index of the next tuple pre-read
   * from the same tape in the case of pre-read entries.  tupindex goes unused
   * if the sort occurs entirely in memory.
   */
  typedef struct
  {
  void   *tuple;  /* the tuple proper */
  Datum   datum1; /* value of first key column */
  boolisnull1;/* is first key column NULL? */
  int tupindex;   /* see notes above */
  } SortTuple;
--- 136,170 
  /*
   * The objects we actually sort are SortTuple structs.  These contain
   * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
   * which is a separate palloc chunk --- we assume it is just one chunk and
   * can be freed by a simple pfree().  SortTuples also contain the tuple's
   * first key column in Datum/nullflag format, and an index integer.
   *
   * Storing the first key column lets us save heap_getattr or index_getattr
   * calls during tuple comparisons.  We could extract and save all the key
   * columns not just the first, but this would increase code complexity and
   * overhead, and wouldn't actually save any comparison cycles in the common
   * case where the first key determines the comparison result.  Note that
   * for a pass-by-reference datatype, datum1 points into the tuple storage.
   *
   * When sorting single Datums, the data value is represented directly by
!  * datum1/isnull1.  If the datatype is pass-by-reference and isnull1 is false,
!  * then datum1 points to a separately palloc'd data value that is also pointed
!  * to by the tuple pointer; otherwise tuple is NULL.  There is one special
!  * case:  when the sort support infrastructure provides an abbreviated key
!  * representation, where the key is (typically) a pass by value proxy for a
!  * pass by reference type.
   *
   * While building initial runs, tupindex holds the tuple's run number.  During
   * merge passes, we re-use it to hold the input tape number that each tuple in
   * the heap was read from, or to hold the index of the next tuple pre-read
   * from the same tape in the case of pre-read entries.  tupindex goes unused
   * if the sort occurs entirely in memory.
   */
  typedef struct
  {
  void   *tuple;  /* the tuple proper */
  Datum   datum1; /* value of first key column 

Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Thu, Mar 12, 2015 at 9:50 PM, Thom Brown t...@linux.com wrote:

 On 12 March 2015 at 15:29, Amit Kapila amit.kapil...@gmail.com wrote:
  Please note that parallel_setup_cost and parallel_startup_cost are
  still set to zero by default, so you need to set it to higher values
  if you don't want the parallel plans once parallel_seqscan_degree
  is set.  I have yet to comeup with default values for them, needs
  some tests.

 Thanks.  Getting a problem:


Thanks for looking into patch.

So as per this report, I am seeing 3 different problems in it.

Problem-1:
-

 # SELECT name, setting FROM pg_settings WHERE name IN
 ('parallel_seqscan_degree','max_worker_processes','seq_page_cost');
   name   | setting
 -+-
  max_worker_processes| 20
  parallel_seqscan_degree | 8
  seq_page_cost   | 1000
 (3 rows)

 # EXPLAIN SELECT DISTINCT bid FROM pgbench_accounts;
 ERROR:  too many dynamic shared memory segments



This happens because we have maximum limit on the number of
dynamic shared memory segments in the system.

In function dsm_postmaster_startup(), it is defined as follows:

maxitems = PG_DYNSHMEM_FIXED_SLOTS
+ PG_DYNSHMEM_SLOTS_PER_BACKEND * MaxBackends;

In the above case, it is choosing parallel plan for each of the
AppendRelation,
(because of seq_page_cost = 1000) and that causes the test to
cross max limit of dsm segments.

One way to fix could be that we increase the number of dsm segments
that can be created in a system/backend, but it seems to me that in
reality there might not be many such plans which would need so many
dsm segments, unless user tinkers too much with costing and even if
he does, he can increase max_connections to avoid such problem.

I would like to see opinion of other people on this matter.

Problem-2:

2015-03-12 16:09:30 GMT [7880]: [36-1] user=,db=,client= LOG:  server
process (PID 7889) was terminated by signal 11: Segmentation fault
2015-03-12 16:09:30 GMT [7880]: [37-1] user=,db=,client= DETAIL:
Failed process was running: SELECT pg_catalog.quote_ident(c.relname)
FROM pg_catalog.pg_class c WHERE c.relkind IN ('r', 'S', 'v', 'm',
'f') AND substring(pg_catalog.quote_ident(c.relname),1,10)='pgbench_br'
AND pg_catalog.pg_table_is_visible(c.oid) AND c.relnamespace 
(SELECT oid FROM pg_catalog.pg_namespace WHERE nspname = 'pg_catalog')
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM
pg_catalog.pg_namespace n WHERE
substring(pg_catalog.quote_ident(n.nspname) || '.',1,10)='pgbench_br'
AND (SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE
substring(pg_catalog.quote_ident(nspname) || '.',1,10) =
substring('pgbench_br',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1))
 1
UNION
SELECT pg_catalog.quote_ident(n.nspname) || '.' ||
pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c,
pg_catalog.pg_namespace n WHERE c.relnamespace = n.oid AND c.relkind
IN ('r', 'S', 'v', 'm', 'f') AND
substring(pg_catalog.quote_ident(n.nspname) || '.' ||
pg_catalog.quote_ident(c.relname),1,10)='pgbench_br' AND substri


This seems to be unrelated to first issue (as the statement in log has
nothing to do with Problem-1) and this could be same issue what
Amit Langote has reported, so we can test this once with the fix for that
issue, but I think it is important if we can isolate the test due to which
this problem has occurred.

Problem-3

I am seeing as Assertion failure (in ExitParallelMode()) with this test,
but that seems to be an issue due to the lack of integration with
access-parallel-safety patch.  I will test this after integrating with
access-parallel-safety patch.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Ye olde write_history() return code problem

2015-03-13 Thread Tom Lane
David Johnston's recent gripe reminded me of the problem we've been
working around for many years, that readline and libedit don't agree
on the return code convention for write_history().  We've attempted
to work around that by supposing that if errno becomes set to nonzero
during write_history() then there was a failure, regardless of whether
the library returned a failure code or not.  Well, this was always
fragile, and I suspect that it may be broken in David's environment.

But: it turns out that they don't agree is obsolete information!
libedit fixed their code to agree with readline back around 2006;
they both now do 0 if ok, errno code if not.

I would imagine therefore that just about any Linux distribution
in current use is shipping a libedit that works correctly.
I poked into Apple's source code and found that OS X has this
fixed in 10.5 (Leopard) and later, so there also it's fairly
likely that nobody's still using old libedit.  (Um, well,
prairiedog.  But I could install readline on that.)

So it seems like we could change our code to rely on the readline
return convention instead of doing something that neither library
promises to work.

Thoughts?  Should we back-patch that, or not?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't object to the concept, but I think that is a pretty bad place
 to put the hook call: add_paths_to_joinrel is typically called multiple
 (perhaps *many*) times per joinrel and thus this placement would force
 any user of the hook to do a lot of repetitive work.

 Interesting point.  I guess the question is whether a some or all
 callers are going to actually *want* a separate call for each
 invocation of add_paths_to_joinrel(), or whether they'll be happy to
 operate on the otherwise-complete path list.

Hmm.  You're right, it's certainly possible that some users would like to
operate on each possible pair of input relations, rather than considering
the joinrel as a whole.  Maybe we need two hooks, one like your patch
and one like I suggested.

 ... But for joinrels, it's not so
 easy to rule out, say, a hash-join here.  Neither hook placement is
 much good for that; the path you want to get rid of may have already
 dominated paths you want to keep.

I don't particularly buy that line of argument.  If a path has been
deleted because it was dominated by another, and you are unhappy about
that decision, then a hook of this sort is not the appropriate solution;
you need to be going and fixing the cost estimates that you think are
wrong.  (This gets back to the point I keep making that I don't actually
believe you can do anything very useful with these hooks; anything
interesting is probably going to involve more fundamental changes to the
planner.)

 I think the foreign data wrapper join pushdown case, which also aims
 to substitute a scan for a join, is interesting to think about, even
 though it's likely to be handled by a new FDW method instead of via
 the hook.  Where should the FDW method get called from?  Currently,
 the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets
 called from add_paths_to_joinrel().  The patch at
 http://www.postgresql.org/message-id/CAEZqfEfy7p=urpwn-q-nngzb8kwhbfqf82ysb9ztfzg7zn6...@mail.gmail.com
 uses that to implement join pushdown in postgres_fdw; if you have A
 JOIN B JOIN C all on server X, we'll notice that the join with A and B
 can be turned into a foreign scan on A JOIN B, and similarly for A-C
 and B-C.  Then, if it turns out that the cheapest path for A-B is the
 foreign join, and the cheapest path for C is a foreign scan, we'll
 arrive at the idea of a foreign scan on A-B-C, and we'll realize the
 same thing in each of the other combinations as well.  So, eventually
 the foreign join gets pushed down.

But this is in fact exactly the sort of case where you should not
rediscover all that over again for each pair of input rels.  Do all
these baserels belong to the same foreign server is a question that need
only be considered once per joinrel.  Not that that matters for this
hook, because as you say we're not doing foreign-server support through
this hook, but I think it's a fine example of why you'd want a single
call per joinrel.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Another bit of this that I think we could commit without fretting
 about it too much is the code adding set_join_pathlist_hook.  This is
 - I think - analogous to set_rel_pathlist_hook, and like that hook,
 could be used for other purposes than custom plan generation - e.g. to
 delete paths we do not want to use.  I've extracted this portion of
 the patch and adjusted the comments; if there are no objections, I
 will commit this bit also.

 I don't object to the concept, but I think that is a pretty bad place
 to put the hook call: add_paths_to_joinrel is typically called multiple
 (perhaps *many*) times per joinrel and thus this placement would force
 any user of the hook to do a lot of repetitive work.

Interesting point.  I guess the question is whether a some or all
callers are going to actually *want* a separate call for each
invocation of add_paths_to_joinrel(), or whether they'll be happy to
operate on the otherwise-complete path list.  It's true that if your
goal is to delete paths, it's probably best to be called just once
after the path list is complete, and there might be a use case for
that, but I guess it's less useful than for baserels.  For a baserel,
as long as you don't nuke the sequential-scan path, there is always
going to be a way to complete the plan; so this would be a fine way to
implement a disable-an-index extension.  But for joinrels, it's not so
easy to rule out, say, a hash-join here.  Neither hook placement is
much good for that; the path you want to get rid of may have already
dominated paths you want to keep.

Suppose you want to add paths - e.g. you have an extension that goes
and looks for a materialized view that matches this subtree of the
query, and if it finds one, it substitutes a scan of the materialized
view for a scan of the baserel.  Or, as in KaiGai's case, you have an
extension that can perform the whole join in GPU-land and produce the
same results we would have gotten via normal execution.  Either way,
you want - and this is the central point of the whole patch here - to
inject a scan path into a joinrel.  It is not altogether obvious to me
what the best placement for this is.  In the materialized view case,
you probably need a perfect match between the baserels in the view and
the baserels in the joinrel to do anything.  There's no point in
re-checking that for every innerrels/outerrels combination.  I don't
know enough about the GPU case to reason about it intelligently; maybe
KaiGai can comment.

I think the foreign data wrapper join pushdown case, which also aims
to substitute a scan for a join, is interesting to think about, even
though it's likely to be handled by a new FDW method instead of via
the hook.  Where should the FDW method get called from?  Currently,
the FDW method in KaiGai's patch is GetForeignJoinPaths, and that gets
called from add_paths_to_joinrel().  The patch at
http://www.postgresql.org/message-id/CAEZqfEfy7p=urpwn-q-nngzb8kwhbfqf82ysb9ztfzg7zn6...@mail.gmail.com
uses that to implement join pushdown in postgres_fdw; if you have A
JOIN B JOIN C all on server X, we'll notice that the join with A and B
can be turned into a foreign scan on A JOIN B, and similarly for A-C
and B-C.  Then, if it turns out that the cheapest path for A-B is the
foreign join, and the cheapest path for C is a foreign scan, we'll
arrive at the idea of a foreign scan on A-B-C, and we'll realize the
same thing in each of the other combinations as well.  So, eventually
the foreign join gets pushed down.

But there's another possible approach: suppose that
join_search_one_level, after considering left-sided and right-sided
joins and after considering bushy joins, checks whether every relation
it's got is from the same foreign server, and if so, asks that foreign
server whether it would like to contribute any paths. Would that be
better or worse?  A disadvantage is that if you've got something like
A LEFT JOIN B LEFT JOIN C LEFT JOIN D LEFT JOIN E LEFT JOIN F LEFT
JOIN G LEFT JOIN H LEFT JOIN I but none of the joins can be pushed
down (say, each join clause calls a non-pushdown-safe function) you'll
end up examining a pile of joinrels - at every level of the join tree
- and individually rejecting each one.  With the
build-it-up-incrementally approach, you'll figure that all out at
level 2, and then after that there's nothing to do but give up
quickly.  On the other hand, I'm afraid the incremental approach might
miss a trick: consider small LEFT JOIN (big INNER JOIN huge ON big.x =
huge.x) ON small.y = big.y AND small.z = huge.z, where all three are
foreign tables on the same server.  If the output of the big/huge join
is big, none of those paths are going to survive at level 2, but the
overall join size might be very small, so we surely want a chance to
recover at level 3.  (We discussed test cases of this form quite a bit
in the context of 

Re: [HACKERS] Strange debug message of walreciver?

2015-03-13 Thread Tatsuo Ishii
 On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii is...@postgresql.org wrote:

  On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org
 wrote:
 
  When I set log_min_messages to debug5 and looked into walreciver log,
  I saw this:
 
  3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
  09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
  apply delay -1945478837 ms tran sfer latency 0 ms
 
  The replication apply delay -1945478837 ms part looks strange
  because the delay is below 0. The number is formatted as %d in elog
  call, and I suspect this is kind of integer overflow.
 
 
  Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
 that
  the integer overflow occurs because sometimes the return of the
  GetCurrentChunkReplayStartTime() is 0 (zero).
 
  I added an elog into GetReplicationApplyDelay to check this info:
 
  DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
  (0.00, 479099832352083.00)
  DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
  00:17:12.352043-03 replication apply delay -1936504800 ms transfer
 latency
  0 ms
  DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
  (0.00, 479099841450320.00)
  DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
  00:17:21.450294-03 replication apply delay -1936495702 ms transfer
 latency
  0 ms

 Right. Until walreceiver gets the first WAL record to replay,
 xlogctl-currentChunkStartTime remains 0.

  Maybe we should check before and return zero from
 GetReplicationApplyDelay.
  The attached patch implement it.

 Your patch regards 0 replication apply delay in the case above which
 is confusing if the delay is actually 0.

 What about something like this to explicitly stats the delay data is
 not available?

 elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A)
 transfer latency %d ms,

 
 Makes sense. Attached patch implement what you suggested.
 
 Looks good. If there's no objection, I will commit this.

Done.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-13 Thread Peter Geoghegan
On Sun, Jan 25, 2015 at 3:15 AM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
  Robert I think this is a good idea.  Do you have a test case that
  Robert shows the benefit?

 The best test case for datum sort performance is to use percentile_disc,
 since that has almost no overhead beyond performing the actual sort.

I attach a slightly tweaked version of Andrew's original.

This revision makes the reverted comments within orderedsetaggs.c
consistent with back branches (no need to call abbreviation out as an
interesting special case anymore, just as in the back branches, where
abbreviation doesn't even exist). Better to keep those consistent for
backpatching purposes. Also, I've changed back tuplesort.c header
comments to how they were back in November and until recently, to
reflect that now it really is the case that only the hash index case
doesn't have the sortKeys field reliably set. We now need to set
sortKeys for the datum case, so don't say that we don't...we need to
worry about the applicability of the onlyKey optimization for the
datum sort case now, too.

Other than that, there are a number of minor stylistic tweaks. The
datum case does not support pass-by-value abbreviation, which could be
useful in theory (e.g., abbreviation of float8 values, which was
briefly discussed before). This isn't worth calling out as a special
case in the tuplesort header comments IMV; there is now a brief note
on this added to tuplesort_begin_datum(). We still support
abbreviation for pass-by-value types for non-datumsort cases (there is
of course no known example of opclass abbreviation support for a
pass-by-value type, so this is only a theoretical deficiency).

I've marked this ready for committer.

Thanks
-- 
Peter Geoghegan
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 9ff0eff..ee13fc0 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -363,10 +363,6 @@ initialize_aggregates(AggState *aggstate,
 			 * We use a plain Datum sorter when there's a single input column;
 			 * otherwise sort the full tuple.  (See comments for
 			 * process_ordered_aggregate_single.)
-			 *
-			 * In the future, we should consider forcing the
-			 * tuplesort_begin_heap() case when the abbreviated key
-			 * optimization can thereby be used, even when numInputs is 1.
 			 */
 			peraggstate-sortstate =
 (peraggstate-numInputs == 1) ?
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index f9a5f7f..869a83b 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -266,13 +266,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 	osastate-qstate = qstate;
 	osastate-gcontext = gcontext;
 
-	/*
-	 * Initialize tuplesort object.
-	 *
-	 * In the future, we should consider forcing the tuplesort_begin_heap()
-	 * case when the abbreviated key optimization can thereby be used, even
-	 * when !use_tuples.
-	 */
+	/* Initialize tuplesort object */
 	if (use_tuples)
 		osastate-sortstate = tuplesort_begin_heap(qstate-tupdesc,
    qstate-numSortCols,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8ad5635..92f0355 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -336,9 +336,9 @@ struct Tuplesortstate
 	bool		markpos_eof;	/* saved eof_reached */
 
 	/*
-	 * The sortKeys variable is used by every case other than the datum and
-	 * hash index cases; it is set by tuplesort_begin_xxx.  tupDesc is only
-	 * used by the MinimalTuple and CLUSTER routines, though.
+	 * The sortKeys variable is used by every case other than the hash index
+	 * case; it is set by tuplesort_begin_xxx.  tupDesc is only used by the
+	 * MinimalTuple and CLUSTER routines, though.
 	 */
 	TupleDesc	tupDesc;
 	SortSupport sortKeys;		/* array of length nKeys */
@@ -901,29 +901,41 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
 	state-copytup = copytup_datum;
 	state-writetup = writetup_datum;
 	state-readtup = readtup_datum;
+	state-abbrevNext = 10;
 
 	state-datumType = datumType;
 
+	/* lookup necessary attributes of the datum type */
+	get_typlenbyval(datumType, typlen, typbyval);
+	state-datumTypeLen = typlen;
+	state-datumTypeByVal = typbyval;
+
 	/* Prepare SortSupport data */
-	state-onlyKey = (SortSupport) palloc0(sizeof(SortSupportData));
+	state-sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+	state-sortKeys-ssup_cxt = CurrentMemoryContext;
+	state-sortKeys-ssup_collation = sortCollation;
+	state-sortKeys-ssup_nulls_first = nullsFirstFlag;
 
-	state-onlyKey-ssup_cxt = CurrentMemoryContext;
-	state-onlyKey-ssup_collation = sortCollation;
-	state-onlyKey-ssup_nulls_first = nullsFirstFlag;
 	/*
-	 * Conversion to abbreviated representation infeasible in the Datum case.
-	 * It must be possible to subsequently fetch original datum values within
-	 * 

[HACKERS] xloginsert.c hole_length warning on gcc 4.8.3

2015-03-13 Thread Tomas Vondra
Hi there,

with gcc 4.8.3, I'm getting this warning in xloginsert.c:

-
xloginsert.c: In function ‘XLogInsert’:
xloginsert.c:668:20: warning: ‘hole_length’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
if (hole_length != 0  is_compressed)
^
xloginsert.c:497:10: note: ‘hole_length’ was declared here
   uint16 hole_length;
-

The compiler seems to be confused probably because the code is
structured like this:

  {
 uint16hole_length;
 uint16hole_offset;

 if (needs_backup)
 {
 ... initialize hole_length/hole_offset ...
 }

 if (needs_data)
 {
 ...
 }

 if (needs_backup)
 {
 ... use hole_length/hole_offset ...
 }
  }

and does not realize the two blocks referencing the variables use the
same condition.

Initializing the variables at the very beginning (and simplifying the
first block a bit) seems like a good fix - patch attached.



-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index fb39e70..3f56d14 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -494,8 +494,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 		XLogRecordBlockCompressHeader cbimg;
 		bool		samerel;
 		bool		is_compressed = false;
-		uint16	hole_length;
-		uint16	hole_offset;
+		uint16	hole_length = 0;
+		uint16	hole_offset = 0;
 
 		if (!regbuf-in_use)
 			continue;
@@ -546,7 +546,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 
 			/*
 			 * The page needs to be backed up, so calculate its hole length
-			 * and offset.
+			 * and offset, but only if it's a standard page header, and if
+			 * there actually is a hole to compress out.
 			 */
 			if (regbuf-flags  REGBUF_STANDARD)
 			{
@@ -561,18 +562,6 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	hole_offset = lower;
 	hole_length = upper - lower;
 }
-else
-{
-	/* No hole to compress out */
-	hole_offset = 0;
-	hole_length = 0;
-}
-			}
-			else
-			{
-/* Not a standard page header, don't try to eliminate hole */
-hole_offset = 0;
-hole_length = 0;
 			}
 
 			/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-13 Thread Josh Berkus
On 03/13/2015 10:01 AM, Pavel Stehule wrote:
 
 
 2015-03-13 17:39 GMT+01:00 Robert Haas robertmh...@gmail.com
 mailto:robertmh...@gmail.com:
 
 On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule
 pavel.steh...@gmail.com mailto:pavel.steh...@gmail.com wrote:
  we found possible bug in pg_dump. It raise a error only when all
 specified
  tables doesn't exists. When it find any table, then ignore missing
 other.
 
  /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres 
 /dev/null; echo
  $?
 
  foo doesn't exists - it creates broken backup due missing Foo table
 
   [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t
 omegaa -s
  postgres  /dev/null; echo $?
  pg_dump: No matching tables were found
  1
 
  Is it ok? I am thinking, so it is potentially dangerous. Any
 explicitly
  specified table should to exists.
 
 Keep in mind that the argument to -t is a pattern, not just a table
 name.  I'm not sure how much that affects the calculus here, but it's
 something to think about.
 
 
 yes, it has a sense, although now, I am don't think so it was a good
 idea. There should be some difference between table name and table pattern.

There was a long discussion about this when the feature was introduced
in 7.3(?) IIRC.  Changing it now would break backwards-compatibility, so
you'd really need to introduce a new option.

And, if you introduce a new option which is a specific table name, would
you require a schema name or not?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 12 Mar 2015 15:27:37 -0700, Peter Geoghegan p...@heroku.com wrote in 
CAM3SWZQ5nwTB-y4ZOj=5ckMLce5GAEUnjKJ2=m1vmhfx_ay...@mail.gmail.com
 On Sat, Feb 14, 2015 at 4:19 PM, Kevin Grittner kgri...@ymail.com wrote:
  At some point we could consider building on this patch to recheck
  index conditions for heap access when a non-MVCC snapshot is used,
  check the visibility map for referenced heap pages when the TIDs
  are read for an index-only scan, and/or skip LP_DEAD hinting for
  non-WAL-logged indexes.  But all those are speculative future work;
  this is a conservative implementation that just didn't modify
  pinning where there were any confounding factors.
 
 I don't have the bandwidth for a full review, but I took a quick look at this.
 
 I think you should call out those confounding factors in the README.
 It's not hard to piece them together from
 _bt_drop_lock_and_maybe_pin(), but I think you should anyway.
 
 Don't use BUFFER_LOCK_SHARE -- use BT_READ, as the existing nbtree
 LockBuffer() callers do. You're inconsistent about that in V3.

I agree with you. It looks the only point where it is used. 

Addition to that, the commnet just above the point methioned
above quizes me.

 /* XXX: Can walking left be lighter on the locking and pins? */
   if (BTScanPosIsPinned(so-currPos))
   LockBuffer(so-currPos.buf, BUFFER_LOCK_SHARE);
   else
   so-currPos.buf = _bt_getbuf(rel, so-currPos.currPage, 
 BT_READ);

I'm happy if I could read the meaming of the comment more
clearly. I understand that it says that you want to remove the
locking (and pinning), but can't now because the simultaneous
splitting of the left page would break something. I'd like to see
it clearer even for me either I am correct or not..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Simon Riggs
On 4 March 2015 at 03:16, Kevin Grittner kgri...@ymail.com wrote:

 How do people feel about the idea of me pushing this for 9.5 (after
 I clean up all the affected comments and README files)?  I know
 this first appeared in the last CF, but the footprint is fairly
 small and the only user-visible behavior change is that a btree
 index scan of a WAL-logged table using an MVCC snapshot no longer
 blocks a vacuum indefinitely.  (If there are objections I will move
 this to the first CF for the next release.)

It helps Hot Standby also, BTW. I proposed this previously, but it was
shot down, so I'm glad to see it happening.

I'm not sure why you have proposed only half the solution though?
Hopefully we aren't just submitting the difficult half that you needed
feedback on? Let's see the whole snapshot too old patch as well,
since that affects core PostgresSQL also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Fri, Mar 13, 2015 at 2:12 PM, Amit Langote langote_amit...@lab.ntt.co.jp
wrote:

 On 13-03-2015 PM 05:32, Amit Langote wrote:
  On 12-03-2015 PM 11:46, Amit Kapila wrote:
  [parallel_seqscan_v10.patch]
 
  There may be a bug in TupleQueueFunnelNext().
 
  1) I observed a hang with stack looking like:
 
  #0  0x0039696df098 in poll () from /lib64/libc.so.6
  #1  0x006f1c6a in WaitLatchOrSocket (latch=0x7f29dc3c73b4,
  wakeEvents=1, sock=-1, timeout=0) at pg_latch.c:333
  #2  0x006f1aca in WaitLatch (latch=0x7f29dc3c73b4, wakeEvents=1,
  timeout=0) at pg_latch.c:197
  #3  0x0065088b in TupleQueueFunnelNext (funnel=0x17b4a20,
nowait=0
  '\000', done=0x17ad481 ) at tqueue.c:269
  #4  0x00636cab in funnel_getnext (funnelstate=0x17ad3d0) at
  nodeFunnel.c:347
  ...
  snip
 
  2) In some cases, there can be a segmentation fault with stack looking
like:
 
  #0  0x00396968990a in memcpy () from /lib64/libc.so.6
  #1  0x006507e7 in TupleQueueFunnelNext (funnel=0x263c800,
nowait=0
  '\000', done=0x2633461 ) at tqueue.c:233
  #2  0x00636cab in funnel_getnext (funnelstate=0x26333b0) at
  nodeFunnel.c:347
  #3  0x00636901 in ExecFunnel (node=0x26333b0) at
nodeFunnel.c:179
  ...
  snip
 
  I could get rid of (1) and (2) with the attached fix.

 Hit send too soon!

 By the way, the bug seems to be exposed only with a certain
pattern/sequence
 of workers being detached (perhaps in immediate successive) whereby the
 funnel-nextqueue remains incorrectly set.


I think this can happen if funnel-nextqueue is greater
than funnel-nqueues.
Please see if attached patch fixes the issue, else could you share the
scenario in more detail where you hit this issue.


 The patch attached this time.

 By the way, when I have asserts enabled, I hit this compilation error:

 createplan.c: In function ‘create_partialseqscan_plan’:
 createplan.c:1180: error: ‘Path’ has no member named ‘path’

 I see following line there:

 Assert(best_path-path.parent-rtekind == RTE_RELATION);


Okay, will take care of this.

Thanks.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_tupqueue_issue_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Langote
On 12-03-2015 PM 11:46, Amit Kapila wrote:
 [parallel_seqscan_v10.patch]

There may be a bug in TupleQueueFunnelNext().

1) I observed a hang with stack looking like:

#0  0x0039696df098 in poll () from /lib64/libc.so.6
#1  0x006f1c6a in WaitLatchOrSocket (latch=0x7f29dc3c73b4,
wakeEvents=1, sock=-1, timeout=0) at pg_latch.c:333
#2  0x006f1aca in WaitLatch (latch=0x7f29dc3c73b4, wakeEvents=1,
timeout=0) at pg_latch.c:197
#3  0x0065088b in TupleQueueFunnelNext (funnel=0x17b4a20, nowait=0
'\000', done=0x17ad481 ) at tqueue.c:269
#4  0x00636cab in funnel_getnext (funnelstate=0x17ad3d0) at
nodeFunnel.c:347
...
snip

2) In some cases, there can be a segmentation fault with stack looking like:

#0  0x00396968990a in memcpy () from /lib64/libc.so.6
#1  0x006507e7 in TupleQueueFunnelNext (funnel=0x263c800, nowait=0
'\000', done=0x2633461 ) at tqueue.c:233
#2  0x00636cab in funnel_getnext (funnelstate=0x26333b0) at
nodeFunnel.c:347
#3  0x00636901 in ExecFunnel (node=0x26333b0) at nodeFunnel.c:179
...
snip

I could get rid of (1) and (2) with the attached fix.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] forward vs backward slashes in msvc build code

2015-03-13 Thread Michael Paquier
On Fri, Mar 13, 2015 at 6:20 AM, Alvaro Herrera wrote:
 Peter Eisentraut wrote:
 This is contrib/chkpass not finding the crypt symbol, which is
 presumably in some library.  But I can't see how it would normally find
 it, without my patch.

 It seems crypt is provided by libpgport.  So chkpass should be mentioned
 in @contrib_uselibpgport, but isn't.  Maybe the fix is just to add it
 there?

I had a closer look at this patch, and yes indeed, the problem was
exactly that. Now honestly I cannot understand why this dependency
with libpgport was not necessary before... In any case, attached is a
patch rebased on HEAD that builds correctly with MSVC.
-- 
Michael
diff --git a/src/tools/msvc/MSBuildProject.pm b/src/tools/msvc/MSBuildProject.pm
index 37958f9..a16f9ac 100644
--- a/src/tools/msvc/MSBuildProject.pm
+++ b/src/tools/msvc/MSBuildProject.pm
@@ -127,7 +127,7 @@ EOF
 	foreach my $fileNameWithPath (sort keys %{ $self-{files} })
 	{
 		confess Bad format filename '$fileNameWithPath'\n
-		  unless ($fileNameWithPath =~ /^(.*)\\([^\\]+)\.(c|cpp|y|l|rc)$/);
+		  unless ($fileNameWithPath =~ m!^(.*)/([^/]+)\.(c|cpp|y|l|rc)$!);
 		my $dir  = $1;
 		my $fileName = $2;
 		if ($fileNameWithPath =~ /\.y$/ or $fileNameWithPath =~ /\.l$/)
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 989a2ec..5634b3a 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -34,11 +34,11 @@ my $contrib_defines = { 'refint' = 'REFINT_VERBOSE' };
 my @contrib_uselibpq =
   ('dblink', 'oid2name', 'pgbench', 'pg_upgrade', 'postgres_fdw', 'vacuumlo');
 my @contrib_uselibpgport = (
-	'oid2name',  'pgbench',
-	'pg_standby','pg_archivecleanup',
-	'pg_test_fsync', 'pg_test_timing',
-	'pg_upgrade','pg_xlogdump',
-	'vacuumlo');
+	'chkpass',   'oid2name',
+	'pgbench',   'pg_standby',
+	'pg_archivecleanup', 'pg_test_fsync',
+	'pg_test_timing','pg_upgrade',
+	'pg_xlogdump',   'vacuumlo');
 my @contrib_uselibpgcommon = (
 	'oid2name',  'pgbench',
 	'pg_standby','pg_archivecleanup',
@@ -49,10 +49,10 @@ my $contrib_extralibs = { 'pgbench' = ['ws2_32.lib'] };
 my $contrib_extraincludes =
   { 'tsearch2' = ['contrib/tsearch2'], 'dblink' = ['src/backend'] };
 my $contrib_extrasource = {
-	'cube' = [ 'contrib\cube\cubescan.l', 'contrib\cube\cubeparse.y' ],
+	'cube' = [ 'contrib/cube/cubescan.l', 'contrib/cube/cubeparse.y' ],
 	'pgbench' =
-	  [ 'contrib\pgbench\exprscan.l', 'contrib\pgbench\exprparse.y' ],
-	'seg' = [ 'contrib\seg\segscan.l', 'contrib\seg\segparse.y' ], };
+	  [ 'contrib/pgbench/exprscan.l', 'contrib/pgbench/exprparse.y' ],
+	'seg' = [ 'contrib/seg/segscan.l', 'contrib/seg/segparse.y' ], };
 my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql');
 
 # Set of variables for frontend modules
@@ -63,18 +63,18 @@ my $frontend_extralibs = {
 	'pg_restore' = ['ws2_32.lib'],
 	'psql'   = ['ws2_32.lib'] };
 my $frontend_extraincludes = {
-	'initdb' = ['src\timezone'],
-	'psql'   = [ 'src\bin\pg_dump', 'src\backend' ] };
-my $frontend_extrasource = { 'psql' = ['src\bin\psql\psqlscan.l'] };
+	'initdb' = [ 'src/timezone' ],
+	'psql'   = [ 'src/bin/pg_dump', 'src/backend' ] };
+my $frontend_extrasource = { 'psql' = ['src/bin/psql/psqlscan.l'] };
 my @frontend_excludes = ('pgevent', 'pg_basebackup', 'pg_dump', 'scripts');
 
 sub mkvcbuild
 {
 	our $config = shift;
 
-	chdir('..\..\..') if (-d '..\msvc'  -d '..\..\..\src');
+	chdir('../../..') if (-d '../msvc'  -d '../../../src');
 	die 'Must run from root or msvc directory'
-	  unless (-d 'src\tools\msvc'  -d 'src');
+	  unless (-d 'src/tools/msvc'  -d 'src');
 
 	my $vsVersion = DetermineVisualStudioVersion();
 
@@ -101,37 +101,37 @@ sub mkvcbuild
 
 	$libpgport = $solution-AddProject('libpgport', 'lib', 'misc');
 	$libpgport-AddDefine('FRONTEND');
-	$libpgport-AddFiles('src\port', @pgportfiles);
+	$libpgport-AddFiles('src/port', @pgportfiles);
 
 	$libpgcommon = $solution-AddProject('libpgcommon', 'lib', 'misc');
 	$libpgcommon-AddDefine('FRONTEND');
-	$libpgcommon-AddFiles('src\common', @pgcommonfrontendfiles);
+	$libpgcommon-AddFiles('src/common', @pgcommonfrontendfiles);
 
-	$postgres = $solution-AddProject('postgres', 'exe', '', 'src\backend');
-	$postgres-AddIncludeDir('src\backend');
-	$postgres-AddDir('src\backend\port\win32');
-	$postgres-AddFile('src\backend\utils\fmgrtab.c');
+	$postgres = $solution-AddProject('postgres', 'exe', '', 'src/backend');
+	$postgres-AddIncludeDir('src/backend');
+	$postgres-AddDir('src/backend/port/win32');
+	$postgres-AddFile('src/backend/utils/fmgrtab.c');
 	$postgres-ReplaceFile(
-		'src\backend\port\dynloader.c',
-		'src\backend\port\dynloader\win32.c');
-	$postgres-ReplaceFile('src\backend\port\pg_sema.c',
-		'src\backend\port\win32_sema.c');
-	$postgres-ReplaceFile('src\backend\port\pg_shmem.c',
-		'src\backend\port\win32_shmem.c');
-	$postgres-ReplaceFile('src\backend\port\pg_latch.c',
-		'src\backend\port\win32_latch.c');
-	

Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-13 Thread Julien Tachoires
Hi,

On 10/03/2015 00:31, Andreas Karlsson wrote:
 On 03/03/2015 04:14 PM, Julien Tachoires wrote:
 Sorry for the delay, I missed this thread.
 Here is a new version of this patch considering Andreas' comments.
 
 Please also add it to the next open commitfest so we do not lose the patch.

Here is a new version rebased against head and considering Andreas' last
comments:

  - New regression tests about the fact that a notice is raised when
ALTER TABLE SET TOAST TABLESPACE is done for a non-TOASTed table.
  - New regression tests to check that a TOAST index follows the TOAST
table when it's moved.
  - Some fixes in the documentation.
  - psql's '\d' command shows now both table and TOAST table tablespaces
when they are different, even if one of them is pg_default.
  - patch added to the next open commitfest:
https://commitfest.postgresql.org/5/188/

Regards,

--
Julien



set_toast_tablespace_v0.13.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Regarding pg_stat_statements

2015-03-13 Thread Sreerama Manoj
Hi,
  As we know that pg_stat_statements will monitor the queries after
normalizing the queries(Removes the values present in query). I want to
know is there a way to store those normalized values because I want to
check the type of data(values) ,range of data that is being hit to the
database. I am using Postgres 9.4


Re: [HACKERS] Join push-down support for foreign tables

2015-03-13 Thread Ashutosh Bapat
Hi Hanada-san,
I noticed that the patch doesn't have any tests for testing FDW join in
postgres_fdw. While you are updating the patch, can you please add few
tests for the same. I will suggest adding tests for a combination of these
dimensions
1. Types of joins
2. Joins between multiple foreign and local tables together, to test
whether we are pushing maximum of join tree with mixed tables.
3. Join/Where conditions with  un/safe-to-push expressions
4. Queries with sorting/aggregation on top of join to test working of
setref.
5. Joins between foreign tables on different foreign servers (to check that
those do not get accidently pushed down).

I have attached a file with some example queries on those lines.

On Tue, Mar 10, 2015 at 8:37 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

Thanks for finding out what we oversight.
Here is still a problem because the new 'relids' field is not
 updated
on setrefs.c (scanrelid is incremented by rtoffset here).
It is easy to shift the bitmapset by rtoffset, however, I also
 would
like to see another approach.
 
 
 
  I just made it work for explain, but other parts still need work. Sorry
 about
  that. If we follow INDEX_VAR, we should be able to get there.
 
 I tried to modify your patch a bit as below:
 * add adjustment of bitmap fields on setrefs.c
 * add support on outfuncs.c and copyfuncs.c.
 * add bms_shift_members() in bitmapset.c

 I think it is a reasonable enhancement, however, it is not tested with
 real-life code, like postgres_fdw.

 Hanada-san, could you add a feature to print name of foreign-tables
 which are involved in remote queries, on postgresExplainForeignScan()?
 ForeignScan-fdw_relids bitmap and ExplainState-rtable_names will
 tell you the joined foreign tables replaced by the (pseudo) foreign-scan.

 Soon, I'll update the interface patch also.

My idea adds 'List *fdw_sub_paths' field in ForeignPath to inform
planner underlying foreign-scan paths (with scanrelid  0).
The create_foreignscan_plan() will call create_plan_recurse() to
construct plan nodes based on the path nodes being attached.
Even though these foreign-scan nodes are not actually executed,
setrefs.c can update scanrelid in usual way and ExplainPreScanNode
does not need to take exceptional handling on Foreign/CustomScan
nodes.
In addition, it allows to keep information about underlying foreign
table scan, even if planner will need some other information in the
future version (not only relids).
 
How about your thought?
 
 
 
  I am not sure about keeping planner nodes, which are not turned into
 execution
  nodes. There's no precedence for that in current code. It could be risky.
 
 Indeed, it is a fair enough opinion. At this moment, no other code makes
 plan
 node but shall not be executed actually.
 Please forget above idea.

 Thanks,
 --
 NEC OSS Promotion Center / PG-Strom Project
 KaiGai Kohei kai...@ak.jp.nec.com




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pgfdw_join.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Langote
On 13-03-2015 PM 05:32, Amit Langote wrote:
 On 12-03-2015 PM 11:46, Amit Kapila wrote:
 [parallel_seqscan_v10.patch]
 
 There may be a bug in TupleQueueFunnelNext().
 
 1) I observed a hang with stack looking like:
 
 #0  0x0039696df098 in poll () from /lib64/libc.so.6
 #1  0x006f1c6a in WaitLatchOrSocket (latch=0x7f29dc3c73b4,
 wakeEvents=1, sock=-1, timeout=0) at pg_latch.c:333
 #2  0x006f1aca in WaitLatch (latch=0x7f29dc3c73b4, wakeEvents=1,
 timeout=0) at pg_latch.c:197
 #3  0x0065088b in TupleQueueFunnelNext (funnel=0x17b4a20, nowait=0
 '\000', done=0x17ad481 ) at tqueue.c:269
 #4  0x00636cab in funnel_getnext (funnelstate=0x17ad3d0) at
 nodeFunnel.c:347
 ...
 snip
 
 2) In some cases, there can be a segmentation fault with stack looking like:
 
 #0  0x00396968990a in memcpy () from /lib64/libc.so.6
 #1  0x006507e7 in TupleQueueFunnelNext (funnel=0x263c800, nowait=0
 '\000', done=0x2633461 ) at tqueue.c:233
 #2  0x00636cab in funnel_getnext (funnelstate=0x26333b0) at
 nodeFunnel.c:347
 #3  0x00636901 in ExecFunnel (node=0x26333b0) at nodeFunnel.c:179
 ...
 snip
 
 I could get rid of (1) and (2) with the attached fix.

Hit send too soon!

By the way, the bug seems to be exposed only with a certain pattern/sequence
of workers being detached (perhaps in immediate successive) whereby the
funnel-nextqueue remains incorrectly set.

The patch attached this time.

By the way, when I have asserts enabled, I hit this compilation error:

createplan.c: In function ‘create_partialseqscan_plan’:
createplan.c:1180: error: ‘Path’ has no member named ‘path’

I see following line there:

Assert(best_path-path.parent-rtekind == RTE_RELATION);

Thanks,
Amit
diff --git a/src/backend/executor/tqueue.c b/src/backend/executor/tqueue.c
index ee4e03e..8a6c6f3 100644
--- a/src/backend/executor/tqueue.c
+++ b/src/backend/executor/tqueue.c
@@ -234,6 +234,8 @@ TupleQueueFunnelNext(TupleQueueFunnel *funnel, bool nowait, bool *done)
    funnel-queue[funnel-nextqueue + 1],
    sizeof(shm_mq_handle *)
 		* (funnel-nqueues - funnel-nextqueue));
+
+			funnel-nextqueue = (funnel-nextqueue + 1) % funnel-nqueues;
 			if (funnel-nextqueue  waitpos)
 --waitpos;
 		}
@@ -260,7 +262,7 @@ TupleQueueFunnelNext(TupleQueueFunnel *funnel, bool nowait, bool *done)
 		 * and return NULL (if we're in non-blocking mode) or wait for the
 		 * process latch to be set (otherwise).
 		 */
-		if (funnel-nextqueue == waitpos)
+		if (result != SHM_MQ_DETACHED  funnel-nextqueue == waitpos)
 		{
 			if (nowait)
 return NULL;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_action doesn't work for anything but shutdown

2015-03-13 Thread Michael Paquier
On Thu, Mar 12, 2015 at 11:53 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-12 15:52:02 +0100, Andres Freund wrote:
 I guess what you actually intended to test was StandbyModeRequested?

 Err, EnableHotStandby.

Indeed. Without !EnableHotStandby, a node in recovery would simply be
shut down even if a pause has been requested when hot_standby = on.
This goes as well in the sense of the comment.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-03-13 Thread Robert Haas
On Wed, Mar 11, 2015 at 5:36 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 The thing is, ()s are actually an odd-duck. Very little supports it, and
 while COPY allows it they're not required. EXPLAIN is a different story,
 because that's not WITH; we're actually using () *instead of* WITH.

Generally, I think the commands that don't have () are the older ones,
and those that do have it are the newer ones: EXPLAIN, VERBOSE, the
newest of our three COPY syntaxes, CREATE MATERIALIZED VIEW, foreign
data wrappers, servers, and foreign tables.  The older stuff like
CREATE DATABASE and REINDEX that uses ad-hoc syntax instead is a real
pain in the neck: every time you want to add an option, you've got to
add new parser rules and keywords, which is bad for the overall
efficiency of parsing.  So I think this argument is exactly backwards:
parenthesized options are the newer, better way to do this sort of
thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-03-13 Thread Jim Nasby

On 3/13/15 6:48 AM, Robert Haas wrote:

On Wed, Mar 11, 2015 at 5:36 PM, Jim Nasby jim.na...@bluetreble.com wrote:

The thing is, ()s are actually an odd-duck. Very little supports it, and
while COPY allows it they're not required. EXPLAIN is a different story,
because that's not WITH; we're actually using () *instead of* WITH.


Generally, I think the commands that don't have () are the older ones,
and those that do have it are the newer ones: EXPLAIN, VERBOSE, the
newest of our three COPY syntaxes, CREATE MATERIALIZED VIEW, foreign
data wrappers, servers, and foreign tables.  The older stuff like
CREATE DATABASE and REINDEX that uses ad-hoc syntax instead is a real
pain in the neck: every time you want to add an option, you've got to
add new parser rules and keywords, which is bad for the overall
efficiency of parsing.  So I think this argument is exactly backwards:
parenthesized options are the newer, better way to do this sort of
thing.


Yeah, that doesn't sound like a good tradeoff compared to making people 
type some extra ()s. :(


We should at least support ()s on the other commands though, so that 
we're consistent.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch for GROUPING SETS phase 1

2015-03-13 Thread Svenne Krap
Patch from message (87d24iukc5@news-spur.riddles.org.uk) fails (tried to 
apply on top of ebc0f5e01d2f ), as b55722692 has broken up the line (in 
src/backend/optimizer/util/pathnode.c):

pathnode-path.rows = estimate_num_groups(root, uniq_exprs, rel-rows);

After patching the added parameter (NULL) in by hand, the build fails as 
src/backend/optimizer/path/indxpath.c:1953 misses the new argument as well - 
this change is not in the patch. 

/Svenne

The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Simon Riggs
On 15 February 2015 at 00:19, Kevin Grittner kgri...@ymail.com wrote:

 What they wanted was what happened in the other database product --
 if a snapshot got sufficiently old that cleaning up the MVCC data
 was a problem *and* the snapshot was used again *and* it read a
 page which had been modified far enough back that it was not
 possible to return correct data, then they wanted to receive a
 snapshot too old error.  I wrote a patch to do that...

So, please lets see the patch. It seems useful for core Postgres.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 8:57 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 Yeah, that doesn't sound like a good tradeoff compared to making people type
 some extra ()s. :(

 We should at least support ()s on the other commands though, so that we're
 consistent.

I think we've been moving slowly in that direction, but it's not this
patch's job to accelerate that transition.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CATUPDATE confusion?

2015-03-13 Thread Robert Haas
On Thu, Mar 12, 2015 at 10:36 PM, Stephen Frost sfr...@snowman.net wrote:
 As near as I can tell, pgAdmin3 does still use pg_user (though I don't
 think it uses pg_group or pg_shadow except when connected to an ancient
 server) in some cases.  Where it is used, based on my quick review at
 least, it looks like it'd be pretty easy to fix.

 The rolcatupdate usage appears to all be associated with pg_authid
 though, and the changes required to remove the places where it shows up
 doesn't look particularly bad either.  There are no references to
 usecatupdate.  Where there are references to 'use*', they'd have to also
 be updated with the change to pg_user, naturally.

 Looking at phppgadmin, there are quite a few more uses of pg_user there,
 along with references to pg_group and even pg_shadow (for 8.0 and
 earlier backends).  Amusingly, the only place 'catupdate' is referenced
 there is in the Polish language file.  Updating phppgadmin to not
 reference pg_user or the other views looks like it'd be a bit more work,
 but not a huge effort either.

 Basically, in my view at least, these programs are likely to continue to
 use these backwards compatibility views until we either formally
 deprecate them or (more likely) until we actually remove them and things
 break.  As such, I'm not sure if this information really helps us make a
 decision here.

After poking at this a bit, I am guessing the reason they still use
pg_user is that regular users don't have permission to access
pg_authid directly.  We don't want to make it impossible for pgAdmin
to work for non-superusers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Thu, Mar 12, 2015 at 3:44 AM, Haribabu Kommi kommi.harib...@gmail.com
wrote:

 In create_parallelscan_paths() function the funnel path is added once
 the partial seq scan
 path is generated. I feel the funnel path can be added once on top of
 the total possible
 parallel path in the entire query path.

 Is this the right patch to add such support also?


This seems to be an optimization for parallel paths which can be
done later as well.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 8:52 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 February 2015 at 00:19, Kevin Grittner kgri...@ymail.com wrote:
 What they wanted was what happened in the other database product --
 if a snapshot got sufficiently old that cleaning up the MVCC data
 was a problem *and* the snapshot was used again *and* it read a
 page which had been modified far enough back that it was not
 possible to return correct data, then they wanted to receive a
 snapshot too old error.  I wrote a patch to do that...

 So, please lets see the patch. It seems useful for core Postgres.

It was submitted here:

http://www.postgresql.org/message-id/136937748.3364317.1423964815320.javamail.ya...@mail.yahoo.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 9:01 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Mar 12, 2015 at 3:44 AM, Haribabu Kommi kommi.harib...@gmail.com
 wrote:
 In create_parallelscan_paths() function the funnel path is added once
 the partial seq scan
 path is generated. I feel the funnel path can be added once on top of
 the total possible
 parallel path in the entire query path.

 Is this the right patch to add such support also?

 This seems to be an optimization for parallel paths which can be
 done later as well.

+1.  Let's keep it simple for now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-13 Thread Robert Haas
On Thu, Mar 12, 2015 at 8:09 PM, Thom Brown t...@linux.com wrote:
 On 12 March 2015 at 21:28, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I took a look at this patch today and noticed that it incorporates not
 only documentation for the new functionality it adds, but also for the
 custom-scan functionality whose documentation I previously excluded
 from commit on the grounds that it needed more work, especially to
 improve the English.  That decision was not popular at the time, and I
 think we need to remedy it before going further with this.  I had
 hoped that someone else would care about this work enough to help with
 the documentation, but it seems not, so today I went through the
 documentation in this patch, excluded all of the stuff specific to
 custom joins, and heavily edited the rest.  The result is attached.

 Looks good; I noticed one trivial typo --- please s/returns/return/ under
 ExecCustomScan.  Also, perhaps instead of just set ps_ResultTupleSlot
 that should say fill ps_ResultTupleSlot with the next tuple in the
 current scan direction.

 Also:

 s/initalization/initialization/

Thanks to both of you for the review.  I have committed it with those
improvements.  Please let me know if you spot anything else.

Another bit of this that I think we could commit without fretting
about it too much is the code adding set_join_pathlist_hook.  This is
- I think - analogous to set_rel_pathlist_hook, and like that hook,
could be used for other purposes than custom plan generation - e.g. to
delete paths we do not want to use.  I've extracted this portion of
the patch and adjusted the comments; if there are no objections, I
will commit this bit also.

Kaigai, note that your patch puts this hook and the call to
GetForeignJoinPaths() in the wrong order.  As in the baserel case, the
hook should get the last word, after any FDW-specific handlers have
been called, so that it can delete or modify paths as well as add
them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 1da953f..2872430 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -21,6 +21,8 @@
 #include optimizer/pathnode.h
 #include optimizer/paths.h
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+set_join_pathlist_hook_type set_join_pathlist_hook = NULL;
 
 #define PATH_PARAM_BY_REL(path, rel)  \
 	((path)-param_info  bms_overlap(PATH_REQ_OUTER(path), (rel)-relids))
@@ -260,6 +262,17 @@ add_paths_to_joinrel(PlannerInfo *root,
 			 restrictlist, jointype,
 			 sjinfo, semifactors,
 			 param_source_rels, extra_lateral_rels);
+
+	/*
+	 * Allow a plugin to editorialize on the set of Paths for this join
+	 * relation.  It could add new paths by calling add_path(), or delete
+	 * or modify paths added by the core code.
+	 */
+	if (set_join_pathlist_hook)
+		set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
+			   restrictlist, jointype,
+			   sjinfo, semifactors,
+			   param_source_rels, extra_lateral_rels);
 }
 
 /*
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..c42c69d 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -30,6 +30,19 @@ typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
 		RangeTblEntry *rte);
 extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
 
+/* Hook for plugins to get control in add_paths_to_joinrel() */
+typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
+			 RelOptInfo *joinrel,
+			 RelOptInfo *outerrel,
+			 RelOptInfo *innerrel,
+			 List *restrictlist,
+			 JoinType jointype,
+			 SpecialJoinInfo *sjinfo,
+			 SemiAntiJoinFactors *semifactors,
+			 Relids param_source_rels,
+			 Relids extra_lateral_rels);
+extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
+
 /* Hook for plugins to replace standard_join_search() */
 typedef RelOptInfo *(*join_search_hook_type) (PlannerInfo *root,
 		  int levels_needed,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Thu, Mar 12, 2015 at 10:35 PM, Thom Brown t...@linux.com wrote:


 Another problem.  I restarted the instance (just in case), and get this
error:

 # \df+ *.*
 ERROR:  cannot retain locks acquired while in parallel mode


This problem occurs because above statement is trying to
execute parallel_unsafe function (obj_description) in parallelmode.
This will be resolved once parallel_seqscan patch is integrated
with access-parallel-safety patch [1].

[1]
https://commitfest.postgresql.org/4/155/


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Tue, Mar 10, 2015 at 12:26 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Tue, Mar 10, 2015 at 10:23 AM, Haribabu Kommi kommi.harib...@gmail.com
wrote:
 
  On Tue, Mar 10, 2015 at 3:09 PM, Amit Kapila amit.kapil...@gmail.com
wrote:
 
   I have currently modelled it based on existing rescan for seqscan
   (ExecReScanSeqScan()) which means it will begin the scan again.
   Basically if the workers are already started/initialized by previous
   scan, then re-initialize them (refer function ExecReScanFunnel() in
   patch).
  
   Can you elaborate more if you think current handling is not sufficient
   for any case?
 
  From ExecReScanFunnel function it seems that the re-scan waits till
  all the workers
  has to be finished to start again the next scan. Are the workers will
  stop the current
  ongoing task? otherwise this may decrease the performance instead of
  improving as i feel.
 

 Okay, performance-wise it might effect such a case, but I think we can
 handle it by not calling WaitForParallelWorkersToFinish(),
 as DestroyParallelContext() will automatically terminate all the workers.


We can't directly call DestroyParallelContext() to terminate workers as
it can so happen that by that time some of the workers are still not
started.
So that can lead to problem.  I think what we need here is a way to know
whether all workers are started. (basically need a new function
WaitForParallelWorkersToStart()).  This API needs to be provided by
parallel-mode patch.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Kevin Grittner
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote:
 At Thu, 12 Mar 2015 15:27:37 -0700, Peter Geoghegan p...@heroku.com wrote:
 On Sat, Feb 14, 2015 at 4:19 PM, Kevin Grittner kgri...@ymail.com wrote:

 At some point we could consider building on this patch to
 recheck index conditions for heap access when a non-MVCC
 snapshot is used, check the visibility map for referenced heap
 pages when the TIDs are read for an index-only scan, and/or
 skip LP_DEAD hinting for non-WAL-logged indexes.  But all those
 are speculative future work; this is a conservative
 implementation that just didn't modify pinning where there were
 any confounding factors.

 I think you should call out those confounding factors in the
 README. It's not hard to piece them together from
 _bt_drop_lock_and_maybe_pin(), but I think you should anyway.

OK:

https://github.com/kgrittn/postgres/commit/f5f59ded30b114ac83b90a00ba1fa5ef490b994e

 Don't use BUFFER_LOCK_SHARE -- use BT_READ, as the existing
 nbtree LockBuffer() callers do. You're inconsistent about that
 in V3.

 I agree with you. It looks the only point where it is used.

OK:

https://github.com/kgrittn/postgres/commit/76118b58be819ed5e68569c926d0222bc41640ea

 Addition to that, the commnet just above the point methioned
 above quizes me.

/* XXX: Can walking left be lighter on the locking and pins? */
if (BTScanPosIsPinned(so-currPos))
LockBuffer(so-currPos.buf, BUFFER_LOCK_SHARE);
else
so-currPos.buf = _bt_getbuf(rel, so-currPos.currPage, 
 BT_READ);

 I'm happy if I could read the meaming of the comment more
 clearly. I understand that it says that you want to remove the
 locking (and pinning), but can't now because the simultaneous
 splitting of the left page would break something. I'd like to see
 it clearer even for me either I am correct or not..

Does this clear it up?:

https://github.com/kgrittn/postgres/commit/22066fc161a092e800e4c1e853136c4513f8771b

Since there are no changes that would affect the compiled code
here, I'm not posting a new patch yet.  I'll do that once things
seem to have settled down a bit more.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-13 Thread Pavel Stehule
Hi

we found possible bug in pg_dump. It raise a error only when all specified
tables doesn't exists. When it find any table, then ignore missing other.

/usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres  /dev/null; echo
$?

foo doesn't exists - it creates broken backup due missing Foo table

 [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t omegaa
-s postgres  /dev/null; echo $?
pg_dump: No matching tables were found
1

Is it ok? I am thinking, so it is potentially dangerous. Any explicitly
specified table should to exists.

Regards

Pavel


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-13 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 On Fri, Mar 13, 2015 at 8:52 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 February 2015 at 00:19, Kevin Grittner kgri...@ymail.com wrote:
 What they wanted was what happened in the other database
 product -- if a snapshot got sufficiently old that cleaning up
 the MVCC data was a problem *and* the snapshot was used again
 *and* it read a page which had been modified far enough back
 that it was not possible to return correct data, then they
 wanted to receive a snapshot too old error.  I wrote a patch
 to do that...

 So, please lets see the patch. It seems useful for core
 Postgres.

 It was submitted here:

 http://www.postgresql.org/message-id/136937748.3364317.1423964815320.javamail.ya...@mail.yahoo.com

The feedback was generally fairly positive except for the fact that
snapshot age (for purposes of being too old) was measured in
transaction IDs assigned.  There seemed to be a pretty universal
feeling that this needed to be changed to a time-based setting.
I've been working on that, although my initial efforts proved to be
a dead-end.  I've had some off-list discussions with Andrew Dunstan
and we have agreed on something that looks like it should work
well, but it's still being coded.

The other thing that is needed is to sprinkle the other index AMs
with TestForOldSnapshot() calls.  In the btree code the right spots
for those were all following BufferGetPage() calls, so the work was
in seeing where each of those could be called from to see whether
it was a spot that the check was needed.

Of course, docs and comment changes are also needed, but that's
probably only a day or two's effort.

Further discussion of the snapshot too old patch should probably
go on its thread.  The spillover is largely my fault for
referencing one from the other.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CATUPDATE confusion?

2015-03-13 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
 On Thu, Mar 12, 2015 at 10:36 PM, Stephen Frost sfr...@snowman.net wrote:
  Basically, in my view at least, these programs are likely to continue to
  use these backwards compatibility views until we either formally
  deprecate them or (more likely) until we actually remove them and things
  break.  As such, I'm not sure if this information really helps us make a
  decision here.
 
 After poking at this a bit, I am guessing the reason they still use
 pg_user is that regular users don't have permission to access
 pg_authid directly.  We don't want to make it impossible for pgAdmin
 to work for non-superusers.

I should have been more specific.  I don't believe they've moved to
using pg_roles completely (which was created specifically to address the
issue that regular users can't select from pg_authid).  Both of the
tools being discussed use pg_roles also (and, in fact, I believe they
have to for certain fields which pg_user doesn't include..
rolcreaterole being a pretty big one, but also rolconnlimit and
rolinherit, which wouldn't make sense in pg_user anyway..).  I agree
that they can't simply move to pg_authid today since only superusers
have access to that table today.  Of course, with column-level
privileges, we could potentially change that too..

In any case, looking at this again, it seems clear that we've not been
keeping pg_user up to date and no one seems to care.  I don't think it
makes sense to go back and try to add useconnlimit, usecanlogin,
usecreateuser for 9.5 when pg_user was only ever intended for
backwards compatibility and clearly hasn't been getting the love and
attention it would deserve if it was really useful.

As such, I'm coming down on the side of simply removing pg_user,
pg_shadow, and pg_group for 9.5.  Having a half-maintained mish-mash of
things from pg_authid making their way into pg_user/pg_shadow (which
suffers all the same problems of missing important fields) isn't doing
anyone any favors, and pg_group is an inefficient way of getting at the
information that's in pg_auth_members which implies that groups are
somehow different from roles, which hasn't been the case in 10 years.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Kapila
On Fri, Mar 13, 2015 at 7:15 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Mar 13, 2015 at 7:01 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  I think this can happen if funnel-nextqueue is greater than
  funnel-nqueues.
  Please see if attached patch fixes the issue, else could you share the
  scenario in more detail where you hit this issue.

 Speaking as the guy who wrote the first version of that code...

 I don't think this is the right fix; the point of that code is to
 remove a tuple queue from the funnel when it gets detached, which is a
 correct thing to want to do.  funnel-nextqueue should always be less
 than funnel-nqueues; how is that failing to be the case here?


I could not reproduce the issue, neither the exact scenario is
mentioned in mail.  However what I think can lead to funnel-nextqueue
greater than funnel-nqueues is something like below:

Assume 5 queues, so value of funnel-nqueues will be 5 and
assume value of funnel-nextqueue is 2, so now let us say 4 workers
got detached one-by-one, so for such a case it will always go in else loop
and will never change funnel-nextqueue whereas value of funnel-nqueues
will become 1.

Am I missing something?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] recovery_target_action doesn't work for anything but shutdown

2015-03-13 Thread Petr Jelinek

On 13/03/15 15:06, Petr Jelinek wrote:

On 12/03/15 15:53, Andres Freund wrote:

On 2015-03-12 15:52:02 +0100, Andres Freund wrote:

I guess what you actually intended to test was StandbyModeRequested?


Err, EnableHotStandby.



Yeah pause does not work currently. This change was made by committer



Actually the code is from me, committer just requested the change, sorry.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 8:59 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 We can't directly call DestroyParallelContext() to terminate workers as
 it can so happen that by that time some of the workers are still not
 started.

That shouldn't be a problem.  TerminateBackgroundWorker() not only
kills an existing worker if there is one, but also tells the
postmaster that if it hasn't started the worker yet, it should not
bother.  So at the conclusion of the first loop inside
DestroyParallelContext(), every running worker will have received
SIGTERM and no more workers will be started.

 So that can lead to problem.  I think what we need here is a way to know
 whether all workers are started. (basically need a new function
WaitForParallelWorkersToStart()).  This API needs to be provided by
 parallel-mode patch.

I don't think so.  DestroyParallelContext() is intended to be good
enough for this purpose; if it's not, we should fix that instead of
adding a new function.

No matter what, re-scanning a parallel node is not going to be very
efficient.  But the way to deal with that is to make sure that such
nodes have a substantial startup cost, so that the planner won't pick
them in the case where it isn't going to work out well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CATUPDATE confusion?

2015-03-13 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote:
 I should have been more specific.  I don't believe they've moved to
 using pg_roles completely (which was created specifically to address the
 issue that regular users can't select from pg_authid).  

Err, forgot to finish that thought, sorry.  Let's try again:

I should have been more specific.  I don't believe they've moved to
using pg_roles completely (which was created specifically to address the
issue that regular users can't select from pg_authid) simply because
they haven't had reason to.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 7:01 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 I think this can happen if funnel-nextqueue is greater than
 funnel-nqueues.
 Please see if attached patch fixes the issue, else could you share the
 scenario in more detail where you hit this issue.

Speaking as the guy who wrote the first version of that code...

I don't think this is the right fix; the point of that code is to
remove a tuple queue from the funnel when it gets detached, which is a
correct thing to want to do.  funnel-nextqueue should always be less
than funnel-nqueues; how is that failing to be the case here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-13 Thread Amit Langote
On Fri, Mar 13, 2015 at 11:03 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Mar 13, 2015 at 7:15 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Mar 13, 2015 at 7:01 AM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  I think this can happen if funnel-nextqueue is greater than
  funnel-nqueues.
  Please see if attached patch fixes the issue, else could you share the
  scenario in more detail where you hit this issue.

 Speaking as the guy who wrote the first version of that code...

 I don't think this is the right fix; the point of that code is to
 remove a tuple queue from the funnel when it gets detached, which is a
 correct thing to want to do.  funnel-nextqueue should always be less
 than funnel-nqueues; how is that failing to be the case here?


 I could not reproduce the issue, neither the exact scenario is
 mentioned in mail.  However what I think can lead to funnel-nextqueue
 greater than funnel-nqueues is something like below:

 Assume 5 queues, so value of funnel-nqueues will be 5 and
 assume value of funnel-nextqueue is 2, so now let us say 4 workers
 got detached one-by-one, so for such a case it will always go in else loop
 and will never change funnel-nextqueue whereas value of funnel-nqueues
 will become 1.

 Am I missing something?


Sorry, I did not mention the exact example I'd used but I thought it
was just any arbitrary example:

CREATE TABLE t1(c1, c2)  SELECT g1, repeat('x', 5) FROM
generate_series(1, 1000) g;

CREATE TABLE t2(c1, c2)  SELECT g1, repeat('x', 5) FROM
generate_series(1, 100) g;

SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1 BETWEEN 100 AND 200;

The observed behavior included a hang or segfault arbitrarily (that's
why I guessed it may be arbitrariness of sequence of detachment of
workers).

Changed parameters to cause plan to include a Funnel:

parallel_seqscan_degree = 8
cpu_tuple_communication_cost = 0.01/0.001

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_action doesn't work for anything but shutdown

2015-03-13 Thread Petr Jelinek

On 12/03/15 15:53, Andres Freund wrote:

On 2015-03-12 15:52:02 +0100, Andres Freund wrote:

I guess what you actually intended to test was StandbyModeRequested?


Err, EnableHotStandby.



Yeah pause does not work currently. This change was made by committer 
and I think the intended change was to make it a little safer by 
silently changing to shutdown instead of silently changing to promote 
which is essentially what was happening before the patch.


But yes the check should have been against EnableHotStandby it seems.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mogrify and indent features for jsonb

2015-03-13 Thread Petr Jelinek

On 01/03/15 16:49, Andrew Dunstan wrote:


On 03/01/2015 05:03 AM, Petr Jelinek wrote:

On 23/02/15 18:15, Dmitry Dolgov wrote:

Hi, Petr, thanks for the review.

  I think it would be better if the ident printing didn't put the
start of array ([) and start of dictionary ({) on separate line
Did you mean this?

 [
 {
 a: 1,
 b: 2
 }
 ]

I tried to verify this in several ways (http://jsonprettyprint.com/,
JSON.stringify, json.too from python), and I always get this result
(new line after ([) and ({) ).


No, I mean new lines before the ([) and ({) - try pretty printing
something like '{a:[b, c], d: {e:f}}' using that tool you
pasted and see what your patch outputs to see what I mean.





Please try this patch and see if it's doing what you want.



Yes, this produces output that looks like what javascript/python produce.


(the other stuff I commented about, mainly the h_atoi is still unfixed 
though)


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regarding pg_stat_statements

2015-03-13 Thread Ronan Dunklau
Le vendredi 13 mars 2015 14:59:28 Sreerama Manoj a écrit :
 Hi,
   As we know that pg_stat_statements will monitor the queries after
 normalizing the queries(Removes the values present in query). I want to
 know is there a way to store those normalized values because I want to
 check the type of data(values) ,range of data that is being hit to the
 database. I am using Postgres 9.4

Hello.

You may be interested in the pg_qualstats extension: 
https://github.com/dalibo/pg_qualstats

The purpose of the extension is to track values like pg_stat_statements, but 
at the predicate level rather than statement level. It stores normalized 
predicates as well as constants. 

The documentation is here:

http://powa.readthedocs.org/en/latest/stats_extensions/pg_qualstats.html#pg-qualstats

It won't give you all normalized values though, only those present in 
predicates.

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-13 Thread Robert Haas
On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 we found possible bug in pg_dump. It raise a error only when all specified
 tables doesn't exists. When it find any table, then ignore missing other.

 /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres  /dev/null; echo
 $?

 foo doesn't exists - it creates broken backup due missing Foo table

  [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t omegaa -s
 postgres  /dev/null; echo $?
 pg_dump: No matching tables were found
 1

 Is it ok? I am thinking, so it is potentially dangerous. Any explicitly
 specified table should to exists.

Keep in mind that the argument to -t is a pattern, not just a table
name.  I'm not sure how much that affects the calculus here, but it's
something to think about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-03-13 Thread Pavel Stehule
2015-03-13 17:39 GMT+01:00 Robert Haas robertmh...@gmail.com:

 On Fri, Mar 13, 2015 at 11:26 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  we found possible bug in pg_dump. It raise a error only when all
 specified
  tables doesn't exists. When it find any table, then ignore missing other.
 
  /usr/local/pgsql/bin/pg_dump -t Foo -t omega -s postgres  /dev/null;
 echo
  $?
 
  foo doesn't exists - it creates broken backup due missing Foo table
 
   [pavel@localhost include]$ /usr/local/pgsql/bin/pg_dump -t Foo -t
 omegaa -s
  postgres  /dev/null; echo $?
  pg_dump: No matching tables were found
  1
 
  Is it ok? I am thinking, so it is potentially dangerous. Any explicitly
  specified table should to exists.

 Keep in mind that the argument to -t is a pattern, not just a table
 name.  I'm not sure how much that affects the calculus here, but it's
 something to think about.


yes, it has a sense, although now, I am don't think so it was a good idea.
There should be some difference between table name and table pattern.

Regards

Pavel



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company