Re: [HACKERS] Proposal for changes to recovery.conf API

2016-09-05 Thread Michael Paquier
On Tue, Sep 6, 2016 at 2:18 PM, Abhijit Menon-Sen  wrote:
> One open issue is the various assign_recovery_target_xxx functions,
> which Michael noted in his earlier review:
> [...]
> I don't like this code, but I'm not yet sure what to replace it with. I
> think we should address the underlying problem—that the UI doesn't map
> cleanly to what the code wants. There's been some discussion about this
> earlier, but not any consensus that I could see.

By moving all the recovery parameters to GUC, we will need to define a
hierarchy among the target types. I see no way to avoid that except by
changing the parametrization but that would be more confusing for the
users. So that will not be anymore the last one wins, but the first
one listed wins in all the ones enabled that wins. I think that we
could leverage that a little bit though and reduce the complexity of
the patch: my best advice here is to make all those recovery_target_*
parameters PGC_POSTMASTER so as they are loaded only once when the
server starts, and then we define the recovery target type used in the
startup process instead of trying to do so at GUC level. This does not
change the fact that we'd still need a hierarchy among the target
types, but it slightly reduces the complexity of the patch. And this
does not prevent further enhancements by switching them later to be
PGC_SIGHUP, really. I'd really like to see this improvement, but I
don't think that this applies to this change, which is complicated
enough, and will likely introduce its lot of bugs even after several
reviews.

> Do we want something like this (easy to implement and document, perhaps
> not especially convenient to use):
>
> recovery_target = 'xid' # or 'time'/'name'/'lsn'/'immediate'
> recovery_target_xid = xxx?  # the only setting we care about now
> recovery_target_otherthings = parsed_but_ignored
>
> Or something like this (a bit harder to implement):
>
> recovery_target = 'xid:xxx' # or 'time:xxx' etc.

Interesting ideas, particularly the last one. Mixing both:
recovery_target_type = 'foo' # can be 'immediate'
recovery_target_value = 'value_of_foo' # does not matter for 'immediate'

> Alternatively, the do-nothing option is to move the tests from guc.c to
> StartupXLOG and do it only once in some defined order (which would also
> break the current last-mention-wins behaviour).

My vote goes in favor of that..

+else if (recovery_target_lsn != 0)
+recovery_target = RECOVERY_TARGET_LSN;
This needs to check for InvalidXLogRecPtr.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-09-05 Thread Abhijit Menon-Sen
Hi.

Here's an updated version of my patch, which now applies on top of the
patch that Simon posted earlier (recovery_startup_r10_api.v1b.patch).

A few notes:

1. I merged in recovery_target_lsn as a new GUC setting.
2. I fixed various minor nits in the earlier patch, not worth mentioning
   individually.
3. I haven't added back trigger_file, because Simon's patch removes it.
   I can add it back in separately after discussion (otherwise Simon's
   and my patches will conflict).
4. I've tested this to the extent that setting things in postgresql.conf
   works, and recovery.conf is still read if it exists, and so on.

One open issue is the various assign_recovery_target_xxx functions,
which Michael noted in his earlier review:

+static void
+assign_recovery_target_xid(const char *newval, void *extra)
+{
+   recovery_target_xid = *((TransactionId *) extra);
+
+   if (recovery_target_xid != InvalidTransactionId)
+   recovery_target = RECOVERY_TARGET_XID;
+   else if (recovery_target_name && *recovery_target_name)
+   recovery_target = RECOVERY_TARGET_NAME;
+   else if (recovery_target_time != 0)
+   recovery_target = RECOVERY_TARGET_TIME;
+   else if (recovery_target_lsn != 0)
+   recovery_target = RECOVERY_TARGET_LSN;
+   else
+   recovery_target = RECOVERY_TARGET_UNSET;
+}

(Note how recovery_target_lsn caused this—and three other functions
besides—to grow an extra branch.)

I don't like this code, but I'm not yet sure what to replace it with. I
think we should address the underlying problem—that the UI doesn't map
cleanly to what the code wants. There's been some discussion about this
earlier, but not any consensus that I could see.

Do we want something like this (easy to implement and document, perhaps
not especially convenient to use):

recovery_target = 'xid' # or 'time'/'name'/'lsn'/'immediate'
recovery_target_xid = xxx?  # the only setting we care about now
recovery_target_otherthings = parsed_but_ignored

Or something like this (a bit harder to implement):

recovery_target = 'xid:xxx' # or 'time:xxx' etc.

Alternatively, the do-nothing option is to move the tests from guc.c to
StartupXLOG and do it only once in some defined order (which would also
break the current last-mention-wins behaviour).

Thoughts? (I've added Fujii to the Cc: list, in case he has any
comments, since this is based on his earlier patch.)

-- Abhijit
commit c20d735648f5ea867fda5afc499a63d3877536a3
Author: Abhijit Menon-Sen 
Date:   Thu Sep 1 09:01:04 2016 +0530

Convert recovery.conf settings to GUCs

Based on unite_recoveryconf_postgresqlconf_v3.patch by Fujii Masao.

diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c
index e4136f9..8fcb85c 100644
--- a/contrib/pg_standby/pg_standby.c
+++ b/contrib/pg_standby/pg_standby.c
@@ -520,7 +520,7 @@ usage(void)
 	printf("  -w MAXWAITTIME max seconds to wait for a file (0=no limit) (default=0)\n");
 	printf("  -?, --help show this help, then exit\n");
 	printf("\n"
-		   "Main intended use as restore_command in recovery.conf:\n"
+		   "Main intended use as restore_command in postgresql.conf:\n"
 		   "  restore_command = 'pg_standby [OPTION]... ARCHIVELOCATION %%f %%p %%r'\n"
 		   "e.g.\n"
 	"  restore_command = 'pg_standby /mnt/server/archiverdir %%f %%p %%r'\n");
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 0f09d82..f43e41e 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1174,8 +1174,15 @@ SELECT pg_stop_backup();


 
- Create a recovery command file recovery.conf in the cluster
- data directory (see ). You might
+ Set up recovery parameters in postgresql.conf (see
+  and
+ ).
+
+   
+   
+
+ Create a recovery trigger file recovery.trigger
+ in the cluster data directory. You might
  also want to temporarily modify pg_hba.conf to prevent
  ordinary users from connecting until you are sure the recovery was successful.
 
@@ -1187,7 +1194,7 @@ SELECT pg_stop_backup();
  recovery be terminated because of an external error, the server can
  simply be restarted and it will continue recovery.  Upon completion
  of the recovery process, the server will rename
- recovery.conf to recovery.done (to prevent
+ recovery.trigger to recovery.done (to prevent
  accidentally re-entering recovery mode later) and then
  commence normal database operations.
 
@@ -1203,12 +1210,11 @@ SELECT pg_stop_backup();

 

-The key part of all this is to set up a recovery configuration file that
-describes how you want to recover and how far the recovery should
-run.  You can use recovery.conf.sample (normally
-located in the installation's share/ directory) as a
-prototype.  The one thing that you absolutely must specify in
-recovery.conf is the restore_command,
+

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-09-05 Thread Victor Wagner
On Tue, 6 Sep 2016 07:58:28 +0530
Mithun Cy  wrote:

> 
> Now if only one host is in connection string and it ask for read_write
> connection(connect to master) I mean read_only is set 0 explicitly.
> With above logic we will allow it to connect to standby?. I still
> think psql connection to standby should be handled by changing the
> default value of readonly to 1 (which means connect to any). 

It would definitely be more logical, but I haven't found easy way to do
it. May be I should return to this place and rethink. I don't like and
idea to explicitely ignore connection string parameter due to some
condition. 

Really, I think, test for replication connection can be either
removed from this part of code now.  


> Thinking
> further probably replacing readonly parameter with
> targetServerType=any|master (with default being any) should clear
> some confusions and bring consistency since same is used in JDBC
> multi host connection string.

It seems that in this context change readonly=0|1 to
targetServerType=any|master  makes sense.


 
> 2)
> @@ -1460,33 +1538,80 @@ connectDBStart(PGconn *conn)
>portstr,
>(int) (UNIXSOCK_PATH_BUFLEN - 1));
>   conn->options_valid = false;
> + free(nodes->port);
> 
> nodes->port was not allocated at this point.
> 

I'll recheck it.

-- 
   Victor Wagner 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: function xmltable

2016-09-05 Thread Craig Ringer
On 4 September 2016 at 16:06, Pavel Stehule  wrote:
> Hi
>
> minor update - using DefElem instead own private parser type

I'm really glad that you're doing this and I'll take a look at it for this CF.

It's quite a big patch so I expect this will take a few rounds of
review and updating.


Patch applies cleanly and builds cleanly on master both with and
without --with-xml .

Overall, I think this needs to be revised with appropriate comments.
Whitespace/formatting needs fixing since it's all over the place.
Documentation is insufficient (per notes below).

Re identifier naming, some of this code uses XmlTable naming patterns,
some uses TableExpr prefixes. Is that intended to indicate a bounary
between things re-usable for other structured data ingesting
functions? Do you expect a "JSONEXPR" or similar in future? That's
alluded to by

+/*--
+ * TableExpr - used for XMLTABLE function
+ *
+ * This can be used for json_table, jsonb_table functions in future
+ *--
+ */
+typedef struct TableExpr
+{
...

If so, should this really be two patches, one to add the table
expression infrastructure and another to add XMLTABLE that uses it?
Also, why in that case does so much of the TableExpr code call
directly into XmlTable code? It doesn't look very generic.

Overall I find identifier naming to be a bit inconsisent and think
it's necessary to make it clear that all the "TableExpr" stuff is for
XMLTABLE specifically, if that's the case, or make the delineation
clearer if not.

I'd also like to see tests that exercise the ruleutils get_rule_expr
parts of the code for the various XMLTABLE variants.

Similarly, since this seems to add a new xpath parser, that needs
comprehensive tests. Maybe re-purpose an existing xpath test data set?




More detailed comments:


Docs comments:

  The xmltable produces [a] table based on
[the] passed XML value.

The docs are pretty minimal and don't explain the various clauses of
XMLTABLE. What is "BY REF" ? Is PATH an xpath expression? If so, is
there a good cross reference link available? The PASSING clause? etc.

How does XMLTABLE decide what to iterate over, and how to iterate over it?

Presumably the FOR ORDINALITY clause makes a column emit a numeric counter.

What standard, if any, does this conform to? Does it resemble
implementations elsewhere? What limitations or unsupported features
does it have relative to those standards?



execEvalTableExpr seems to be defined twice, with a difference in
case. This is probably not going to fly:


+static Datum
+execEvalTableExpr(TableExprState *tstate,
+ExprContext *econtext,
+bool *isNull, ExprDoneCond *isDone)
+{

+static Datum
+ExecEvalTableExpr(TableExprState *tstate,
+ExprContext *econtext,
+bool *isNull, ExprDoneCond *isDone)
+{


It looks like you've split the function into a "guts" and "wrapper"
part, with the error handling PG_TRY / PG_CATCH block in the wrapper.
That seems reasonable for readability, but the naming isn't.

A comment is needed to explain what ExecEvalTableExpr is / does. If
it's XMLTABLE specific (which it looks like based on the code), its
name should reflect that. This pattern is repeated elsewhere; e.g.
TableExprState is really the state for an XMLTABLE expression. But
PostgreSQL actually has TABLE statements, and in future we might want
to support table-expressions, so I don't think this naming is
appropriate. This is made worse by the lack of comments on things like
the definition of TableExprState. Please use something that makes it
clear it's for XMLTABLE and add appropriate comments.

Formatting of variables, arguments, function signatures etc is
random/haphazard and doesn't follow project convention. It's neither
aligned or unaligned in the normal way, I don't understand the random
spacing at all. Maybe you should try to run pgindent and then extract
just the changes related to your patch? Or run your IDE/editor's
indent function on your changes? Right now it's actually kind of hard
to read. Do you edit with tabstop set to 1 normally or something like
that?

There's a general lack of comments throughout the added code.

In execEvalTableExpr, why are we looping over namespaces? What's that
for? Comment would be nice.

Typo: Path caclulation => Path calculation

What does XmlTableSetRowPath() do? It seems to copy its argument.
Nothing further is done with the row_path argument after it's called
by execEvalTableExpr, so what context is that memory in and do we have
to worry about it if it's large?

execEvalTableExpr says it's doing "path calculation". What it actually
appears to do is evaluate the path expressions, if provided, and
otherwise use the column name as the implied path expression. (The
docs should mention that).

It's wasn't immediately obvious to me what the branch around
tstate->for_ordinality_col   is for and what the alternate path's
purpose is in 

Re: [HACKERS] Refactoring of heapam code.

2016-09-05 Thread Pavan Deolasee
On Mon, Aug 8, 2016 at 3:13 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

>
>>
> Thank you for the review, I'll fix these problems in final version.
>
> Posting the first message I intended to raise the discussion. Patches
> were attached mainly to illustrate the problem and to be more concrete.
>

I started looking at the first patch posted above, but it seems you'll
rewrite these patches after discussion on the heapam refactoring ideas you
posted here https://wiki.postgresql.org/wiki/HeapamRefactoring concludes.

Some comments anyways on the first patch:

1. It does not apply cleanly with git-apply - many white space errors
2. I don't understand the difference between PageGetItemHeapHeaderOnly()
and PageGetItemHeap(). They seem to do exactly the same thing and can be
used interchangeably.
3. While I like the idea of using separate interfaces to get heap/index
tuple from a page, may be they can internally use a common interface
instead of duplicating what PageGetItem() does already.
4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or something
like that?
5. If we do this, we should probably have corresponding wrapper
functions/macros for remaining callers of PageGetItem()

I also looked at the refactoring design doc. Looks like a fine approach to
me, but the API will need much more elaborate discussions. I am not sure if
the interfaces as presented are enough to support everything that even
heapam can do today. And much more thoughts will be required to ensure we
don't miss out things that new primary access method may need.

A few points regarding how the proposed API maps to heapam:

- How do we support parallel scans on heap?
- Does the primary AM need to support locking of rows?
- Would there be separate API to interpret the PAM tuple itself? (something
that htup_details.h does today). It seems natural that every PAM will have
its own interpretation of tuple structure and we would like to hide that
inside PAM implementation.
- There are many upper modules that need access to system attributes (xmin,
xmax for starters). How do you plan to handle that? You think we can
provide enough abstraction so that the callers don't need to know the tuple
structures of individual PAM?

I don't know what to do with the CF entry itself. I could change the status
to "waiting on author" but then the design draft may not get enough
attention. So I'm leaving it in the current state for others to look at.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2016-09-05 Thread Kyotaro HORIGUCHI
Hello,

At Sun, 4 Sep 2016 12:54:57 +0200, Pavel Stehule  
wrote in 

Re: [HACKERS] Comment Typo

2016-09-05 Thread Bruce Momjian
On Tue, Sep  6, 2016 at 10:52:22AM +0900, Amit Langote wrote:
> Attached fixes a typo in header comment in libpq-be.h.
> 
> s/libpq_be.h/libpq-be.h/g

Applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro
> HORIGUCHI
Implementing radix tree code, then redefining the format of mapping table
> to suppot radix tree, then modifying mapping generator script are needed.
> 
> If no one oppse to this, I'll do that.

+100
Great analysis and your guts.  I very much appreciate your trial!

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Kyotaro HORIGUCHI
Hello,

At Mon, 5 Sep 2016 19:38:33 +0300, Heikki Linnakangas  wrote 
in <529db688-72fc-1ca2-f898-b0b99e300...@iki.fi>
> On 09/05/2016 05:47 PM, Tom Lane wrote:
> > "Tsunakawa, Takayuki"  writes:
> >> Before digging into the problem, could you share your impression on
> >> whether PostgreSQL can support SJIS?  Would it be hopeless?
> >
> > I think it's pretty much hopeless.
> 
> Agreed.

+1, even as a user of SJIS:)

> But one thing that would help a little, would be to optimize the UTF-8
> -> SJIS conversion. It uses a very generic routine, with a binary
> search over a large array of mappings. I bet you could do better than
> that, maybe using a hash table or a radix tree instead of the large
> binary-searched array.

I'm very impressed by the idea. Mean number of iterations for
binsearch on current conversion table with 8000 characters is
about 13 and the table size is under 100kBytes (maybe).

A three-level array with 2 byte values will take about 1.6~2MB of memory.

A radix tree for UTF-8->some-encoding conversion requires about,
or up to.. (using 1 byte index to point the next level)

(1 *  ((7f + 1) +
  (df - c2 + 1) * (bf - 80 + 1) +
  (ef - e0 + 1) * (bf - 80 + 1)^2)) = 67 kbytes.

SJIS characters are 2byte length at longest so about 8000
characters takes extra 16 k Bytes. And some padding space will be
added on them.

As the result, radix tree seems to be promising because of small
requirement of additional memory and far less comparisons.  Also
Big5 and other encodings including EUC-* will get benefit from
it.

Implementing radix tree code, then redefining the format of
mapping table to suppot radix tree, then modifying mapping
generator script are needed.

If no one oppse to this, I'll do that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki"  writes:
> Using multibyte-functions like mb... to process characters would solve
> the problem?

Well, sure.  The problem is (1) finding all the places that need that
(I'd estimate dozens to hundreds of places in the core code, and then
there's the question of extensions); (2) preventing new
non-multibyte-aware code from being introduced after you've fixed those
places; and (3) the performance penalties you'd take, because a lot of
those places are bottlenecks and it's much cheaper to not worry about
character lengths in an inner loop.

> Isn't the current implementation blocking the support of
> other character sets that have similar characteristics?

Sure, SJIS is not the only encoding that we consider frontend-only.
See

https://www.postgresql.org/docs/devel/static/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Heikki
> But one thing that would help a little, would be to optimize the UTF-8
> -> SJIS conversion. It uses a very generic routine, with a binary search
> over a large array of mappings. I bet you could do better than that, maybe
> using a hash table or a radix tree instead of the large binary-searched
> array.

That sounds worth pursuing.  Thanks!


Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cache Hash Index meta page.

2016-09-05 Thread Amit Kapila
On Tue, Sep 6, 2016 at 12:20 AM, Mithun Cy  wrote:
> On Sep 2, 2016 7:38 PM, "Jesper Pedersen" 
> wrote:
>> Could you provide a rebased patch based on Amit's v5 ?
>
> Please find the the patch, based on Amit's V5.
>

I think you want to say based on patch in the below mail:
https://www.postgresql.org/message-id/CAA4eK1J6b8O4PcEPqRxNYbLVbfToNMJEEm%2Bqn0jZX31-obXrJw%40mail.gmail.com

It is better if we can provide the link for a patch on which the
current patch is based on, that will help people to easily identify
the dependent patches.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> "Tsunakawa, Takayuki"  writes:
> > Before digging into the problem, could you share your impression on
> > whether PostgreSQL can support SJIS?  Would it be hopeless?
> 
> I think it's pretty much hopeless.  Even if we were willing to make every
> bit of code that looks for '\' and other specific at-risk characters
> multi-byte aware (with attendant speed penalties), we could expect that
> third-party extensions would still contain vulnerable code.  More, we could
> expect that new bugs of the same ilk would get introduced all the time.
> Many such bugs would amount to security problems.  So the amount of effort
> and vigilance required seems out of proportion to the benefits.

Hmm, this sounds like a death sentence.  But as I don't have good knowledge of 
character set handling yet, I'm not completely convinced about why PostgreSQL 
cannot support SJIS.  I wonder why and how other DBMSs support SJIS and what's 
the difference of the implementation.  Using multibyte-functions like mb... to 
process characters would solve the problem?  Isn't the current implementation 
blocking the support of other character sets that have similar characteristics? 
 I'll learn the character set handling...

> Most of the recent discussion about allowed backend encodings has run more
> in the other direction, ie, "why don't we disallow everything but
> UTF8 and get rid of all the infrastructure for multiple backend encodings?".
> I'm not personally in favor of that, but there are very few hackers who
> want to add any more overhead in this area.

Personally, I totally agree.  I want non-Unicode character sets to disappear 
from the world.  But the real business doesn't seem to forgive the lack of 
SJIS...

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-05 Thread Amit Kapila
On Mon, Sep 5, 2016 at 11:34 PM, Tomas Vondra
 wrote:
>
>
> On 09/05/2016 06:03 AM, Amit Kapila wrote:
>>  So, in short we have to compare three
>> approaches here.
>>
>> 1) Group mode to reduce CLOGControlLock contention
>> 2) Use granular locking model
>> 3) Use atomic operations
>>
>> For approach-1, you can use patch [1].  For approach-2, you can use
>> 0001-Improve-64bit-atomics-support patch[2] and the patch attached
>> with this mail.  For approach-3, you can use
>> 0001-Improve-64bit-atomics-support patch[2] and the patch attached
>> with this mail by commenting USE_CONTENT_LOCK.  If the third doesn't
>> work for you then for now we can compare approach-1 and approach-2.
>>
>
> OK, I can compile all three cases - but onl with gcc 4.7 or newer. Sadly
> the 4-socket 64-core machine runs Debian Jessie with just gcc 4.6 and my
> attempts to update to a newer version were unsuccessful so far.
>

So which all patches your are able to compile on 4-socket m/c?  I
think it is better to measure the performance on bigger machine.

>> I have done some testing of these patches for read-write pgbench
>> workload and doesn't find big difference.  Now the interesting test
>> case could be to use few sub-transactions (may be 4-8) for each
>> transaction as with that we can see more contention for
>> CLOGControlLock.
>
> Understood. So a bunch of inserts/updates interleaved by savepoints?
>

Yes.

> I presume you started looking into this based on a real-world
> performance issue, right? Would that be a good test case?
>

I had started looking into it based on LWLOCK_STATS data for
read-write workload (pgbench tpc-b).  I think it will depict many of
the real-world read-write workloads.

>>
>> Few points to note for performance testing, one should use --unlogged
>> tables, else the WAL writing and WALWriteLock contention masks the
>> impact of this patch.  The impact of this patch is visible at
>> higher-client counts (say at 64~128).
>>
>
> Even on good hardware (say, PCIe SSD storage that can do thousands of
> fsyncs per second)?

Not sure, because it could be masked by WALWriteLock contention.

> Does it then make sense to try optimizing this if
> the effect can only be observed without the WAL overhead (so almost
> never in practice)?
>

It is not that there is no improvement with WAL overhead (like one can
observe that via LWLOCK_STATS apart from TPS), but it is clearly
visible with unlogged tables.  The situation is not that simple,
because let us say we don't do anything for the remaining contention
for CLOGControlLock, then when we try to reduce the contention around
other locks like WALWriteLock or may be ProcArrayLock, there is a
chance that contention will shift to CLOGControlLock.  So, the basic
idea is to get the big benefits, we need to eliminate contention
around each of the locks.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-05 Thread Tom Lane
Andres Freund  writes:
> On September 5, 2016 7:26:42 AM PDT, Tom Lane  wrote:
>> The main problem I can see with this is that serial columns will have
>> default expressions that are written out as
>> "nextval('foo_f1_seq'::regclass)".  I do not think we can afford to
>> break dumps containing that, but I'm not sure how to get the regclass
>> cast replaced with a regsequence cast.

> Why not just continue having a pgclass entry, but no relfilenode?

Yeah, maybe.  I was hoping to dispense with the pg_attribute rows, but
maybe that's not enough overhead to worry about.

In this viewpoint, we'd keep the sequence-specific data in a pg_sequence
catalog.  pg_sequence rows would be extensions of the associated pg_class
rows in much the same way that pg_index rows extend the pg_class entries
for indexes.  We should supply a view pg_sequences that performs the
implied join, and encourage users to select from that rather than directly
from pg_sequence (compare pg_indexes view).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-09-05 Thread Mithun Cy
On Mon, Sep 5, 2016 at 4:33 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:
>After a brief examination I've found following ways to improve the patch.
Adding to above comments.

1)
+ /*
+ * consult connection options and check if RO connection is OK
+ * RO connection is OK if readonly connection is explicitely
+ * requested or if replication option is set, or just one
+ * host is specified in the connect string.
+ */
+ if ((conn->pghost==NULL || strchr(conn->pghost,',')==NULL) ||
+ ((conn->read_only && conn->read_only[0] > '0'))
+ ||(conn->replication && conn->replication[0])
+ )
+ {

Now if only one host is in connection string and it ask for read_write
connection(connect to master) I mean read_only is set 0 explicitly. With
above logic we will allow it to connect to standby?. I still think psql
connection to standby should be handled by changing the default value of
readonly to 1 (which means connect to any). Thinking further probably
replacing readonly parameter with targetServerType=any|master (with default
being any) should clear some confusions and bring consistency since same is
used in JDBC multi host connection string.

2)
@@ -1460,33 +1538,80 @@ connectDBStart(PGconn *conn)
   portstr,
   (int) (UNIXSOCK_PATH_BUFLEN - 1));
  conn->options_valid = false;
+ free(nodes->port);

nodes->port was not allocated at this point.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Re: Install extensions using update scripts (was Re: [HACKERS] Remove superuser() checks from pgstattuple)

2016-09-05 Thread Andres Freund
On 2016-09-05 22:24:09 -0400, Tom Lane wrote:
> Ordinarily I'd be willing to stick this on the queue for the next
> commitfest, but it seems like we ought to try to get it pushed now
> so that Stephen can make use of the feature for his superuser changes.
> Thoughts?

Seems sensible to me. I can have a look at it one of the next few days
if you want.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Install extensions using update scripts (was Re: [HACKERS] Remove superuser() checks from pgstattuple)

2016-09-05 Thread Tom Lane
Andres Freund  writes:
> On September 4, 2016 6:33:30 PM PDT, Tom Lane  wrote:
>> I think nearly all of the
>> infrastructure for this is already there in extension.c.

> Yes, it doesn't sound very hard...

I poked at this a bit, and indeed it's not that hard.  Attached is a
proposed patch (code-complete, I think, but no docs as yet).
Aside from touching CREATE EXTENSION itself, this modifies the
pg_available_extension_versions view to show versions that are installable
on the basis of update chains.  I think pg_extension_update_paths() needs
no changes, though.

Ordinarily I'd be willing to stick this on the queue for the next
commitfest, but it seems like we ought to try to get it pushed now
so that Stephen can make use of the feature for his superuser changes.
Thoughts?

regards, tom lane

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fa861e6..4d82527 100644
*** a/src/backend/commands/extension.c
--- b/src/backend/commands/extension.c
*** typedef struct ExtensionVersionInfo
*** 100,105 
--- 100,106 
  static List *find_update_path(List *evi_list,
   ExtensionVersionInfo *evi_start,
   ExtensionVersionInfo *evi_target,
+  bool reject_indirect,
   bool reinitialize);
  static void get_available_versions_for_extension(ExtensionControlFile *pcontrol,
  	 Tuplestorestate *tupstore,
*** identify_update_path(ExtensionControlFil
*** 1071,1077 
  	evi_target = get_ext_ver_info(newVersion, _list);
  
  	/* Find shortest path */
! 	result = find_update_path(evi_list, evi_start, evi_target, false);
  
  	if (result == NIL)
  		ereport(ERROR,
--- 1072,1078 
  	evi_target = get_ext_ver_info(newVersion, _list);
  
  	/* Find shortest path */
! 	result = find_update_path(evi_list, evi_start, evi_target, false, false);
  
  	if (result == NIL)
  		ereport(ERROR,
*** identify_update_path(ExtensionControlFil
*** 1086,1091 
--- 1087,1095 
   * Apply Dijkstra's algorithm to find the shortest path from evi_start to
   * evi_target.
   *
+  * If reject_indirect is true, ignore paths that go through installable
+  * versions (presumably, caller will consider starting from such versions).
+  *
   * If reinitialize is false, assume the ExtensionVersionInfo list has not
   * been used for this before, and the initialization done by get_ext_ver_info
   * is still good.
*** static List *
*** 1097,1102 
--- 1101,1107 
  find_update_path(List *evi_list,
   ExtensionVersionInfo *evi_start,
   ExtensionVersionInfo *evi_target,
+  bool reject_indirect,
   bool reinitialize)
  {
  	List	   *result;
*** find_update_path(List *evi_list,
*** 1105,1110 
--- 1110,1117 
  
  	/* Caller error if start == target */
  	Assert(evi_start != evi_target);
+ 	/* Caller error if reject_indirect and target is installable */
+ 	Assert(!(reject_indirect && evi_target->installable));
  
  	if (reinitialize)
  	{
*** find_update_path(List *evi_list,
*** 1131,1136 
--- 1138,1146 
  			ExtensionVersionInfo *evi2 = (ExtensionVersionInfo *) lfirst(lc);
  			int			newdist;
  
+ 			/* if reject_indirect, treat installable versions as unreachable */
+ 			if (reject_indirect && evi2->installable)
+ continue;
  			newdist = evi->distance + 1;
  			if (newdist < evi2->distance)
  			{
*** find_update_path(List *evi_list,
*** 1167,1172 
--- 1177,1239 
  }
  
  /*
+  * Given a target version that is not directly installable, find the
+  * best installation sequence starting from a directly-installable version.
+  *
+  * evi_list: previously-collected version update graph
+  * evi_target: member of that list that we want to reach
+  *
+  * Returns the best starting-point version, or NULL if there is none.
+  * On success, *best_path is set to the path from the start point.
+  *
+  * If there's more than one possible start point, prefer shorter update paths,
+  * and break any ties arbitrarily on the basis of strcmp'ing the starting
+  * versions' names.
+  */
+ static ExtensionVersionInfo *
+ find_install_path(List *evi_list, ExtensionVersionInfo *evi_target,
+   List **best_path)
+ {
+ 	ExtensionVersionInfo *evi_start = NULL;
+ 	ListCell   *lc;
+ 
+ 	/* Target should not be installable */
+ 	Assert(!evi_target->installable);
+ 
+ 	*best_path = NIL;
+ 
+ 	/* Consider all installable versions as start points */
+ 	foreach(lc, evi_list)
+ 	{
+ 		ExtensionVersionInfo *evi1 = (ExtensionVersionInfo *) lfirst(lc);
+ 		List	   *path;
+ 
+ 		if (!evi1->installable)
+ 			continue;
+ 
+ 		/*
+ 		 * Find shortest path from evi1 to evi_target; but no need to consider
+ 		 * paths going through other installable versions.
+ 		 */
+ 		path = find_update_path(evi_list, evi1, evi_target, true, true);
+ 		if (path == NIL)
+ 			continue;
+ 
+ 		/* Remember best path */
+ 		if (evi_start == NULL ||
+ 

[HACKERS] Comment Typo

2016-09-05 Thread Amit Langote
Attached fixes a typo in header comment in libpq-be.h.

s/libpq_be.h/libpq-be.h/g

Thanks,
Amit
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index ecdfbc6..b91eca5 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -1,6 +1,6 @@
 /*-
  *
- * libpq_be.h
+ * libpq-be.h
  *	  This file contains definitions for structures and externs used
  *	  by the postmaster during client authentication.
  *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speedup twophase transactions

2016-09-05 Thread Michael Paquier
On Sat, Sep 3, 2016 at 10:26 PM, Michael Paquier
 wrote:
> On Fri, Sep 2, 2016 at 5:06 AM, Simon Riggs  wrote:
>> On 13 April 2016 at 15:31, Stas Kelvich  wrote:
>>
>>> Fixed patch attached. There already was infrastructure to skip currently
>>> held locks during replay of standby_redo() and I’ve extended that with 
>>> check for
>>> prepared xids.
>>
>> Please confirm that everything still works on current HEAD for the new
>> CF, so review can start.
>
> The patch does not apply cleanly. Stas, could you rebase? I am
> switching the patch to "waiting on author" for now.

So, I have just done the rebase myself and did an extra round of
reviews of the patch. Here are couple of comments after this extra
lookup.

LockGXactByXid() is aimed to be used only in recovery, so it seems
adapted to be to add an assertion with RecoveryInProgress(). Using
this routine in non-recovery code paths is risky because it assumes
that a PGXACT could be missing, which is fine in recovery as prepared
transactions could be moved to twophase files because of a checkpoint,
but not in normal cases. We could also add an assertion of the kind
gxact->locking_backend == InvalidBackendId before locking the PGXACT
but as this routine is just normally used by the startup process (in
non-standalone mode!) that's fine without.

The handling of invalidation messages and removal of relfilenodes done
in FinishGXact can be grouped together, checking only once for
!RecoveryInProgress().

+ *
+ * The same procedure happens during replication and crash recovery.
  *
"during WAL replay" is more generic and applies here.

+
+next_file:
+   continue;
+
That can be removed and replaced by a "continue;".

+   /*
+* At first check prepared tx that wasn't yet moved to disk.
+*/
+   LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+   for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
+   {
+   GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
+   PGXACT *pgxact = >allPgXact[gxact->pgprocno];
+
+   if (TransactionIdEquals(pgxact->xid, xid))
+   {
+   LWLockRelease(TwoPhaseStateLock);
+   return true;
+   }
+   }
+   LWLockRelease(TwoPhaseStateLock);
This overlaps with TwoPhaseGetGXact but I'd rather keep both things
separated: it does not seem worth complicating the existing interface,
and putting that in cache during recovery has no meaning.

I have also reworked the test format, and fixed many typos and grammar
problems in the patch as well as in the tests.

After review the result is attached. Perhaps a committer could get a look at it?
-- 
Michael
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 9f55adc..eb7c339 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -45,8 +45,8 @@
  *		  fsynced
  *		* If COMMIT happens after checkpoint then backend reads state data from
  *		  files
- *		* In case of crash replay will move data from xlog to files, if that
- *		  hasn't happened before. XXX TODO - move to shmem in replay also
+ *
+ *		The same procedure happens during WAL replay.
  *
  *-
  */
@@ -578,6 +578,45 @@ LockGXact(const char *gid, Oid user)
 }
 
 /*
+ * LockGXactByXid
+ *
+ * Find prepared transaction by xid and lock corresponding GXACT.
+ * This is used during recovery as an alternative to LockGXact(), and
+ * should only be used in recovery. No entries found means that a checkpoint
+ * has moved the searched prepared transaction data to a twophase file.
+ *
+ * Returns the transaction data if found, or NULL if nothing has been locked.
+ */
+static GlobalTransaction
+LockGXactByXid(TransactionId xid)
+{
+	int		i;
+	GlobalTransaction gxact = NULL;
+
+	Assert(RecoveryInProgress());
+
+	LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+	for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
+	{
+		PGXACT	   *pgxact;
+
+		gxact = TwoPhaseState->prepXacts[i];
+		pgxact = >allPgXact[gxact->pgprocno];
+
+		if (TransactionIdEquals(xid, pgxact->xid))
+		{
+			/* ok to lock it */
+			gxact->locking_backend = MyBackendId;
+			MyLockedGxact = gxact;
+			break;
+		}
+	}
+	LWLockRelease(TwoPhaseStateLock);
+
+	return gxact;
+}
+
+/*
  * RemoveGXact
  *		Remove the prepared transaction from the shared memory array.
  *
@@ -1241,9 +1280,9 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings)
  * Reads 2PC data from xlog. During checkpoint this data will be moved to
  * twophase files and ReadTwoPhaseFile should be used instead.
  *
- * Note clearly that this function accesses WAL during normal operation, similarly
- * to the way WALSender or Logical Decoding would do. It does not run during
- * crash recovery or standby processing.
+ * Note that this function accesses WAL not only during recovery but also
+ * during normal operation, similarly to the way WALSender or 

Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan

2016-09-05 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Monday, September 05, 2016 12:58 PM
> To: Jeevan Chalke
> Cc: pgsql-hackers@postgresql.org; Etsuro Fujita
> Subject: Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan
> 
> > On Mon, Aug 29, 2016 at 7:25 AM, Kouhei Kaigai  wrote:
> >
> >
> > Hello,
> >
> > The attached patch adds an optional callback to support special 
> > optimization
> > if ForeignScan/CustomScan are located under the Limit node in plan-tree.
> >
> > Our sort node wisely switches the behavior when we can preliminary know
> > exact number of rows to be produced, because all the Sort node has to
> > return is the top-k rows when it is located under the Limit node.
> > It is much lightweight workloads than sorting of entire input rows when
> > nrows is not small.
> >
> > In my case, this information is very useful because GPU can complete its
> > sorting operations mostly on L1-grade memory if we can preliminary know
> > the top-k value is enough small and fits to size of the fast memory.
> >
> > Probably, it is also valuable for Fujita-san's case because this 
> > information
> > allows to attach "LIMIT k" clause on the remote query of postgres_fdw.
> > It will reduce amount of the network traffic and remote CPU consumption
> > once we got support of sort pushdown.
> >
> >
> >
> > One thing we need to pay attention is cost estimation on the planner 
> > stage.
> > In the existing code, only create_ordered_paths() and
> > create_merge_append_path()
> > considers the limit clause for cost estimation of sorting. They use the
> > 'limit_tuples' of PlannerInfo; we can reference the structure when 
> > extension
> > adds ForeignPath/CustomPath, so I think we don't need a special 
> > enhancement
> > on the planner stage.
> >
> Thanks for your comments.
> 
> > I believe this hook is gets called at execution time.
> > So to push LIMIT clause like you said above we should use "limit_tuples" at 
> > the time
> > of planning and then use this hook to optimize at runtime, right?
> >
> Yes. For more correctness, a valid "limit_tuples" of PlannerInfo is set only 
> when
> LIMIT clause takes constant values; it is true for most of use case.
> Then, the hook I added shall be called at execution time for more exact 
> optimization.
> 
> If FDW/CSP cannot accept uncertain number of rows to generate on planning 
> time,
> it is not a duty to provide its own path which is optimized for small number 
> of
> LIMIT clause.
> 
> > Apart from that, attached patch applies cleanly on latest sources and found 
> > no issues
> > with make or with regressions.
> >
> > However this patch is an infrastructure for any possible optimization when
> > foreign/customscan is under LIMIT.
> >
> > So look good to me.
> >
> > I quickly tried adding a hook support in postgres_fdw, and it gets called 
> > correctly
> > when we have foreignscan with LIMIT (limit being evaluated on local server).
> >
> > So code wise no issue. Also add this hook details in documentation.
> >
> OK, I'll try to write up some detailed documentation stuff; not only API 
> specification.
>
The v2 patch attached. It introduces the role of this hook and how extension
utilizes the LIMIT clause for its further optimization on planning and
execution time.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


pgsql-v10-fdw-css-limit-bound.v2.patch
Description: pgsql-v10-fdw-css-limit-bound.v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] COPY vs \copy HINT

2016-09-05 Thread Craig Ringer
On 5 September 2016 at 16:32, Christoph Berg  wrote:

> The part about the server permissions might be useful to hint at.

> What about
>
>  "COPY TO instructs the PostgreSQL server to write to a file on the 
> server. "
>  "You may want a client-side facility such as psql's \\copy."
>
>  "COPY FROM instructs the PostgreSQL server to read from a file on the 
> server. "
>  "You may want a client-side facility such as psql's \\copy."
>
> This stresses more explicitly that the file is opened by the server
> process. (I'm not particularly happy that it says "server" in there
> twice, but couldn't think of anything else.)

Tom, any preference here?

I'm probably inclined to go for your original wording and accept that
it's just too hard to hint at the client/server process split in a
single short message.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stopping logical replication protocol

2016-09-05 Thread Craig Ringer
On 25 August 2016 at 13:04, Craig Ringer  wrote:

> By the way, I now think that the second part of your patch, to allow
> interruption during ReorderBufferCommit processing, is also very
> desirable.

I've updated your patch, rebasing it on top of 10.0 master and
splitting it into two pieces. I thought you were planning on following
up with an update to the second part, but maybe that was unclear from
our prior discussion, so I'm doing this to help get this moving again.

The first patch is what I posted earlier - the part of your patch that
allows the walsender to exit COPY BOTH mode and return to command mode
in response to a client-initiated CopyDone when it's in the
WalSndLoop.  For logical decoding this means that it will notice a
client CopyDone when it's between transactions or while it's ingesting
transaction changes. It won't react to CopyDone while it's in
ReorderBufferCommit and sending a transaction to an output plugin
though.

The second piece is the other half of your original patch. Currently
unmodified from what you wrote. It adds a hook in ReorderBufferCommit
that calls back into the walsender to check for new client input and
interrupt processing. I haven't yet investigated this in detail.


Review comments on the 2nd patch, i.e. the 2nd half of your original patch:

* Other places in logical decoding use the CB suffix for callback
types. This should do the same.

* I'm not too keen on the name is_active  for the callback. We
discussed the name continue_decoding_cb in our prior conversation.

* Instead of having LogicalContextAlwaysActive, would it be better to
test if the callback pointer is null and skip it if so?

* Lots of typos in LogicalContextAlwaysActive comments, and wording is unclear

* comment added to arguments of pg_logical_slot_get_changes_guts is
not in PostgreSQL style - it goes way wide on the line. Probably
better to put the comment above the call and mention which argument it
refers to.

* A comment somewhere is needed - probably on the IsStreamingActive()
callback - to point readers at the fact that WalSndWriteData() calls
ProcessRepliesIfAny() and that's how IsStreamingActive can get set. I
had to look at the email discussion history to remind myself how this
worked when I was re-reading the code.

* I'm not keen on

typedef ReorderBufferIsActive LogicalDecondingContextIsActive;

  I think instead a single callback name that encompasses both should
be used. ContinueDecodingCB ?

* There are no tests. I don't see any really simple way to test this, though.

I suggest that you apply these updated/split patches to a fresh copy
of master then see if you can update it based on this feedback.
Something like:

git checkout master
git pull
git checkout -b stop-decoding-v10
git am /path/to/0001-Respect-client-initiated-CopyDone-in-walsender.patch
git am /path/to/0002-Second-part-of-Vladimir-Gordiychuk-s-patch.patch

then modify the code per the above, and "git commit --amend" the
changes and send an updated set of patches with "git format-patch -2"
.

I am setting this patch as "waiting on author" in the commitfest.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 81b3f61b0235e1b936e3336e43ce55c00bc230c4 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 10 May 2016 10:34:10 +0800
Subject: [PATCH 1/2] Respect client-initiated CopyDone in walsender

The walsender never looked for CopyDone sent by the client unless it
had already decided it was done sending data and dispatched its own
CopyDone message.

Check for client-initiated CopyDone when in COPY BOTH mode, returning to
command mode. In logical decoding this will allow the client to end a logical
decoding session between transactions without just unilaterally closing its
connection.

This change does not allow a client to end COPY BOTH session in the middle of
processing a logical decoding block.

TODO effect on physical walsender?
---
 src/backend/replication/walsender.c | 32 +++-
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ea2a5c..c43310c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -759,6 +759,13 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req
 	/* make sure we have enough WAL available */
 	flushptr = WalSndWaitForWal(targetPagePtr + reqLen);
 
+	/*
+	 * If the client sent CopyDone while we were waiting,
+	 * bail out so we can wind up the decoding session.
+	 */
+	if (streamingDoneSending)
+		return -1;
+
 	/* more than one block available */
 	if (targetPagePtr + XLOG_BLCKSZ <= flushptr)
 		count = XLOG_BLCKSZ;
@@ -1220,8 +1227,11 @@ WalSndWaitForWal(XLogRecPtr loc)
 		 * It's important to do this check after the recomputation of
 		 * RecentFlushPtr, so we can send all 

Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-05 Thread Haribabu Kommi
On Fri, Sep 2, 2016 at 3:01 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> > Yes, we need to consider many parameters as a system load, not just only
> > the CPU. Here I attached a POC patch that implements the CPU load
> > calculation and decide the number of workers based on the available CPU
> > load. The load calculation code is not an optimized one, there are many
> ways
> > that can used to calculate the system load. This is just for an example.
>
> I see a number of discussion points here:
>
> We don't yet have enough field experience with the parallel query
> facilities to know what kind of use patterns there are and what systems
> for load management we need.  So I think building a highly specific
> system like this seems premature.  We have settings to limit process
> numbers, which seems OK as a start, and those knobs have worked
> reasonably well in other areas (e.g., max connections, autovacuum).  We
> might well want to enhance this area, but we'll need more experience and
> information.
>

Yes, I agree that parallel query is a new feature and we cannot decide it's
affect now itself.



> If we think that checking the CPU load is a useful way to manage process
> resources, why not apply this to more kinds of processes?  I could
> imagine that limiting connections by load could be useful.  Parallel
> workers is only one specific niche of this problem.
>

Yes, I agree that parallel is only one problem.

How about Postmater calculates the CPU and etc load on the system and
update it in a shared location where every backend can access the details.
Using that, we can decide what operations to control. Using some GUC
specified interval, Postmater updates the system load, so this will not
affect
the performance of other backends.



> As I just wrote in another message in this thread, I don't trust system
> load metrics very much as a gatekeeper.  They are reasonable for
> long-term charting to discover trends, but there are numerous potential
> problems for using them for this kind of resource control thing.
>
> All of this seems very platform specific, too.  You have
> Windows-specific code, but the rest seems very Linux-specific.  The
> dstat tool I had never heard of before.  There is stuff with cgroups,
> which I don't know how portable they are across different Linux
> installations.  Something about Solaris was mentioned.  What about the
> rest?  How can we maintain this in the long term?  How do we know that
> these facilities actually work correctly and not cause mysterious problems?
>

The CPU load calculation patch is a POC patch, i didn't evaluate it's
behavior
in all platforms.



> Maybe a couple of hooks could be useful to allow people to experiment
> with this.  But the hooks should be more general, as described above.
> But I think a few GUC settings that can be adjusted at run time could be
> sufficient as well.


With the GUC settings of parallel it is possible to control the behavior
where
it improves the performance because of more parallel workers when there is
very less load on the system. In case if the system load increases and use
of
more parallel workers can add the overhead instead of improvement to
existing
current behavior when the load is high.

In such cases, the number of parallel workers needs to be reduced with
change
in GUC settings. Instead of that, I just thought, how about if we do the
same
automatically.


Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Suggestions for first contribution?

2016-09-05 Thread David Fetter
On Mon, Sep 05, 2016 at 01:25:03PM -0400, Christian Convey wrote:
> Hi guys,
> 
> Can anyone suggest a project for my first PG contribution?

How about adding PIVOT tables?  MS SQL Server and Oracle both have
them.  If you're interested, I have some ideas about the UI parts and
a few about the implementation.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-05 Thread Andres Freund


On September 5, 2016 7:26:42 AM PDT, Tom Lane  wrote:
>Simon Riggs  writes:
>> On 4 September 2016 at 23:17, Greg Stark  wrote:
>>> So? Clients expect changes like this between major releases surely.
>>> Subtle changes that cause silent breakage for end-users seems
>scarier
>>> than unsubtle breakage that tool authors can fix.
>
>> Agreed; some change in the behaviour if SELECT * FROM sequence is
>> effectively part of this proposal. I was going to make the same
>> comment myself.
>
>Well, if we're going to blow off compatibility on that score, I suggest
>that we blow it off all the way.  Make sequences not be relations
>anymore,
>and what you do instead of "SELECT * FROM sequence" is "SELECT * FROM
>pg_sequences WHERE seqname = 'sequence'".  Or more likely, since
>sequences
>should still belong to schemas, we need a "regsequence" OID-alias type
>like "regclass" and you do "SELECT * FROM pg_sequences WHERE oid =
>'foo.bar'::regsequence".
>
>The main problem I can see with this is that serial columns will
>have default expressions that are written out as
>"nextval('foo_f1_seq'::regclass)".  I do not think we can afford to
>break
>dumps containing that, but I'm not sure how to get the regclass cast
>replaced with a regsequence cast.

Why not just continue having a pgclass entry, but no relfilenode?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-09-05 Thread Tom Lane
I wrote:
> Pavel Stehule  writes:
>> Using footer for this purpose is little bit strange. What about following
>> design?
>> 1. move out source code of PL functions from \df+
>> 2. allow not unique filter in \sf and allow to display multiple functions

> Wasn't that proposed and rejected upthread?

So ... why did you put this patch in "Waiting on Author" state?  AFAIK,
we had dropped the idea of relying on \sf for this, mainly because
Peter complained about \df+ no longer providing source code.  I follow
his point: if you're used to using \df+ to see source code, you probably
can figure it out quickly if that command shows the source in a different
place than before.  But if it doesn't show it at all, using \sf instead
might not occur to you right away.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in 9.6 tuplesort batch memory growth logic

2016-09-05 Thread Peter Geoghegan
While working on my parallel CREATE INDEX patch, I came across a
problem. I initially assumed that what I saw was just a bug in my
development branch. During a final merge in a parallel worker, with
very little maintenance_work_mem, workers attempted to allocate an
amount of memory slightly less than 2 ^ 64, and fail, since that
exceeds MaxAllocHugeSize. I noticed that this happened immediately
following batchmemtuples() leaving us in a LACKMEM() state due to a
fault in its calculation (or, if you prefer, the lack of any defense
against that fault).

Further analysis led me to believe that this is a 9.6 bug. We should
have a check for sane memtuples growth, in line with the
grow_memtuples() check, where "We need to be sure that we do not cause
LACKMEM to become true, else the space management algorithm will go
nuts".  Because of the way grow_memtuples() is called by
batchmemtuples() in practice (in particular, because
availMemLessRefund may be < 0 in these corner cases),
grow_memtuples()'s own protections may not save us as previously
assumed. FREEMEM() *takes away* memory from the availMem budget when
"availMemLessRefund < 0", just after the conventional grow_memtuples()
checks run.

Attached patch adds a defense. FWIW, there is no reason to think that
more careful accounting could fix this problem, since in general a
LACKMEM() condition may not immediately lead to tuples being dumped
out.

I haven't been able to reproduce this on 9.6, but that seems
unnecessary. I can reproduce it with my parallel CREATE INDEX patch
applied, with just the right test case and right number of workers
(it's rather delicate). After careful consideration, I can think of no
reason why 9.6 would be unaffected.

-- 
Peter Geoghegan
From 850af5f9f7b1b85f98a54f5b4a757bbd00df77ec Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sun, 4 Sep 2016 16:05:49 -0700
Subject: [PATCH] Defend against faulty batch memtuples calculation

Add a new test to batchmemtuples() that must be passed in order to
proceed with actually growing memtuples.  This avoids problems with
spurious target allocation sizes in corner cases with low memory.

Backpatch to 9.6, where the tuplesort batch memory optimization was
added.
---
 src/backend/utils/sort/tuplesort.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c8fbcf8..3478277 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2868,6 +2868,7 @@ batchmemtuples(Tuplesortstate *state)
 
 	/* For simplicity, assume no memtuples are actually currently counted */
 	Assert(state->memtupcount == 0);
+	Assert(state->activeTapes > 0);
 
 	/*
 	 * Refund STANDARDCHUNKHEADERSIZE per tuple.
@@ -2880,6 +2881,20 @@ batchmemtuples(Tuplesortstate *state)
 	availMemLessRefund = state->availMem - refund;
 
 	/*
+	 * We need to be sure that we do not cause LACKMEM to become true, else a
+	 * negative number of bytes could be calculated as batch allocation size,
+	 * causing havoc (a negative availMemLessRefund cannot be accepted).  Note
+	 * that we are never reliant on a tape's batch allocation being large
+	 * enough to fit even a single tuple; the use of the constant
+	 * ALLOCSET_DEFAULT_INITSIZE here is not critical (a threshold like this
+	 * avoids a useless repalloc that only grows memtuples by a tiny amount,
+	 * which seems worth avoiding here in passing).
+	 */
+	if (availMemLessRefund <=
+		(int64) state->activeTapes * ALLOCSET_DEFAULT_INITSIZE)
+	   return;
+
+	/*
 	 * To establish balanced memory use after refunding palloc overhead,
 	 * temporarily have our accounting indicate that we've allocated all
 	 * memory we're allowed to less that refund, and call grow_memtuples() to
@@ -2889,8 +2904,11 @@ batchmemtuples(Tuplesortstate *state)
 	USEMEM(state, availMemLessRefund);
 	(void) grow_memtuples(state);
 	/* Should not matter, but be tidy */
+	state->growmemtuples = false;
+	/* availMem must stay accurate for spacePerTape calculation */
 	FREEMEM(state, availMemLessRefund);
-	state->growmemtuples = false;
+	if (LACKMEM(state))
+		elog(ERROR, "unexpected out-of-memory situation in tuplesort");
 
 #ifdef TRACE_SORT
 	if (trace_sort)
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-05 Thread Steve Singer

On 09/05/2016 03:58 PM, Steve Singer wrote:

On 08/31/2016 04:51 PM, Petr Jelinek wrote:

Hi,

and one more version with bug fixes, improved code docs and couple 
more tests, some general cleanup and also rebased on current master 
for the start of CF.








A few more things I noticed when playing with the patches

1, Creating a subscription to yourself ends pretty badly,
the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill 
it.  The background process seems to be waiting for a transaction to 
commit (I assume the create subscription command).  I had to kill -9 the 
various processes to get things to stop.  Getting confused about 
hostnames and ports is a common operator error.


2. Failures during the initial subscription  aren't recoverable

For example

on db1
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create publication testpub for table a;

on db2
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create subscription testsub connection 'host=localhost port=5440 
dbname=test' publication testpub;


I then get in my db2 log

ERROR:  duplicate key value violates unique constraint "a_pkey"
DETAIL:  Key (id)=(1) already exists.
LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10583) exited with exit code 1

LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists



LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10585) exited with exit code 1

LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists



and it keeps looping.
If I then truncate "a" on db2 it doesn't help. (I'd expect at that point 
the initial subscription to work)


If I then do on db2
 drop subscription testsub cascade;

I still see a slot in use on db1

select * FROM pg_replication_slots ;
   slot_name|  plugin  | slot_type | datoid | database | active | 
active_pid | xmin | catalog_xmin | rest

art_lsn | confirmed_flush_lsn
+--+---++--+++--+--+-
+-
 testsub_sync_a | pgoutput | logical   |  16384 | test | f 
||  | 1173 | 0/15

66E08   | 0/1566E40






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-09-05 Thread Claudio Freire
On Mon, Sep 5, 2016 at 5:36 PM, Simon Riggs  wrote:
> On 5 September 2016 at 15:50, Claudio Freire  wrote:
>> On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs  wrote:
>>> On 3 September 2016 at 04:25, Claudio Freire  wrote:
 The patch also makes vacuum free the dead_tuples before starting
 truncation. It didn't seem necessary to hold onto it beyond that
 point, and it might help give the OS more cache, especially if work
 mem is configured very high to avoid multiple index scans.
>>>
>>> How long does that part ever take? Is there any substantial gain from this?
>>>
>>> Lets discuss that as a potential second patch.
>>
>> In the test case I mentioned, it takes longer than the vacuum part itself.
>
> Please provide a test case and timings so we can see what's happening.

The referenced test case is the one I mentioned on the OP:

- createdb pgbench
- pgbench -i -s 4000 pgbench
- psql pgbench -c 'delete from pgbench_accounts;'
- vacuumdb -v -t pgbench_accounts pgbench

fsync=off, autovacuum=off, maintainance_work_mem=4GB

>From what I remember, it used ~2.7GB of RAM up until the truncate
phase, where it freed it. It performed a single index scan over the
PK.

I don't remember timings, and I didn't take them, so I'll have to
repeat the test to get them. It takes all day and makes my laptop
unusably slow, so I'll post them later, but they're not very
interesting. The only interesting bit is that it does a single index
scan instead of several, which on TB-or-more tables it's kinda nice.

Btw, without a further patch to prefetch pages on the backward scan
for truncate, however, my patience ran out before it finished
truncating. I haven't submitted that patch because there was an
identical patch in an older thread that was discussed and more or less
rejected since it slightly penalized SSDs. While I'm confident my
version of the patch is a little easier on SSDs, I haven't got an SSD
at hand to confirm, so that patch is still waiting on the queue until
I get access to an SSD. The tests I'll make include that patch, so the
timing regarding truncate won't be representative of HEAD (I really
can't afford to run the tests on a scale=4000 pgbench without that
patch, it crawls, and smaller scales don't fill the dead_tuples
array).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-09-05 Thread Simon Riggs
On 5 September 2016 at 15:50, Claudio Freire  wrote:
> On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs  wrote:
>> On 3 September 2016 at 04:25, Claudio Freire  wrote:
>>> The patch also makes vacuum free the dead_tuples before starting
>>> truncation. It didn't seem necessary to hold onto it beyond that
>>> point, and it might help give the OS more cache, especially if work
>>> mem is configured very high to avoid multiple index scans.
>>
>> How long does that part ever take? Is there any substantial gain from this?
>>
>> Lets discuss that as a potential second patch.
>
> In the test case I mentioned, it takes longer than the vacuum part itself.

Please provide a test case and timings so we can see what's happening.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-05 Thread Steve Singer

On 08/31/2016 04:51 PM, Petr Jelinek wrote:

Hi,

and one more version with bug fixes, improved code docs and couple 
more tests, some general cleanup and also rebased on current master 
for the start of CF.






To get the 'subscription' TAP tests to pass I need to set

export PGTZ=+02

Shouldn't the expected output be with reference to PST8PDT?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-09-05 Thread Claudio Freire
On Mon, Sep 5, 2016 at 11:50 AM, Claudio Freire  wrote:
> On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs  wrote:
>> On 3 September 2016 at 04:25, Claudio Freire  wrote:
>>> The patch also makes vacuum free the dead_tuples before starting
>>> truncation. It didn't seem necessary to hold onto it beyond that
>>> point, and it might help give the OS more cache, especially if work
>>> mem is configured very high to avoid multiple index scans.
>>
>> How long does that part ever take? Is there any substantial gain from this?
>>
>> Lets discuss that as a potential second patch.
>
> In the test case I mentioned, it takes longer than the vacuum part itself.
>
> Other than freeing RAM there's no gain. I didn't measure any speed
> difference while testing, but that's probably because the backward
> scan doesn't benefit from the cache, but other activity on the system
> might. So, depending on the workload on the server, extra available
> RAM may be a significant gain on its own or not. It just didn't seem
> there was a reason to keep that RAM reserved, especially after making
> it a huge allocation.
>
> I'm fine either way. I can remove that from the patch or leave it
> as-is. It just seemed like a good idea at the time.


Rebased and split versions attached
From 7b47b59d89bf3edaeb11c4ffe3660f3d5b7b86de Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Fri, 2 Sep 2016 23:21:01 -0300
Subject: [PATCH 1/2] Vacuum: allow using more than 1GB work mem

Turns the palloc for dead_tuples into a huge allocation to allow
using more than 1GB worth of dead_tuples, saving index scans on
heavily bloated tables
---
 src/backend/commands/vacuumlazy.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 231e92d..2c98da4 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -1952,7 +1952,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	{
 		maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);
 		maxtuples = Min(maxtuples, INT_MAX);
-		maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+		maxtuples = Min(maxtuples, MaxAllocHugeSize / sizeof(ItemPointerData));
 
 		/* curious coding here to ensure the multiplication can't overflow */
 		if ((BlockNumber) (maxtuples / LAZY_ALLOC_TUPLES) > relblocks)
@@ -1969,7 +1969,7 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 	vacrelstats->num_dead_tuples = 0;
 	vacrelstats->max_dead_tuples = (int) maxtuples;
 	vacrelstats->dead_tuples = (ItemPointer)
-		palloc(maxtuples * sizeof(ItemPointerData));
+		MemoryContextAllocHuge(CurrentMemoryContext, maxtuples * sizeof(ItemPointerData));
 }
 
 /*
-- 
2.6.6

From 24fa0815d915cc31ae455ef60585f54a33dbd09c Mon Sep 17 00:00:00 2001
From: Claudio Freire 
Date: Fri, 2 Sep 2016 23:21:01 -0300
Subject: [PATCH 2/2] Vacuum: free dead_tuples sooner

Frees the dead_tuples array sooner, decreasing impact of huge settings
for autovacuum_work_mem or maintainance_work_mem during lazy vacuum.
---
 src/backend/commands/vacuumlazy.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 2c98da4..dbe2040 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -155,6 +155,7 @@ static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
 static BlockNumber count_nondeletable_pages(Relation onerel,
 		 LVRelStats *vacrelstats);
 static void lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks);
+static void lazy_space_dealloc(LVRelStats *vacrelstats);
 static void lazy_record_dead_tuple(LVRelStats *vacrelstats,
 	   ItemPointer itemptr);
 static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
@@ -1310,6 +1311,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 	for (i = 0; i < nindexes; i++)
 		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
+	lazy_space_dealloc(vacrelstats);
+
 	/* If no indexes, make log report that lazy_vacuum_heap would've made */
 	if (vacuumed_pages)
 		ereport(elevel,
@@ -1973,6 +1976,18 @@ lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)
 }
 
 /*
+ * lazy_space_dealloc - free lazy vacuum work mem
+ */
+static void
+lazy_space_dealloc(LVRelStats *vacrelstats)
+{
+	if (vacrelstats->dead_tuples != NULL) {
+		pfree(vacrelstats->dead_tuples);
+		vacrelstats->dead_tuples = NULL;
+	}
+}
+
+/*
  * lazy_record_dead_tuple - remember one deletable tuple
  */
 static void
-- 
2.6.6


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cache Hash Index meta page.

2016-09-05 Thread Mithun Cy
On Sep 2, 2016 7:38 PM, "Jesper Pedersen" 
wrote:
> Could you provide a rebased patch based on Amit's v5 ?

Please find the the patch, based on Amit's V5.

I have fixed following things

1. now in "_hash_first" we check if (opaque->hasho_prevblkno ==
InvalidBlockNumber) to see if bucket is from older version hashindex and
has been upgraded. Since as of now InvalidBlockNumber is one value greater
than maximum value the variable "metap->hashm_maxbucket" can be set (see
_hash_expandtable). We can distinguish it from rest. I tested the upgrade
issue reported by amit. It is fixed now.

2. One case which buckets hasho_prevblkno is used is where we do backward
scan. So now before testing for previous block number I test whether
current page is bucket page if so we end the bucket scan (see changes in
_hash_readprev). On other places where hasho_prevblkno is used it is not
for bucket page, so I have not put any extra check to verify if is a bucket
page.


cache_hash_index_metapage_onAmit_v5_01.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Alter or rename enum value

2016-09-05 Thread Tom Lane
Emre Hasegeli  writes:
>> I started looking at this patch.  I'm kind of unhappy with having *both*
>> IF EXISTS and IF NOT EXISTS options on the statement, especially since
>> the locations of those phrases in the syntax seem to have been chosen
>> with a dartboard.  This feels way more confusing than it is useful.
>> Is there really a strong use-case for either option?  I note that
>> ALTER TABLE RENAME COLUMN, which is probably used a thousand times
>> more often than this will be, has so far not grown either kind of option,
>> which sure makes me think that this proposal is getting ahead of itself.

> I think they can be useful.  I am writing a lot of migration scripts
> for small projects.  It really helps to be able to run parts of them
> again.  ALTER TYPE ... ADD VALUE already have IF NOT EXISTS option.  I
> don't think we would lose anything to support both of them in here.

The opportunity cost here is potential user confusion.  The only
closely parallel rename operation we have is ALTER TABLE RENAME COLUMN,
and that doesn't have a column-level IF EXISTS option; it has a
table-level IF EXISTS option.  So I think it would be weird and confusing
for ALTER TYPE RENAME VALUE to be different from that.  And again, it's
hard to get excited about having these options for RENAME VALUE when no
one has felt a need for them yet in RENAME COLUMN.  I'm especially dubious
about IF NOT EXISTS against the destination name, considering that there
isn't *any* variant of RENAME that has an equivalent of that.  If it's
really useful, why hasn't that happened?

I think we should leave these features out for now and wait to see if
there's really field demand for them.  If there is, it probably will
make sense to add them in more than just this one kind of RENAME.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-05 Thread Tomas Vondra


On 09/05/2016 06:03 AM, Amit Kapila wrote:
> On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra
>  wrote:
>> Hi,
>>
>> This thread started a year ago, different people contributed various
>> patches, some of which already got committed. Can someone please post a
>> summary of this thread, so that it's a bit more clear what needs
>> review/testing, what are the main open questions and so on?
>>
> 
> Okay, let me try to summarize this thread.  This thread started off to
> ameliorate the CLOGControlLock contention with a patch to increase the
> clog buffers to 128 (which got committed in 9.6).   Then the second
> patch was developed to use Group mode to further reduce the
> CLOGControlLock contention, latest version of which is upthread [1] (I
> have checked that version still gets applied).  Then Andres suggested
> to compare the Group lock mode approach with an alternative (more
> granular) locking model approach for which he has posted patches
> upthread [2].  There are three patches on that link, the patches of
> interest are 0001-Improve-64bit-atomics-support and
> 0003-Use-a-much-more-granular-locking-model-for-the-clog-.  I have
> checked that second one of those doesn't get applied, so I have
> rebased it and attached it with this mail.  In the more granular
> locking approach, actually, you can comment USE_CONTENT_LOCK to make
> it use atomic operations (I could not compile it by disabling
> USE_CONTENT_LOCK on my windows box, you can try by commenting that as
> well, if it works for you).  So, in short we have to compare three
> approaches here.
> 
> 1) Group mode to reduce CLOGControlLock contention
> 2) Use granular locking model
> 3) Use atomic operations
> 
> For approach-1, you can use patch [1].  For approach-2, you can use
> 0001-Improve-64bit-atomics-support patch[2] and the patch attached
> with this mail.  For approach-3, you can use
> 0001-Improve-64bit-atomics-support patch[2] and the patch attached
> with this mail by commenting USE_CONTENT_LOCK.  If the third doesn't
> work for you then for now we can compare approach-1 and approach-2.
> 

OK, I can compile all three cases - but onl with gcc 4.7 or newer. Sadly
the 4-socket 64-core machine runs Debian Jessie with just gcc 4.6 and my
attempts to update to a newer version were unsuccessful so far.

> I have done some testing of these patches for read-write pgbench
> workload and doesn't find big difference.  Now the interesting test
> case could be to use few sub-transactions (may be 4-8) for each
> transaction as with that we can see more contention for
> CLOGControlLock.

Understood. So a bunch of inserts/updates interleaved by savepoints?

I presume you started looking into this based on a real-world
performance issue, right? Would that be a good test case?

> 
> Few points to note for performance testing, one should use --unlogged
> tables, else the WAL writing and WALWriteLock contention masks the
> impact of this patch.  The impact of this patch is visible at
> higher-client counts (say at 64~128).
> 

Even on good hardware (say, PCIe SSD storage that can do thousands of
fsyncs per second)? Does it then make sense to try optimizing this if
the effect can only be observed without the WAL overhead (so almost
never in practice)?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggestions for first contribution?

2016-09-05 Thread Pavel Stehule
Hi

2016-09-05 19:25 GMT+02:00 Christian Convey :

> Hi guys,
>
> Can anyone suggest a project for my first PG contribution?
>
> My first two ideas didn't pan out:  Yury doesn't seem to need help
> with CMake, and the TODO list's "-Wcast-align" project (now deleted)
> appeared impractical.
>
> I can continue trying things from the TODO list, but if someone knows
> of a project that's definitely worth doing, all the better.
>
> I'd prefer doing back-end coding, but I'm flexible.  The only two
> caveats are: (1) I can't work on speedup-focused changes without my
> employer's permission, and (2) I only have access to Linux boxes.
>
> Any advice would be greatly appreciated.


I wrote XMLTABLE function, and I am thinking about JSON_TABLE function. But
there is one blocker - missing JsonPath support in our JSON implementation.

So one idea - implement JsonPath support and related JSON query functions.
This can help with better standard conformance.

Regards

Pavel



>
> Kind regards,
> Christian
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Better locale-specific-character-class handling for regexps

2016-09-05 Thread Tom Lane
Heikki Linnakangas  writes:
> On 09/05/2016 07:10 PM, Tom Lane wrote:
>> In any case, this is getting very far afield from the current patch.
>> I'm willing to add a regexp.linux.ut8.sql test file if you think it's
>> important to have some canned tests that exercise this new code, but
>> otherwise I don't see any near-term solution.

> Ok, that'll do.

OK, I'll put something together.  Thanks again for reviewing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Suggestions for first contribution?

2016-09-05 Thread Christian Convey
Hi guys,

Can anyone suggest a project for my first PG contribution?

My first two ideas didn't pan out:  Yury doesn't seem to need help
with CMake, and the TODO list's "-Wcast-align" project (now deleted)
appeared impractical.

I can continue trying things from the TODO list, but if someone knows
of a project that's definitely worth doing, all the better.

I'd prefer doing back-end coding, but I'm flexible.  The only two
caveats are: (1) I can't work on speedup-focused changes without my
employer's permission, and (2) I only have access to Linux boxes.

Any advice would be greatly appreciated.

Kind regards,
Christian


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better locale-specific-character-class handling for regexps

2016-09-05 Thread Heikki Linnakangas

On 09/05/2016 07:10 PM, Tom Lane wrote:

Heikki Linnakangas  writes:

On 09/04/2016 08:44 PM, Tom Lane wrote:

I guess I could follow the lead of collate.linux.utf8.sql and produce
a test that's only promised to pass on one platform with one encoding,
but I'm not terribly excited by that.  AFAIK that test file does not
get run at all in the buildfarm or in the wild.



I'm not too worried if the tests don't get run regularly, but I don't
like the idea that only works on one platform.


Well, it would work on any platform that reports high Unicode letters
as letters.  The problem for putting this into the regular regression
tests is that the generic tests don't even assume UTF8 encoding, let
alone a Unicode-ish locale.


Ah, ok. I thought there were some more special requirements.


Since we're now de facto maintainers of this regexp library, and our
version could be used somewhere else than PostgreSQL too, it would
actually be nice to have a regression suite that's independent from the
pg_regress infrastructure, and wouldn't need a server to run.


If anyone ever really picks up the challenge of making the regexp library
a standalone project, I think one of the first orders of business would be
to pull out the Tcl project's regexp-related regression tests.  There's a
pretty extensive set of tests written by Henry Spencer himself, and more
that they added over the years; it's far more comprehensive than our
tests.  (I've looked at stealing that test set in toto, but it requires
some debug APIs that we don't expose in SQL, and probably don't want to.)


Oh, interesting.


In any case, this is getting very far afield from the current patch.
I'm willing to add a regexp.linux.ut8.sql test file if you think it's
important to have some canned tests that exercise this new code, but
otherwise I don't see any near-term solution.


Ok, that'll do.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Bruce Momjian
On Mon, Sep  5, 2016 at 12:48:32PM -0300, Alvaro Herrera wrote:
> > The least invasive solution would be to have a guc, something like
> > 'keep_orphan_temp_tables' with boolean value.
> > Which would determine a autovacuum worker policy toward encountered orphan
> > temp tables.
> 
> The stated reason for keeping them around is to ensure you have time to
> do some forensics research in case there was something useful in the
> crashing backend.  My feeling is that if the reason they are kept around
> is not a crash but rather some implementation defect that broke end-time
> cleanup, then they don't have their purported value and I would rather
> just remove them.
> 
> I have certainly faced my fair share of customers with dangling temp
> tables, and would like to see this changed in some way or another.

I don't think we look at those temp tables frequently enough to justify
keeping them around for all users.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor fix to comments

2016-09-05 Thread Bruce Momjian
On Sun, Sep  4, 2016 at 05:30:57PM -0500, Jim Nasby wrote:
> I noticed some imbalanced '-'s in execnodes.h. Though, ISTM newer code
> doesn't use -'s in comments anymore, so maybe it'd be better to just ditch
> them?

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete TODO item "-Wcast-align" ?

2016-09-05 Thread Bruce Momjian
On Sun, Sep  4, 2016 at 06:10:23PM -0400, Christian Convey wrote:
> On Sun, Sep 4, 2016 at 5:56 PM, Tom Lane  wrote:
> > Christian Convey  writes:
> >> I chose this item from the TODO page: "[E] Remove warnings created by
> >> -Wcast-align".  It didn't have a check-mark after the "[E]", which I
> >> took to mean it's an outstanding issue.
> >> However, I'm starting to wonder if it's already been fixed:
> >
> > No, but you only see it on some platforms/compilers.  On my OS X laptop
> > (clang-based not gcc-based compiler), turning that on generates just a
> > hair short of 13000 warnings :-(
> >
> > I think that TODO item is indeed obsolete, but more in the direction
> > of "we're never gonna do that".  There are too many places where we
> > do need to cast up from generic pointer to specific structure pointer,
> > and there doesn't seem to be any practical way to tell a compiler which
> > instances are useful to warn about.
> 
> Thanks for the response.  I'm unclear about how the TODO list is
> curated.  Is there someone whose attention I should direct to this
> thread?

Someone has removed the item.  It is a wiki so anyone can add/remove
things.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Heikki Linnakangas

On 09/05/2016 05:47 PM, Tom Lane wrote:

"Tsunakawa, Takayuki"  writes:

Before digging into the problem, could you share your impression on
whether PostgreSQL can support SJIS?  Would it be hopeless?


I think it's pretty much hopeless.


Agreed.

But one thing that would help a little, would be to optimize the UTF-8 
-> SJIS conversion. It uses a very generic routine, with a binary search 
over a large array of mappings. I bet you could do better than that, 
maybe using a hash table or a radix tree instead of the large 
binary-searched array.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] less expensive pg_buffercache on big shmem

2016-09-05 Thread Ivan Kartyshov

On 09/03/2016 05:04 AM, Tomas Vondra wrote:
> This patch needs a rebase, as 06d7fd6e bumped the version to 1.2.
Thank you for a valuable hint.

> > If we will replace consistent method, then we should replace it 
with the

> > partially consistent method (called "nonconsistent") because:
> > 1) it's based on fast spinlocks (it's not fully up to its name, though)
>
> In other words, it does exactly the thing proposed up-thread, i.e.
> locking only buffer headers.
>
> What do you mean by fast spinlocks? And why aren't they up to the name?

Not they (spinlocks), but the name “nonconsistent” is somewhat 
misleading. At the moment we can't implement concurrent shared memory 
access without locks in general, so most inconsistent method that has 
been proposed was the "nonconsistent" one. But roughly speaking 
*nonconsistent* is not as such by the name, because it contains a 
locking mechanism (spinlocks).


> > 2) it's *almost* the fastest one (the less time needed for execution of
> > method, the less data core will change and as a consequence the more
> > consistent snapshot will be)
>
> I'm not sure I understand what you're saying here? What do you mean by
> "almost the fastest one"?

I mean the fastest one that has been proposed ("consistent"| 
"semiconsistent"| "nonconsistent").


> I'm a bit confused by the results you reported before, i.e. that
>
> 1) nonconsistent is 10% faster than consistent method
> 2) semiconsistent is 5-times slower than nonconsistent method
>
> What does that mean? Are you refering to duration of the queries reading
> data from pg_buffercache, or to impact on the other queries?

Here I mean "working duration time".

> How can be semiconsistent 5x slower than nonconsistent? Wouldn't that
> make it significantly slower than the consistent method?

Firstly, when we want to measure the quality of pg_buffercache, we must 
measure several values:

1) Execution time (duration of the queries reading data from pg_buffercache)
2) How it influences the system (and other queries) during its work

Secondly, the semiconsistent is slower than nonconsistent and consistent 
method, but it makes less influence on other queries then consistent.


Thirdly, it is slower because it locks the partitions of shared memory 
in a different way than in consistent or nonconsistent methods.
The semi-consistent strategy implies that only one buffer is locked at a 
time. Therefor has no significant effect on the system, but it takes 
more time.



---
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better locale-specific-character-class handling for regexps

2016-09-05 Thread Tom Lane
Heikki Linnakangas  writes:
> On 09/04/2016 08:44 PM, Tom Lane wrote:
>> I guess I could follow the lead of collate.linux.utf8.sql and produce
>> a test that's only promised to pass on one platform with one encoding,
>> but I'm not terribly excited by that.  AFAIK that test file does not
>> get run at all in the buildfarm or in the wild.

> I'm not too worried if the tests don't get run regularly, but I don't 
> like the idea that only works on one platform.

Well, it would work on any platform that reports high Unicode letters
as letters.  The problem for putting this into the regular regression
tests is that the generic tests don't even assume UTF8 encoding, let
alone a Unicode-ish locale.

> Since we're now de facto maintainers of this regexp library, and our 
> version could be used somewhere else than PostgreSQL too, it would 
> actually be nice to have a regression suite that's independent from the 
> pg_regress infrastructure, and wouldn't need a server to run.

If anyone ever really picks up the challenge of making the regexp library
a standalone project, I think one of the first orders of business would be
to pull out the Tcl project's regexp-related regression tests.  There's a
pretty extensive set of tests written by Henry Spencer himself, and more
that they added over the years; it's far more comprehensive than our
tests.  (I've looked at stealing that test set in toto, but it requires
some debug APIs that we don't expose in SQL, and probably don't want to.)

In any case, this is getting very far afield from the current patch.
I'm willing to add a regexp.linux.ut8.sql test file if you think it's
important to have some canned tests that exercise this new code, but
otherwise I don't see any near-term solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Alvaro Herrera
Grigory Smolkin wrote:
> 
> On 09/05/2016 04:34 PM, Alvaro Herrera wrote:
> >Grigory Smolkin wrote:
> >
> >>Funny part is that it never drops them. So when backend is finally
> >>terminated, it tries to drop them and fails with error:
> >>
> >>FATAL:  out of shared memory
> >>HINT:  You might need to increase max_locks_per_transaction
> >>
> >>If I understand that rigth, we are trying to drop all these temp tables in
> >>one transaction and running out of locks to do so.
> >Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient.
> >It is certainly pointless to hold onto these locks for temp tables.  I
> >wonder how ugly would be to fix this problem ...
> >
> 
> Thank you for your interest in this problem.
> I dont think this is a source of problem. Ugly fix here would only force
> backend to terminate properly.
> It will not help at all in cause of server crash or power outage.
> We need a way to tell autovacuum, that we don`t need orphan temp tables, so
> they can be removed using existing routine.

It is always possible to drop the containing schemas; and as soon as
some other backend uses the BackendId 15 (in your example) the tables
would be removed anyway.  This only becomes a longstanding problem when
the crashing backend uses a high-numbered BackendId that's not reused
promptly enough.

> The least invasive solution would be to have a guc, something like
> 'keep_orphan_temp_tables' with boolean value.
> Which would determine a autovacuum worker policy toward encountered orphan
> temp tables.

The stated reason for keeping them around is to ensure you have time to
do some forensics research in case there was something useful in the
crashing backend.  My feeling is that if the reason they are kept around
is not a crash but rather some implementation defect that broke end-time
cleanup, then they don't have their purported value and I would rather
just remove them.

I have certainly faced my fair share of customers with dangling temp
tables, and would like to see this changed in some way or another.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-05 Thread Masahiko Sawada
On Mon, Sep 5, 2016 at 6:27 PM, Simon Riggs  wrote:
> On 4 August 2016 at 05:57, Masahiko Sawada  wrote:
>
>> While reviewing freeze map code, Andres pointed out[1] that
>> lazy_scan_heap could accesses visibility map twice and its logic is
>> seems a bit tricky.
>> As discussed before, it's not nice especially when large relation is
>> entirely frozen.
>>
>> I posted the patch for that before but since this is an optimization,
>> not bug fix, I'd like to propose it again.
>> Please give me feedback.
>>
>> [1] 
>> https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de
>
> If we have a freeze map now, surely tables will no longer be entirely frozen?

Well, if table is completely frozen, freezing for all pages will be skipped.

> What performance difference does this make, in a realistic use case?

I have yet to measure performance effect but it would be effect for
very large frozen table.

> How would we test this to check it is exactly correct?

One possible idea is that we emit the number of skipped page according
visibility map as a vacuum verbose message, and check it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-09-05 Thread Claudio Freire
On Sun, Sep 4, 2016 at 3:46 AM, Simon Riggs  wrote:
> On 3 September 2016 at 04:25, Claudio Freire  wrote:
>> The patch also makes vacuum free the dead_tuples before starting
>> truncation. It didn't seem necessary to hold onto it beyond that
>> point, and it might help give the OS more cache, especially if work
>> mem is configured very high to avoid multiple index scans.
>
> How long does that part ever take? Is there any substantial gain from this?
>
> Lets discuss that as a potential second patch.

In the test case I mentioned, it takes longer than the vacuum part itself.

Other than freeing RAM there's no gain. I didn't measure any speed
difference while testing, but that's probably because the backward
scan doesn't benefit from the cache, but other activity on the system
might. So, depending on the workload on the server, extra available
RAM may be a significant gain on its own or not. It just didn't seem
there was a reason to keep that RAM reserved, especially after making
it a huge allocation.

I'm fine either way. I can remove that from the patch or leave it
as-is. It just seemed like a good idea at the time.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki"  writes:
> Before digging into the problem, could you share your impression on
> whether PostgreSQL can support SJIS?  Would it be hopeless?

I think it's pretty much hopeless.  Even if we were willing to make every
bit of code that looks for '\' and other specific at-risk characters
multi-byte aware (with attendant speed penalties), we could expect that
third-party extensions would still contain vulnerable code.  More, we
could expect that new bugs of the same ilk would get introduced all the
time.  Many such bugs would amount to security problems.  So the amount of
effort and vigilance required seems out of proportion to the benefits.

Most of the recent discussion about allowed backend encodings has run
more in the other direction, ie, "why don't we disallow everything but
UTF8 and get rid of all the infrastructure for multiple backend
encodings?".  I'm not personally in favor of that, but there are very
few hackers who want to add any more overhead in this area.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new autovacuum criterion for visible pages

2016-09-05 Thread Simon Riggs
On 12 August 2016 at 01:01, Tom Lane  wrote:
> Michael Paquier  writes:
>> In short, autovacuum will need to scan by itself the VM of each
>> relation and decide based on that.
>
> That seems like a worthwhile approach to pursue.  The VM is supposed to be
> small, and if you're worried it isn't, you could sample a few pages of it.
> I do not think any of the ideas proposed so far for tracking the
> visibility percentage on-the-fly are very tenable.

Sounds good, but we can't scan the VM for every table, every minute.
We need to record something that will tell us how many VM bits have
been cleared, which will then allow autovac to do a simple SELECT to
decide what needs vacuuming.

Vik's proposal to keep track of the rows inserted seems like the best
approach to this issue.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-05 Thread Tom Lane
Simon Riggs  writes:
> On 4 September 2016 at 23:17, Greg Stark  wrote:
>> So? Clients expect changes like this between major releases surely.
>> Subtle changes that cause silent breakage for end-users seems scarier
>> than unsubtle breakage that tool authors can fix.

> Agreed; some change in the behaviour if SELECT * FROM sequence is
> effectively part of this proposal. I was going to make the same
> comment myself.

Well, if we're going to blow off compatibility on that score, I suggest
that we blow it off all the way.  Make sequences not be relations anymore,
and what you do instead of "SELECT * FROM sequence" is "SELECT * FROM
pg_sequences WHERE seqname = 'sequence'".  Or more likely, since sequences
should still belong to schemas, we need a "regsequence" OID-alias type
like "regclass" and you do "SELECT * FROM pg_sequences WHERE oid =
'foo.bar'::regsequence".

The main problem I can see with this is that serial columns will
have default expressions that are written out as
"nextval('foo_f1_seq'::regclass)".  I do not think we can afford to break
dumps containing that, but I'm not sure how to get the regclass cast
replaced with a regsequence cast.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-05 Thread Ildar Musin

Hi Tomas,

On 03.09.2016 14:37, Tomas Vondra wrote:

Hi Ildar,

I've looked at this patch again today to do a bit more thorough review,
and I think it's fine. There are a few comments (particularly in the new
code in check_index_only) that need improving, and also a few small
tweaks in the walkers.

Attached is a modified v5 patch with the proposed changes - it's
probably easier than explaining what the changes should/might be.

I've added an XXX comment in check_index_only_expr_walker - ISTM we're
first explicitly matching  Vars to index attributes, and then dealing
with expressions. But we loop over all index columns (including those
that can only really match Vars). Not sure if it makes any difference,
but is it worth differentiating between Var and non-Var expressions? Why
not to simply call match_index_to_operand() in both cases?



Thank you! Yes, you're right and probably it doesn't worth it. I 
intended to optimize just a little bit since we already have attribute 
numbers that can be returned by index. But as you have already noted in 
comment it is a cheap check and it would barely be noticeable.



I've also tweaked a few comments to match project code style, and moved
a few variables into the block where they are used. But the latter is
probably matter of personal taste, I guess.


regards



Thanks for that, I have some difficulties in expressing myself in 
english, so it was very helpful.


Best regards,
--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sequence catalog

2016-09-05 Thread Simon Riggs
On 4 September 2016 at 23:17, Greg Stark  wrote:
> On Wed, Aug 31, 2016 at 3:01 PM, Tom Lane  wrote:
>>
>> Uh, not as subtly as all that, because "select * from sequence" will
>> now return a different set of columns, which will flat out break a
>> lot of clients that use that method to get sequence properties.
>
> So? Clients expect changes like this between major releases surely.
> Subtle changes that cause silent breakage for end-users seems scarier
> than unsubtle breakage that tool authors can fix.

Agreed; some change in the behaviour if SELECT * FROM sequence is
effectively part of this proposal. I was going to make the same
comment myself.

I think we should balance the problems caused by not having a sequence
catalog against the problems caused by changing the currently somewhat
broken situation.

I vote in favour of applying Peter's idea/patch, though if that is not
acceptable I don't think its worth spending further time on for any of
us, so if thats the case lets Return with Feedback and move on.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT .. SET syntax

2016-09-05 Thread Vik Fearing
On 09/05/2016 03:58 PM, Simon Riggs wrote:
> On 3 July 2016 at 20:36, Marko Tiikkaja  wrote:
> 
>> Here's a patch for $SUBJECT.  I'll probably work on the docs a bit more
>> before the next CF, but I thought I'd post it anyway.
> 
> I think this should be Returned With Feedback.

You're probably right (even though you're quoting the wrong message), so
I've changed it to that.

Marko, please resubmit an updated patch.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Grigory Smolkin


On 09/05/2016 04:34 PM, Alvaro Herrera wrote:

Grigory Smolkin wrote:


Funny part is that it never drops them. So when backend is finally
terminated, it tries to drop them and fails with error:

FATAL:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction

If I understand that rigth, we are trying to drop all these temp tables in
one transaction and running out of locks to do so.

Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient.
It is certainly pointless to hold onto these locks for temp tables.  I
wonder how ugly would be to fix this problem ...



Thank you for your interest in this problem.
I dont think this is a source of problem. Ugly fix here would only force 
backend to terminate properly.

It will not help at all in cause of server crash or power outage.
We need a way to tell autovacuum, that we don`t need orphan temp tables, 
so they can be removed using existing routine.


The least invasive solution would be to have a guc, something like 
'keep_orphan_temp_tables' with boolean value.
Which would determine a autovacuum worker policy toward encountered 
orphan temp tables.


--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] INSERT .. SET syntax

2016-09-05 Thread Simon Riggs
On 3 July 2016 at 20:36, Marko Tiikkaja  wrote:

> Here's a patch for $SUBJECT.  I'll probably work on the docs a bit more
> before the next CF, but I thought I'd post it anyway.

I think this should be Returned With Feedback.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-05 Thread Ildar Musin

Hi Vladimir,

On 03.09.2016 19:31, Vladimir Sitnikov wrote:

Ildar>The reason why this doesn't work is that '~~' operator (which is a
Ildar>synonym for 'like') isn't supported by operator class for btree. Since
Ildar>the only operators supported by btree are <, <=, =, >=, >, you can use
Ildar>it with queries like:

Ildar>And in 3rd query 'OFFSET' statement prevents rewriter from
Ildar>transforming the query, so it is possible to use index only scan on
Ildar>subquery and then filter the result of subquery with '~~' operator.

I'm afraid I do not follow you.
Note: query 3 is 100% equivalent of query 2, however query 3 takes 55
times less reads.
It looks like either an optimizer bug, or some missing feature in the
"index only scan" logic.

Here's quote from "query 2" (note % are at both ends):  ... where
type=42) as x where upper_vc like '%ABC%';

Note: I do NOT use "indexed scan" for the like operator. I'm very well aware
that LIKE patterns with leading % cannot be optimized to a btree range scan.
What I want is "use the first indexed column as index scan, then use the
second column
for filtering".

As shown in "query 2" vs "query 3", PostgreSQL cannot come up with such
a plan on its own
for some reason.

This is not a theoretical issue, but it is something that I use a lot
with Oracle DB (it just creates a good plan for "query 2").

Vladimir


Thanks, I get it now. The reason why it acts like this is that I used 
match_clause_to_index() function to determine if IOS can be used with 
the specified clauses. This function among other things checks if 
operator matches the index opfamily. Apparently this isn't correct. I 
wrote another prototype to test your case and it seems to work. But it's 
not ready for public yet, I'll publish it in 1-2 days.


--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Alvaro Herrera
Grigory Smolkin wrote:

> Funny part is that it never drops them. So when backend is finally
> terminated, it tries to drop them and fails with error:
> 
> FATAL:  out of shared memory
> HINT:  You might need to increase max_locks_per_transaction
> 
> If I understand that rigth, we are trying to drop all these temp tables in
> one transaction and running out of locks to do so.

Hmm, yeah, I suppose it does that, and it does seem pretty inconvenient.
It is certainly pointless to hold onto these locks for temp tables.  I
wonder how ugly would be to fix this problem ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Vik Fearing
On 09/05/2016 01:54 PM, Grigory Smolkin wrote:
> What is the purpose of keeping orphan tables around and not dropping
> them on the spot?

You can read the discussion about it here:
https://www.postgresql.org/message-id/flat/3507.1214581513%40sss.pgh.pa.us
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fun fact about autovacuum and orphan temp tables

2016-09-05 Thread Grigory Smolkin

Hello, hackers!

We were testing how well some application works with PostgreSQL and 
stumbled upon an autovacuum behavior which I fail to understand.
Application in question have a habit to heavily use temporary tables in 
funny ways.

For example it creates A LOT of them.
Which is ok.
Funny part is that it never drops them. So when backend is finally 
terminated, it tries to drop them and fails with error:


FATAL:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction

If I understand that rigth, we are trying to drop all these temp tables 
in one transaction and running out of locks to do so.
After that postgresql.log is flooded at the rate 1k/s with messages like 
that:


LOG:  autovacuum: found orphan temp table "pg_temp_15"."tt38147" in 
database "DB_TEST"


It produces a noticeable load on the system and it`s getting worst with 
every terminated backend or restart.
I did some RTFS and it appears that autovacuum has no intention of 
cleaning that orphan tables unless

it`s wraparound time:

src/backend/postmaster/autovacuum.c
 /* We just ignore it if the owning backend is still active */
 2037 if (backendID == MyBackendId || 
BackendIdGetProc(backendID) == NULL)

 2038 {
 2039 /*
 2040  * We found an orphan temp table (which was 
probably left
 2041  * behind by a crashed backend).  If it's so old 
as to need
 2042  * vacuum for wraparound, forcibly drop it. 
Otherwise just

 2043  * log a complaint.
 2044  */
 2045 if (wraparound)
 2046 {
 2047 ObjectAddress object;
 2048
 2049 ereport(LOG,
 2050 (errmsg("autovacuum: dropping orphan 
temp table \"%s\".\"%s\" in database \"%s\"",

 2051 get_namespace_name(classForm->relnamespace),
 2052 NameStr(classForm->relname),
 2053 get_database_name(MyDatabaseId;
 2054 object.classId = RelationRelationId;
 2055 object.objectId = relid;
 2056 object.objectSubId = 0;
 2057 performDeletion(, DROP_CASCADE, 
PERFORM_DELETION_INTERNAL);

 2058 }
 2059 else
 2060 {
 2061 ereport(LOG,
 2062 (errmsg("autovacuum: found orphan 
temp table \"%s\".\"%s\" in database \"%s\"",

 2063 get_namespace_name(classForm->relnamespace),
 2064 NameStr(classForm->relname),
 2065 get_database_name(MyDatabaseId;
 2066 }
 2067 }
 2068 }


What is more troubling is that pg_statistic is starting to bloat badly.

LOG:  automatic vacuum of table "DB_TEST.pg_catalog.pg_statistic": index 
scans: 0

pages: 0 removed, 68225 remain, 0 skipped due to pins
tuples: 0 removed, 2458382 remain, 2408081 are dead but not yet 
removable

buffer usage: 146450 hits, 31 misses, 0 dirtied
avg read rate: 0.010 MB/s, avg write rate: 0.000 MB/s
system usage: CPU 3.27s/6.92u sec elapsed 23.87 sec

What is the purpose of keeping orphan tables around and not dropping 
them on the spot?



--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2016-09-05 Thread Heikki Linnakangas

On 09/05/2016 03:12 AM, Andreas Karlsson wrote:

On 08/30/2016 08:42 AM, Heikki Linnakangas wrote:

There's the ResourceOwner mechanism, see src/backend/utils/resowner/.
That would be the proper way to do this. Call
RegisterResourceReleaseCallback() when the context is allocated, and
have the callback free it. One pitfall to watch out for is that
RegisterResourceReleaseCallback() itself calls palloc(), and can error
out, so you have to do things in such an order that you don't leak in
that case either.

Want to take a stab at that?

Another approach is put each allocated context in a list or array in a
global variable, and to register a callback to be called at
end-of-(sub)transaction, which closes all the contexts. But the resource
owner mechanism is probably easier.

There's also PG_TRY-CATCH, that you could maybe use in the callers of
px_find_digest(), to make sure they call px_free_digest() even on error.
But that also seems difficult to use with the pgp_encrypt() pipeline.


Sure, I have attached a patch where I try to use it.


Thanks! Unfortunately the callback mechanism is a bit more complicated 
to use than that. The list of registered callbacks is global, so the 
callback gets called for every ResourceOwner that's closed, not just the 
one that was active when you registered it. Also, unregistering the 
callback from within the callback is not safe. I fixed those things in 
the (first) attached patch.


On 09/05/2016 03:23 AM, Tom Lane wrote:

Judging by the number of people who have popped up recently with their
own OpenSSL 1.1 patches, I think there is going to be a lot of demand for
back-patching some sort of 1.1 support into our back branches.  All this
talk of refactoring does not sound very back-patchable.  Should we be
thinking of what we can extract that is back-patchable?


Yes, I think you're right.

The minimum set of changes is the first of the attached patches. It 
includes Andreas' first patch, with the configure changes and other 
changes needed to compile, and a working version of the resourceowner 
callback mechanism to make sure we don't leak OpenSSL handles in pgcrypto.


Maybe we could get away without the resourceowner mechanism, and just 
accept the minor memory leaks on the rare error case (out-of-memory 
might be the only situation where that happen). Or #ifdef it so that we 
keep the old embed-in-palloced-struct approach for OpenSSL versions 
older than 1.1. But on the whole, I'd prefer to keep the code the same 
in all branches.


The second patch attached here includes Andreas' second and third 
patches, to silence deprecation warnings. That's not strictly required, 
but seems safe enough that I think we should back-patch that too. It 
needs one additional #ifdef version check in generate_dh_params(), if we 
want it to still work with OpenSSL 0.9.7, but that's easy. I'm assuming 
we want to still support it in back-branches, even though we just 
dropped it from master.


- Heikki
From 4534d01a0471d7378100cddebb5641800950e6c6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 5 Sep 2016 13:46:15 +0300
Subject: [PATCH 1/2] Fix compilation with OpenSSL 1.1.

- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
  pgcrypto, to use the resource owner mechanism to ensure that we don't
  leak OpenSSL handles, now that we don't palloc them as part of other
  structs anymore.
- RAND_SSLeay was renamed to RAND_OpenSSL

Andreas Karlsson, with some changes by me.
---
 configure|  44 ++--
 configure.in |   4 +-
 contrib/pgcrypto/openssl.c   | 115 +++
 contrib/sslinfo/sslinfo.c|  14 +---
 src/backend/libpq/be-secure-openssl.c|  39 +++
 src/interfaces/libpq/fe-secure-openssl.c |  39 +++
 6 files changed, 181 insertions(+), 74 deletions(-)

diff --git a/configure b/configure
index 45c8eef..caf6f26 100755
--- a/configure
+++ b/configure
@@ -9538,9 +9538,9 @@ else
   as_fn_error $? "library 'crypto' is required for OpenSSL" "$LINENO" 5
 fi
 
- { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_library_init in -lssl" >&5
-$as_echo_n "checking for SSL_library_init in -lssl... " >&6; }
-if ${ac_cv_lib_ssl_SSL_library_init+:} false; then :
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_new in -lssl" >&5
+$as_echo_n "checking for SSL_new in -lssl... " >&6; }
+if ${ac_cv_lib_ssl_SSL_new+:} false; then :
   $as_echo_n "(cached) " >&6
 else
   ac_check_lib_save_LIBS=$LIBS
@@ -9554,27 +9554,27 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 #ifdef __cplusplus
 extern "C"
 #endif
-char SSL_library_init ();
+char SSL_new ();
 int
 main ()
 {
-return SSL_library_init ();
+return SSL_new ();
   ;
   return 0;
 }
 _ACEOF
 if ac_fn_c_try_link "$LINENO"; then :
-  ac_cv_lib_ssl_SSL_library_init=yes
+  

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-09-05 Thread Aleksander Alekseev
Hello, Victor.

> As community shows an interest with this patch, and it has been
> included in the current commitfest, I've to submit it.

Naturally we show interest in this patch. It's a great feature that
would be very nice to have!

> This version of patch includes
> 
> 1. Most important - a tap test suite (really small enough and has a
> room for improvements).
> 
> 2. Compatibility with older versions - now readonly check is skipped if
> only one hostname is specified in the connect string
> 
> 3. pg_host and pg_port variables now show host name and port library
> was connected to.
> 
> 4. Your fix with compilation of ecpg tests.
> 
> Patch is attached.

After a brief examination I've found following ways to improve the
patch.

1) It looks like code is not properly formatted.

2)

> +  readonly
> +  
> +  
> +  If this parameter is 0 (the default), upon successful connection
> +  library checks if host is in recovery state, and if it is so, 
> +  tries next host in the connect string. If this parameter is
> +  non-zero, connection to warm standby nodes are considered
> +  successful.

IIRC the are "warm" and "hot" standby's in PostgreSQL terminology. The
difference is that you can't execute any queries on a "warm" standby. So
the description is probably wrong. I suggest to fix it to just "...
connection to standby nodes are...".

3)

> + falover_timeout
> + 
> + 
> + Maximum time to cycilically retry all the hosts in commect string.
> + (as decimal integer number of seconds). If not specified, then
> + hosts are tried just once.
> + 

Typos: cycilically, commect

4) 

>  static int 
>  connectDBStart(PGconn *conn)
>  {
> +   struct nodeinfo
> +   {
> +   char   *host;
> +   char   *port;
> +   };

Not sure if all compilers we support can parse this. I suggest to
declare nodinfo structure outside of procedure, just to be safe.

5)

> +   nodes = calloc(sizeof(struct nodeinfo), 2); 
> +   nodes->host = strdup(conn->pghostaddr);
> hint.ai_family = AF_UNSPEC;

> +   /* Parse comma-separated list of host-port pairs into
> +  function-local array of records */
> +   nodes = malloc(sizeof(struct nodeinfo) * 4); 


> /* pghostaddr and pghost are NULL, so use Unix domain
>  * socket */
> -   node = NULL;
> +   nodes = calloc(sizeof(struct nodeinfo), 2); 

> +   sprintf(port,"%d", htons(((struct sockaddr_in
> *)ai->ai_addr)->sin_port));
> +   return strdup(port);

You should check return values of malloc, calloc and strdup procedures,
or use safe pg_* equivalents.

6) 

> +   for (p = addrs; p->ai_next != NULL; p =
>   p->ai_next);
> +   p->ai_next = this_node_addrs;

Looks a bit scary. I suggest to add an empty scope ( {} ) and a comment
to make our intention clear here.

7) Code compiles with following warnings (CLang 3.8):

> 1 warning generated.
> fe-connect.c:5239:39: warning: if statement has empty body
> [-Wempty-body]
>
> errorMessage,
> false, true));
>   
> ^
> fe-connect.c:5239:39: note: put the semicolon on a separate line to
> silence this warning
> 1 warning generated.

8) get_next_element procedure implementation is way too smart (read
"complicated"). You could probably just store current list length and
generate a random number between 0 and length-1.

-- 
Best regards,
Aleksander Alekseev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CF app and patch series

2016-09-05 Thread Craig Ringer
Hi all

Now that it's becoming more common to post patch series, not just
standalone patches, it might be worth looking at how the CF app can
help manage them.

Any ideas? Especially since the patch series may not get committed all
in one go, but progressively rebased on top of the bits that did get
committed until it's all in?

Making one CF app entry for this doesn't work well. It's confusing,
spammy in terms of number of entries involved, confusing for potential
reviewers, and pretty useless unless each patch is independent of all
the others. In which case they shouldn't be a series in the first
place.

A "partly committed" status doesn't help much; it doesn't signal
whether the rest is waiting for author response, awaiting review after
revision, etc.

So far I've just not been flagging it committed at all until
everything is. Or creating new CF entries that are immediately set to
"committed" for each part that goes in. But neither seems ideal.

Ideas?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))

2016-09-05 Thread Julien Rouhaud
On 05/09/2016 11:55, Julien Rouhaud wrote:
> On 20/06/2016 06:28, Thomas Munro wrote:
>> On Mon, Jun 20, 2016 at 3:43 PM, Craig Ringer  wrote:
>>> On 18 June 2016 at 11:28, Thomas Munro 
>>> wrote:
 Several times now when reading, debugging and writing code I've wished
 that LWLockHeldByMe assertions specified the expected mode, especially
 where exclusive locking is required.

 What do you think about something like the attached?  See also an
 example of use.  I will add this to the next commitfest.
>>>
>>> I've wanted this before too [...]
>>
> 
> same here.
> 
>> Before ab5194e6f (25 December 2014) held_lwlocks didn't record the mode.
>>
> 
> I just reviewed both patches.  They applies cleanly on current HEAD,
> work as intended and make check run smoothly.  Patches are pretty
> straightforward, so I don't have much to say.
> 
> My only remark is on following comment:
> 
> + * LWLockHeldByMeInMode - test whether my process holds a lock in mode X
> 
> Maybe something like "test whether my process holds a lock in given
> mode" would be better?
> 
> Otherwise, I think they're ready for committer.
> 

Didn't saw that Simon just committed it, sorry about it.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))

2016-09-05 Thread Julien Rouhaud
On 20/06/2016 06:28, Thomas Munro wrote:
> On Mon, Jun 20, 2016 at 3:43 PM, Craig Ringer  wrote:
>> On 18 June 2016 at 11:28, Thomas Munro 
>> wrote:
>>> Several times now when reading, debugging and writing code I've wished
>>> that LWLockHeldByMe assertions specified the expected mode, especially
>>> where exclusive locking is required.
>>>
>>> What do you think about something like the attached?  See also an
>>> example of use.  I will add this to the next commitfest.
>>
>> I've wanted this before too [...]
> 

same here.

> Before ab5194e6f (25 December 2014) held_lwlocks didn't record the mode.
> 

I just reviewed both patches.  They applies cleanly on current HEAD,
work as intended and make check run smoothly.  Patches are pretty
straightforward, so I don't have much to say.

My only remark is on following comment:

+ * LWLockHeldByMeInMode - test whether my process holds a lock in mode X

Maybe something like "test whether my process holds a lock in given
mode" would be better?

Otherwise, I think they're ready for committer.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] condition variables

2016-09-05 Thread Amit Kapila
On Mon, Aug 15, 2016 at 10:35 PM, Robert Haas  wrote:
>>> Don't you need to set proc->cvSleeping = false in ConditionVariableSignal?
>>
>> I poked at this a bit... OK, a lot... and have some feedback:
>>
>> 1.  As above, we need to clear cvSleeping before setting the latch.
>
> Right, OK.
>

I have independently faced this problem while using your patch and for
now I have updated my local copy.  If possible, please send an updated
patch as this patch could be used for development of various
parallelism projects.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert(LWLockHeldByMeInMode(lock, LW_EXCLUSIVE))

2016-09-05 Thread Simon Riggs
On 18 June 2016 at 04:28, Thomas Munro  wrote:
> Hi hackers,
>
> Several times now when reading, debugging and writing code I've wished
> that LWLockHeldByMe assertions specified the expected mode, especially
> where exclusive locking is required.
>
> What do you think about something like the attached?  See also an
> example of use.  I will add this to the next commitfest.

Committed, thanks.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Craig Ringer
On 5 September 2016 at 16:33, Petr Jelinek  wrote:

>> The better alternative is to add a variant on
>> pg_logical_slot_get_changes(...) etc that accepts a start LSN. But
>> it's not convenient or easy for SQL callers to extract the last commit
>> LSN received from the last call to pass it to the next one, so this
>> isn't simple, and is probably best tackled as part of the SQL
>> interface API change Petr and Andres discussed in this thread.
>>
>
> It isn't? I thought lsn is first column of the output of that function?

It is. While it's a pain to use that from SQL (psql, etc) as well as
doing something with the results, there's no point since anything
working in simple SQL will get terminated when the server restarts
anyway. So really my point above is moot.

That's doesn't reflect on the slot dirty patch, it just means there's
no need to do anything extra when we add the cursor-like interface
later in order to fully solve this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Send catalog_xmin separately in hot standby feedback

2016-09-05 Thread Craig Ringer
On 5 September 2016 at 14:44, Craig Ringer  wrote:
> On 5 September 2016 at 12:40, Craig Ringer  wrote:
>> Hi all
>>
>> Currently hot standby feedback sends GetOldestXmin()'s result to the
>> upstream as the required xmin. GetOldestXmin() returns a slot's
>> catalog_xmin if that's the lowest xmin on the system.
>
>
> Note that this patch changes the API to GetOldestXmin(), adding a new
> boolean to allow it to disregard the catalog_xmin of slots.
>
> Per Simon's feedback I'm going to split that out into a separate
> patch, so will post a follow-up split one soon as the series.

Now formatted a series:

1.  Send catalog_xmin in hot standby feedback protocol
2.  Make walsender respect catalog_xmin in hot standby feedback messages
3.  Allow GetOldestXmin(...) to optionally disregard the catalog_xmin
4.  Send catalog_xmin separately in hot_standby_feedback messages


Descriptions are in the patch headers.


1 adds the protocol field only. The value is at this point always sent
as 0 by walreceiver and ignored by walsender. There's need to handle
backward compatibility in the addition to the hot standby protocol
message here as only the same major version Pg has any business
sending us hot standby feedback. pg_receivexlog, pg_basebackup etc
don't use hs feedback. Includes protocol docs change.

2 makes walsender now pay attention to the sent catalog_xmin.
walreceiver doesn't set it yet and has no way to get it separately.

3 Provides a way to get the global xmin without considering the
catalog_xmin so walreceiver can use it.

4 makes walsender use the modified GetOldestXmin()


(3) needs additional attention:

By ignoring slot catalog_xmin in the GetOldestXmin() call then
separately calling ProcArrayGetReplicationSlotXmin() to get the
catalog_xmin to  we might produce a catalog xmin slightly later than
what was in the procarray at the time we previously called
GetOldestXmin() to examine backend/session state. ProcArrayLock is
released so it can be advanced in-between the calls. That's harmless -
it isn't necessary for the reported catalog_xmin to be exactly
consistent with backend state. If it advances it's safe to report the
new position since we know the confirmed positions are on-disk
locally.

The alternative here is to extend GetOldestXmin() to take an out-param
to report the slot catalog xmin. That really just duplicates  the
functionality of ProcArrayGetReplicationSlotXmin but means we can grab
it within a single ProcArray lock. Variants of GetOldestXmin and
ProcArrayGetReplicationSlotXmin that take an already-locked flag would
work too, but would hold the lock across parts of GetOldestXmin() that
currently don't retain it. I could also convert the current boolean
param ignoreVacuum into a flags argument instead of adding another
boolean. No real preference from me.

I cut out some comment changes to be submitted separately; otherwise
this series is much the same as the original patch upthread.

Also available at
https://github.com/2ndQuadrant/postgres/tree/dev/feedback-catalog-xmin
(and tagged dev/feedback-catalog-xmin). Branch subject to rebasing.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 275c6422962f1fc326cc5f0c92186de0b127c472 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 5 Sep 2016 15:30:53 +0800
Subject: [PATCH 1/6] Send catalog_xmin in hot standby feedback protocol

Add catalog_xmin to the to the hot standby feedback protocol so a read replica
that has logical slots can use its physical slot to the master to hold down the
master's catalog_xmin. This information will let a replica prevent vacuuming of
catalog tuples still required by the replica's logical slots.

This is the hot standby feedback protocol change, the new value is always set
to zero by the walreceiver and is ignored by the walsender.
---
 doc/src/sgml/protocol.sgml| 33 -
 src/backend/replication/walreceiver.c | 20 ++--
 src/backend/replication/walsender.c   | 14 --
 3 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 68b0941..c4e41ca 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1783,10 +1783,11 @@ The commands accepted in walsender mode are:
   
   
   
-  The standby's current xmin. This may be 0, if the standby is
-  sending notification that Hot Standby feedback will no longer
-  be sent on this connection. Later non-zero messages may
-  reinitiate the feedback mechanism.
+  The standby's current global xmin, excluding the catalog_xmin from any
+  replication slots. If both this value and the following
+  catalog_xmin are 0 this is treated as a notification that Hot Standby
+  feedback will no longer 

Re: [HACKERS] Optimization for lazy_scan_heap

2016-09-05 Thread Simon Riggs
On 4 August 2016 at 05:57, Masahiko Sawada  wrote:

> While reviewing freeze map code, Andres pointed out[1] that
> lazy_scan_heap could accesses visibility map twice and its logic is
> seems a bit tricky.
> As discussed before, it's not nice especially when large relation is
> entirely frozen.
>
> I posted the patch for that before but since this is an optimization,
> not bug fix, I'd like to propose it again.
> Please give me feedback.
>
> [1] 
> https://www.postgresql.org/message-id/20160505000840.epatoq6d2e556446%40alap3.anarazel.de

If we have a freeze map now, surely tables will no longer be entirely frozen?

What performance difference does this make, in a realistic use case?

How would we test this to check it is exactly correct?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-05 Thread Amit Kapila
On Mon, Sep 5, 2016 at 2:00 PM, Pavan Deolasee  wrote:
>
>
> On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra 
> wrote:
>>
>> Hi,
>>
>> This thread started a year ago, different people contributed various
>> patches, some of which already got committed. Can someone please post a
>> summary of this thread, so that it's a bit more clear what needs
>> review/testing, what are the main open questions and so on?
>>
>> I'm interested in doing some tests on the hardware I have available, but
>> I'm not willing spending my time untangling the discussion.
>>
>
> I signed up for reviewing this patch. But as Amit explained later, there are
> two different and independent implementations to solve the problem. Since
> Tomas has volunteered to do some benchmarking, I guess I should wait for the
> results because that might influence which approach we choose.
>
> Does that sound correct?
>

Sounds correct to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unused function arguments

2016-09-05 Thread Antonin Houska
While reading the logical replication (and related) code, I found a few unused
function arguments:

 * XactLogCommitRecord() - unused argument forceSync

 * SnapBuildBuildSnapshot() - xid

 * TeardownHistoricSnapshot() - is_error

No idea which ones are intended for future use and which ones can be
removed.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Yet another small patch - reorderbuffer.c:1099

2016-09-05 Thread Aleksander Alekseev
> I looked at this and can see some of the argument on both sides, but
> if it's setting off static-analyzer warnings for some people, that
> seems like a sufficient reason to change it.  We certainly make more
> significant changes than this in order to silence warnings.
> 
> I rewrote the comment a bit more and pushed it.

Tom, thank you for committing this patch. And thanks everyone for your
contribution!

-- 
Best regards,
Aleksander Alekseev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Simon Riggs
On 5 September 2016 at 03:41, Craig Ringer  wrote:
> On 2 September 2016 at 17:49, Craig Ringer  wrote:
>
>> So the main change becomes the one-liner in my prior mail.
>
> Per feedback from Simon, updated with a new test in src/test/recovery .
>
> If you revert the change to
> src/backend/replication/logical/logicalfuncs.c then the test will
> start failing.
>
> I'd quite like to backpatch the fix since it's simple and safe, but I
> don't feel that it's hugely important to do so. This is an annoyance
> not a serious data risking bug.

I've committed to HEAD only. We can discuss backpatching elsewhere also.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Petr Jelinek

On 05/09/16 04:41, Craig Ringer wrote:

On 2 September 2016 at 17:49, Craig Ringer  wrote:


So the main change becomes the one-liner in my prior mail.


Per feedback from Simon, updated with a new test in src/test/recovery .

If you revert the change to
src/backend/replication/logical/logicalfuncs.c then the test will
start failing.

I'd quite like to backpatch the fix since it's simple and safe, but I
don't feel that it's hugely important to do so. This is an annoyance
not a serious data risking bug.

My only concern here is that we still lose position on crash. So
SQL-interface callers should still be able to cope with it. But they
won't see it happen if they're only testing with normal shutdowns,
host reboots, etc. In practice users aren't likely to notice this
anyway, though, since most people don't restart the server all the
time. I think it's better than what we have.

This issue could be eliminated completely by calling
ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would
force an immediate flush after a non-peeked SQL interface decoding
call. It means more fsync()ing, though, and the SQL interface can only
be used by polling so that could be a significant performance hit.
We'd want to only flush when the position changed, but even then it'll
mean a sync every time anything gets returned.

The better alternative is to add a variant on
pg_logical_slot_get_changes(...) etc that accepts a start LSN. But
it's not convenient or easy for SQL callers to extract the last commit
LSN received from the last call to pass it to the next one, so this
isn't simple, and is probably best tackled as part of the SQL
interface API change Petr and Andres discussed in this thread.



It isn't? I thought lsn is first column of the output of that function?

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] COPY vs \copy HINT

2016-09-05 Thread Christoph Berg
Re: Craig Ringer 2016-09-05 

> To cover the same-host case we could try something like:
> 
>COPY runs on the PostgreSQL server, using the PostgreSQL server's
> directories and permissions, it doesn't run on the client.
> 
> ... but I think that's actually less helpful for the users who'll need this.

The part about the server permissions might be useful to hint at.

> > I don't suppose there's any easy way for COPY to distinguish local
> > from remote connections
> 
> Not that I see, since "local" can be unix socket or tcp to localhost.
> Not cleanly anyway.
> 
> I don't think it matters. Many hosts have enough paths in common that
> in practice the hint on EACCES will be useful anyway. It'd be nice to
> work in something about running with the permissions of the PostgreSQL
> server, but I don't see a way to do that without making it all more
> complex.
> 
> I don't think it's worth tons of time anyway. This will be better than
> what we have, lets do it.

It's probably not helpful trying to distinguish local and remote
connections, the set of possible problems is diverse and this is just
one of them.

> I'm fairly happy with the wording you came up with:
> 
> "COPY copies to a file on the PostgreSQL server, not on the client. "
> "You may want a client-side facility such as psql's \\copy."

What about

 "COPY TO instructs the PostgreSQL server to write to a file on the server. 
"
 "You may want a client-side facility such as psql's \\copy."

 "COPY FROM instructs the PostgreSQL server to read from a file on the 
server. "
 "You may want a client-side facility such as psql's \\copy."

This stresses more explicitly that the file is opened by the server
process. (I'm not particularly happy that it says "server" in there
twice, but couldn't think of anything else.)

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-09-05 Thread Pavan Deolasee
On Mon, Sep 5, 2016 at 3:18 AM, Tomas Vondra 
wrote:

> Hi,
>
> This thread started a year ago, different people contributed various
> patches, some of which already got committed. Can someone please post a
> summary of this thread, so that it's a bit more clear what needs
> review/testing, what are the main open questions and so on?
>
> I'm interested in doing some tests on the hardware I have available, but
> I'm not willing spending my time untangling the discussion.
>
>
I signed up for reviewing this patch. But as Amit explained later, there
are two different and independent implementations to solve the problem.
Since Tomas has volunteered to do some benchmarking, I guess I should wait
for the results because that might influence which approach we choose.

Does that sound correct? Or do we already know which implementation is more
likely to be pursued, in which case I can start reviewing that patch.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> Before digging into the problem, could you share your impression on whether 
> PostgreSQL can support SJIS?  Would it be hopeless?  Can't we find any 
> direction to go?  Can I find relevant source code by searching specific words 
> like "ASCII", "HIGH_BIT", "\\" etc?

For starters, you could grep "multibyte".

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-05 Thread Amit Langote

Hi Fabien,

On 2016/09/03 2:47, Fabien COELHO wrote:
>> This patch needs to be rebased because of commit 64710452 (on 2016-08-19).
> 
> Here it is!

Thanks for sending the updated patch. Here are some (mostly cosmetic)
comments.  Before the comments, let me confirm whether the following
result is odd (a bug) or whether I am just using it wrong:

Custom script looks like:

\;
select a \into a
from tab where a = 1;
\set i debug(:a)

I get the following error:

undefined variable "a"
client 0 aborted in state 1; execution of meta-command failed

Even the following script gives the same result:

\;
select a from a where a = 1;
\into a
\set t debug(:a)

However with the following there is no error and a gets set to 2:

select a from a where a = 1
\;
select a+1 from a where a = 1;
\into a
\set t debug(:a)


Comments on the patch follow:

+
+ 
+  Stores the first fields of the resulting row from the current or
preceding
+  SELECT command into these variables.

Any command returning rows ought to work, no?  For example, the following
works:

update a set a = a+1 returning *;
\into a
\set t debug(:a)


-char   *line;   /* text of command line */
+char   *line;   /* first line for short display */
+char   *lines;  /* full multi-line text of command */

When I looked at this and related hunks (and also the hunks related to
semicolons), it made me wonder whether this patch contains two logical
changes.  Is this a just a refactoring for the \into implementation or
does this provide some new externally visible feature?


 char   *argv[MAX_ARGS]; /* command word list */
+int compound;  /* last compound command (number of \;) */
+char***intos;  /* per-compound command \into variables */

Need an extra space for intos to align with earlier fields.  Also I wonder
if intonames or intoargs would be a slightly better name for the field.


+static bool
+read_response(CState *st, char ** intos[])

Usual style seems to be to use ***intos here.


+fprintf(stderr,
+"client %d state %d compound %d: "
+"cannot apply \\into to non SELECT statement\n",
+st->id, st->state, compound);

How about make this error message say something like other messages
related to \into, perhaps something like: "\\into cannot follow non SELECT
statements\n"


 /*
  * Read and discard the query result; note this is not
included in
- * the statement latency numbers.
+ * the statement latency numbers (above), thus if reading the
+ * response fails the transaction is counted nevertheless.
  */

Does this comment need to mention that the result is not discarded when
\into is specified?


+my_command->intos = pg_malloc0(sizeof(char**) * (compounds+1));

Need space: s/char**/char **/g

This comments applies also to a couple of nearby places.


-my_command->line = pg_malloc(nlpos - p + 1);
+my_command->line = pg_malloc(nlpos - p  + 1 + 3);

A comment mentioning what the above means would be helpful.


+boolsql_command_in_progress = false;

This variable's name could be slightly confusing to readers. If I
understand its purpose correctly, perhaps it could be called
sql_command_continues.


+if (index == 0)
+syntax_error(desc, lineno, NULL, NULL,
+ "\\into cannot start a script",
+ NULL, -1);

How about: "\\into cannot be at the beginning of a script" ?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-05 Thread Michael Paquier
On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander  wrote:
> Ugh. That would be nice to have, but I think that's outside the scope of
> this patch.

A test for this patch that could have value would be to use
pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the
size of the segments. If you have an idea to untar something without
the in-core perl support because we need to have the TAP stuff able to
run on at least 5.8.8, I'd welcome an idea. Honestly I still have
none, and that's why the recovery tests do not use tarballs in their
tests when using pg_basebackup. In short let's not add something more
for this patch.

> PFA is an updated version of this patch that:
> * documents a magic value passed to zlib (which is in their documentation as
> being a magic value, but has no define)
> * fixes the padding of tar files
> * adds a most basic test that the -X stream -Ft does produce a tarfile

So I had a more serious look at this patch, and it basically makes
more generic the operations done for the plain mode by adding a set of
routines that can be used by both tar and plain mode to work on the
WAL files streamed. Elegant.

+   
+The transaction log files will be written to
+ the base.tar file.
+   
Nit: number of spaces here.

-mark_file_as_archived(const char *basedir, const char *fname)
+mark_file_as_archived(StreamCtl *stream, const char *fname)
Just passing WalMethod as argument would be enough, but... My patch
adding the fsync calls to pg_basebackup could just make use of
StreamCtl, so let's keep it as you suggest.

 static bool
 existsTimeLineHistoryFile(StreamCtl *stream)
 {
[...]
+   return stream->walmethod->existsfile(histfname);
 }
existsfile returns always false for the tar method. This does not
matter much because pg_basebackup exists immediately in case of a
failure, but I think that this deserves a comment in ReceiveXlogStream
where existsTimeLineHistoryFile is called.

I find the use of existsfile() in open_walfile() rather confusing
because this relies on the fact  that existsfile() returns always
false for the tar mode. We could add an additional field in WalMethod
to store the method type and use that more, but that may make the code
more confusing than what you propose. What do you think?

+   int (*unlink) (const char *pathname);
The unlink method is used nowhere. This could just be removed.

-static void
+void
 print_tar_number(char *s, int len, uint64 val)
This could be an independent patch. Or not.

I think that I found another bug regarding the contents of the
segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar
which contained segment 1/0/2, then:
$ pg_xlogdump 00010002
pg_xlogdump: FATAL:  could not find a valid record after 0/200
I'd expect this segment to have records, up to a XLOG_SWITCH record.

> As for using XLOGDIR to drive the name of the tarfile. pg_basebackup is
> already hardcoded to use pg_xlog. And so are the tests. We probably want to
> fix that, but that's a separate step and this patch will be easier to review
> and test if we keep it out for now.

Yes. pg_basebackup is not the only thing here missing the point here,
here is most of the list:
$ git grep "pg_xlog\"" -- *.c *.h
src/backend/access/transam/xlog.c:static const char *xlogSourceNames[]
= {"any", "archive", "pg_xlog", "stream"};
src/backend/replication/basebackup.c:   dir = AllocateDir("pg_xlog");
src/backend/replication/basebackup.c:(errmsg("could
not open directory \"%s\": %m", "pg_xlog")));
src/backend/replication/basebackup.c:   while ((de = ReadDir(dir,
"pg_xlog")) != NULL)
src/backend/replication/basebackup.c:   if (strcmp(pathbuf,
"./pg_xlog") == 0)
src/backend/storage/file/fd.c:  if (lstat("pg_xlog", ) < 0)
src/backend/storage/file/fd.c:  "pg_xlog")));
src/backend/storage/file/fd.c:  if (pgwin32_is_junction("pg_xlog"))
src/backend/storage/file/fd.c:  walkdir("pg_xlog", pre_sync_fname,
false, DEBUG1);
src/backend/storage/file/fd.c:  walkdir("pg_xlog",
datadir_fsync_fname, false, LOG);
src/bin/initdb/initdb.c:snprintf(pg_xlog, MAXPGPATH, "%s/pg_xlog", pg_data);
src/bin/initdb/initdb.c:subdirloc = psprintf("%s/pg_xlog", pg_data);
src/bin/pg_basebackup/pg_basebackup.c:  snprintf(param->xlogdir,
sizeof(param->xlogdir), "%s/pg_xlog", basedir);
src/bin/pg_basebackup/pg_basebackup.c:  if
(!((pg_str_endswith(filename, "/pg_xlog") ||
src/bin/pg_basebackup/pg_basebackup.c:  linkloc =
psprintf("%s/pg_xlog", basedir);
src/bin/pg_rewind/copy_fetch.c: strcmp(path, "pg_xlog") == 0)
src/bin/pg_rewind/filemap.c:if (strcmp(path, "pg_xlog") == 0 &&
type == FILE_TYPE_SYMLINK)
src/bin/pg_rewind/filemap.c:if (exists &&
!S_ISDIR(statbuf.st_mode) && strcmp(path, "pg_xlog") != 0)
src/bin/pg_rewind/filemap.c:if (strcmp(path, "pg_xlog") == 0 &&
type == FILE_TYPE_SYMLINK)

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii
> > But what I'm wondering is why PostgreSQL doesn't support SJIS.  Was there
> any technical difficulty?  Is there anything you are worried about if adding
> SJIS?
> 
> Yes, there's a technical difficulty with backend code. In many places it
> is assumed that any string is "ASCII compatible", which means no ASCII
> character is used as a part of multi byte string. Here is such a random
> example from src/backend/util/adt/varlena.c:
> 
>   /* Else, it's the traditional escaped style */
>   for (bc = 0, tp = inputText; *tp != '\0'; bc++)
>   {
>   if (tp[0] != '\\')
>   tp++;
> 
> Sometimes SJIS uses '\' as the second byte of it.

Thanks, I'll try to understand the seriousness of the problem as I don't have 
good knowledge of character sets.  But your example seems to be telling 
everything about the difficulty...

Before digging into the problem, could you share your impression on whether 
PostgreSQL can support SJIS?  Would it be hopeless?  Can't we find any 
direction to go?  Can I find relevant source code by searching specific words 
like "ASCII", "HIGH_BIT", "\\" etc?

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> But what I'm wondering is why PostgreSQL doesn't support SJIS.  Was there any 
> technical difficulty?  Is there anything you are worried about if adding SJIS?

Yes, there's a technical difficulty with backend code. In many places
it is assumed that any string is "ASCII compatible", which means no
ASCII character is used as a part of multi byte string. Here is such a
random example from src/backend/util/adt/varlena.c:

/* Else, it's the traditional escaped style */
for (bc = 0, tp = inputText; *tp != '\0'; bc++)
{
if (tp[0] != '\\')
tp++;

Sometimes SJIS uses '\' as the second byte of it.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
Hello,

I'd like to propose adding SJIS as a database encoding.  You may wonder why 
SJIS is still necessary in the world of Unicode.  The purpose is to achieve 
comparable performance when migrating legacy database systems from other DBMSs 
without little modification of applications.

Recently, we failed to migrate some customer's legacy database from DBMS-X to 
PostgreSQL.  That customer wished for PostgreSQL, but PostgreSQL couldn't meet 
the performance requirement.

The system uses DBMS-X with the database character set being SJIS.  The main 
applications are written in embedded SQL, which require SJIS in their host 
variables.  They insisted they cannot use UTF8 for the host variables because 
that would require large modification of applications due to character 
handling.  So no character set conversion is necessary between the clients and 
the server.

On the other hand, PostgreSQL doesn't support SJIS as a database encoding.  
Therefore, character set conversion from UTF-8 to SJIS has to be performed.  
The batch application runs millions of SELECTS each of which retrieves more 
than 100 columns.  And many of those columns are of character type.

If PostgreSQL supports SJIS, PostgreSQL will match or outperform the 
performance of DBMS-X with regard to the applications.  We confirmed it by 
using psql to run a subset of the batch processing.  When the client encoding 
is SJIS, one FETCH of 10,000 rows took about 500ms.  When the client encoding 
is UTF8 (the same as the database encoding), the same FETCH took 270ms.

Supporting SJIS may somewhat regain attention to PostgreSQL here in Japan, in 
the context of database migration.  BTW, MySQL supports SJIS as a database 
encoding.  PostgreSQL used to be the most popular open source database in 
Japan, but MySQL is now more popular.


But what I'm wondering is why PostgreSQL doesn't support SJIS.  Was there any 
technical difficulty?  Is there anything you are worried about if adding SJIS?

I'd like to write a patch for adding SJIS if there's no strong objection.  I'd 
appreciate it if you could let me know good design information to add a server 
encoding (e.g. the URL of the most recent patch to add a new server encoding)

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel build with MSVC

2016-09-05 Thread Christian Ullrich

* Michael Paquier wrote:


On Mon, Sep 5, 2016 at 9:18 AM, Noah Misch  wrote:



Every vcbuild and msbuild invocation ought to recognize this variable, so
please update the two places involving ecpg_regression.proj.  Apart from that,
the patch looks good.


Good catch. I did not notice those during my lookups of those patches.
Not my intention to bump into Christian's feet, but here are updated
patches.


My feet feel fine. Thanks for updating.

--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2016-09-05 Thread Haribabu Kommi
On Sun, Sep 4, 2016 at 1:44 AM, Simon Riggs  wrote:

> On 15 August 2016 at 12:17, Haribabu Kommi 
> wrote:
>
> > comments?
>
> This looks like a good feature contribution, thanks.
>
> At present the patch doesn't apply cleanly, please rebase.
>

Rebased patch is attached.


> The patch doesn't contain any tests, which means I can't see what the
> output looks like, so I can't judge the exact usefulness of this
> particular patch. ISTM likely that there will be some detailed points
> to review in there somewhere.
>

Added a test in the regress and also in the docs.

Do we want line number, or priority order? i.e. do we want to count
> comment lines or just main rule lines? I prefer latter.
> Various other questions along those lines needed, once I can see the
> output.
>

I preferred the line number that includes the comment lines also. This way
it
will be easy to edit the file if it contains any errors by directly going
to that line
number.


> What is push_jsonb_string_value() etc..?
> If there is required infrastructure there is no explanation of why.
> Suggest you explain and/or split into two.
>

I added a JSONB type column to display the hba.conf options values.
To store the options data into JSONB format, currently i didn't find any
functions that are available to use in the core. So I added key/value
functions to add the data into JSONB object.

The functions related code is split into a different patch and attached.


> pg_hba_file_settings seems a clumsy name. I'd prefer pg_hba_settings,
> since that name could live longer than the concept of pg_hba.conf,
> which seems likely to become part of ALTER SYSTEM in future, so we
> wouldn't really want the word "file" in there.
>

Yes, I also agree that *file* is not required. The hba rules are not
available
in memory also in the backends. I changed the view name to pg_hba_rules
as per the other mail from Christoph.


Regards,
Hari Babu
Fujitsu Australia


pg_hba_rules_1.patch
Description: Binary data


jsonb_utilitiy_funcs_1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LSN as a recovery target

2016-09-05 Thread Michael Paquier
On Mon, Sep 5, 2016 at 4:02 PM, Simon Riggs  wrote:
> On 5 September 2016 at 06:55, Michael Paquier  
> wrote:
>> On Sun, Sep 4, 2016 at 11:30 PM, Simon Riggs  wrote:
>
>>> I noticed we don't mention what LSN is anywhere, so I'd like to apply
>>> the following doc patch also.
>>
>> +1 for the idea. What do you think about adding a mention to the
>> system data type pg_lsn?
>>
>>
>> +   WAL records are appended to the WAL
>> +   logs as each new record is written. The insert position is described by
>> +   a Log Sequence Number (LSN) that is a byte offset into
>> +   the logs, increasing monotonically with each new record. Two
>> +   LSNs can be compared to calculate the volume of
>> +   WAL data that separates them, so are used to measure
>> +   progress of WAL during replication and recovery.
>> +  
>> Here we could for example append a sentence like "The system data type
>> pg_lsn is an internal
>> representation of the LSN".
>
> Good input, thanks. v2 attached.

Looks good to me. Thanks for the wordsmithing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better locale-specific-character-class handling for regexps

2016-09-05 Thread Heikki Linnakangas

On 09/04/2016 08:44 PM, Tom Lane wrote:

Heikki Linnakangas  writes:

On 08/23/2016 03:54 AM, Tom Lane wrote:
+1 for this patch in general. Some regression test cases would be nice.


I'm not sure how to write such tests without introducing insurmountable
platform dependencies --- particularly on platforms with weak support for
UTF8 locales, such as OS X.  All the interesting cases require knowing
what iswalpha() etc will return for some high character codes.

What I did to test it during development was to set MAX_SIMPLE_CHR to
something in the ASCII range, so that the high-character-code paths could
be tested without making any assumptions about locale classifications for
non-ASCII characters.  I'm not sure that's a helpful idea for regression
testing purposes, though.

I guess I could follow the lead of collate.linux.utf8.sql and produce
a test that's only promised to pass on one platform with one encoding,
but I'm not terribly excited by that.  AFAIK that test file does not
get run at all in the buildfarm or in the wild.


I'm not too worried if the tests don't get run regularly, but I don't 
like the idea that only works on one platform. This code is unlikely to 
be accidentally broken by unrelated changes in the backend, as the 
regexp code is very well isolated. But for someone hacks on the regexp 
library in the future, having a test suite to tickle all these 
corner-cases would be very handy.


Another class of regressions would be that something changes in the way 
a locale treats some characters, and that breaks an application. That 
would be very difficult to test for in a platform-independent way. That 
wouldn't really our bug, though, but the locale's.


Since we're now de facto maintainers of this regexp library, and our 
version could be used somewhere else than PostgreSQL too, it would 
actually be nice to have a regression suite that's independent from the 
pg_regress infrastructure, and wouldn't need a server to run. Perhaps a 
stand-alone C program that compiles the regexp code with mock versions 
of pg_wc_is* functions. Or perhaps a magic collation OID that makes 
pg_wc_is* functions to return hard-coded values for particular inputs.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LSN as a recovery target

2016-09-05 Thread Simon Riggs
On 5 September 2016 at 06:55, Michael Paquier  wrote:
> On Sun, Sep 4, 2016 at 11:30 PM, Simon Riggs  wrote:

>> I noticed we don't mention what LSN is anywhere, so I'd like to apply
>> the following doc patch also.
>
> +1 for the idea. What do you think about adding a mention to the
> system data type pg_lsn?
>
>
> +   WAL records are appended to the WAL
> +   logs as each new record is written. The insert position is described by
> +   a Log Sequence Number (LSN) that is a byte offset into
> +   the logs, increasing monotonically with each new record. Two
> +   LSNs can be compared to calculate the volume of
> +   WAL data that separates them, so are used to measure
> +   progress of WAL during replication and recovery.
> +  
> Here we could for example append a sentence like "The system data type
> pg_lsn is an internal
> representation of the LSN".

Good input, thanks. v2 attached.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


doc_lsn.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Send catalog_xmin separately in hot standby feedback

2016-09-05 Thread Craig Ringer
On 5 September 2016 at 12:40, Craig Ringer  wrote:
> Hi all
>
> Currently hot standby feedback sends GetOldestXmin()'s result to the
> upstream as the required xmin. GetOldestXmin() returns a slot's
> catalog_xmin if that's the lowest xmin on the system.


Note that this patch changes the API to GetOldestXmin(), adding a new
boolean to allow it to disregard the catalog_xmin of slots.

Per Simon's feedback I'm going to split that out into a separate
patch, so will post a follow-up split one soon as the series.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers