Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

2017-01-15 Thread Michael Paquier
On Mon, Jan 16, 2017 at 9:40 AM, Thomas Munro
 wrote:
> I also have longer term plans to show the first and third of them
> running with the read-only transaction moved to a standby server.
> Kevin Grittner gave me the idea of multi-server isolation tests when I
> mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,

Being able to do that with the isolation tester has been mentioned a
coupled of times, particularly from 2ndQ folks who worked in BDR. I
think that it has been actually patched in this sense, so perhaps you
could reuse the work that has been done there.

> which prompted me to realise that our existing SSI tests aren't very
> interesting for that because they all rely on the behaviour of
> SERIALIZABLE, not SERIALIZABLE DEFERRABLE.  So I figured we'd better
> have some tests that show show that working on a single node system
> first.

Makes sense.

> Upon reflection, either 2 or 3 might be considered more beautiful than
> 4, depending on how others feel about extending the remit of the
> existing pg_blocking_pid function.  I'd be happy to post a new patch
> using one of those approaches if others feel it'd be better that way.

Well, either 1, 2 or 3 basically duplicate what the wait events
tracked via latch.c for SafeSnapshot are doing when a client scans
pg_stat_activity, so at the end even if that's not the most beautiful
thing I have seen, this does not seem worth making the backend more
complicated for a facility that we already have. What I am wondering
though is if it would be interesting to make the list of potential
wait events backends are waiting for at some transaction point
configurable in the isolation tester. For example say having as step a
way to switch between wait events to wait for and be able to make that
usable as part of a permutation. I am not sure if there are actual use
cases for such a fancy facility, so if things prove to become more
complicated in the future we could consider it. But honestly I am now
of the opinion that this time has not come yet.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-01-15 Thread Ashutosh Bapat
On Mon, Jan 16, 2017 at 9:49 AM, Amit Khandekar  wrote:
> Thanks Ashutosh for the feedback.
>
> On 6 January 2017 at 17:04, Ashutosh Bapat
>  wrote:
>> On Fri, Dec 23, 2016 at 10:51 AM, Amit Khandekar  
>> wrote:
>>> Currently an Append plan node does not execute its subplans in
>>> parallel. There is no distribution of workers across its subplans. The
>>> second subplan starts running only after the first subplan finishes,
>>> although the individual subplans may be running parallel scans.
>>>
>>> Secondly, we create a partial Append path for an appendrel, but we do
>>> that only if all of its member subpaths are partial paths. If one or
>>> more of the subplans is a non-parallel path, there will be only a
>>> non-parallel Append. So whatever node is sitting on top of Append is
>>> not going to do a parallel plan; for example, a select count(*) won't
>>> divide it into partial aggregates if the underlying Append is not
>>> partial.
>>>
>>> The attached patch removes both of the above restrictions.  There has
>>> already been a mail thread [1] that discusses an approach suggested by
>>> Robert Haas for implementing this feature. This patch uses this same
>>> approach.
>>
>> The first goal requires some kind of synchronization which will allow workers
>> to be distributed across the subplans. The second goal requires some kind of
>> synchronization to prevent multiple workers from executing non-parallel
>> subplans. The patch uses different mechanisms to achieve the goals. If
>> we create two different patches addressing each goal, those may be
>> easier to handle.
>
> Goal A : Allow non-partial subpaths in Partial Append.
> Goal B : Distribute workers across the Append subplans.
> Both of these require some kind of synchronization while choosing the
> next subplan. So, goal B is achieved by doing all the synchronization
> stuff. And implementation of goal A requires that goal B is
> implemented. So there is a dependency between these two goals. While
> implementing goal B, we should keep in mind that it should also work
> for goal A; it does not make sense later changing the synchronization
> logic in goal A.
>
> I am ok with splitting the patch into 2 patches :
> a) changes required for goal A
> b) changes required for goal B.
> But I think we should split it only when we are ready to commit them
> (commit for B, immediately followed by commit for A). Until then, we
> should consider both of these together because they are interconnected
> as explained above.

For B, we need to know, how much gain that brings and in which cases.
If that gain is not worth the complexity added, we may have to defer
Goal B. Goal A would certainly be useful since it will improve
performance of the targetted cases. The synchronization required for
Goal A is simpler than that of B and thus if we choose to implement
only A, we can live with a simpler synchronization.

BTW, Right now, the patch does not consider non-partial paths for a
child which has partial paths. Do we know, for sure, that a path
containing partial paths for a child, which has it, is always going to
be cheaper than the one which includes non-partial path. If not,
should we build another paths which contains non-partial paths for all
child relations. This sounds like a 0/1 knapsack problem.

>
>
>> Here are some review comments
> I will handle the other comments, but first, just a quick response to
> some important ones :
>
>> 6. By looking at parallel_worker field of a path, we can say whether it's
>> partial or not. We probably do not require to maintain a bitmap for that at 
>> in
>> the Append path. The bitmap can be constructed, if required, at the time of
>> creating the partial append plan. The reason to take this small step is 1. we
>> want to minimize our work at the time of creating paths, 2. while freeing a
>> path in add_path, we don't free the internal structures, in this case the
>> Bitmap, which will waste memory if the path is not chosen while planning.
>
> Let me try keeping the per-subplan max_worker info in Append path
> itself, like I mentioned above. If that works, the bitmap will be
> replaced by max_worker field. In case of non-partial subpath,
> max_worker will be 1. (this is the same info kept in AppendState node
> in the patch, but now we might need to keep it in Append path node as
> well).

It will be better if we can fetch that information from each subpath
when creating the plan. As I have explained before, a path is minimal
structure, which should be easily disposable, when throwing away the
path.

>
>>
>> 7. If we consider 6, we don't need concat_append_subpaths(), but still here 
>> are
>> some comments about that function. Instead of accepting two separate 
>> arguments
>> childpaths and child_partial_subpaths_set, which need to be in sync, we can
>> just pass the path which contains both of those. In the same following code 
>> may
>> be 

Re: [HACKERS] [COMMITTERS] pgsql: Fix a bug in how we generate partition constraints.

2017-01-15 Thread Amit Langote
On 2017/01/14 13:36, Amit Langote wrote:
> On Sat, Jan 14, 2017 at 6:10 AM, Robert Haas  wrote:
>> On Fri, Jan 13, 2017 at 3:09 PM, Alvaro Herrera  
>> wrote:
>>>
>>> I'm just saying that the problem at hand is already solved for a related
>>> feature, so ISTM this new code should use the recently added routine
>>> rather than doing the same thing in a different way.
>>
>> Oh, I see.  Amit, thoughts?
> 
> Hm, perhaps.  The code in map_partition_varattnos() that creates the
> map could be replaced by a call to the new
> convert_tuples_by_name_map().  In fact, it could even have used the
> old version of it (convert_tuples_by_name()).  I guess I just aped
> what other callers of map_variable_attnos() were doing, which is to
> generate the map themselves (not that they ought to be changed to use
> convert_tuples_by_name_map).
> 
> I will send a patch at my earliest convenience. Thanks to Alvaro for
> pointing that out.

And here is the patch.

Thanks,
Amit
>From e04fde063af93e756f15a374611cdb2fa2708ece Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 16 Jan 2017 14:37:50 +0900
Subject: [PATCH 1/7] Avoid code duplication in map_partition_varattnos()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Code to map attribute numbers in map_partition_varattnos() duplicates
what convert_tuples_by_name_map() does.  Avoid that.

Amit Langote, pointed out by Álvaro Herrera.
---
 src/backend/catalog/partition.c | 21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 874e69d8d6..01e0c6c360 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -885,32 +885,19 @@ get_qual_from_partbound(Relation rel, Relation parent, Node *bound)
 List *
 map_partition_varattnos(List *expr, Relation partrel, Relation parent)
 {
-	TupleDesc	tupdesc = RelationGetDescr(parent);
-	AttrNumber	attno;
 	AttrNumber *part_attnos;
 	bool		found_whole_row;
 
 	if (expr == NIL)
 		return NIL;
 
-	part_attnos = (AttrNumber *) palloc0(tupdesc->natts * sizeof(AttrNumber));
-	for (attno = 1; attno <= tupdesc->natts; attno++)
-	{
-		Form_pg_attribute attribute = tupdesc->attrs[attno - 1];
-		char	   *attname = NameStr(attribute->attname);
-		AttrNumber	part_attno;
-
-		if (attribute->attisdropped)
-			continue;
-
-		part_attno = get_attnum(RelationGetRelid(partrel), attname);
-		part_attnos[attno - 1] = part_attno;
-	}
-
+	part_attnos = convert_tuples_by_name_map(RelationGetDescr(partrel),
+			 RelationGetDescr(parent),
+ gettext_noop("could not convert row type"));
 	expr = (List *) map_variable_attnos((Node *) expr,
 		1, 0,
 		part_attnos,
-		tupdesc->natts,
+		RelationGetDescr(parent)->natts,
 		_whole_row);
 	/* There can never be a whole-row reference here */
 	if (found_whole_row)
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-15 Thread Michael Paquier
On Mon, Jan 16, 2017 at 1:14 PM, Karl O. Pinc  wrote:
> I do have a question here regards code formatting.
> The patch now contains:
>
> if (log_filepath == NULL)
> {
> /* Bad data. Avoid segfaults etc. and return NULL to caller. */
> break;
> }
>
> I'm not sure how this fits in with PG coding style,
> whether the {} should be removed or what.  I've looked
> around and can't find an example of an if with a single
> line then block and a comment.  Maybe this means that
> I shouldn't be doing this, but if I put the comment above
> the if then it will "run into" the comment block which
> immediately precedes the above code which describes
> a larger body of code.  So perhaps someone should look
> at this and tell me how to improve it.

Including brackets in this case makes a more readable code. That's the
pattern respected the most in PG code. See for example
XLogInsertRecord():
else
{
/*
 * This was an xlog-switch record, but the current insert location was
 * already exactly at the beginning of a segment, so there was no need
 * to do anything.
 */
}

> Attached also are 2 patches which abstract some hardcoded
> constants into symbols.  Whether to apply them is a matter
> of style and left to the maintainers, which is why these
> are separate patches.

Making the strings csvlog, stderr and eventlog variables? Why not
because the patch presented here uses them in new places. Now it is
not like it is a huge maintenance burden to keep them as strings, so I
would personally not bother much.

> The first patch changes only code on the master
> branch, the 2nd patch changes the new code.

Thanks for keeping things separated.

> I have not looked further at the patch or checked
> to see that all changes previously recommended have been
> made.  Michael, if you're confident that everything is good
> go ahead and move the patchs forward to the maintainers.  Otherwise
> please let me know and I'll see about finding time
> for further review.

OK. I have done a round of hands-on review on this patch, finishing
with the attached. In the things I have done:
- Elimination of useless noise diff
- Fixes some typos (logile, log file log, etc.)
- Adjusted docs.
- Docs were overdoing in storage.sgml. Let's keep the description simple.
- Having a paragraph at the beginning of "Error Reporting and Logging"
in config.sgml does not look like a good idea to me. As the updates of
current_logfiles is associated with log_destination I'd rather see
this part added in the description of the GUC.
- Missed a (void) in the definition of update_metainfo_datafile().
- Moved update_metainfo_datafile() before the signal handler routines
in syslogger.c for clarity.
- The last "return" is useless in update_metainfo_datafile().
- In pg_current_logfile(), it is useless to call PG_RETURN_NULL after
emitting an ERROR message.
- Two calls to FreeFile(fd) have been forgotten in
pg_current_logfile(). In one case, errno needs to be saved beforehand
to be sure that the correct error string is generated for the user.
- A portion of 010_pg_basebackup.pl was not updated.
- Incorrect header ordering in basebackup.c.
- Decoding of CR and LF characters seem to work fine when
log_file_name include some.

With all those issues fixed, I finish with the attached, that I am
fine to pass down to a committer. I still think that this should be
only one function using a SRF with rows shaped as (type, path) for
simplicity, but as I am visibly outnumbered I won't insist further.
Also, I would rather see an ERROR returned to the user in case of bad
data in current_logfiles, I did not change that either as that's the
original intention of Gilles.
-- 
Michael
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 07afa3c77a..3188496bc2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4248,6 +4248,25 @@ SELECT * FROM parent WHERE key = 2400;
  must be enabled to generate
 CSV-format log output.

+   
+When either stderr or
+csvlog are included, the file
+current_logfiles gets created and records the location
+of the log file(s) currently in use by the logging collector and the
+associated logging destination. This provides a convenient way to
+find the logs currently in use by the instance. Here is an example of
+contents of this file:
+
+stderr pg_log/postgresql.log
+csvlog pg_log/postgresql.csv
+
+Note that this file gets updated when a new log file is created
+as an effect of rotation or when log_destination is
+reloaded.  current_logfiles is removed when
+stderr and csvlog
+are not included in log_destination or when the
+logging collector is disabled.
+   
 

 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 10e31868ba..c756fbe066 100644
--- a/doc/src/sgml/func.sgml

Re: [HACKERS] Too many autovacuum workers spawned during forced auto-vacuum

2017-01-15 Thread Amit Khandekar
On 13 January 2017 at 19:15, Alvaro Herrera  wrote:
> I think this is the same problem as reported in
> https://www.postgresql.org/message-id/CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=ypy...@mail.gmail.com

Ah yes, this is the same problem. Not sure why I didn't land on that
thread when I tried to search pghackers using relevant keywords.
>
>> === Fix ===
> [...]
>> Instead, the attached patch (prevent_useless_vacuums.patch) prevents
>> the repeated cycle by noting that there's no point in doing whatever
>> vac_update_datfrozenxid() does, if we didn't find anything to vacuum
>> and there's already another worker vacuuming the same database. Note
>> that it uses wi_tableoid field to check concurrency. It does not use
>> wi_dboid field to check for already-processing worker, because using
>> this field might cause each of the workers to think that there is some
>> other worker vacuuming, and eventually no one vacuums. We have to be
>> certain that the other worker has already taken a table to vacuum.
>
> Hmm, it seems reasonable to skip the end action if we didn't do any
> cleanup after all. This would normally give enough time between vacuum
> attempts for the first worker to make further progress and avoid causing
> a storm.  I'm not really sure that it fixes the problem completely, but
> perhaps it's enough.

I had thought about this : if we didn't clean up anything, skip the
end action unconditionally without checking if there was any
concurrent worker. But then thought it is better to skip only if we
know there is another worker doing the same job, because :
a) there might be some reason we are just calling
vac_update_datfrozenxid() without any condition. But I am not sure
whether it was intentionally kept like that. Didn't get any leads from
the history.
b) it's no harm in updating datfrozenxid() it if there was no other
worker. In this case, we *know* that there was indeed nothing to be
cleaned up. So the next time this database won't be chosen again, so
there's no harm just calling this function.

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Grouped base relation

2017-01-15 Thread Ashutosh Bapat
On Fri, Jan 13, 2017 at 10:12 PM, Antonin Houska  wrote:
> Ashutosh Bapat  wrote:
>> On Mon, Jan 9, 2017 at 11:26 PM, Antonin Houska  wrote:
>> > Attached is a draft patch that lets partial aggregation happen at base
>> > relation level. If the relations contain relatively small number of groups,
>> > the number of input rows of the aggregation at the query level can be 
>> > reduced
>> > this way.  Also, if append relation and postgres_fdw planning is enhanced
>> > accordingly, patch like this can let us aggregate individual tables on 
>> > remote
>> > servers (e.g. shard nodes) and thus reduce the amount of rows subject to 
>> > the
>> > final aggregation.
>
>> For an appendrel probably you need an ability to switch group->append into
>> append->group
>
> Yes, like the new patch version (see attachment) does:
>
> postgres=# EXPLAIN (COSTS false) SELECT b.j, sum(a.x) FROM a JOIN b ON a.i = 
> b.j GROUP BY b.j;
>   QUERY PLAN
> --
>  Finalize HashAggregate
>Group Key: b.j
>->  Hash Join
>  Hash Cond: (a.i = b.j)
>  ->  Append
>->  Partial HashAggregate
>  Group Key: a.i
>  ->  Seq Scan on a
>->  Partial HashAggregate
>  Group Key: a_1.i
>  ->  Seq Scan on a_1
>->  Partial HashAggregate
>  Group Key: a_2.i
>  ->  Seq Scan on a_2
>  ->  Hash
>->  Gather
>  Workers Planned: 1
>  ->  Partial HashAggregate
>Group Key: b.j
>->  Parallel Seq Scan on b
>
>> For postgres_fdw, we already support aggregate pushdown.
>
> My understanding is that currently it only works if the whole query can be
> evaluated by the FDW. What I try to do is to push down aggregation of
> individual table, and join the partially-aggregated set with other tables,
> which are not necessarily remote or reside on different remote server.

You will need to invoke FDW's hook for aggregate pushdown for the base
relations. It would work as long as we don't ask it transient results.
But I guess, that can come later.

>> But we don't support fetching partial aggregates from foreign server. What
>> other enhancements do you need?
>
> Here I try to introduce infrastructure for aggregation pushdown and
> propagation of the transient aggregate state values from base relations to the
> final join. postgres_fdw can benefit from it but it's not the only use case,
> so I'd prefer adjusting it in a separate patch.
>
> Yes, an outstanding problem is that the remote nodes need to return transient
> state values - probably using bytea type. I think this functionality should
> even be separate from postgres_fdw (e.g. a new contrib module?), because the
> remote nodes do not need postgres_fdw.

Hmm, that's a missing piece. We need to work on it separately.

>
>> > A few "footnotes":
>> >
>> >
>> > As for the example, the processing continues by joining the partially 
>> > grouped
>> > sets:
>> >
>> >  i | sum(x)| count(i.*) | j | count(j.*)
>> > 
>> >  1 | 7 |   2| 1 |   2
>
> [ Sorry, count(j.*) should be 2, not 3 as I wrote in the initial email. ]
>
>> >
>> >
>> > Before performing the final aggregation, we need to multiply sum(a.x) by
>> > count(j.*) because w/o the aggregation at base relation level the input
>> > of the query-level aggregation would look like
>> >
>> >  a.i | a.x | b.j
>> > 
>> >  1   |   3 |  1
>> >  1   |   4 |  1
>> >  1   |   3 |  1
>> >  1   |   4 |  1
>> >
>> > In other words, grouping of the base relation "b" below the join prevents 
>> > the
>> > join from bringing per-group input set to the aggregate input multiple
>> > times. To compensate for this effect, I've added a new field 
>> > "aggtransmultifn"
>> > to pg_aggregate catalog. It "multiplies" the aggregate state w/o repeated
>> > processing of the same input set many times. sum() is an example of an
>> > aggregate that needs such processing, avg() is one that does not.
>>
>> For something like product aggregation, where the product (or higher order
>> operations) across rows is accumulated instead of sum, mere multiplication
>> wouldn't help. We will need some higher order operation to "extrapolate" the
>> result based on count(j.*). In fact, the multiplication factor will depend
>> upon the number of relations being joined E.g.  select b.j, sum(a.x) where
>> a, b, c where a.i = b.j and a.i = c.k group by b.j
>
> Maybe you're saying what I already try to do. Let me modify the example
> accordingly (unlike the initial example, each table produces a group of
> different size so the the count() values are harder to confuse):
>
[... snip ]]

This all works well, as 

Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-15 Thread Fujii Masao
On Sat, Jan 14, 2017 at 11:19 PM, Peter Eisentraut
 wrote:
> On 1/11/17 11:20 AM, Ryan Murphy wrote:
>> Thanks for the review Beena, I'm glad the patch is ready to go!
>
> committed, thanks

Sorry for speaking up late.

This change may confuse the users who run "pg_ctl start" to perform a crash
recovery, archive recovery and standby server (with hot_standby=off) because
"pg_ctl start" would not return so long time. Also during that long time,
the error message "FATAL:  the database system is starting up" keeps outputing.
This makes me think that -W is better as the default of at least "pg_ctl start".

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-01-15 Thread Amit Khandekar
Thanks Ashutosh for the feedback.

On 6 January 2017 at 17:04, Ashutosh Bapat
 wrote:
> On Fri, Dec 23, 2016 at 10:51 AM, Amit Khandekar  
> wrote:
>> Currently an Append plan node does not execute its subplans in
>> parallel. There is no distribution of workers across its subplans. The
>> second subplan starts running only after the first subplan finishes,
>> although the individual subplans may be running parallel scans.
>>
>> Secondly, we create a partial Append path for an appendrel, but we do
>> that only if all of its member subpaths are partial paths. If one or
>> more of the subplans is a non-parallel path, there will be only a
>> non-parallel Append. So whatever node is sitting on top of Append is
>> not going to do a parallel plan; for example, a select count(*) won't
>> divide it into partial aggregates if the underlying Append is not
>> partial.
>>
>> The attached patch removes both of the above restrictions.  There has
>> already been a mail thread [1] that discusses an approach suggested by
>> Robert Haas for implementing this feature. This patch uses this same
>> approach.
>
> The first goal requires some kind of synchronization which will allow workers
> to be distributed across the subplans. The second goal requires some kind of
> synchronization to prevent multiple workers from executing non-parallel
> subplans. The patch uses different mechanisms to achieve the goals. If
> we create two different patches addressing each goal, those may be
> easier to handle.

Goal A : Allow non-partial subpaths in Partial Append.
Goal B : Distribute workers across the Append subplans.
Both of these require some kind of synchronization while choosing the
next subplan. So, goal B is achieved by doing all the synchronization
stuff. And implementation of goal A requires that goal B is
implemented. So there is a dependency between these two goals. While
implementing goal B, we should keep in mind that it should also work
for goal A; it does not make sense later changing the synchronization
logic in goal A.

I am ok with splitting the patch into 2 patches :
a) changes required for goal A
b) changes required for goal B.
But I think we should split it only when we are ready to commit them
(commit for B, immediately followed by commit for A). Until then, we
should consider both of these together because they are interconnected
as explained above.

>
> We may want to think about a third goal: preventing too many workers
> from executing the same plan. As per comment in get_parallel_divisor()
> we do not see any benefit if more than 4 workers execute the same
> node. So, an append node can distribute more than 4 worker nodes
> equally among the available subplans. It might be better to do that as
> a separate patch.

I think that comment is for calculating leader contribution. It does
not say that 4 workers is too many workers in general.

But yes, I agree, and I have it in mind as the next improvement.
Basically, it does not make sense to give more than 3 workers to a
subplan when parallel_workers for that subplan are 3. For e.g., if
gather max workers is 10, and we have 2 Append subplans s1 and s2 with
parallel workers 3 and 5 respectively. Then, with the current patch,
it will distribute 4 workers to each of these workers. What we should
do is : once both of the subplans get 3 workers each, we should give
the 7th and 8th worker to s2.

Now that I think of that, I think for implementing above, we need to
keep track of per-subplan max_workers in the Append path; and with
that, the bitmap will be redundant. Instead, it can be replaced with
max_workers. Let me check if it is easy to do that. We don't want to
have the bitmap if we are sure it would be replaced by some other data
structure.


> Here are some review comments
I will handle the other comments, but first, just a quick response to
some important ones :

> 6. By looking at parallel_worker field of a path, we can say whether it's
> partial or not. We probably do not require to maintain a bitmap for that at in
> the Append path. The bitmap can be constructed, if required, at the time of
> creating the partial append plan. The reason to take this small step is 1. we
> want to minimize our work at the time of creating paths, 2. while freeing a
> path in add_path, we don't free the internal structures, in this case the
> Bitmap, which will waste memory if the path is not chosen while planning.

Let me try keeping the per-subplan max_worker info in Append path
itself, like I mentioned above. If that works, the bitmap will be
replaced by max_worker field. In case of non-partial subpath,
max_worker will be 1. (this is the same info kept in AppendState node
in the patch, but now we might need to keep it in Append path node as
well).

>
> 7. If we consider 6, we don't need concat_append_subpaths(), but still here 
> are
> some comments about that function. Instead of accepting two separate arguments
> 

Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-15 Thread Karl O. Pinc
On Sun, 15 Jan 2017 07:20:40 -0600
"Karl O. Pinc"  wrote:

> On Sun, 15 Jan 2017 14:54:44 +0900
> Michael Paquier  wrote:
> 
> > I think that's all I have for this patch.  

> I'd like to submit with it an addendum patch that
> makes symbols out of some constants.  I'll see if I can
> get that done soon.

Attached is a version 23 of the patch.  The only change
is follow-through and cleanup of code posted in-line in emails.

  src/backend/utils/adt/misc.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

This includes making comments into full sentences.


I do have a question here regards code formatting.
The patch now contains:

if (log_filepath == NULL)
{
/* Bad data. Avoid segfaults etc. and return NULL to caller. */
break;
}

I'm not sure how this fits in with PG coding style,
whether the {} should be removed or what.  I've looked
around and can't find an example of an if with a single
line then block and a comment.  Maybe this means that
I shouldn't be doing this, but if I put the comment above
the if then it will "run into" the comment block which
immediately precedes the above code which describes
a larger body of code.  So perhaps someone should look
at this and tell me how to improve it.


Attached also are 2 patches which abstract some hardcoded
constants into symbols.  Whether to apply them is a matter
of style and left to the maintainers, which is why these
are separate patches.  

The first patch changes only code on the master
branch, the 2nd patch changes the new code.


I have not looked further at the patch or checked
to see that all changes previously recommended have been
made.  Michael, if you're confident that everything is good
go ahead and move the patchs forward to the maintainers.  Otherwise
please let me know and I'll see about finding time
for further review.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 07afa3c..c44984d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4211,6 +4211,17 @@ SELECT * FROM parent WHERE key = 2400;
  server log
 
 
+
+ When logs are written to the file system, the  file includes the type,
+ the location and the name of the logs currently in use. This provides a
+ convenient way to find the logs currently in used by the instance.
+
+$ cat $PGDATA/current_logfiles
+stderr pg_log/postgresql-2017-01-13_085812.log
+
+
+
 
  Where To Log
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 10e3186..693669b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15441,6 +15441,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile(text)
+   text
+   Primary log file name, or log in the requested format,
+   currently in use by the logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15658,6 +15665,39 @@ SET search_path TO schema , schema, ..
 the time when the postmaster process re-read the configuration files.)

 
+
+pg_current_logile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of either the csv or stderr log file currently in use by the
+logging collector.  This is a path including the
+ directory and the log file name.
+Log collection must be active or the return value
+is NULL.  When multiple log files exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered
+list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply,
+as text, either csvlog
+or stderr as the value of the optional parameter. The
+return value is NULL when the log format requested
+is not a configured .
+
+pg_current_logfiles reflects the content of the
+ file.  All caveats
+regards current_logfiles content are applicable
+to pg_current_logfiles' return value.
+   
+

 pg_my_temp_schema

diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5c52824..3ce2a7e 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -60,6 +60,38 @@ Item
  Subdirectory containing per-database subdirectories
 
 
+
+ 
+  
+   current_logfiles
+  
+  
+   Logging
+   current_logfiles file
+  
+  current_logfiles
+ 
+ 
+  
+  A file recording the log file(s) currently written to by the syslogger
+  and the file's log formats, stderr
+  or csvlog.  Each line of the file is a space separated list of
+  two elements: the log format and the full path to the log file 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-15 Thread Andres Freund
On 2016-10-31 09:06:39 -0700, Andres Freund wrote:
> On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> > Here's a draft patch that is just meant to investigate what the planner
> > changes might look like if we do it in the pipelined-result way.
> > Accordingly, I didn't touch the executor, but just had it emit regular
> > Result nodes for SRF-execution steps.  However, the SRFs are all
> > guaranteed to appear at top level of their respective tlists, so that
> > those Results could be replaced with something that works like
> > nodeFunctionscan.
> 
> > So I think we should continue investigating this way of doing things.
> > I'll try to take a look at the executor end of it tomorrow.  However
> > I'm leaving Friday for a week's vacation, and may not have anything to
> > show before that.
> 
> Are you planning to work on the execution side of things? I otherwise
> can take a stab...

Doing so now.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw bug in 9.6

2017-01-15 Thread Etsuro Fujita

On 2017/01/14 6:39, Jeff Janes wrote:

I've tested this patch against
9.6.stable-2d443ae1b0121e15265864d2b21 and 10.dev-0563a3a8b59150bf3cc8,
and in both cases it fixes the original problem.


Thanks for testing!


I do get a compiler warning:

foreign.c: In function 'CreateLocalJoinPath':
foreign.c:832: warning: implicit declaration of function
'pathkeys_contained_in'


Will fix.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

2017-01-15 Thread Thomas Munro
On Thu, Jan 12, 2017 at 2:21 PM, Michael Paquier
 wrote:
> On Thu, Jan 12, 2017 at 4:09 AM, Robert Haas  wrote:
>> On Tue, Jan 10, 2017 at 11:41 PM, Michael Paquier
>>  wrote:
>>> Do you think that expanding the wait query by default could be
>>> intrusive for the other tests? I am wondering about such a white list
>>> to generate false positives for the existing tests, including
>>> out-of-core extensions, as all the tests now rely only on
>>> pg_blocking_pids().
>>
>> It won't affect anything unless running at transaction isolation level
>> serializable with the "read only deferrable" option.
>
> Indeed as monitoring.sgml says. And that's now very likely close to
> zero. It would be nice to add a comment in the patch to just mention
> that. In short, I withdraw my concerns about this patch, the addition
> of a test for repeatable read outweights the tweaks done in the
> isolation tester. I am marking this as ready for committer, I have not
> spotted issues with it.

Thanks!  Aside from exercising the code, which is surely always a good
thing, I think these three tests are quite educational on their own
for those of us trying to learn about transaction isolation.

I also have longer term plans to show the first and third of them
running with the read-only transaction moved to a standby server.
Kevin Grittner gave me the idea of multi-server isolation tests when I
mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,
which prompted me to realise that our existing SSI tests aren't very
interesting for that because they all rely on the behaviour of
SERIALIZABLE, not SERIALIZABLE DEFERRABLE.  So I figured we'd better
have some tests that show show that working on a single node system
first.

Concerning the question of beauty, I thought of several ways for the
isolation tester to detect this type of waiting:

1.  Create a new function pg_waiting_for_safe_snapshot(pid) that would
acquire SerializableXactHashLock, do a linear search of active
SERIALIZABLEXACT structs (using FirstPredXact and NextPredXact)
looking for sact->pid == pid and then test sact->flags &
SXACT_FLAG_DEFERRABLE_WAITING.  Then change the isolation tester's
"waiting" query to add " OR pg_waiting_for_safe_snapshot($1)".

2.  Modify the existing pg_blocking_pids() function to detect this
type of waiting and figure out which other pids are responsible,
adding them to its current results.  That could work by first doing a
linear search to find the SERIALIZABLEXACT struct as above, and if its
SXACT_FLAG_DEFERRABLE_WAITING flag is set, then walking the list of
possibleUnsafeConflicts to find the pids responsible.  Then the
isolation tester wouldn't need to change at all.

3.  Create a new function pg_waiting_for_safe_snapshot_pids(),
separate from pg_blocking_pids(), to return the list of pids as above,
and modify the isolation tester to call this new function too.

4.  The wait_event based approach as proposed, looking at pg_stat_activity.

I couldn't think of any practical uses for the new facilities 1-3
outside this isolation test, which is why I didn't look into those
more intrusive approaches.  I admit that 4 is a bit clunky, but it's a
simple non-intrusive change and I can't think of any reason it would
give incorrect results.  Even though wait_event is updated and read
without memory synchronisation, as far as I know we can assume that
select(2) and WaitLatch contain full memory barriers so the isolation
tester will certainly see the SafeSnapshot value when it repeatedly
polls that query, and the value can't change again until the isolation
tester sees it, recognises it as a kind of waiting it knows about, and
moves on to running the step in the test.

Upon reflection, either 2 or 3 might be considered more beautiful than
4, depending on how others feel about extending the remit of the
existing pg_blocking_pid function.  I'd be happy to post a new patch
using one of those approaches if others feel it'd be better that way.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Petr Jelinek
On 15/01/17 23:57, Erik Rijkers wrote:
> On 2017-01-15 23:20, Petr Jelinek wrote:
> 
>> 0001-Add-PUBLICATION-catalogs-and-DDL-v18.patch
>> 0002-Add-SUBSCRIPTION-catalog-and-DDL-v18.patch
>> 0003-Define-logical-replication-protocol-and-output-plugi-v18.patch
>> 0004-Add-logical-replication-workers-v18.patch
>> 0005-Add-separate-synchronous-commit-control-for-logical--v18.patch
> 
> patches apply OK (to master), but I get this compile error:
> 

Ah missed that during final rebase, sorry. Here is fixed 0004 patch.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


0004-Add-logical-replication-workers-v18fixed.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Erik Rijkers

On 2017-01-15 23:20, Petr Jelinek wrote:


0001-Add-PUBLICATION-catalogs-and-DDL-v18.patch
0002-Add-SUBSCRIPTION-catalog-and-DDL-v18.patch
0003-Define-logical-replication-protocol-and-output-plugi-v18.patch
0004-Add-logical-replication-workers-v18.patch
0005-Add-separate-synchronous-commit-control-for-logical--v18.patch


patches apply OK (to master), but I get this compile error:

execReplication.c: In function ‘ExecSimpleRelationInsert’:
execReplication.c:392:41: warning: passing argument 3 of 
‘ExecConstraints’ from incompatible pointer type 
[-Wincompatible-pointer-types]

ExecConstraints(resultRelInfo, slot, estate);
 ^~
In file included from execReplication.c:21:0:
../../../src/include/executor/executor.h:197:13: note: expected 
‘TupleTableSlot * {aka struct TupleTableSlot *}’ but argument is of type 
‘EState * {aka struct EState *}’

 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
 ^~~
execReplication.c:392:4: error: too few arguments to function 
‘ExecConstraints’

ExecConstraints(resultRelInfo, slot, estate);
^~~
In file included from execReplication.c:21:0:
../../../src/include/executor/executor.h:197:13: note: declared here
 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
 ^~~
execReplication.c: In function ‘ExecSimpleRelationUpdate’:
execReplication.c:451:41: warning: passing argument 3 of 
‘ExecConstraints’ from incompatible pointer type 
[-Wincompatible-pointer-types]

ExecConstraints(resultRelInfo, slot, estate);
 ^~
In file included from execReplication.c:21:0:
../../../src/include/executor/executor.h:197:13: note: expected 
‘TupleTableSlot * {aka struct TupleTableSlot *}’ but argument is of type 
‘EState * {aka struct EState *}’

 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
 ^~~
execReplication.c:451:4: error: too few arguments to function 
‘ExecConstraints’

ExecConstraints(resultRelInfo, slot, estate);
^~~
In file included from execReplication.c:21:0:
../../../src/include/executor/executor.h:197:13: note: declared here
 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
 ^~~
make[3]: *** [execReplication.o] Error 1
make[2]: *** [executor-recursive] Error 2
make[1]: *** [install-backend-recurse] Error 2
make: *** [install-src-recurse] Error 2



Erik Rijkers





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generic type subscription

2017-01-15 Thread Artur Zakirov
> Yes, but it was related to the idea of having `ArrayRef` and `JsonbRef` nodes
> for specific types. Since now there is generic `SubscriptingRef` node, I think
> it should be ok.

Sorry I misunderstood it.

> Just to be clear - as far as I understood, these compilation problems were
> caused not because the extension knew something about ArrayRef node in
> particular, but because the extension tried to extract all nodes to generate
> code from them. It means any change will require "refetching", so I think it's
> natural for this extension.

Agree. It will be hard to maintain both nodes. And it is not so smart
to have two nodes ArrayRef (deprecated) and SubscriptingRef. It is not
hard to replace ArrayRef node with SubscriptingRef  in other
extensions.

There is a little note:

>  #include "utils/lsyscache.h"
> +#include "utils/syscache.h"
>  #include "utils/memutils.h"

I think "utils/syscache.h" isn't necessary here. PostgreSQL could be
compiled without this include.

I suppose that this patch can be marked as "Ready for commiter". Any opinions?

-- 
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FYI: git worktrees as replacement for "rsync the CVSROOT"

2017-01-15 Thread Jim Nasby
Not sure how many people still use [1], as referenced by our git 
wiki[2], but it appears git worktrees are a viable replacement for that 
technique. In short, if you're already in your checkout:


git worktree add ../9.6 REL9_6_STABLE

would give you a checkout of 9.6 in the ../9.6 directory.

BTW, I learned about this from this "git year in review" article[3].

1: 
https://www.postgresql.org/message-id/20090602162347.gf23...@yugib.highrise.ca
2: 
https://wiki.postgresql.org/wiki/Working_with_Git#Continuing_the_.22rsync_the_CVSROOT.22_workflow
3: 
https://hackernoon.com/git-in-2016-fad96ae22a15?imm_mid=0ec3e0=em-prog-na-na-newsltr_20170114#.shgj609ad

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Petr Jelinek
On 15/01/17 20:20, Euler Taveira wrote:
> On 15-01-2017 15:13, Petr Jelinek wrote:
>> I am not quite sure I agree with this. Either it's system object and we
>> don't replicate it (which I would have considered to be anything with
>> Oid < FirstNormalObjectId) or it's user made and then it should be
>> replicated. Filtering by schema name is IMHO way too fragile (what stops
>> user creating additional tables there for example).
>>
> What happens if you replicate information_schema tables? AFAICS, those
> tables are already in the subscriber database. And will it generate
> error or warning? (I'm not sure how this functionality deals with
> schemas.) Also, why do I want to replicate a information schema table?
> Their contents are static and, by default, it is already in each database.
> 
> Information schema isn't a catalog but I think it is good to exclude it
> from FOR ALL TABLES clause because the use case is almost zero. Of
> course, it should be documented. Also, if someone wants to replicate an
> information schema table, it could do it with ALTER PUBLICATION.
> 

Well the preinstalled information_schema is excluded by the
FirstNormalObjectId filter as it's created by initdb. If user drops and
recreates it that means it was created as user object.

My opinion is that FOR ALL TABLES should replicate all user tables (ie,
anything that has Oid >= FirstNormalObjectId), if those are added to
information_schema that's up to user. We also replicate user created
tables in pg_catalog even if it's system catalog so I don't see why
information_schema should be filtered on schema level.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Euler Taveira
On 15-01-2017 15:13, Petr Jelinek wrote:
> I am not quite sure I agree with this. Either it's system object and we
> don't replicate it (which I would have considered to be anything with
> Oid < FirstNormalObjectId) or it's user made and then it should be
> replicated. Filtering by schema name is IMHO way too fragile (what stops
> user creating additional tables there for example).
> 
What happens if you replicate information_schema tables? AFAICS, those
tables are already in the subscriber database. And will it generate
error or warning? (I'm not sure how this functionality deals with
schemas.) Also, why do I want to replicate a information schema table?
Their contents are static and, by default, it is already in each database.

Information schema isn't a catalog but I think it is good to exclude it
from FOR ALL TABLES clause because the use case is almost zero. Of
course, it should be documented. Also, if someone wants to replicate an
information schema table, it could do it with ALTER PUBLICATION.


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing matching of boolean index columns to sort ordering

2017-01-15 Thread Tom Lane
Michael Paquier  writes:
> Bah. I was sure I was missing something, still I would have thought
> that the index scan is cheaper than the bitmap index scan with ORDER
> BY. As far as I can see, this patch is not the most elegant thing, but
> it has value. So marked as "ready for committer".

Pushed, thanks for the review.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Petr Jelinek
On 06/01/17 21:26, Peter Eisentraut wrote:
> 0005-Add-separate-synchronous-commit-control-for-logical--v16.patch.gz
> 
> This looks a little bit hackish.  I'm not sure how this would behave
> properly when either synchronous_commit or
> logical_replication_synchronous_commit is changed at run time with a reload.
> 

Yes, I said in the initial email that this is meant for discussion and
not as final implementation. And certainly it's not required for initial
commit. Perhaps I should have started separate thread for this part.

> I'm thinking maybe this and perhaps some other WAL receiver settings
> should be properties of a subscription, like ALTER SUBSCRIPTION ...
> SET/RESET.
>

True, but we still need the GUC defaults.

> Actually, maybe I'm a bit confused what this is supposed to achieve.
> synchronous_commit has both a local and a remote meaning.  What behavior
> are the various combinations of physical and logical replication
> supposed to accomplish?
> 

It's meant to decouple the synchronous commit setting for logical
replication workers from the one set for normal clients. Now that we
have owners for subscription and subscription runs as that owner, maybe
we could do that via ALTER USER. However I think the apply should by
default run with sync commit turned off as the performance benefits are
important there given that there is one worker that has to replicate in
serialized manner and the success of replication is not confirmed by
responding to COMMIT but by reporting LSNs of various replication stages.

Perhaps the logical_replication_synchronous_commit should only be
boolean that would translate to 'off' and 'local' for the real
synchronous_commit.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-15 Thread Petr Jelinek
On 11/01/17 22:30, Peter Eisentraut wrote:
> On 1/11/17 3:35 PM, Petr Jelinek wrote:
>> On 11/01/17 18:32, Peter Eisentraut wrote:
>>> On 1/11/17 3:29 AM, Petr Jelinek wrote:
 Okay, looking into my notes, I originally did this because we did not
 allow adding tables without pkeys to publications which effectively
 prohibited FOR ALL TABLES publication from working because of
 information_schema without this. Since this is no longer the case I
 think it's safe to skip the FirstNormalObjectId check.
>>>
>>> Wouldn't that mean that FOR ALL TABLES replicates the tables from
>>> information_schema?
>>>
>>
>> Yes, as they are not catalog tables, I thought that was your point.
> 
> But we shouldn't do that.  So we need to exclude information_schema from
> "all tables" somehow.  Just probably not by OID, since that is not fixed.
> 

I am not quite sure I agree with this. Either it's system object and we
don't replicate it (which I would have considered to be anything with
Oid < FirstNormalObjectId) or it's user made and then it should be
replicated. Filtering by schema name is IMHO way too fragile (what stops
user creating additional tables there for example).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-15 Thread Karl O. Pinc
On Sun, 15 Jan 2017 14:54:44 +0900
Michael Paquier  wrote:

> I think that's all I have for this patch.

I'd like to submit with it an addendum patch that
makes symbols out of some constants.  I'll see if I can
get that done soon.


Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ICU integration

2017-01-15 Thread Pavel Stehule
Hi

2016-12-28 3:50 GMT+01:00 Peter Eisentraut :

> Updated patch attached.
>
> The previous round of reviews showed that there was general agreement on
> the approach.  So I have focused on filling in the gaps, added ICU
> support to all the locale-using places, added documentation, fixed some
> minor issues that have been pointed out.  Everything appears to work
> correctly now, and the user-facing feature set is pretty well-rounded.
>
> I don't have much experience with the abbreviated key stuff.  I have
> filled in what I think should work, but it needs detailed review.
>
> Similarly, some of the stuff in the regular expression code was hacked
> in blindly.
>
> One minor problem is that we now support adding any-encoding collation
> entries, which violates some of the comments in CollationCreate().  See
> FIXME there.  It doesn't seem worth major contortions to fix that; maybe
> it just has to be documented better.
>

the regress test fails

 Program received signal SIGSEGV, Segmentation fault.
0x007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
5291 return isalpha_l((unsigned char) c, locale->lt);


(gdb) bt
#0  0x007bbc2b in pattern_char_isalpha (locale_is_c=0 '\000',
locale=0x1a73220, is_multibyte=1 '\001', c=97 'a') at selfuncs.c:5291
#1  like_fixed_prefix (patt_const=,
case_insensitive=, collation=,
prefix_const=0x7ffc0963e800,
rest_selec=0x7ffc0963e808) at selfuncs.c:5389
#2  0x007c1076 in patternsel (fcinfo=,
ptype=ptype@entry=Pattern_Type_Like_IC, negate=negate@entry=0 '\000')
at selfuncs.c:1228
#3  0x007c1680 in iclikesel (fcinfo=) at
selfuncs.c:1406
#4  0x0080db56 in OidFunctionCall4Coll (functionId=,
collation=collation@entry=12886, arg1=arg1@entry=28299032,
arg2=arg2@entry=1627, arg3=arg3@entry=28300096, arg4=arg4@entry=0) at
fmgr.c:1674
#5  0x0068e424 in restriction_selectivity (root=root@entry=0x1afcf18,
operatorid=1627, args=0x1afd340, inputcollid=12886,
varRelid=varRelid@entry=0) at plancat.c:1583
#6  0x0065457e in clause_selectivity (root=0x1afcf18,
clause=, varRelid=0, jointype=JOIN_INNER, sjinfo=0x0)
at clausesel.c:657
#7  0x0065485c in clauselist_selectivity (root=root@entry=0x1afcf18,
clauses=, varRelid=varRelid@entry=0,
jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0) at
clausesel.c:107
#8  0x006599d4 in set_baserel_size_estimates
(root=root@entry=0x1afcf18,
rel=rel@entry=0x1afd500) at costsize.c:3771
#9  0x006526e5 in set_plain_rel_size (rte=,
rel=, root=) at allpaths.c:492
#10 set_rel_size (root=root@entry=0x1afcf18, rel=rel@entry=0x1afd500,
rti=rti@entry=1, rte=0x1acfb68) at allpaths.c:352
#11 0x00653ebd in set_base_rel_sizes (root=) at
allpaths.c:272
#12 make_one_rel (root=root@entry=0x1afcf18, joinlist=joinlist@entry=0x1afd810)
at allpaths.c:170
#13 0x00670ad4 in query_planner (root=root@entry=0x1afcf18,
tlist=tlist@entry=0x1afd1b0,
qp_callback=qp_callback@entry=0x6710c0 ,
qp_extra=qp_extra@entry=0x7ffc0963f020) at planmain.c:254
#14 0x006727cc in grouping_planner (root=root@entry=0x1afcf18,
inheritance_update=inheritance_update@entry=0 '\000',
tuple_fraction=, tuple_fraction@entry=0) at
planner.c:1729
#15 0x006752c6 in subquery_planner (glob=glob@entry=0x1afcbe8,
parse=parse@entry=0x1acfa50, parent_root=parent_root@entry=0x0,
hasRecursion=hasRecursion@entry=0 '\000',
tuple_fraction=tuple_fraction@entry=0) at planner.c:789
#16 0x0067619f in standard_planner (parse=0x1acfa50,
cursorOptions=256, boundParams=0x0) at planner.c:301
#17 0x007095bd in pg_plan_query (querytree=0x1acfa50,
cursorOptions=256, boundParams=0x0) at postgres.c:798
#18 0x0070968e in pg_plan_queries (querytrees=,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
at postgres.c:864
#19 0x0070b1e7 in exec_simple_query (query_string=0x1ace8c8 "SELECT
* FROM collate_test1 WHERE b ILIKE 'abc';") at postgres.c:1029
#20 PostgresMain (argc=, argv=argv@entry=0x1a78988,
dbname=, username=) at postgres.c:4067
#21 0x0046fc7d in BackendRun (port=0x1a6e6e0) at postmaster.c:4300
#22 BackendStartup (port=0x1a6e6e0) at postmaster.c:3972

[root@localhost backend]# dnf info libicu
Last metadata expiration check: 1:50:20 ago on Sun Jan 15 09:58:29 2017.
Installed Packages
Name: libicu
Arch: x86_64
Epoch   : 0
Version : 57.1
Release : 4.fc25
Size: 29 M
Repo: @System

Regards

Pavel



> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Patch to implement pg_current_logfile() function

2017-01-15 Thread Gilles Darold
Le 15/01/2017 à 06:54, Michael Paquier a écrit :
> On Sat, Jan 14, 2017 at 10:02 PM, Gilles Darold
>  wrote:
>> Le 13/01/2017 à 14:09, Michael Paquier a écrit :
>>> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold  
>>> wrote:
 Le 13/01/2017 à 05:26, Michael Paquier a écrit :
> Surely the temporary file of current_logfiles should not be included
> in base backups (look at basebackup.c). Documentation needs to reflect
> that as well. Also, should we have current_logfiles in a base backup?
> I would think no.
 Done but can't find any documentation about the file exclusion, do you
 have a pointer?
>>> protocol.sgml, in the protocol-replication part. You could search for
>>> the paragraph that contains postmaster.opts. There is no real harm in
>>> including current_logfiles in base backups, but that's really in the
>>> same bag as postmaster.opts or postmaster.pid, particularly if the log
>>> file name has a timestamp.
>> Thanks for the pointer. After reading this part I think it might only
>> concern files that are critical at restore time. I still think that the
>> file might not be removed automatically from the backup. If it is
>> restored it has strictly no impact at all, it will be removed or
>> overwritten at startup. We can let the user choose to remove it from the
>> backup or not, the file can be an help to find the latest log file
>> related to a backup.
> OK, no problem for me. I can see that your patch does the right thing
> to ignore the current_logfiles.tmp. Could you please update
> t/010_pg_basebackup.pl and add this file to the list of files filled
> with DONOTCOPY?
Applied in attached patch v22.

> pg_current_logfile() can be run by *any* user. We had better revoke
> its access in system_views.sql!
 Why? There is no special information returned by this function unless
 the path but it can be retrieve using show log_directory.
>>> log_directory could be an absolute path, and we surely don't want to
>>> let normal users have a look at it. For example, "show log_directory"
>>> can only be seen by superusers. As Stephen Frost is for a couple of
>>> months (years?) on a holly war path against super-user checks in
>>> system functions, I think that revoking the access to this function is
>>> the best thing to do. It is as well easier to restrict first and
>>> relax, the reverse is harder to justify from a compatibility point of
>>> view.
>> Right, I've append a REVOKE to the function in file
>> backend/catalog/system_views.sql, patch v21 reflect this change.
> Thanks. That looks good.
>
> +   {
> +   /* Logging collector is not enabled. We don't know where messages are
> +* logged.  Remove outdated file holding the current log filenames.
> +*/
> +   unlink(LOG_METAINFO_DATAFILE);
> return 0
> This comment format is not postgres-like.

Fixed too.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 07afa3c..c44984d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4211,6 +4211,17 @@ SELECT * FROM parent WHERE key = 2400;
  server log
 
 
+
+ When logs are written to the file system, the  file includes the type,
+ the location and the name of the logs currently in use. This provides a
+ convenient way to find the logs currently in used by the instance.
+
+$ cat $PGDATA/current_logfiles
+stderr pg_log/postgresql-2017-01-13_085812.log
+
+
+
 
  Where To Log
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 10e3186..693669b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15441,6 +15441,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile(text)
+   text
+   Primary log file name, or log in the requested format,
+   currently in use by the logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15658,6 +15665,39 @@ SET search_path TO schema , schema, ..
 the time when the postmaster process re-read the configuration files.)

 
+
+pg_current_logile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of either the csv or stderr log file currently in use by the
+logging collector.  This is a path including the
+ directory and the log file name.
+Log collection must be active or the return value
+is NULL.  When multiple log files exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered
+list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply,
+as