from:"Fujii Masao"

Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Fujii Masao

On Tue, Oct 3, 2017 at 9:57 PM, Tatsuo Ishii <is...@sraoss.co.jp> wrote:
>> On Mon, Oct 2, 2017 at 5:09 PM, Tatsuo Ishii <is...@sraoss.co.jp> wrote:
>>> Michael is correct.
>>>
>>> Sometimes people choose family name first in the emails.  However I am
>>> sure "Fujii" is the family name and "Masao" is the firstname.
>>
>> But I don't think that directly answers the question of how he would
>> prefer to be credited.
>
> Of course. It's different story.
>
>> Since I am American, I prefer to be credited
>> using the style "${FIRSTNAME} ${LASTNAME}", but the preferences of
>> someone from another country might be the same or different.  I don't
>> think we should presume that someone prefers something other than the
>> style in their email name unless they say so.
>
> My concern is that the list seems implicitly assumes that each name
> appears first name then last name. So people might misunderstand that
> "Fujii" is the first name and "Masao" is the last name. I don't how he
> actually feels about that, but if I were him, I would feel
> uncomfortable. If the list explicitly stats that we do not guarantee
> that the order of the last names and the first names are correct, then
> that kind of misunderstanding could be avoided.

Yeah, your concern might be right, but I prefer Fujii Masao,
i.e., family-name-first style, so I have used that name in my email
and past release notes so far.

Anyway, thanks for your kind consideration!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] subscription worker doesn't start immediately on eabled

2017-04-27 Thread Fujii Masao

On Thu, Apr 27, 2017 at 6:32 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Thu, 27 Apr 2017 00:51:03 +0900, Fujii Masao <masao.fu...@gmail.com> wrote 
> in

Re: [HACKERS] some review comments on logical rep code

2017-04-27 Thread Fujii Masao

On Thu, Apr 27, 2017 at 5:37 PM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 26/04/17 18:36, Fujii Masao wrote:
>> On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>> On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI
>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada 
>>>> <sawada.m...@gmail.com> wrote in 
>>>> <CAD21AoDMy8a6UwMrRh8pigQbDC+JAOQ4m_tXT41VRP4SYp23=w...@mail.gmail.com>
>>>>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek
>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>> On 26/04/17 01:01, Fujii Masao wrote:
>>>>>>>>> However this is overkill for small gain and false wakeup of the
>>>>>>>>> launcher is not so harmful (probably we can live with that), so
>>>>>>>>> we do nothing here for this issue.
>>>>>>>>
>>>>>>>> I agree this as a whole. But I think that the main issue here is
>>>>>>>> not false wakeups, but 'possible delay of launching new workers
>>>>>>>> by 3 minutes at most' (this is centainly a kind of false wakeups,
>>>>>>>> though). We can live with this failure when using two-paase
>>>>>>>> commmit, but I think it shouldn't happen silently.
>>>>>>>>
>>>>>>>>
>>>>>>>> How about providing AtPrepare_ApplyLauncher(void) like the
>>>>>>>> follows and calling it in PrepareTransaction?
>>>>>>>
>>>>>>> Or we should apply the attached patch and handle the 2PC case properly?
>>>>>>> I was thinking that it's overkill more than necessary, but that seems 
>>>>>>> not true
>>>>>>> as far as I implement that.
>>>>>>>
>>>>>> Looks like it does not even increase size of the 2pc file, +1 for this.
>>>>>
>>>>> In my honest opinion, I didn't have a big will that we should handle
>>>>> even two-phase commit case, because this case is very rare (I could
>>>>> not image such case) and doesn't mean to lead a harmful result such as
>>>>> crash of server and returning inconsistent result. it just delays the
>>>>> launching worker for at most 3 minutes. We also can deal with this for
>>>>> example by making maximum nap time of apply launcher user-configurable
>>>>> and document it.
>>>>> But if we can deal with it by minimum changes like attached your patch I 
>>>>> agree.
>>>>
>>>> This change looks reasonable to me, +1 from me to this.
>>>>
>>>> The patch reads on_commit_launcher_wakeup directly then updates
>>>> it via ApplyALuncherWakeupAtCommit() but it's too much to add a
>>>> function for the sake of this.
>>>
>>> OK, so what about the attached patch? I replaced all the calls to
>>> ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = 
>>> true".
>>
>> BTW, while I was reading the code to implement the patch that
>> I posted upthread, I found that the following condition doesn't
>> work as expected. This is because "last_start_time" is always 0.
>>
>> /* Limit the start retry to once a wal_retrieve_retry_interval */
>> if (TimestampDifferenceExceeds(last_start_time, now,
>>   wal_retrieve_retry_interval))
>>
>> "last_start_time" is set to "now" when the launcher starts up new
>> worker. But "last_start_time" is defined and always initialized with 0
>> at the beginning of the main loop in ApplyLauncherMain(), so
>> the above condition becomes always true. This is obviously a bug.
>> Attached patch fixes this issue.
>>
>
> Yes, please go ahead and commit this.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

2017-04-26 Thread Fujii Masao

On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI
> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote in 
>> <CAD21AoDMy8a6UwMrRh8pigQbDC+JAOQ4m_tXT41VRP4SYp23=w...@mail.gmail.com>
>>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek
>>> <petr.jeli...@2ndquadrant.com> wrote:
>>> > On 26/04/17 01:01, Fujii Masao wrote:
>>> >>>> However this is overkill for small gain and false wakeup of the
>>> >>>> launcher is not so harmful (probably we can live with that), so
>>> >>>> we do nothing here for this issue.
>>> >>>
>>> >>> I agree this as a whole. But I think that the main issue here is
>>> >>> not false wakeups, but 'possible delay of launching new workers
>>> >>> by 3 minutes at most' (this is centainly a kind of false wakeups,
>>> >>> though). We can live with this failure when using two-paase
>>> >>> commmit, but I think it shouldn't happen silently.
>>> >>>
>>> >>>
>>> >>> How about providing AtPrepare_ApplyLauncher(void) like the
>>> >>> follows and calling it in PrepareTransaction?
>>> >>
>>> >> Or we should apply the attached patch and handle the 2PC case properly?
>>> >> I was thinking that it's overkill more than necessary, but that seems 
>>> >> not true
>>> >> as far as I implement that.
>>> >>
>>> > Looks like it does not even increase size of the 2pc file, +1 for this.
>>>
>>> In my honest opinion, I didn't have a big will that we should handle
>>> even two-phase commit case, because this case is very rare (I could
>>> not image such case) and doesn't mean to lead a harmful result such as
>>> crash of server and returning inconsistent result. it just delays the
>>> launching worker for at most 3 minutes. We also can deal with this for
>>> example by making maximum nap time of apply launcher user-configurable
>>> and document it.
>>> But if we can deal with it by minimum changes like attached your patch I 
>>> agree.
>>
>> This change looks reasonable to me, +1 from me to this.
>>
>> The patch reads on_commit_launcher_wakeup directly then updates
>> it via ApplyALuncherWakeupAtCommit() but it's too much to add a
>> function for the sake of this.
>
> OK, so what about the attached patch? I replaced all the calls to
> ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = 
> true".

BTW, while I was reading the code to implement the patch that
I posted upthread, I found that the following condition doesn't
work as expected. This is because "last_start_time" is always 0.

/* Limit the start retry to once a wal_retrieve_retry_interval */
if (TimestampDifferenceExceeds(last_start_time, now,
  wal_retrieve_retry_interval))

"last_start_time" is set to "now" when the launcher starts up new
worker. But "last_start_time" is defined and always initialized with 0
at the beginning of the main loop in ApplyLauncherMain(), so
the above condition becomes always true. This is obviously a bug.
Attached patch fixes this issue.

Regards,

-- 
Fujii Masao


bugfix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

2017-04-26 Thread Fujii Masao

On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote in 
> <CAD21AoDMy8a6UwMrRh8pigQbDC+JAOQ4m_tXT41VRP4SYp23=w...@mail.gmail.com>
>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek
>> <petr.jeli...@2ndquadrant.com> wrote:
>> > On 26/04/17 01:01, Fujii Masao wrote:
>> >>>> However this is overkill for small gain and false wakeup of the
>> >>>> launcher is not so harmful (probably we can live with that), so
>> >>>> we do nothing here for this issue.
>> >>>
>> >>> I agree this as a whole. But I think that the main issue here is
>> >>> not false wakeups, but 'possible delay of launching new workers
>> >>> by 3 minutes at most' (this is centainly a kind of false wakeups,
>> >>> though). We can live with this failure when using two-paase
>> >>> commmit, but I think it shouldn't happen silently.
>> >>>
>> >>>
>> >>> How about providing AtPrepare_ApplyLauncher(void) like the
>> >>> follows and calling it in PrepareTransaction?
>> >>
>> >> Or we should apply the attached patch and handle the 2PC case properly?
>> >> I was thinking that it's overkill more than necessary, but that seems not 
>> >> true
>> >> as far as I implement that.
>> >>
>> > Looks like it does not even increase size of the 2pc file, +1 for this.
>>
>> In my honest opinion, I didn't have a big will that we should handle
>> even two-phase commit case, because this case is very rare (I could
>> not image such case) and doesn't mean to lead a harmful result such as
>> crash of server and returning inconsistent result. it just delays the
>> launching worker for at most 3 minutes. We also can deal with this for
>> example by making maximum nap time of apply launcher user-configurable
>> and document it.
>> But if we can deal with it by minimum changes like attached your patch I 
>> agree.
>
> This change looks reasonable to me, +1 from me to this.
>
> The patch reads on_commit_launcher_wakeup directly then updates
> it via ApplyALuncherWakeupAtCommit() but it's too much to add a
> function for the sake of this.

OK, so what about the attached patch? I replaced all the calls to
ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = true".

Regards,

-- 
Fujii Masao


002_Reset_on_commit_launcher_wakeup_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] subscription worker doesn't start immediately on eabled

2017-04-26 Thread Fujii Masao

On Wed, Apr 26, 2017 at 4:03 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Tue, 25 Apr 2017 14:45:03 -0400, Peter Eisentraut 
> <peter.eisentr...@2ndquadrant.com> wrote in 
> <3d6a1bd0-08ce-301d-3336-ec9f623a3...@2ndquadrant.com>
>> On 4/6/17 08:24, Kyotaro HORIGUCHI wrote:
>> > The attached patch wakes up launcher when a subscription is
>> > enabled. This fails when a subscription is enabled immedaitely
>> > after disabling but it won't be a matter.
>>
>> committed, thanks
>
> Thanks!

This patch makes me think that CREATE SUBSCRIPTION should also wake up
the launcher only when ENABLE is specified. Patch attached. Thought?

Regards,

-- 
Fujii Masao


wakeup_launcher_when_enabled.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix a typo in subscriptioncmd.c

2017-04-26 Thread Fujii Masao

On Wed, Apr 26, 2017 at 3:07 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Hi,
>
> Attached patch for $subject.
>
> s/accomodate/accommodate/

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

2017-04-25 Thread Fujii Masao

On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> Hello,
>
> At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote in <cad21aobu8mzdgx-stj3u+qkaej5rpnouotw4kfexc4xddnf...@mail.gmail.com>
>> >> BEGIN;
>> >> ALTER SUBSCRIPTION hoge_sub ENABLE;
>> >> PREPARE TRANSACTION 'g';
>> >> BEGIN;
>> >> SELECT 1;
>> >> COMMIT;  -- wake up the launcher at this point.
>> >> COMMIT PREPARED 'g';
>> >>
>> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's
>> >> not a big deal to the launcher process actually. Compared with the
>> >> complexly of changing the logic so that on_commit_launcher_wakeup
>> >> corresponds to prepared transaction, we might want to accept this
>> >> behavior.
>> >
>> > on_commit_launcher_wakeup needs to be recoreded in 2PC state
>> > file to handle this issue properly. But I agree with you, that would
>> > be overkill for small gain. So I'm thinking to add the following
>> > comment into your patch and push it. Thought?
>> >
>> > -
>> > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC 
>> > case.
>> > For example, COMMIT PREPARED on the transaction enabling the flag
>> > doesn't wake up
>> > the logical replication launcher if ROLLBACK on another transaction
>> > runs before it.
>> > To handle this case properly, the flag needs to be recorded in 2PC
>> > state file so that
>> > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up
>> > the launcher. However this is overkill for small gain and false wakeup
>> > of the launcher
>> > is not so harmful (probably we can live with that), so we do nothing
>> > here for this issue.
>> > -
>> >
>>
>> Agreed.
>>
>> I added this comment to the patch and attached it.
>
> The following "However" may need a follwoing comma.
>
>> However this is overkill for small gain and false wakeup of the
>> launcher is not so harmful (probably we can live with that), so
>> we do nothing here for this issue.
>
> I agree this as a whole. But I think that the main issue here is
> not false wakeups, but 'possible delay of launching new workers
> by 3 minutes at most' (this is centainly a kind of false wakeups,
> though). We can live with this failure when using two-paase
> commmit, but I think it shouldn't happen silently.
>
>
> How about providing AtPrepare_ApplyLauncher(void) like the
> follows and calling it in PrepareTransaction?

Or we should apply the attached patch and handle the 2PC case properly?
I was thinking that it's overkill more than necessary, but that seems not true
as far as I implement that.

Regards,

-- 
Fujii Masao


002_Reset_on_commit_launcher_wakeup_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-25 Thread Fujii Masao

On Tue, Apr 25, 2017 at 5:41 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Tue, 25 Apr 2017 09:22:59 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote in <cad21aoag88zyuwhv9l5munx-qpsb+agzerfdd0jddvom25g...@mail.gmail.com>
>> >> Please observe the policy on open item ownership[1] and send a status 
>> >> update
>> >> within three calendar days of this message.  Include a date for your
>> >> subsequent status update.
>> >
>> > Okay, so our consensus is to always set the priorities of sync standbys
>> > to 1 in quorum-based syncrep case. Attached patch does this change.
>> > Barrying any objection, I will commit this.
>>
>> +1
>
> Ok, +1 from me.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separation walsender & normal backends

2017-04-25 Thread Fujii Masao

On Tue, Apr 25, 2017 at 11:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Andres Freund <and...@anarazel.de> writes:
>> I've for a while suspected that the separation & duplication of
>> infrastructure between walsenders and normal backends isn't nice.
>
> I think we should consider a more radical solution: trying to put
> general SQL query capability into the replication protocol was a
> bad idea and we should revert it while we still can.  The uglinesses
> you mention aren't merely implementation issues, they're an indication
> that that concept is broken in itself.

I think that it's worth considering this option in order to "stabilize"
logical replication stuff before the release. The table sync patch
(which allows walsender to run normal queries) introduced such
uglinesses and increased the complexity in logical rep code.
OTOH, I believe that logical replication is still useful even without
initial table sync feature. So reverting the table sync patch seems
possible idea.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove dead interfaces added by mistake in 7c4f52409

2017-04-24 Thread Fujii Masao

On Sun, Apr 23, 2017 at 8:15 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> Hi,
>
> Seems like couple of declarations for couple of functions we never
> actually implemented and are not used got past review of logical
> replication support for initial copy path (commit 7c4f52409).
>
> Attached patch gets rid of them.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-24 Thread Fujii Masao

On Mon, Apr 24, 2017 at 2:55 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Thu, Apr 20, 2017 at 9:31 AM, Kyotaro HORIGUCHI
> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> Ok, I got the point.
>>
>> At Wed, 19 Apr 2017 17:39:01 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>> <horiguchi.kyot...@lab.ntt.co.jp> wrote in 
>> <20170419.173901.16598616.horiguchi.kyot...@lab.ntt.co.jp>
>>> > >> |
>>> > >> | Quorum-based synchronous replication is basically more
>>> > >> | efficient than priority-based one when you specify multiple
>>> > >> | standbys in synchronous_standby_names and want
>>> > >> | to synchronously replicate transactions to two or more of
>>> > >> | them.
>>
>> "Some" means "not all".
>>
>>> > >> | In the priority-based case, the replication master
>>> > >> | must wait for a reply from the slowest standby in the
>>> > >> | required number of standbys in priority order, which may
>>> > >> | slower than the rest.
>>
>>
>> Quorum-based synchronous replication is expected to be more
>> efficient than priority-based one when your master doesn't need
>> to be in sync with all of the nominated standbys by
>> synchronous_standby_names.

This description may be invalid in the case where the requested number
of sync standbys is smaller than the number of "nominated" standbys by
s_s_names. For example, please imagine the case where there are five
standbys nominated by s_s_name, the requested number of sync standbys
is 2, and only two sync standbys are running. In this case, the master
needs to wait for those two standbys whatever the sync rep method is.
I think that we should rewrite that to something like "quorum-based
synchronous replication is more effecient when the requested number
of synchronous standbys is smaller than the number of potential
synchronous standbys running".

>  While quorum-based
>> replication master waits only for a specified number of fastest
>> standbys, priority-based replicatoin master must wait for
>> standbys at the top of the list, which may be slower than the
>> rest.
>
> This description looks good to me. I've updated the patch based on
> this description and attached it.

But I still think that the original description that I used in my patch is
better than this

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-24 Thread Fujii Masao

On Mon, Apr 24, 2017 at 9:02 AM, Noah Misch <n...@leadboat.com> wrote:
> On Thu, Apr 20, 2017 at 11:34:34PM -0700, Noah Misch wrote:
>> On Fri, Apr 21, 2017 at 01:20:05PM +0900, Masahiko Sawada wrote:
>> > On Fri, Apr 21, 2017 at 12:02 PM, Noah Misch <n...@leadboat.com> wrote:
>> > > On Wed, Apr 19, 2017 at 01:52:53PM +0900, Masahiko Sawada wrote:
>> > >> On Wed, Apr 19, 2017 at 12:34 PM, Noah Misch <n...@leadboat.com> wrote:
>> > >> > On Sun, Apr 16, 2017 at 07:25:28PM +0900, Fujii Masao wrote:
>> > >> >> As I told firstly this is not a bug. There are some proposals for 
>> > >> >> better design
>> > >> >> of priority column in pg_stat_replication, but we've not reached the 
>> > >> >> consensus
>> > >> >> yet. So I think that it's better to move this open item to "Design 
>> > >> >> Decisions to
>> > >> >> Recheck Mid-Beta" section so that we can hear more opinions.
>> > >> >
>> > >> > I'm reading that some people want to report NULL priority, some 
>> > >> > people want to
>> > >> > report a constant 1 priority, and nobody wants the current behavior.  
>> > >> > Is that
>> > >> > an accurate summary?
>> > >>
>> > >> Yes, I think that's correct.
>> > >
>> > > Okay, but ...
>> > >
>> > >> FWIW the reason of current behavior is that it would be useful for the
>> > >> user who is willing to switch from ANY to FIRST. They can know which
>> > >> standbys will become sync or potential.
>> > >
>> > > ... does this mean you personally want to keep the current behavior?  If 
>> > > not,
>> > > has some other person stated a wish to keep the current behavior?
>> >
>> > No, I want to change the current behavior. IMO it's better to set
>> > priority 1 to all standbys in quorum set. I guess there is no longer
>> > person who supports the current behavior.
>>
>> In that case, this open item is not eligible for section "Design Decisions to
>> Recheck Mid-Beta".  That section is for items where we'll probably change
>> nothing, but we plan to recheck later just in case.  Here, we expect to 
>> change
>> the behavior; the open question is which replacement behavior to prefer.
>>
>> Fujii, as the owner of this open item, you are responsible for moderating the
>> debate until there's adequate consensus to make a particular change or to 
>> keep
>> the current behavior after all.  Please proceed to do that.  Beta testers
>> deserve a UI they may like, not a UI you already plan to change later.
>
> Please observe the policy on open item ownership[1] and send a status update
> within three calendar days of this message.  Include a date for your
> subsequent status update.

Okay, so our consensus is to always set the priorities of sync standbys
to 1 in quorum-based syncrep case. Attached patch does this change.
Barrying any objection, I will commit this.

I will commit something to close this open item by April 28th at the latest
(IOW before my vacation starts).

Regards,

-- 
Fujii Masao


sync_priority.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

2017-04-21 Thread Fujii Masao

On Fri, Apr 21, 2017 at 4:02 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI
>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>> Hi,
>>>>
>>>> Thank you for the revised version.
>>>>
>>>> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada 
>>>> <sawada.m...@gmail.com> wrote in 
>>>> <CAD21AoA+yFJTbw0PCw-ttmh9TsTygM3=iavxf+5bx4o9-ux...@mail.gmail.com>
>>>>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>>>> wrote:
>>>>> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
>>>>> > <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>>> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada 
>>>>> >> <sawada.m...@gmail.com> wrote in 
>>>>> >>

Re: [HACKERS] some review comments on logical rep code

2017-04-20 Thread Fujii Masao

On Thu, Apr 20, 2017 at 12:05 PM, Noah Misch <n...@leadboat.com> wrote:
> On Sun, Apr 16, 2017 at 06:14:49AM +, Noah Misch wrote:
>> On Fri, Apr 14, 2017 at 04:47:12AM +0900, Fujii Masao wrote:
>> > Though I've read only a part of the logical rep code yet, I'd like to
>> > share some (relatively minor) review comments that I got so far.
>> >
>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
>> > value from the argument, instead of DatumGetObjectId().
>> >
>> > No one resets on_commit_launcher_wakeup flag to false.
>> >
>> > ApplyLauncherWakeup() should be static function.
>> >
>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>> >
>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>> >
>> > Normal statements that the walsender for logical rep runs are logged
>> > by log_replication_commands. So if log_statement is also set,
>> > those commands are logged twice.
>> >
>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
>> > an error. The callback function to reset it needs to be registered
>> > via on_shmem_exit().
>> >
>> > Typo: "an subscriber" should be "a subscriber" in some places.
>> >
>> > /* Get conninfo */
>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID,
>> > tup,
>> > Anum_pg_subscription_subconninfo,
>> > );
>> > Assert(!isnull);
>> >
>> > This assertion makes me think that pg_subscription.subconnifo should
>> > have NOT NULL constraint. Also subslotname and subpublications
>> > should have NOT NULL constraint because of the same reason.
>> >
>> > SpinLockAcquire(>relmutex);
>> > MyLogicalRepWorker->relstate =
>> >   GetSubscriptionRelState(MyLogicalRepWorker->subid,
>> > MyLogicalRepWorker->relid,
>> > >relstate_lsn,
>> > false);
>> > SpinLockRelease(>relmutex);
>> >
>> > Non-"short-term" function like GetSubscriptionRelState() should not
>> > be called while spinlock is being held.
>>
>> [Action required within three days.  This is a generic notification.]
>>
>> The above-described topic is currently a PostgreSQL 10 open item.  Peter,
>> since you committed the patch believed to have created it, you own this open
>> item.  If some other commit is more relevant or if this does not belong as a
>> v10 open item, please let us know.  Otherwise, please observe the policy on
>> open item ownership[1] and send a status update within three calendar days of
>> this message.  Include a date for your subsequent status update.  Testers may
>> discover new open items at any time, and I want to plan to get them all fixed
>> well in advance of shipping v10.  Consequently, I will appreciate your 
>> efforts
>> toward speedy resolution.  Thanks.
>>
>> [1] 
>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

I'm not the owner of this item, but the current status is;

I've pushed several patches, and there is now only one remaining patch.
I posted the review comment on that patch, and I'm expecting that
Masahiko-san will update the patch. So what about waiting for the updated
version of the patch by next Monday (April 24th)?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

2017-04-20 Thread Fujii Masao

On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada  wrote:
> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI
>  wrote:
>> Hi,
>>
>> Thank you for the revised version.
>>
>> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada  
>> wrote in 
>>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada  
>>> wrote:
>>> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
>>> >  wrote:
>>> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada 
>>> >>  wrote in 
>>> >>

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-18 Thread Fujii Masao

On Tue, Apr 18, 2017 at 7:02 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 6:40 PM, Kyotaro HORIGUCHI
> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> At Tue, 18 Apr 2017 14:58:50 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote in <cad21aobqsjugx0lcdrjedlb-yx2evglmdt8nz4zr_xpxrbm...@mail.gmail.com>
>>> On Tue, Apr 18, 2017 at 3:04 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>> > On Wed, Apr 12, 2017 at 2:36 AM, Masahiko Sawada <sawada.m...@gmail.com> 
>>> > wrote:
>>> >> On Thu, Apr 6, 2017 at 4:17 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>> >> wrote:
>>> >>> On Thu, Apr 6, 2017 at 10:51 AM, Noah Misch <n...@leadboat.com> wrote:
>>> >>>> On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
>>> >>>>> On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch <n...@leadboat.com> wrote:
>>> >>>>> > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>>> >>>>> >> Regarding this feature, there are some loose ends. We should work 
>>> >>>>> >> on
>>> >>>>> >> and complete them until the release.
>>> >>>>> >>
>>> >>>>> >> (1)
>>> >>>>> >> Which synchronous replication method, priority or quorum, should be
>>> >>>>> >> chosen when neither FIRST nor ANY is specified in s_s_names? Right 
>>> >>>>> >> now,
>>> >>>>> >> a priority-based sync replication is chosen for keeping backward
>>> >>>>> >> compatibility. However some hackers argued to change this decision
>>> >>>>> >> so that a quorum commit is chosen because they think that most 
>>> >>>>> >> users
>>> >>>>> >> prefer to a quorum.
>>> >>>>> >>
>>> >>>>> >> (2)
>>> >>>>> >> There will be still many source comments and documentations that
>>> >>>>> >> we need to update, for example, in high-availability.sgml. We need 
>>> >>>>> >> to
>>> >>>>> >> check and update them throughly.
>>> >>>>> >>
>>> >>>>> >> (3)
>>> >>>>> >> The priority value is assigned to each standby listed in s_s_names
>>> >>>>> >> even in quorum commit though those priority values are not used at 
>>> >>>>> >> all.
>>> >>>>> >> Users can see those priority values in pg_stat_replication.
>>> >>>>> >> Isn't this confusing? If yes, it might be better to always assign 
>>> >>>>> >> 1 as
>>> >>>>> >> the priority, for example.
>>> >>>>> >
>>> >>>>> > [Action required within three days.  This is a generic 
>>> >>>>> > notification.]
>>> >>>>> >
>>> >>>>> > The above-described topic is currently a PostgreSQL 10 open item.  
>>> >>>>> > Fujii,
>>> >>>>> > since you committed the patch believed to have created it, you own 
>>> >>>>> > this open
>>> >>>>> > item.  If some other commit is more relevant or if this does not 
>>> >>>>> > belong as a
>>> >>>>> > v10 open item, please let us know.  Otherwise, please observe the 
>>> >>>>> > policy on
>>> >>>>> > open item ownership[1] and send a status update within three 
>>> >>>>> > calendar days of
>>> >>>>> > this message.  Include a date for your subsequent status update.  
>>> >>>>> > Testers may
>>> >>>>> > discover new open items at any time, and I want to plan to get them 
>>> >>>>> > all fixed
>>> >>>>> > well in advance of shipping v10.  Consequently, I will appreciate 
>>> >>>>> > your efforts
>>> >>>>> > toward speedy resolution.  Thanks.
>>> >>>>> >
>>> >>>>> > [1] 
>>> >>>>> > https://www.postgresql.org/mes

Re: [HACKERS] some review comments on logical rep code

2017-04-18 Thread Fujii Masao

gt;
> and then
>
>>   /*
>>* CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot. If it was
>>* called outside of transaction the snapshot should be cleared here.
>>*/
>>   if (!IsTransactionBlock())
>>   SnapBuildClearExportedSnapshot();
>
> The first one should not be there, it looks like a result of some merge
> conflict being solved wrongly. Maybe your patch could fix that too?
>
>>>>>
>>>>>>> 10.
>>>>>>>>
>>>>>>>> SpinLockAcquire(>relmutex);
>>>>>>>> MyLogicalRepWorker->relstate =
>>>>>>>>   GetSubscriptionRelState(MyLogicalRepWorker->subid,
>>>>>>>> MyLogicalRepWorker->relid,
>>>>>>>> >relstate_lsn,
>>>>>>>> false);
>>>>>>>> SpinLockRelease(>relmutex);
>>>>>>>>
>>>>>>>> Non-"short-term" function like GetSubscriptionRelState() should not
>>>>>>>> be called while spinlock is being held.
>>>>>>>>
>>>>>>>
>>>>>>> One option is adding new LWLock for the relation state change but this
>>>>>>> lock is used at many locations where the operation actually doesn't
>>>>>>> take time. I think that the discussion would be needed to get
>>>>>>> consensus, so patch for it is not attached.
>>>>>>
>>>>>> From the point of view of time, I agree that it doesn't seem to
>>>>>> harm. Bt I thing it as more significant problem that
>>>>>> potentially-throwable function is called within the mutex
>>>>>> region. It potentially causes a kind of dead lock.  (That said,
>>>>>> theoretically dosn't occur and I'm not sure what happens by the
>>>>>> dead lock..)
>>>>>>
>
> Hmm, I think doing what I attached should be fine here.

Thanks for the patch!

> We don't need to
> hold spinlock for table read, just for changing the
> MyLogicalRepWorker->relstate.

Yes, but the update of MyLogicalRepWorker->relstate_lsn also should
be protected with the spinlock.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-17 Thread Fujii Masao

On Wed, Apr 12, 2017 at 2:36 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Thu, Apr 6, 2017 at 4:17 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
>> On Thu, Apr 6, 2017 at 10:51 AM, Noah Misch <n...@leadboat.com> wrote:
>>> On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
>>>> On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch <n...@leadboat.com> wrote:
>>>> > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>>>> >> Regarding this feature, there are some loose ends. We should work on
>>>> >> and complete them until the release.
>>>> >>
>>>> >> (1)
>>>> >> Which synchronous replication method, priority or quorum, should be
>>>> >> chosen when neither FIRST nor ANY is specified in s_s_names? Right now,
>>>> >> a priority-based sync replication is chosen for keeping backward
>>>> >> compatibility. However some hackers argued to change this decision
>>>> >> so that a quorum commit is chosen because they think that most users
>>>> >> prefer to a quorum.
>>>> >>
>>>> >> (2)
>>>> >> There will be still many source comments and documentations that
>>>> >> we need to update, for example, in high-availability.sgml. We need to
>>>> >> check and update them throughly.
>>>> >>
>>>> >> (3)
>>>> >> The priority value is assigned to each standby listed in s_s_names
>>>> >> even in quorum commit though those priority values are not used at all.
>>>> >> Users can see those priority values in pg_stat_replication.
>>>> >> Isn't this confusing? If yes, it might be better to always assign 1 as
>>>> >> the priority, for example.
>>>> >
>>>> > [Action required within three days.  This is a generic notification.]
>>>> >
>>>> > The above-described topic is currently a PostgreSQL 10 open item.  Fujii,
>>>> > since you committed the patch believed to have created it, you own this 
>>>> > open
>>>> > item.  If some other commit is more relevant or if this does not belong 
>>>> > as a
>>>> > v10 open item, please let us know.  Otherwise, please observe the policy 
>>>> > on
>>>> > open item ownership[1] and send a status update within three calendar 
>>>> > days of
>>>> > this message.  Include a date for your subsequent status update.  
>>>> > Testers may
>>>> > discover new open items at any time, and I want to plan to get them all 
>>>> > fixed
>>>> > well in advance of shipping v10.  Consequently, I will appreciate your 
>>>> > efforts
>>>> > toward speedy resolution.  Thanks.
>>>> >
>>>> > [1] 
>>>> > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>>>>
>>>> Thanks for the notice!
>>>>
>>>> Regarding the item (2), Sawada-san told me that he will work on it after
>>>> this CommitFest finishes. So we would receive the patch for the item from
>>>> him next week. If there will be no patch even after the end of next week
>>>> (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.
>>>
>>> Sounds reasonable; I will look for your update on 14Apr or earlier.
>>>
>>>> The items (1) and (3) are not bugs. So I don't think that they need to be
>>>> resolved before the beta release. After the feature freeze, many users
>>>> will try and play with many new features including quorum-based syncrep.
>>>> Then if many of them complain about (1) and (3), we can change the code
>>>> at that timing. So we need more time that users can try the feature.
>>>
>>> I've moved (1) to a new section for things to revisit during beta.  If 
>>> someone
>>> feels strongly that the current behavior is Wrong and must change, speak up 
>>> as
>>> soon as you reach that conclusion.  Absent such arguments, the behavior 
>>> won't
>>> change.
>>>
>>>> BTW, IMO (3) should be fixed so that pg_stat_replication reports NULL
>>>> as the priority if quorum-based sync rep is chosen. It's less confusing.
>>>
>>> Since you do want (3) to change, please own it like any other open item,
>>> including the mandatory sta

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-16 Thread Fujii Masao

On Sun, Apr 16, 2017 at 1:19 PM, Noah Misch <n...@leadboat.com> wrote:
> On Fri, Apr 14, 2017 at 11:58:23PM -0400, Noah Misch wrote:
>> On Wed, Apr 05, 2017 at 09:51:02PM -0400, Noah Misch wrote:
>> > On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
>>
>> > > > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>> > > >> (2)
>> > > >> There will be still many source comments and documentations that
>> > > >> we need to update, for example, in high-availability.sgml. We need to
>> > > >> check and update them throughly.
>> > > >>
>> > > >> (3)
>> > > >> The priority value is assigned to each standby listed in s_s_names
>> > > >> even in quorum commit though those priority values are not used at 
>> > > >> all.
>> > > >> Users can see those priority values in pg_stat_replication.
>> > > >> Isn't this confusing? If yes, it might be better to always assign 1 as
>> > > >> the priority, for example.
>>
>> > > Regarding the item (2), Sawada-san told me that he will work on it after
>> > > this CommitFest finishes. So we would receive the patch for the item from
>> > > him next week. If there will be no patch even after the end of next week
>> > > (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.
>> >
>> > Sounds reasonable; I will look for your update on 14Apr or earlier.
>>
>> This PostgreSQL 10 open item is past due for your status update.  Kindly send
>> a status update within 24 hours, and include a date for your subsequent 
>> status
>> update.  Refer to the policy on open item ownership:
>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

Sorry for the delay.

I will review Sawada-san's patch and commit something in next three days.
So next target date is April 19th.

>> > Since you do want (3) to change, please own it like any other open item,
>> > including the mandatory status updates.
>>
>> Likewise.

As I told firstly this is not a bug. There are some proposals for better design
of priority column in pg_stat_replication, but we've not reached the consensus
yet. So I think that it's better to move this open item to "Design Decisions to
Recheck Mid-Beta" section so that we can hear more opinions.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-14 Thread Fujii Masao

On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 12/04/17 15:55, Fujii Masao wrote:
>> Hi,
>>
>> When I shut down the publisher while I repeated creating and dropping
>> the subscription in the subscriber, the publisher emitted the following
>> PANIC error during shutdown checkpoint.
>>
>> PANIC:  concurrent transaction log activity while database system is
>> shutting down
>>
>> The cause of this problem is that walsender for logical replication can
>> generate WAL records even during shutdown checkpoint.
>>
>> Firstly walsender keeps running until shutdown checkpoint finishes
>> so that all the WAL including shutdown checkpoint record can be
>> replicated to the standby. This was safe because previously walsender
>> could not generate WAL records. However this assumption became
>> invalid because of logical replication. That is, currenty walsender for
>> logical replication can generate WAL records, for example, by executing
>> CREATE_REPLICATION_SLOT command. This is an oversight in
>> logical replication patch, I think.
>
> Hmm, but CREATE_REPLICATION_SLOT should not generate WAL afaik. I agree
> that the issue with walsender still exist (since we now allow normal SQL
> to run there) but I think it's important to identify what exactly causes
> the WAL activity in your case

At least in my case, the following CREATE_REPLICATION_SLOT command
generated WAL record.

BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ;
CREATE_REPLICATION_SLOT testslot TEMPORARY LOGICAL pgoutput USE_SNAPSHOT;

Here is the pg_waldump output of the WAL record that CREATE_REPLICATION_SLOT
generated.

rmgr: Standby len (rec/tot): 24/50, tx:  0,
lsn: 0/01601438, prev 0/01601400, desc: RUNNING_XACTS nextXid 692
latestCompletedXid 691 oldestRunningXid 692

So I guess that CREATE_REPLICATION_SLOT code calls LogStandbySnapshot()
and which generates WAL record about snapshot of running transactions.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] some review comments on logical rep code

2017-04-13 Thread Fujii Masao

Hi,

Though I've read only a part of the logical rep code yet, I'd like to
share some (relatively minor) review comments that I got so far.

In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
value from the argument, instead of DatumGetObjectId().

No one resets on_commit_launcher_wakeup flag to false.

ApplyLauncherWakeup() should be static function.

Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
subcommands (e.g., ENABLED one) should wake the launcher up on commit.

Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?

Normal statements that the walsender for logical rep runs are logged
by log_replication_commands. So if log_statement is also set,
those commands are logged twice.

LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
an error. The callback function to reset it needs to be registered
via on_shmem_exit().

Typo: "an subscriber" should be "a subscriber" in some places.

/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
tup,
Anum_pg_subscription_subconninfo,
);
Assert(!isnull);

This assertion makes me think that pg_subscription.subconnifo should
have NOT NULL constraint. Also subslotname and subpublications
should have NOT NULL constraint because of the same reason.

SpinLockAcquire(>relmutex);
MyLogicalRepWorker->relstate =
  GetSubscriptionRelState(MyLogicalRepWorker->subid,
MyLogicalRepWorker->relid,
>relstate_lsn,
false);
SpinLockRelease(>relmutex);

Non-"short-term" function like GetSubscriptionRelState() should not
be called while spinlock is being held.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-13 Thread Fujii Masao

On Thu, Apr 13, 2017 at 12:36 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Thu, Apr 13, 2017 at 12:28 PM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut
>> <peter.eisentr...@2ndquadrant.com> wrote:
>>> On 4/12/17 09:55, Fujii Masao wrote:
>>>> To fix this issue, we should terminate walsender for logical replication
>>>> before shutdown checkpoint starts. Of course walsender for physical
>>>> replication still needs to keep running until shutdown checkpoint ends,
>>>> though.
>>>
>>> Can we turn it into a kind of read-only or no-new-commands mode instead,
>>> so it can keep streaming but not accept any new actions?
>>
>> So we make walsenders switch to that mode and wait for all the 
>> already-ongoing
>> their "write" commands to finish, and then we start a shutdown checkpoint?
>> This is an idea, but seems a bit complicated. ISTM that it's simpler to
>> terminate only walsenders for logical rep before shutdown checkpoint.
>
> Perhaps my memory is failing me here... But in clean shutdowns we do
> shut down WAL senders after the checkpoint has completed so as we are
> sure that they have flushed the LSN corresponding to the checkpoint
> ending, right?

Yes.

> Why introducing an inconsistency for logical workers?
> It seems to me that logical workers should fail under the same rules.

Could you tell me why? You think that even walsender for logical rep
needs to stream the shutdown checkpoint WAL record to the subscriber?
I was not thinking that's true.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-13 Thread Fujii Masao

On Thu, Apr 13, 2017 at 9:23 PM, Masahiko Sawada  wrote:
> On Thu, Apr 13, 2017 at 5:17 PM, Kyotaro HORIGUCHI
>  wrote:
>> Hello,
>>
>> At Thu, 6 Apr 2017 16:17:31 +0900, Masahiko Sawada  
>> wrote in

Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-13 Thread Fujii Masao

On Fri, Apr 14, 2017 at 1:28 AM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 4/10/17 13:28, Fujii Masao wrote:
>>  src/backend/replication/logical/launcher.c
>>  * Worker started and attached to our shmem. This check is safe
>>  * because only launcher ever starts the workers, so nobody can steal
>>  * the worker slot.
>>
>> The tablesync patch enabled even worker to start another worker.
>> So the above assumption is not valid for now.
>>
>> This issue seems to cause the corner case where the launcher picks up
>> the same worker slot that previously-started worker has already picked
>> up to start another worker.
>
> I think what the comment should rather say is that workers are always
> started through logicalrep_worker_launch() and worker slots are always
> handed out while holding LogicalRepWorkerLock exclusively, so nobody can
> steal the worker slot.
>
> Does that make sense?

No unless I'm missing something.

logicalrep_worker_launch() picks up unused worker slot (slot's proc == NULL)
while holding LogicalRepWorkerLock. But it releases the lock before the slot
is marked as used (i.e., slot is set to non-NULL). Then newly-launched worker
calls logicalrep_worker_attach() and marks the slot as used.

So if another logicalrep_worker_launch() starts after LogicalRepWorkerLock
is released before the slot is marked as used, it can pick up the same slot
because that slot looks unused.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Fujii Masao

On Thu, Apr 13, 2017 at 5:25 AM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 4/12/17 09:55, Fujii Masao wrote:
>> To fix this issue, we should terminate walsender for logical replication
>> before shutdown checkpoint starts. Of course walsender for physical
>> replication still needs to keep running until shutdown checkpoint ends,
>> though.
>
> Can we turn it into a kind of read-only or no-new-commands mode instead,
> so it can keep streaming but not accept any new actions?

So we make walsenders switch to that mode and wait for all the already-ongoing
their "write" commands to finish, and then we start a shutdown checkpoint?
This is an idea, but seems a bit complicated. ISTM that it's simpler to
terminate only walsenders for logical rep before shutdown checkpoint.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove pg_stat_progress_vacuum from Table 28.2

2017-04-12 Thread Fujii Masao

On Wed, Apr 12, 2017 at 10:32 AM, Amit Langote
<langote_amit...@lab.ntt.co.jp> wrote:
> On 2017/04/12 0:22, Fujii Masao wrote:
>> On Fri, Apr 7, 2017 at 9:53 AM, Amit Langote
>> <langote_amit...@lab.ntt.co.jp> wrote:
>>> On 2017/04/07 0:56, Fujii Masao wrote:
>>>> On Tue, Apr 4, 2017 at 10:19 AM, Amit Langote
>>>> <langote_amit...@lab.ntt.co.jp> wrote:
>>>>> It seems pg_stat_progress_vacuum is not supposed to appear in the table
>>>>> titled "Collected Statistics Views".  It was added by c16dc1aca.  Attached
>>>>> patch fixes that.
>>>>
>>>> Instead, it should appear in the table of "Dynamic Statistics Views"
>>>> because it reports dynamic info, i.e., progress, about VACUUM activity?
>>>
>>> I thought the same at first, but then realized we have a entirely separate
>>> section 28.4. Progress Reporting.
>>
>> Yes, but I don't think that removing that from 28.2 is an improvement
>> for users.
>
> Perhaps you're right.  Here's a patch that moves pg_stat_progress_vacuum
> to the Dynamic Statistics Views table.

Thanks for the patch! Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tab completion support for ALTER SUBSCRIPTION REFRESH PUBLICATION

2017-04-12 Thread Fujii Masao

On Thu, Apr 13, 2017 at 11:05 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Apr 12, 2017 at 11:48 PM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Wed, Apr 12, 2017 at 5:12 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> I've attached a patch for $subject. Please check it.
>>
>> + COMPLETE_WITH_LIST8("WITH", "CONNECTION", "SET PUBLICATION", "ENABLE",
>> + "DISABLE", "OWNER TO", "RENAME TO", "REFRESH PUBLICATION WITH");
>>
>> "WITH" should be "WITH ("?
>> Also "REFRESH PUBLICATION WITH" should be "REFRESH PUBLICATION WITH ("?
>
> Agreed. I found other bugs of tab completion of PUB/SUB DDLs. Attached
> latest patch incorporated these fixes.

Thanks for the patch! Pushed.

>> BTW, I found that ALTER SUBSCRIPTION ... RENAME TO exists in tab-complete.c,
>> but not in the documentation. ALTER PUBLICATION has the same issue, too.
>> So I think that we need to apply the attached patch which improves the
>> documentation
>> for ALTER PUBLICATION and SUBSCRIPTION.
>
> +1

Also pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tab completion support for ALTER SUBSCRIPTION REFRESH PUBLICATION

2017-04-12 Thread Fujii Masao

On Wed, Apr 12, 2017 at 5:12 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Hi,
>
> I've attached a patch for $subject. Please check it.

+ COMPLETE_WITH_LIST8("WITH", "CONNECTION", "SET PUBLICATION", "ENABLE",
+ "DISABLE", "OWNER TO", "RENAME TO", "REFRESH PUBLICATION WITH");

"WITH" should be "WITH ("?
Also "REFRESH PUBLICATION WITH" should be "REFRESH PUBLICATION WITH ("?

BTW, I found that ALTER SUBSCRIPTION ... RENAME TO exists in tab-complete.c,
but not in the documentation. ALTER PUBLICATION has the same issue, too.
So I think that we need to apply the attached patch which improves the
documentation
for ALTER PUBLICATION and SUBSCRIPTION.

Regards,

-- 
Fujii Masao

bugfix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical rep worker for table sync can start and stop very frequently unexpectedly

2017-04-12 Thread Fujii Masao

On Wed, Apr 12, 2017 at 2:40 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Apr 12, 2017 at 2:05 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> Hi,
>>
>> I observed $subject when I ran the following steps.
>>
>> 1. start the publisher server
>> 2. start the subscriber server
>> 3. create table with primary key and publication for it
>> 4. insert several records into the table
>> 5. create table with primary key and subscription
>> 6. wait for table sync to finish
>> 7. drop the subscription and recreate the subscription
>>
>> Then the launcher starts the worker, and the worker starts
>> another worker for table sync. But that worker for table sync
>> exits immediately because of primary key violation.
>> Then the worker tries to start new worker for table sync immediately
>> and it also exits immediately because of primary key violation.
>> This sequence repeats very fast...
>>
>> Attached is the script that I used to reproduce the issue.
>>
>
> I encountered the same situation[1] and am proposing to put interval
> of launching table sync worker after failed to attempt.
>
> [1] 
> https://www.postgresql.org/message-id/CAD21AoDCnyRJDUY%3DESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg%40mail.gmail.com

Okay, let's discuss this in that thread. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Fujii Masao

Hi,

When I shut down the publisher while I repeated creating and dropping
the subscription in the subscriber, the publisher emitted the following
PANIC error during shutdown checkpoint.

PANIC:  concurrent transaction log activity while database system is
shutting down

The cause of this problem is that walsender for logical replication can
generate WAL records even during shutdown checkpoint.

Firstly walsender keeps running until shutdown checkpoint finishes
so that all the WAL including shutdown checkpoint record can be
replicated to the standby. This was safe because previously walsender
could not generate WAL records. However this assumption became
invalid because of logical replication. That is, currenty walsender for
logical replication can generate WAL records, for example, by executing
CREATE_REPLICATION_SLOT command. This is an oversight in
logical replication patch, I think.

To fix this issue, we should terminate walsender for logical replication
before shutdown checkpoint starts. Of course walsender for physical
replication still needs to keep running until shutdown checkpoint ends,
though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical rep worker for table sync can start and stop very frequently unexpectedly

2017-04-11 Thread Fujii Masao

Hi,

I observed $subject when I ran the following steps.

1. start the publisher server
2. start the subscriber server
3. create table with primary key and publication for it
4. insert several records into the table
5. create table with primary key and subscription
6. wait for table sync to finish
7. drop the subscription and recreate the subscription

Then the launcher starts the worker, and the worker starts
another worker for table sync. But that worker for table sync
exits immediately because of primary key violation.
Then the worker tries to start new worker for table sync immediately
and it also exits immediately because of primary key violation.
This sequence repeats very fast...

Attached is the script that I used to reproduce the issue.

Regards,

-- 
Fujii Masao


test.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove pg_stat_progress_vacuum from Table 28.2

2017-04-11 Thread Fujii Masao

On Fri, Apr 7, 2017 at 9:53 AM, Amit Langote
<langote_amit...@lab.ntt.co.jp> wrote:
> On 2017/04/07 0:56, Fujii Masao wrote:
>> On Tue, Apr 4, 2017 at 10:19 AM, Amit Langote
>> <langote_amit...@lab.ntt.co.jp> wrote:
>>> It seems pg_stat_progress_vacuum is not supposed to appear in the table
>>> titled "Collected Statistics Views".  It was added by c16dc1aca.  Attached
>>> patch fixes that.
>>
>> Instead, it should appear in the table of "Dynamic Statistics Views"
>> because it reports dynamic info, i.e., progress, about VACUUM activity?
>
> I thought the same at first, but then realized we have a entirely separate
> section 28.4. Progress Reporting.

Yes, but I don't think that removing that from 28.2 is an improvement
for users.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_sync_workers_per_subscription is missing in postgresql.conf

2017-04-11 Thread Fujii Masao

On Tue, Apr 11, 2017 at 4:50 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Tue, Apr 11, 2017 at 1:39 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Mon, Apr 10, 2017 at 9:39 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Mon, Apr 10, 2017 at 9:32 PM, Petr Jelinek
>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>> On 10/04/17 07:16, Masahiko Sawada wrote:
>>>>> Hi all,
>>>>>
>>>>> Attached a patch for $subject.
>>>>>
>>>>> I added this parameter into "Asynchronous Behavior" section of
>>>>> "RESOURCE" section. But GUC parameter for subscriber now is written in
>>>>> this section, in spite of there is "REPLICATION" section. I think that
>>>>> we can coordinate these parameters to not confuse user. For example in
>>>>> documentation, these parameters are described in "19.6.4. Subscribers"
>>>>> section of "19.6. Replication" section. Thought?
>>
>> Yes, I think that we should not only add the parameter into
>> postgresql.conf.sample
>> but also
>>
>> - add REPLICATION_SUBSCRIBERS into config_group
>> - mark max_logical_replication_workers and max_sync_workers_per_subscription
>>   as REPLICATION_SUBSCRIBERS parameters, in guc.c
>> - move those parameters into "Subscribers" section in postgresql.conf.sample
>
> +1
>
>> The attached patch does these.
>
> The patch looks good to me.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-10 Thread Fujii Masao

Hi,

 src/backend/replication/logical/launcher.c
 * Worker started and attached to our shmem. This check is safe
 * because only launcher ever starts the workers, so nobody can steal
 * the worker slot.

The tablesync patch enabled even worker to start another worker.
So the above assumption is not valid for now.

This issue seems to cause the corner case where the launcher picks up
the same worker slot that previously-started worker has already picked
up to start another worker.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_sync_workers_per_subscription is missing in postgresql.conf

2017-04-10 Thread Fujii Masao

On Mon, Apr 10, 2017 at 9:39 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Mon, Apr 10, 2017 at 9:32 PM, Petr Jelinek
> <petr.jeli...@2ndquadrant.com> wrote:
>> On 10/04/17 07:16, Masahiko Sawada wrote:
>>> Hi all,
>>>
>>> Attached a patch for $subject.
>>>
>>> I added this parameter into "Asynchronous Behavior" section of
>>> "RESOURCE" section. But GUC parameter for subscriber now is written in
>>> this section, in spite of there is "REPLICATION" section. I think that
>>> we can coordinate these parameters to not confuse user. For example in
>>> documentation, these parameters are described in "19.6.4. Subscribers"
>>> section of "19.6. Replication" section. Thought?

Yes, I think that we should not only add the parameter into
postgresql.conf.sample
but also

- add REPLICATION_SUBSCRIBERS into config_group
- mark max_logical_replication_workers and max_sync_workers_per_subscription
  as REPLICATION_SUBSCRIBERS parameters, in guc.c
- move those parameters into "Subscribers" section in postgresql.conf.sample

The attached patch does these.

Regards,

-- 
Fujii Masao


Add_max_sync_workers_per_subscription_into_postgresql_conf_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Remove pg_stat_progress_vacuum from Table 28.2

2017-04-06 Thread Fujii Masao

On Tue, Apr 4, 2017 at 10:19 AM, Amit Langote
<langote_amit...@lab.ntt.co.jp> wrote:
> It seems pg_stat_progress_vacuum is not supposed to appear in the table
> titled "Collected Statistics Views".  It was added by c16dc1aca.  Attached
> patch fixes that.

Instead, it should appear in the table of "Dynamic Statistics Views"
because it reports dynamic info, i.e., progress, about VACUUM activity?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

2017-04-06 Thread Fujii Masao

und_process)
>> + XLogCtl->stats.writes++;
>> + else
>> + XLogCtl->stats.backend_writes++;
>> +
>> + if (is_background_process)
>> + XLogCtl->stats.write_blocks += npages;
>> + else
>> + XLogCtl->stats.backend_write_blocks += npages;
>> +
>>   /* Update state for write */
>>   openLogOff += nbytes;
>>   npages = 0;
>> @@ -2499,8 +2571,29 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>*/
>>   if (finishing_seg)
>>   {
>> + /* Start timer to acquire start time of the 
>> wal sync */
>> + if (track_io_timing)
>> + INSTR_TIME_SET_CURRENT(io_start);
>> +
>>   issue_xlog_fsync(openLogFile, openLogSegNo);
>>
>> + /* calculate the total time spent for wal sync 
>> */
>> + if (track_io_timing)
>> + {
>> + INSTR_TIME_SET_CURRENT(io_time);
>> + INSTR_TIME_SUBTRACT(io_time, io_start);
>> +
>> + if (is_background_process)
>> + XLogCtl->stats.total_sync_time 
>> += INSTR_TIME_GET_MILLISEC(io_time);
>> + else
>> + 
>> XLogCtl->stats.backend_total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
>> + }
>> + else
>> + {
>> + XLogCtl->stats.total_sync_time = 0;
>> + XLogCtl->stats.backend_total_sync_time 
>> = 0;
>> + }
>> +
>>   /* signal that we need to wakeup walsenders 
>> later */
>>   WalSndWakeupRequest();
>>
>> @@ -2568,7 +2661,28 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>   openLogOff = 0;
>>   }
>>
>> + /* Start timer to acquire start time of the wal sync */
>> + if (track_io_timing)
>> + INSTR_TIME_SET_CURRENT(io_start);
>> +
>>   issue_xlog_fsync(openLogFile, openLogSegNo);
>> +
>> + /* calculate the total time spent for wal sync */
>> + if (track_io_timing)
>> + {
>> + INSTR_TIME_SET_CURRENT(io_time);
>> + INSTR_TIME_SUBTRACT(io_time, io_start);
>> +
>> + if (is_background_process)
>> + XLogCtl->stats.total_sync_time += 
>> INSTR_TIME_GET_MILLISEC(io_time);
>> + else
>> + XLogCtl->stats.backend_total_sync_time 
>> += INSTR_TIME_GET_MILLISEC(io_time);
>> + }
>> + else
>> + {
>> + XLogCtl->stats.total_sync_time = 0;
>> + XLogCtl->stats.backend_total_sync_time = 0;
>> + }
>>   }
>
> I'm *very* doubtful about this, but even if we do it, this needs careful
> benchmarking.
>
>>   /* signal that we need to wakeup walsenders later */
>> @@ -3043,6 +3157,9 @@ XLogBackgroundFlush(void)
>>   {
>>   XLogWrite(WriteRqst, flexible);
>>   }
>> +
>> + /* Collect all the wal write shared counters into local, and report it 
>> to stats collector */
>> + memcpy(, >stats, 
>> sizeof(PgStat_WalWritesCounts));
>>   LWLockRelease(WALWriteLock);
>
> Hm. I think in a good number of workloads hti sis never reached, because
> we return early.
>
>
> I think this is an interesting feature, but I don't think it's ready,
> and it was submitted very late in the v10 release cycle. Therefore I
> think this should be moved to the next CF.

+1

I think that there is no consensus yet about what values should be exposed.
For example, with the patch, WAL writing activity by backends is reported
separately from that by the other processes. But I'm not sure if this grouping
is good. It seems better to report walwriter activity separately from
the others, for tuning of walwriter.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical replication and SIGHUP

2017-04-05 Thread Fujii Masao

Hi,

Both launcher and worker don't handle SIGHUP signal and cannot
reload the configuration. I think that this is a bug. Will add this as
an open item barring objection.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-05 Thread Fujii Masao

On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch <n...@leadboat.com> wrote:
> On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>> Regarding this feature, there are some loose ends. We should work on
>> and complete them until the release.
>>
>> (1)
>> Which synchronous replication method, priority or quorum, should be
>> chosen when neither FIRST nor ANY is specified in s_s_names? Right now,
>> a priority-based sync replication is chosen for keeping backward
>> compatibility. However some hackers argued to change this decision
>> so that a quorum commit is chosen because they think that most users
>> prefer to a quorum.
>>
>> (2)
>> There will be still many source comments and documentations that
>> we need to update, for example, in high-availability.sgml. We need to
>> check and update them throughly.
>>
>> (3)
>> The priority value is assigned to each standby listed in s_s_names
>> even in quorum commit though those priority values are not used at all.
>> Users can see those priority values in pg_stat_replication.
>> Isn't this confusing? If yes, it might be better to always assign 1 as
>> the priority, for example.
>
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Fujii,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
>
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

Thanks for the notice!

Regarding the item (2), Sawada-san told me that he will work on it after
this CommitFest finishes. So we would receive the patch for the item from
him next week. If there will be no patch even after the end of next week
(i.e., April 14th), I will. Let's wait for Sawada-san's action at first.

The items (1) and (3) are not bugs. So I don't think that they need to be
resolved before the beta release. After the feature freeze, many users
will try and play with many new features including quorum-based syncrep.
Then if many of them complain about (1) and (3), we can change the code
at that timing. So we need more time that users can try the feature.

BTW, IMO (3) should be fixed so that pg_stat_replication reports NULL
as the priority if quorum-based sync rep is chosen. It's less confusing.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Logical replication support for initial data copy

2017-03-30 Thread Fujii Masao

On Thu, Mar 23, 2017 at 9:59 PM, Peter Eisentraut <pete...@gmx.net> wrote:
> Logical replication support for initial data copy

+ case T_SQLCmd:
+ if (MyDatabaseId == InvalidOid)
+ ereport(ERROR,
+ (errmsg("not connected to database")));

This error message doesn't seem to follow the error message style in docs.
Also It seems a bit unclear to me. So what about replacing it with
something like the following?

ERROR:  must connect to database to execute command \"%s\"

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-30 Thread Fujii Masao

On Wed, Mar 29, 2017 at 3:31 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Mar 29, 2017 at 1:32 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Tue, Mar 28, 2017 at 1:06 AM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Sun, Mar 26, 2017 at 2:26 AM, Masahiko Sawada <sawada.m...@gmail.com> 
>>> wrote:
>>>> On Sun, Mar 26, 2017 at 1:37 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>>>> On Mon, Mar 6, 2017 at 9:37 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>>>> wrote:
>>>>>> On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs <si...@2ndquadrant.com> 
>>>>>> wrote:
>>>>>>> Allow vacuums to report oldestxmin
>>>>>>>
>>>>>>> Allow VACUUM and Autovacuum to report the oldestxmin value they
>>>>>>> used while cleaning tables, helping to make better sense out of
>>>>>>> the other statistics we report in various cases.
>>>>>>>
>>>>>>> Branch
>>>>>>> --
>>>>>>> master
>>>>>>>
>>>>>>> Details
>>>>>>> ---
>>>>>>> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>>>>>>>
>>>>>>> Modified Files
>>>>>>> --
>>>>>>> src/backend/commands/vacuumlazy.c | 9 +
>>>>>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Should we change the example in vacuum.sgml file as well? Attached patch.
>>>>>
>>>>> "tuples" in the above should be "row versions"?
>>>>> We should review not only this line but also all the lines in the example
>>>>> of VERBOSE output, I think.
>>>>
>>>> Right. These verbose log messages are out of date. I ran
>>>> VACUUM(VERBOSE, ANALYZE) with same scenario as current example as
>>>> possible. Attached patch updates verbose log messages.
>>>>
>>>>
>>>
>>> Surprisingly the changes "tuples" -> "row versions" in vacuumlazy.c is
>>> introduced by commit feb4f44d296b88b7f0723f4a4f3945a371276e0b in 2003.
>>
>> This is the evidence that no one cares about the details of VACUUM VERBOSE
>> output example. So I'm tempted to simplify the example (please see the
>> attached patch) instead of keeping updating the example.
>
> Yes. I agree.

Pushed. I back-patched to all supported versions according to
Alvaro's comment upthread.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-30 Thread Fujii Masao

On Wed, Mar 29, 2017 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> Hello,
>
> At Wed, 29 Mar 2017 09:23:42 +0200, Michael Banck <michael.ba...@credativ.de> 
> wrote in <149077.18436.14.ca...@credativ.de>
>> Hi,
>>
>> Am Mittwoch, den 29.03.2017, 15:22 +0900 schrieb Michael Paquier:
>> > On Wed, Mar 29, 2017 at 3:56 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> > > If your need other information except START WAL LOCATION at the 
>> > > beginning of
>> > > base backup and they are very useful for many third-party softwares,
>> > > you can add them into that first result set. If you do this, you can
>> > > retrieve them
>> > > at the beginning even when WAL files are included in the backup.
>> >
>> > You mean in the result tuple of pg_start_backup(), right? Why not.
>>
>> The replication protocol chapter says: "When the backup is started, the
>> server will first send two ordinary result sets, followed by one or more
>> CopyResponse results. The first ordinary result set contains the
>> starting position of the backup, in a single row with two columns."
>>
>> However, I don't think it is very obvious to users (or at least it is
>> not to me) how to get at this from psql, if you want to script it.  If I

I don't think that using psql to run BASE_BACKUP command is good
approach. Instead, IMO you basically should implement the client program
which can handle BASE_BACKUP properly, or extend pg_basebackup
so that you can do what you want to do.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] reorder tablespaces in basebackup tar stream for backup_label

2017-03-28 Thread Fujii Masao

On Tue, Feb 21, 2017 at 7:17 PM, Michael Banck
<michael.ba...@credativ.de> wrote:
> Hi,
>
> currently, the backup_label and (I think) the tablespace_map files are
> (by design) conveniently located at the beginning of the main tablespace
> tarball when making a basebackup. However, (by accident or also by
> design?) the main tablespace is the last tarball[1] to be streamed via
> the BASE_BACKUP replication protocol command.
>
> For pg_basebackup, this is not a real problem, as either each tablespace
> is its own tarfile (in tar format mode), or the files are extracted
> anyway (in plain mode).
>
> However, third party tools using the BASE_BACKUP command might want to
> extract the backup_label, e.g. in order to figure out the START WAL
> LOCATION.

The first result set that BASE BACKUP sends back to a client contains
the START WAL LOCATION. So at first, ISTM that you don't need backup_label
for that purpose.

If your need other information except START WAL LOCATION at the beginning of
base backup and they are very useful for many third-party softwares,
you can add them into that first result set. If you do this, you can
retrieve them
at the beginning even when WAL files are included in the backup.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

2017-03-28 Thread Fujii Masao

On Tue, Mar 28, 2017 at 1:40 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
>
> On Mon, Mar 27, 2017 at 1:27 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>>
>> On Sat, Mar 25, 2017 at 6:40 AM, Fujii Masao <masao.fu...@gmail.com>
>> wrote:
>>>
>>> On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
>>> <kommi.harib...@gmail.com> wrote:
>>> >
>>> >
>>> > On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapil...@gmail.com>
>>> > wrote:
>>> >>
>>> >> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>>> >> <kommi.harib...@gmail.com> wrote:
>>> >> > Hi Hackers,
>>> >> >
>>> >> > I just want to discuss adding of a new statistics view that provides
>>> >> > the information of wal writing details as follows
>>> >> >
>>> >>
>>> >> +1.  I think it will be useful to observe WAL activity.
>>> >
>>> >
>>> > Thanks for your opinion.
>>> >
>>> >> > postgres=# \d pg_stat_wal_writer
>>> >> > View "pg_catalog.pg_stat_wal_writer"
>>> >> > Column |   Type   | Collation |
>>> >> > Nullable
>>> >> > |
>>> >> > Default
>>> >> >
>>> >> >
>>> >> > ---+--+---+--+-
>>> >> >  num_backend_writes   | bigint   |   |
>>> >> > |
>>> >> >  num_total_writes  | bigint   |   |
>>> >> > |
>>> >> >  num_blocks  | bigint   |   |  |
>>> >> >  total_write_time   | bigint|   |  |
>>> >> >  stats_reset   | timestamp with time zone |   |
>>> >> > |
>>> >> >
>>> >> > The columns of the view are
>>> >> > 1. Total number of xlog writes that are called from the backend.
>>> >> > 2. Total number of xlog writes that are called from both backend
>>> >> >  and background workers. (This column can be changed to just
>>> >> >  display on the background writes).
>>> >> > 3. The number of the blocks that are written.
>>> >> > 4. Total write_time of the IO operation it took, this variable data
>>> >> > is
>>> >> > filled only when the track_io_timing GUC is enabled.
>>> >>
>>> >> So, here is *write_time* the total time system has spent in WAL
>>> >> writing before the last reset?
>>> >
>>> >
>>> > total write_time spent in WAL writing "after" the last reset in
>>> > milliseconds.
>>> >
>>> >> I think there should be a separate column for write and sync time.
>>> >>
>>> >
>>> > Added.
>>> >
>>> >>
>>> >> > Or it is possible to integrate the new columns into the existing
>>> >> > pg_stat_bgwriter view also.
>>> >> >
>>> >>
>>> >> I feel separate view is better.
>>> >
>>> >
>>> > Ok.
>>> >
>>> > Following the sample out of the view after regress run.
>>> >
>>> > postgres=# select * from pg_stat_walwrites;
>>> > -[ RECORD 1 ]--+--
>>> > backend_writes | 19092
>>> > writes | 663
>>> > write_blocks   | 56116
>>> > write_time | 0
>>> > sync_time  | 3064
>>> > stats_reset| 2017-02-15 13:37:09.454314+11
>>> >
>>> > Currently, writer, walwriter and checkpointer processes
>>> > are considered as background processes that can do
>>> > the wal write mainly.
>>
>>
>> Thanks for the review.
>>
>>> I'm not sure if this categorization is good. You told that this view is
>>> useful
>>> to tune walwriter parameters at the top of this thread. If so, ISTM that
>>> the information about walwriter's activity should be separated from
>>> others.
>>
>>
>> Yes, that's correct. First I thought of providing the statistics of
>> walwriter, but
>> later in

Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-28 Thread Fujii Masao

On Wed, Mar 29, 2017 at 1:32 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
> On Tue, Mar 28, 2017 at 1:06 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Sun, Mar 26, 2017 at 2:26 AM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Sun, Mar 26, 2017 at 1:37 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>>> On Mon, Mar 6, 2017 at 9:37 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>>> wrote:
>>>>> On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs <si...@2ndquadrant.com> 
>>>>> wrote:
>>>>>> Allow vacuums to report oldestxmin
>>>>>>
>>>>>> Allow VACUUM and Autovacuum to report the oldestxmin value they
>>>>>> used while cleaning tables, helping to make better sense out of
>>>>>> the other statistics we report in various cases.
>>>>>>
>>>>>> Branch
>>>>>> --
>>>>>> master
>>>>>>
>>>>>> Details
>>>>>> ---
>>>>>> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>>>>>>
>>>>>> Modified Files
>>>>>> --
>>>>>> src/backend/commands/vacuumlazy.c | 9 +
>>>>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>>>>
>>>>>>
>>>>>
>>>>> Should we change the example in vacuum.sgml file as well? Attached patch.
>>>>
>>>> "tuples" in the above should be "row versions"?
>>>> We should review not only this line but also all the lines in the example
>>>> of VERBOSE output, I think.
>>>
>>> Right. These verbose log messages are out of date. I ran
>>> VACUUM(VERBOSE, ANALYZE) with same scenario as current example as
>>> possible. Attached patch updates verbose log messages.
>>>
>>>
>>
>> Surprisingly the changes "tuples" -> "row versions" in vacuumlazy.c is
>> introduced by commit feb4f44d296b88b7f0723f4a4f3945a371276e0b in 2003.
>
> This is the evidence that no one cares about the details of VACUUM VERBOSE
> output example. So I'm tempted to simplify the example (please see the
> attached patch) instead of keeping updating the example.

Patch attached.

Regards,

-- 
Fujii Masao


vacuum-example.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-28 Thread Fujii Masao

On Tue, Mar 28, 2017 at 1:06 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Sun, Mar 26, 2017 at 2:26 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Sun, Mar 26, 2017 at 1:37 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>> On Mon, Mar 6, 2017 at 9:37 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>> wrote:
>>>> On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>>>>> Allow vacuums to report oldestxmin
>>>>>
>>>>> Allow VACUUM and Autovacuum to report the oldestxmin value they
>>>>> used while cleaning tables, helping to make better sense out of
>>>>> the other statistics we report in various cases.
>>>>>
>>>>> Branch
>>>>> --
>>>>> master
>>>>>
>>>>> Details
>>>>> ---
>>>>> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>>>>>
>>>>> Modified Files
>>>>> --
>>>>> src/backend/commands/vacuumlazy.c | 9 +
>>>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>>>
>>>>>
>>>>
>>>> Should we change the example in vacuum.sgml file as well? Attached patch.
>>>
>>> "tuples" in the above should be "row versions"?
>>> We should review not only this line but also all the lines in the example
>>> of VERBOSE output, I think.
>>
>> Right. These verbose log messages are out of date. I ran
>> VACUUM(VERBOSE, ANALYZE) with same scenario as current example as
>> possible. Attached patch updates verbose log messages.
>>
>>
>
> Surprisingly the changes "tuples" -> "row versions" in vacuumlazy.c is
> introduced by commit feb4f44d296b88b7f0723f4a4f3945a371276e0b in 2003.

This is the evidence that no one cares about the details of VACUUM VERBOSE
output example. So I'm tempted to simplify the example (please see the
attached patch) instead of keeping updating the example.

> If this patch is applied, it should back-patch to all supported
> branches.

Not sure if it's worth back-patching that.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical replication worker and statistics

2017-03-27 Thread Fujii Masao

Hi,

Logical replication worker should call pgstat_report_stat()?
Currently it doesn't seem to do that and no statistics about
table accesses by logical replication workers are collected.
For example, this can prevent autovacuum from working on
those tables properly.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Allow vacuums to report oldestxmin

2017-03-25 Thread Fujii Masao

On Mon, Mar 6, 2017 at 9:37 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Fri, Mar 3, 2017 at 10:50 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>> Allow vacuums to report oldestxmin
>>
>> Allow VACUUM and Autovacuum to report the oldestxmin value they
>> used while cleaning tables, helping to make better sense out of
>> the other statistics we report in various cases.
>>
>> Branch
>> --
>> master
>>
>> Details
>> ---
>> http://git.postgresql.org/pg/commitdiff/9eb344faf54a849898d9be012ddfa8204cfeb57c
>>
>> Modified Files
>> --
>> src/backend/commands/vacuumlazy.c | 9 +
>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>
>>
>
> Should we change the example in vacuum.sgml file as well? Attached patch.

"tuples" in the above should be "row versions"?
We should review not only this line but also all the lines in the example
of VERBOSE output, I think.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

2017-03-24 Thread Fujii Masao

On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
>
> On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>
>> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>> <kommi.harib...@gmail.com> wrote:
>> > Hi Hackers,
>> >
>> > I just want to discuss adding of a new statistics view that provides
>> > the information of wal writing details as follows
>> >
>>
>> +1.  I think it will be useful to observe WAL activity.
>
>
> Thanks for your opinion.
>
>> > postgres=# \d pg_stat_wal_writer
>> > View "pg_catalog.pg_stat_wal_writer"
>> > Column |   Type   | Collation | Nullable
>> > |
>> > Default
>> >
>> > ---+--+---+--+-
>> >  num_backend_writes   | bigint   |   |
>> > |
>> >  num_total_writes  | bigint   |   |  |
>> >  num_blocks  | bigint   |   |  |
>> >  total_write_time   | bigint|   |  |
>> >  stats_reset   | timestamp with time zone |   |
>> > |
>> >
>> > The columns of the view are
>> > 1. Total number of xlog writes that are called from the backend.
>> > 2. Total number of xlog writes that are called from both backend
>> >  and background workers. (This column can be changed to just
>> >  display on the background writes).
>> > 3. The number of the blocks that are written.
>> > 4. Total write_time of the IO operation it took, this variable data is
>> > filled only when the track_io_timing GUC is enabled.
>>
>> So, here is *write_time* the total time system has spent in WAL
>> writing before the last reset?
>
>
> total write_time spent in WAL writing "after" the last reset in
> milliseconds.
>
>> I think there should be a separate column for write and sync time.
>>
>
> Added.
>
>>
>> > Or it is possible to integrate the new columns into the existing
>> > pg_stat_bgwriter view also.
>> >
>>
>> I feel separate view is better.
>
>
> Ok.
>
> Following the sample out of the view after regress run.
>
> postgres=# select * from pg_stat_walwrites;
> -[ RECORD 1 ]--+--
> backend_writes | 19092
> writes | 663
> write_blocks   | 56116
> write_time | 0
> sync_time  | 3064
> stats_reset| 2017-02-15 13:37:09.454314+11
>
> Currently, writer, walwriter and checkpointer processes
> are considered as background processes that can do
> the wal write mainly.

I'm not sure if this categorization is good. You told that this view is useful
to tune walwriter parameters at the top of this thread. If so, ISTM that
the information about walwriter's activity should be separated from others.

What about other processes which *can* write WAL, for example walsender
(e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
recovery checkpoint) and logical replication worker (Not sure if it always
works with synchronous_commit=off, though). There might be other processes
which can write WAL.

Why didn't you separate "write_blocks", "write_time" and "sync_time" per
the process category, like "backend_writes" and "writes"?

This view doesn't seem to take into consideration the WAL writing and flushing
during creating new WAL segment file.

I think that it's better to make this view report also the number of WAL pages
which are written when wal_buffer is full. This information is useful to
tune the size of wal_buffers. This was proposed by Nagayasu before.
https://www.postgresql.org/message-id/4ff824f3.5090...@uptime.jp

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

2017-03-24 Thread Fujii Masao

On Thu, Mar 23, 2017 at 4:28 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
> On Wed, Mar 15, 2017 at 7:51 PM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Wed, Mar 15, 2017 at 1:09 PM, Yugo Nagata <nag...@sraoss.co.jp> wrote:
>>> On Fri, 10 Mar 2017 20:08:42 +0900
>>> Masahiko Sawada <sawada.m...@gmail.com> wrote:
>>>
>>>> On Fri, Mar 10, 2017 at 3:58 PM, Jim Nasby <jim.na...@openscg.com> wrote:
>>>> > On 3/6/17 8:34 PM, Masahiko Sawada wrote:
>>>> >>
>>>> >> I don't think it can say "1 frozen pages" because the number of
>>>> >> skipped pages according to visibility map is always more than 32
>>>> >> (SKIP_PAGES_THRESHOLD).
>>>> >
>>>> >
>>>> > That's just an artifact of how the VM currently works. I'm not a fan of
>>>> > cross dependencies like that unless there's a pretty good reason.
>>>>
>>>> Thank you for the comment.
>>>> Agreed. Attached fixed version patch.
>>>
>>> It seems that it says "frozen pages" if pinskipped_pages is not zero,
>>> rather than about frozenskipped_pages.
>>>
>>> How about writing as below?
>>>
>>> appendStringInfo(, ngettext("Skipped %u page due to buffer pins, "
>>> "Skipped %u pages due to buffer pins, ",
>>> vacrelstats->pinskipped_pages),
>>>  vacrelstats->pinskipped_pages);
>>> appendStringInfo(, ngettext("%u frozen page.\n", "%u frozen 
>>> pages.\n",
>>> vacrelstats->frozenskipped_pages),
>>>  vacrelstats->frozenskipped_pages);
>>>
>>
>> Yeah, you're right. I've attached the updated version patch
>> incorporated your comment and fixing documentation.
>
> The patch looks very simple and good to me.
> Barring objection, I will commit this.

Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

2017-03-22 Thread Fujii Masao

On Wed, Mar 15, 2017 at 7:51 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Mar 15, 2017 at 1:09 PM, Yugo Nagata <nag...@sraoss.co.jp> wrote:
>> On Fri, 10 Mar 2017 20:08:42 +0900
>> Masahiko Sawada <sawada.m...@gmail.com> wrote:
>>
>>> On Fri, Mar 10, 2017 at 3:58 PM, Jim Nasby <jim.na...@openscg.com> wrote:
>>> > On 3/6/17 8:34 PM, Masahiko Sawada wrote:
>>> >>
>>> >> I don't think it can say "1 frozen pages" because the number of
>>> >> skipped pages according to visibility map is always more than 32
>>> >> (SKIP_PAGES_THRESHOLD).
>>> >
>>> >
>>> > That's just an artifact of how the VM currently works. I'm not a fan of
>>> > cross dependencies like that unless there's a pretty good reason.
>>>
>>> Thank you for the comment.
>>> Agreed. Attached fixed version patch.
>>
>> It seems that it says "frozen pages" if pinskipped_pages is not zero,
>> rather than about frozenskipped_pages.
>>
>> How about writing as below?
>>
>> appendStringInfo(, ngettext("Skipped %u page due to buffer pins, "
>> "Skipped %u pages due to buffer pins, ",
>> vacrelstats->pinskipped_pages),
>>  vacrelstats->pinskipped_pages);
>> appendStringInfo(, ngettext("%u frozen page.\n", "%u frozen 
>> pages.\n",
>>     vacrelstats->frozenskipped_pages),
>>  vacrelstats->frozenskipped_pages);
>>
>
> Yeah, you're right. I've attached the updated version patch
> incorporated your comment and fixing documentation.

The patch looks very simple and good to me.
Barring objection, I will commit this.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-22 Thread Fujii Masao

On Thu, Mar 23, 2017 at 12:37 AM, Stephen Frost <sfr...@snowman.net> wrote:
> David, all,
>
> * David Steele (da...@pgmasters.net) wrote:
>> On 3/21/17 2:34 PM, Fujii Masao wrote:
>> >The patch basically looks good to me, but one comment is;
>> >backup.sgml (at least the description for "Making a non-exclusive
>> >low level backup) seems to need to be updated.
>>
>> Agreed.  Added in the attached patch and rebased on 8027556.

Thanks for updating the patch!

-SELECT * FROM pg_stop_backup(false);
+SELECT * FROM pg_stop_backup(false [, true ]);

I think that it's better to get rid of "[" and "]" from the above because
IMO this should be the command example that users actually can run.

+ If the backup process monitors the WAL archiving process independently,
+ the second parameter (which defaults to true) can be set to false to
+ prevent pg_stop_backup from blocking until all WAL is
+ archived.  Instead, the function will return as soon as the stop backup
+ record is written to the WAL.  This option must be used with caution:
+ if WAL archiving is not monitored correctly then the result might be a
+ useless backup.

You added this descriptions into the step #4 in the non-exclusive
backup procedure.. But since the step #5 already explains how
pg_stop_backup has to do with WAL archiving, I think that it's better
to update (or add something like the above descriptions into)
the step #5. Thought?

+ If the backup process monitors the WAL archiving process independently,

Can we explain "monitor the WAL archiving process" part a bit more
explicitly? For example, "monitor and ensure that all WAL segment files
required for the backup are successfully archived".

> I've started looking at this.  Seems pretty straight-forward and will
> try to get it committed later today.

Thanks!

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-21 Thread Fujii Masao

On Tue, Mar 21, 2017 at 11:03 AM, Tsunakawa, Takayuki
<tsunakawa.ta...@jp.fujitsu.com> wrote:
> From: David Steele [mailto:da...@pgmasters.net]
>> Well, that's embarrassing.  When I recreated the function to add defaults
>> I messed up the AS clause and did not pay attention to the results of the
>> regression tests, apparently.
>>
>> Attached is a new version rebased on 88e66d1.  Catalog version bump has
>> also been omitted.
>
> I was late to reply because yesterday was a national holiday in Japan.  I 
> marked this as ready for committer.  The patch applied cleanly and worked as 
> expected.

The patch basically looks good to me, but one comment is;
backup.sgml (at least the description for "Making a non-exclusive
low level backup) seems to need to be updated.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Renaming of pg_xlog and pg_clog

2017-03-18 Thread Fujii Masao

On Fri, Mar 17, 2017 at 11:03 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Fri, Mar 17, 2017 at 10:58 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> Fine!  I've committed the pg_clog renaming, but I'd really like to
>> draw the line here.  I'm not going to commit the pg_subtrans ->
>> pg_subxact naming and am -1 on anyone else doing so.  I think that
>> having the names of things in the code match what shows up in the file
>> system is an awfully desirable property which we should preserve
>> insofar as we can do so without compromising other goals.  Not
>> invalidating what users and DBAs think they know about how PostgreSQL
>> works is a good thing, too; there are a lot more people who know
>> something about the directory layout than will read this thread.
>
> I'm cool with that, thanks for the commit.

monitoring.sgml still has some references to "clog". We should update them?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] dblink leaks unnamed connections

2017-03-09 Thread Fujii Masao

On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayuki
<tsunakawa.ta...@jp.fujitsu.com> wrote:
> Hello,
>
> dblink fails to close the unnamed connection as follows when a new unnamed 
> connection is requested.  The attached patch fixes this.

This issue was reported about ten years ago and added as TODO item.
http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php

I agree that this is a bug, and am tempted to back-patch to all the supported
versions. But it had not been fixed in many years since the first report of
the issue. So I'm not sure if it's ok to just treat this as a bug right now and
back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear
more opinions about this.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-03-08 Thread Fujii Masao

On Wed, Mar 8, 2017 at 9:05 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Feb 8, 2017 at 12:04 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Tue, Feb 7, 2017 at 2:13 AM, Petr Jelinek
>> <petr.jeli...@2ndquadrant.com> wrote:
>>> On 06/02/17 17:33, Fujii Masao wrote:
>>>> On Sun, Feb 5, 2017 at 5:11 AM, Petr Jelinek
>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>> On 03/02/17 19:38, Fujii Masao wrote:
>>>>>> On Sat, Feb 4, 2017 at 12:49 AM, Fujii Masao <masao.fu...@gmail.com> 
>>>>>> wrote:
>>>>>>> On Fri, Feb 3, 2017 at 10:56 AM, Kyotaro HORIGUCHI
>>>>>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>>>>>> At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao <masao.fu...@gmail.com> 
>>>>>>>> wrote in 
>>>>>>>>

[HACKERS] ALTER PUBLICATION and segmentation fault

2017-03-07 Thread Fujii Masao

Hi,

When I logged in PostgreSQL as non-superuser and ran
ALTER PUBLICATION command, I got a segmentation fault.
The code checking the owner of publication might have a bug.

=# CREATE ROLE foo NOSUPERUSER LOGIN
=# \c - foo
=> \dRp
  List of publications
 Name  |  Owner   | Inserts | Updates | Deletes
---+--+-+-+-
 mypub | postgres | t   | t   | t
=> ALTER PUBLICATION mypub RENAME TO hoge;

LOG:  server process (PID 80356) was terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: ALTER PUBLICATION mypub RENAME TO hoge;

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Resolved typo in a comment

2017-02-21 Thread Fujii Masao

On Mon, Feb 20, 2017 at 7:34 AM, neha khatri <nehakhat...@gmail.com> wrote:
> Hi,
>
> Attached is a patch that fixes a comment typo in varlena.c

Thanks! Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SUBSCRIPTION command hangs up, segfault occurs in the server process and to cancel the execution.

2017-02-21 Thread Fujii Masao

3912) at xlogreader.c:556
> #5  0x00545068 in XLogReadRecord (state=0x2732008,
> RecPtr=23400240, errormsg=0x7fff7cf34cf8) at xlogreader.c:255
> #6  0x007d09db in DecodingContextFindStartpoint
> (ctx=0x2731d48) at logical.c:432
> #7  0x007e3a87 in CreateReplicationSlot (cmd=0x26edfa0) at
> walsender.c:796
> #8  0x007e45ae in exec_replication_command
> (cmd_string=0x268b5f8 "CREATE_REPLICATION_SLOT \"sub_db2_test\"
> LOGICAL pgoutput") at walsender.c:1272
> #9  0x00846783 in PostgresMain (argc=1, argv=0x2697f38,
> dbname=0x2697e68 "db1", username=0x2697e40 "masahiko") at
> postgres.c:4064
> #10 0x007b34e0 in BackendRun (port=0x268f4b0) at postmaster.c:4310
> #11 0x007b2c60 in BackendStartup (port=0x268f4b0) at postmaster.c:3982
> #12 0x007af2e0 in ServerLoop () at postmaster.c:1722
> #13 0x007ae9a4 in PostmasterMain (argc=5, argv=0x266c020) at
> postmaster.c:1330
> #14 0x006f487a in main (argc=5, argv=0x266c020) at main.c:228
>
> I guess that the cause of SEGV is that reply_message is reset without
> initialize by initStringInfo. The reply_message and other buffers are
> initialized at beggning of WalSndLoop but resetStringInfo can be
> called by WalSndWaitForWal->ProcessRepliesIfAny without calling
> WalSndLoop when creating replication slot. To fix it, we can
> initialize reply_message in exec_replication_command like follows.

Yes. I pushed the patch which makes walsender always initialize
all three buffers.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Walsender crash

2017-02-21 Thread Fujii Masao

On Mon, Feb 13, 2017 at 8:55 PM, Stas Kelvich <s.kelv...@postgrespro.ru> wrote:
> Hello.
>
> While playing with logical publications/subscriptions I’ve noted that 
> walsender crashes
> when i’m trying to CREATE SUBSCRIPTION to a current server.
>
> So having one running postgres instance on a standard port:
>
> stas=# CREATE SUBSCRIPTION mysub CONNECTION 'dbname=stas host=127.0.0.1' 
> PUBLICATION mypub;
> ^CCancel request sent
> ERROR:  canceling statement due to user request
> stas=# CREATE SUBSCRIPTION mysub CONNECTION 'dbname=stas host=127.0.0.1' 
> PUBLICATION mypub;
> WARNING:  terminating connection because of crash of another server process
>
> Crash happens due to uninitialised StringBuffer reply_message in walsender.c:
>
> (lldb) bt
> * thread #1: tid = 0x13e3f, 0x00010e74ebaf 
> postgres`resetStringInfo(str=0x00010ecd12a0) + 15 at stringinfo.c:96, 
> queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, 
> address=0x0)
>   * frame #0: 0x00010e74ebaf 
> postgres`resetStringInfo(str=0x00010ecd12a0) + 15 at stringinfo.c:96
> frame #1: 0x00010e8778c9 postgres`ProcessRepliesIfAny + 169 at 
> walsender.c:1347
> frame #2: 0x00010e8770fd postgres`WalSndWaitForWal(loc=35929928) + 
> 221 at walsender.c:1142
> frame #3: 0x00010e876cf2 
> postgres`logical_read_xlog_page(state=0x7fbcac0446f8, 
> targetPagePtr=35921920, reqLen=8008, targetRecPtr=35929904, cur_page="\x95?", 
> pageTLI=0x7fbcac044fa4) + 50 at walsender.c:717
> frame #4: 0x00010e571e5e 
> postgres`ReadPageInternal(state=0x7fbcac0446f8, pageptr=35921920, 
> reqLen=8008) + 542 at xlogreader.c:556
> frame #5: 0x00010e571336 
> postgres`XLogReadRecord(state=0x7fbcac0446f8, RecPtr=35929904, 
> errormsg=0x7fff5178f1e0) + 326 at xlogreader.c:255
> frame #6: 0x00010e85ea9b 
> postgres`DecodingContextFindStartpoint(ctx=0x7fbcac044438) + 139 at 
> logical.c:432
> frame #7: 0x00010e874dfd 
> postgres`CreateReplicationSlot(cmd=0x7fbcad8004d0) + 333 at 
> walsender.c:796
> frame #8: 0x00010e8747a4 
> postgres`exec_replication_command(cmd_string="CREATE_REPLICATION_SLOT 
> \"mysub\" LOGICAL pgoutput") + 532 at walsender.c:1272
> frame #9: 0x00010e8e6adc postgres`PostgresMain(argc=1, 
> argv=0x7fbcad005f50, dbname="stas", username="stas") + 2332 at 
> postgres.c:4064
> frame #10: 0x00010e839abe 
> postgres`BackendRun(port=0x7fbcabc033a0) + 654 at postmaster.c:4310
> frame #11: 0x00010e838eb3 
> postgres`BackendStartup(port=0x7fbcabc033a0) + 483 at postmaster.c:3982
> frame #12: 0x00010e837ea5 postgres`ServerLoop + 597 at 
> postmaster.c:1722
> frame #13: 0x00010e8356b5 postgres`PostmasterMain(argc=3, 
> argv=0x7fbcabc02b90) + 5397 at postmaster.c:1330
> frame #14: 0x00010e76371f postgres`main(argc=3, 
> argv=0x7fbcabc02b90) + 751 at main.c:228
> frame #15: 0x00007fffa951c255 libdyld.dylib`start + 1
> frame #16: 0x7fffa951c255 libdyld.dylib`start + 1
>
> Patch with lacking initStringInfo() attached.

Thanks for the report and patch!
I pushed the modified version of the patch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-21 Thread Fujii Masao

On Tue, Feb 21, 2017 at 7:52 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Tue, Feb 21, 2017 at 3:42 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Thu, Feb 16, 2017 at 11:40 AM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> On Thu, Feb 16, 2017 at 7:52 AM, Petr Jelinek
>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>> On 15/02/17 06:43, Masahiko Sawada wrote:
>>>>> On Tue, Feb 14, 2017 at 1:13 AM, Fujii Masao <masao.fu...@gmail.com> 
>>>>> wrote:
>>>>>> On Mon, Feb 13, 2017 at 4:05 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>>>>> wrote:
>>>>>>> On Sat, Feb 11, 2017 at 8:18 PM, Petr Jelinek
>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>> On 10/02/17 19:55, Masahiko Sawada wrote:
>>>>>>>>> On Thu, Feb 9, 2017 at 12:44 AM, Petr Jelinek
>>>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>>>> On 08/02/17 07:40, Masahiko Sawada wrote:
>>>>>>>>>>> On Wed, Feb 8, 2017 at 9:01 AM, Michael Paquier
>>>>>>>>>>> <michael.paqu...@gmail.com> wrote:
>>>>>>>>>>>> On Wed, Feb 8, 2017 at 1:30 AM, Fujii Masao 
>>>>>>>>>>>> <masao.fu...@gmail.com> wrote:
>>>>>>>>>>>>> On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek
>>>>>>>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>>>>>>>> For example what happens if apply crashes during the DROP
>>>>>>>>>>>>>> SUBSCRIPTION/COMMIT and is not started because the delete from 
>>>>>>>>>>>>>> catalog
>>>>>>>>>>>>>> is now visible so the subscription is no longer there?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Another idea is to treat DROP SUBSCRIPTION in the same way as 
>>>>>>>>>>>>> VACUUM, i.e.,
>>>>>>>>>>>>> make it emit an error if it's executed within user's transaction 
>>>>>>>>>>>>> block.
>>>>>>>>>>>>
>>>>>>>>>>>> It seems to me that this is exactly Petr's point: using
>>>>>>>>>>>> PreventTransactionChain() to prevent things to happen.
>>>>>>>>>>>
>>>>>>>>>>> Agreed. It's better to prevent to be executed inside user 
>>>>>>>>>>> transaction
>>>>>>>>>>> block. And I understood there is too many failure scenarios we need 
>>>>>>>>>>> to
>>>>>>>>>>> handle.
>>>>>>>>>>>
>>>>>>>>>>>>> Also DROP SUBSCRIPTION should call CommitTransactionCommand() just
>>>>>>>>>>>>> after removing the entry from pg_subscription, then connect to 
>>>>>>>>>>>>> the publisher
>>>>>>>>>>>>> and remove the replication slot.
>>>>>>>>>>>>
>>>>>>>>>>>> For consistency that may be important.
>>>>>>>>>>>
>>>>>>>>>>> Agreed.
>>>>>>>>>>>
>>>>>>>>>>> Attached patch, please give me feedback.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This looks good (and similar to what initial patch had btw). Works 
>>>>>>>>>> fine
>>>>>>>>>> for me as well.
>>>>>>>>>>
>>>>>>>>>> Remaining issue is, what to do about CREATE SUBSCRIPTION then, there 
>>>>>>>>>> are
>>>>>>>>>> similar failure scenarios there, should we prevent it from running
>>>>>>>>>> inside transaction as well?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hmm,  after thought I suspect current discussing approach. For
>>>>>>>>> example, please image the case where CRAETE SUBSCRIPTION c

Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-20 Thread Fujii Masao

On Thu, Feb 16, 2017 at 11:40 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Thu, Feb 16, 2017 at 7:52 AM, Petr Jelinek
> <petr.jeli...@2ndquadrant.com> wrote:
>> On 15/02/17 06:43, Masahiko Sawada wrote:
>>> On Tue, Feb 14, 2017 at 1:13 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>>> On Mon, Feb 13, 2017 at 4:05 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>>> wrote:
>>>>> On Sat, Feb 11, 2017 at 8:18 PM, Petr Jelinek
>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>> On 10/02/17 19:55, Masahiko Sawada wrote:
>>>>>>> On Thu, Feb 9, 2017 at 12:44 AM, Petr Jelinek
>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>> On 08/02/17 07:40, Masahiko Sawada wrote:
>>>>>>>>> On Wed, Feb 8, 2017 at 9:01 AM, Michael Paquier
>>>>>>>>> <michael.paqu...@gmail.com> wrote:
>>>>>>>>>> On Wed, Feb 8, 2017 at 1:30 AM, Fujii Masao <masao.fu...@gmail.com> 
>>>>>>>>>> wrote:
>>>>>>>>>>> On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek
>>>>>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>>>>>> For example what happens if apply crashes during the DROP
>>>>>>>>>>>> SUBSCRIPTION/COMMIT and is not started because the delete from 
>>>>>>>>>>>> catalog
>>>>>>>>>>>> is now visible so the subscription is no longer there?
>>>>>>>>>>>
>>>>>>>>>>> Another idea is to treat DROP SUBSCRIPTION in the same way as 
>>>>>>>>>>> VACUUM, i.e.,
>>>>>>>>>>> make it emit an error if it's executed within user's transaction 
>>>>>>>>>>> block.
>>>>>>>>>>
>>>>>>>>>> It seems to me that this is exactly Petr's point: using
>>>>>>>>>> PreventTransactionChain() to prevent things to happen.
>>>>>>>>>
>>>>>>>>> Agreed. It's better to prevent to be executed inside user transaction
>>>>>>>>> block. And I understood there is too many failure scenarios we need to
>>>>>>>>> handle.
>>>>>>>>>
>>>>>>>>>>> Also DROP SUBSCRIPTION should call CommitTransactionCommand() just
>>>>>>>>>>> after removing the entry from pg_subscription, then connect to the 
>>>>>>>>>>> publisher
>>>>>>>>>>> and remove the replication slot.
>>>>>>>>>>
>>>>>>>>>> For consistency that may be important.
>>>>>>>>>
>>>>>>>>> Agreed.
>>>>>>>>>
>>>>>>>>> Attached patch, please give me feedback.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This looks good (and similar to what initial patch had btw). Works fine
>>>>>>>> for me as well.
>>>>>>>>
>>>>>>>> Remaining issue is, what to do about CREATE SUBSCRIPTION then, there 
>>>>>>>> are
>>>>>>>> similar failure scenarios there, should we prevent it from running
>>>>>>>> inside transaction as well?
>>>>>>>>
>>>>>>>
>>>>>>> Hmm,  after thought I suspect current discussing approach. For
>>>>>>> example, please image the case where CRAETE SUBSCRIPTION creates
>>>>>>> subscription successfully but fails to create replication slot for
>>>>>>> whatever reason, and then DROP SUBSCRIPTION drops the subscription but
>>>>>>> dropping replication slot is failed. In such case, CREAET SUBSCRIPTION
>>>>>>> and DROP SUBSCRIPTION return ERROR but the subscription is created and
>>>>>>> dropped successfully. I think that this behaviour confuse the user.
>>>>>>>
>>>>>>> I think we should just prevent calling DROP SUBSCRIPTION in user's
>>>>>>> transaction block. Or I guess that it could be better to separate the
>>>>>>> starting/stopping logical replication from subscription managem

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-20 Thread Fujii Masao

On Fri, Feb 17, 2017 at 11:17 PM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 2/13/17 12:07, Fujii Masao wrote:
>> Anyway IMO that we can expose all the
>> columns except the sensitive information (i.e., subconninfo field)
>> in pg_subscription to even non-superusers.
>
> You mean with column privileges?

Yes.

So there are several approaches...

1) Expose all the columns except subconninfo in pg_subscription to
non-superusers. In this idea, the tab-completion and psql meta-command
for subscription still sees pg_subscription. One good point of
idea is that even non-superusers can see all the useful information
about subscriptions other than sensitive information like conninfo.

2) Change pg_stat_subscription so that it also shows all the columns except
subconninfo in pg_subscription. Also change the tab-completion and
psql meta-command for subscription so that they see pg_stat_subscription
instead of pg_subscription. One good point is that pg_stat_subscription
exposes all the useful information about subscription to even
non-superusers. OTOH, pg_subscription and pg_stat_subscription have
to have several same columns. This would be redundant and a bit confusing.

3) Expose subdbid in pg_stat_subscription. Also change the tab-completion
and psql meta-command for subscription so that they see
pg_stat_subscription. This change is very simple. But non-superusers cannot
see useful information like subslotname because of privilege problem.

I like #1, but any better approach?

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Remove all references to "xlog" from SQL-callable functions in p

2017-02-13 Thread Fujii Masao

On Tue, Feb 14, 2017 at 2:06 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Feb 13, 2017 at 11:43 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Fri, Feb 10, 2017 at 5:36 AM, Robert Haas <rh...@postgresql.org> wrote:
>>> Remove all references to "xlog" from SQL-callable functions in pg_proc.
>>>
>>> Commit f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e renamed the pg_xlog
>>> directory to pg_wal.  To make things consistent, and because "xlog" is
>>> terrible terminology for either "transaction log" or "write-ahead log"
>>> rename all SQL-callable functions that contain "xlog" in the name to
>>> instead contain "wal".  (Note that this may pose an upgrade hazard for
>>> some users.)
>>
>> There are still some mentions to "xlog" in the doc. Maybe we should replace
>> them with "wal" as the attached patch does.
>
> +1.  Patch looks good to me.

Thanks! Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-13 Thread Fujii Masao

On Fri, Feb 10, 2017 at 1:52 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Tue, Feb 7, 2017 at 6:35 AM, Peter Eisentraut
> <peter.eisentr...@2ndquadrant.com> wrote:
>> On 2/6/17 10:54 AM, Fujii Masao wrote:
>>> Yes, that's an option. And, if we add dbid to pg_stat_subscription,
>>> I'm tempted to add all the pg_subscription's columns except subconninfo
>>> into pg_stat_subscription. Since subconninfo may contain security-sensitive
>>> information, it should be excluded. But it seems useful to expose other
>>> columns. Then, if we expose all the columns except subconninfo, maybe
>>> it's better to just revoke subconninfo column on pg_subscription instead of
>>> all columns. Thought?
>>
>> I think previous practice is to provide a view such as pg_subscriptions
>> that contains all the nonprivileged information.
>
> OK, I think that I see the point you are coming at:
> pg_stat_get_subscription (or stat tables) should not be used for
> psql's tab completion. So gathering all things discussed, we have:
> - No tab completion for publications.

No, the tab-completions for ALTER/DROP PUBLICATION should show the local
publications because those commands drop and alter the local ones. OTOH,
CREATE/ALTER SUBSCRIPTOIN ... PUBLICATION should show nothing because
the remote publications in the publisher side should be specified there.

> - Fix the meta-commands.

Yes.

> - Addition of a view pg_subscriptions with all the non-sensitive data.
> (- Or really extend pg_stat_subscriptions with the database ID and use
> it for tab completion?)

Probably I failed to get Peter's point... Anyway IMO that we can expose all the
columns except the sensitive information (i.e., subconninfo field)
in pg_subscription to even non-superusers. Then we can use pg_subscription
for the tab-completion for ALTER/DROP SUBSCRIPTION.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Remove all references to "xlog" from SQL-callable functions in p

2017-02-13 Thread Fujii Masao

On Fri, Feb 10, 2017 at 5:36 AM, Robert Haas <rh...@postgresql.org> wrote:
> Remove all references to "xlog" from SQL-callable functions in pg_proc.
>
> Commit f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e renamed the pg_xlog
> directory to pg_wal.  To make things consistent, and because "xlog" is
> terrible terminology for either "transaction log" or "write-ahead log"
> rename all SQL-callable functions that contain "xlog" in the name to
> instead contain "wal".  (Note that this may pose an upgrade hazard for
> some users.)

There are still some mentions to "xlog" in the doc. Maybe we should replace
them with "wal" as the attached patch does.

Regards,

-- 
Fujii Masao


xlog2wal.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-13 Thread Fujii Masao

On Mon, Feb 13, 2017 at 4:05 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Sat, Feb 11, 2017 at 8:18 PM, Petr Jelinek
> <petr.jeli...@2ndquadrant.com> wrote:
>> On 10/02/17 19:55, Masahiko Sawada wrote:
>>> On Thu, Feb 9, 2017 at 12:44 AM, Petr Jelinek
>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>> On 08/02/17 07:40, Masahiko Sawada wrote:
>>>>> On Wed, Feb 8, 2017 at 9:01 AM, Michael Paquier
>>>>> <michael.paqu...@gmail.com> wrote:
>>>>>> On Wed, Feb 8, 2017 at 1:30 AM, Fujii Masao <masao.fu...@gmail.com> 
>>>>>> wrote:
>>>>>>> On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek
>>>>>>> <petr.jeli...@2ndquadrant.com> wrote:
>>>>>>>> For example what happens if apply crashes during the DROP
>>>>>>>> SUBSCRIPTION/COMMIT and is not started because the delete from catalog
>>>>>>>> is now visible so the subscription is no longer there?
>>>>>>>
>>>>>>> Another idea is to treat DROP SUBSCRIPTION in the same way as VACUUM, 
>>>>>>> i.e.,
>>>>>>> make it emit an error if it's executed within user's transaction block.
>>>>>>
>>>>>> It seems to me that this is exactly Petr's point: using
>>>>>> PreventTransactionChain() to prevent things to happen.
>>>>>
>>>>> Agreed. It's better to prevent to be executed inside user transaction
>>>>> block. And I understood there is too many failure scenarios we need to
>>>>> handle.
>>>>>
>>>>>>> Also DROP SUBSCRIPTION should call CommitTransactionCommand() just
>>>>>>> after removing the entry from pg_subscription, then connect to the 
>>>>>>> publisher
>>>>>>> and remove the replication slot.
>>>>>>
>>>>>> For consistency that may be important.
>>>>>
>>>>> Agreed.
>>>>>
>>>>> Attached patch, please give me feedback.
>>>>>
>>>>
>>>> This looks good (and similar to what initial patch had btw). Works fine
>>>> for me as well.
>>>>
>>>> Remaining issue is, what to do about CREATE SUBSCRIPTION then, there are
>>>> similar failure scenarios there, should we prevent it from running
>>>> inside transaction as well?
>>>>
>>>
>>> Hmm,  after thought I suspect current discussing approach. For
>>> example, please image the case where CRAETE SUBSCRIPTION creates
>>> subscription successfully but fails to create replication slot for
>>> whatever reason, and then DROP SUBSCRIPTION drops the subscription but
>>> dropping replication slot is failed. In such case, CREAET SUBSCRIPTION
>>> and DROP SUBSCRIPTION return ERROR but the subscription is created and
>>> dropped successfully. I think that this behaviour confuse the user.
>>>
>>> I think we should just prevent calling DROP SUBSCRIPTION in user's
>>> transaction block. Or I guess that it could be better to separate the
>>> starting/stopping logical replication from subscription management.
>>>
>>
>> We need to stop the replication worker(s) in order to be able to drop
>> the slot. There is no such issue with startup of the worker as that one
>> is launched by launcher after the transaction has committed.
>>
>> IMO best option is to just don't allow DROP/CREATE SUBSCRIPTION inside a
>> transaction block and don't do any commits inside of those (so that
>> there are no rollbacks, which solves your initial issue I believe). That
>> way failure to create/drop slot will result in subscription not being
>> created/dropped which is what we want.

On second thought, +1.

> I basically agree this option, but why do we need to change CREATE
> SUBSCRIPTION as well?

Because the window between the creation of replication slot and the transaction
commit of CREATE SUBSCRIPTION should be short. Otherwise, if any error happens
during that window, the replication slot unexpectedly remains while there is no
corresponding subscription. Of course, even If we prevent CREATE SUBSCRIPTION
from being executed within user's transaction block, there is still such
window. But we can reduce the possibility of that problem.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-07 Thread Fujii Masao

On Wed, Feb 8, 2017 at 12:26 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 07/02/17 13:10, Masahiko Sawada wrote:
>> Hi all,
>>
>> While testing logical replciation I found that if the transaction
>> issued DROP SUBSCRIPTION　rollbacks then the logical repliation stops
>> and the subscription can never be removed later. The document says
>> that the replication worker associated with the subscription will not
>> stop until after the transaction that issued this command has
>> committed but it doesn't work.
>
> Ah then the docs are wrong and should be fixed. Maybe we should not
> allow DROP SUBSCRIPTION inside transaction similarly to CREATE INDEX
> CONCURRENTLY.
>
>> The cause of this is that DropSubscription stops the apply worker and
>> drops corresponding replication slot on publisher side without waiting
>> for commit or rollback. The launcher process launches the apply worker
>> again but the launched worker will fail to start logical replication
>> because corresponding replication slot is already removed. And the
>> orphan subscription can not be removed later.
>>
>> I think the logical replication should not stop and the corresponding
>> replication slot and replication origin should not be removed until
>> the transaction commits.
>>
>> The solution for this I came up with is that the launcher process
>> stops the apply worker after DROP SUBSCRIPTION is committed rather
>> than DropSubscription does. And the apply worker drops replication
>> slot and replication origin before exits. Attached draft patch fixes
>> this issue.
>>
>
> I don't think we can allow the slot drop to be postponed. There is too
> many failure scenarios where we would leave the remote slot in the
> database and that's not acceptable IMHO.
>
> For example what happens if apply crashes during the DROP
> SUBSCRIPTION/COMMIT and is not started because the delete from catalog
> is now visible so the subscription is no longer there?

Another idea is to treat DROP SUBSCRIPTION in the same way as VACUUM, i.e.,
make it emit an error if it's executed within user's transaction block.
Also DROP SUBSCRIPTION should call CommitTransactionCommand() just
after removing the entry from pg_subscription, then connect to the publisher
and remove the replication slot.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-07 Thread Fujii Masao

On Tue, Feb 7, 2017 at 2:13 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 06/02/17 17:33, Fujii Masao wrote:
>> On Sun, Feb 5, 2017 at 5:11 AM, Petr Jelinek
>> <petr.jeli...@2ndquadrant.com> wrote:
>>> On 03/02/17 19:38, Fujii Masao wrote:
>>>> On Sat, Feb 4, 2017 at 12:49 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>>>> On Fri, Feb 3, 2017 at 10:56 AM, Kyotaro HORIGUCHI
>>>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>>>> At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao <masao.fu...@gmail.com> 
>>>>>> wrote in 
>>>>>>

Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK

2017-02-07 Thread Fujii Masao

On Tue, Feb 7, 2017 at 9:10 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Hi all,
>
> While testing logical replciation I found that if the transaction
> issued DROP SUBSCRIPTION　rollbacks then the logical repliation stops
> and the subscription can never be removed later. The document says
> that the replication worker associated with the subscription will not
> stop until after the transaction that issued this command has
> committed but it doesn't work.

Yeah, this is a bug.

ISTM that CREATE SUBSCRIPTION also has the similar issue. It creates
the replication slot on the publisher side before the transaction has been
committed. Even if the transaction is rollbacked, that replication slot is
not removed. That is, in a transaction block, we should not connect to
the publisher. Instead, the launcher or worker should do.

> The cause of this is that DropSubscription stops the apply worker and
> drops corresponding replication slot on publisher side without waiting
> for commit or rollback. The launcher process launches the apply worker
> again but the launched worker will fail to start logical replication
> because corresponding replication slot is already removed. And the
> orphan subscription can not be removed later.
>
> I think the logical replication should not stop and the corresponding
> replication slot and replication origin should not be removed until
> the transaction commits.

Yes.

> The solution for this I came up with is that the launcher process
> stops the apply worker after DROP SUBSCRIPTION is committed rather
> than DropSubscription does. And the apply worker drops replication
> slot and replication origin before exits. Attached draft patch fixes
> this issue.
>
> Please give me feedback.

The patch failed to apply to HEAD.

+ worker = logicalrep_worker_find(subid);
+ if (worker)
  {
- heap_close(rel, NoLock);
- return;
+ if (stmt->drop_slot)
+ worker->drop_slot = true;
+ worker->need_to_stop = true;

"drop_slot" and "need_to_stop" seem to be set to true even if the transaction
is rollbacked. This would cause the same problem that you're trying to fix.

I think that we should make the launcher periodically checks pg_subscription
and stop the worker if there is no its corresponding subscription. Then,
if necessary, the worker should remove its replication slot from the publisher.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-06 Thread Fujii Masao

On Sun, Feb 5, 2017 at 5:11 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 03/02/17 19:38, Fujii Masao wrote:
>> On Sat, Feb 4, 2017 at 12:49 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>>> On Fri, Feb 3, 2017 at 10:56 AM, Kyotaro HORIGUCHI
>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>>>> At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao <masao.fu...@gmail.com> 
>>>> wrote in 
>>>>

Re: [HACKERS] [RFC] Should "SHOW huge_pages" display the effective value "off" when the huge page is unavailable?

2017-02-06 Thread Fujii Masao

On Tue, Feb 7, 2017 at 12:36 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Fujii Masao <masao.fu...@gmail.com> writes:
>> On Mon, Feb 6, 2017 at 4:01 PM, Tsunakawa, Takayuki
>> <tsunakawa.ta...@jp.fujitsu.com> wrote:
>>> I don't have a strong opinion on that, but I think a bit that it would be 
>>> better to reflect the effective setting, i.e. SHOW displays huge_pages as 
>>> off, not try.
>
>> Not sure if this is best way to do that, but I agree that it's helpful if
>> we can see whether the server actually uses huge page or not in
>> huge_page=try case.
>
> If the proposal is to actually change the stored value of huge_pages,
> I would say "absolutely not".  Suppose that you change "try" to "on",
> and there's a backend crash and restart so that the postmaster needs
> to reallocate shared memory, and this time it's unable to obtain
> huge pages for some reason.  Taking the database down would be entirely
> the wrong thing.  Also, how would you handle postgresql.conf reload
> situations?
>
> If the proposal is to have SHOW report something other than the setting
> of the variable, that's not a great plan either.  It's generally important
> that the output of SHOW be something that's acceptable to SET, as not
> having that equivalence will break assorted client-side code.

I was thinking that Tunakawa-san's proposal is this, i.e., use GUC show-hook
to show "off" if the server fails to use huge-page and "on" otherwise.

> I think this desire would be better addressed by some kind of specialized
> inquiry function, which would also be able to return more information than
> just a naked "on/off" bit.  People might for instance wish to know what
> hugepage size is in use.

+1

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-06 Thread Fujii Masao

On Mon, Feb 6, 2017 at 1:52 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Sun, Feb 5, 2017 at 5:04 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> With this patch, when normal users type TAB after SUBSCRIPTION,
>> they got "permission denied" error. I don't think that's acceptable.
>
> Right, I can see that now. pg_stat_get_subscription() does not offer
> as well a way to filter by databases... Perhaps we could add it? it is
> stored as LogicalRepWorker->dbid so making it visible is sort of easy.

Yes, that's an option. And, if we add dbid to pg_stat_subscription,
I'm tempted to add all the pg_subscription's columns except subconninfo
into pg_stat_subscription. Since subconninfo may contain security-sensitive
information, it should be excluded. But it seems useful to expose other
columns. Then, if we expose all the columns except subconninfo, maybe
it's better to just revoke subconninfo column on pg_subscription instead of
all columns. Thought?

BTW, I found that psql's \dRs command has the same problem;
the permission denied error occurs when normal user runs \dRs.
This issue should be fixed in the same way as tab-completion one.

Also I found that tab-completion for psql's meta-commands doesn't
show \dRp and \dRs.

>> In "CREATE SUBSCRIPTION ... PUBLICATION" and "ALTER SUBSCRIPTION ... SET
>> PUBLICATION" cases, the publication defined in the publisher side should be
>> specified. But, with this patch, the tab-completion for PUBLICATION shows
>> the publications defined in local server (ie, subscriber side in those 
>> cases).
>> This might be confusing.
>
> Which would mean that psql tab completion should try to connect the
> remote server using the connection string defined with the
> subscription to get this information, which looks unsafe to me. To be
> honest, I'd rather see a list of local objects defined than nothing..

IMO showing nothing is better. But many people think that even local
objects should be showed in that case, I can live with that.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should "SHOW huge_pages" display the effective value "off" when the huge page is unavailable?

2017-02-06 Thread Fujii Masao

On Mon, Feb 6, 2017 at 4:01 PM, Tsunakawa, Takayuki
<tsunakawa.ta...@jp.fujitsu.com> wrote:
> Hello, all
>
> Could you give me your opinions on whether the SHOW command should display 
> the effective value or the specified value for huge_pages?  During the review 
> of "Supporting huge_pages on Windows", which is now shifted to CommitFest 
> 2017-3, Magnus gave me a comment that the huge_page variable should retain 
> the value "try" when the huge page is not available on the machine and the 
> server falls back to huge_page=off.  The Linux version does so.
>
> I don't have a strong opinion on that, but I think a bit that it would be 
> better to reflect the effective setting, i.e. SHOW displays huge_pages as 
> off, not try.

Not sure if this is best way to do that, but I agree that it's helpful if
we can see whether the server actually uses huge page or not in
huge_page=try case.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-06 Thread Fujii Masao

On Mon, Feb 6, 2017 at 1:24 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Sat, Feb 4, 2017 at 6:39 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> On Fri, Feb 3, 2017 at 5:21 AM, Stephen Frost <sfr...@snowman.net> wrote:
>>> Daniel,
>>>
>>> * Daniel Verite (dan...@manitou-mail.org) wrote:
>>>> What if we look at the change from the pessimistic angle?
>>>> An example of confusion that the change would create:
>>>> a lot of users currently choose pg_wal for the destination
>>>> directory of their archive command. Less-informed users
>>>> that set up archiving and/or log shipping in PG10 based on
>>>> advice online from previous versions will be fairly
>>>> confused about the missing pg_xlog, and the fact that the
>>>> pg_wal directory they're supposed to create already exists.
>>>
>>> One would hope that they would realize that's not going to work
>>> when they set up PG10.  If they aren't paying attention sufficient
>>> to realize that then it seems entirely likely that they would feel
>>> equally safe removing the contents of a directory named 'pg_xlog'.
>>
>> So... somebody want to tally up the votes here?
>
> Here is what I have, 6 votes clearly stated:
> 1. Rename nothing: Daniel,
> 2. Rename directory only: Andres
> 3. Rename everything: Stephen, Vladimir, David S, Michael P (with
> aliases for functions, I could live without at this point...)

I vote for 1.
I still wonder how much the renaming of pg_xlog actually helps very careless
people who remove pg_xlog becase its name includes "log". I'm afraid that
they would make another serious mistake (e.g., remove pg_wal because it has
many files and it occupies large amount of disk space) even after renaming
to pg_wal. The crazy idea, making initdb create the empty file with the name
"DONT_REMOVE_pg_xlog_IF_YOU_DONT_WANT_TO_LOSE_YOUR_IMPORTANT_DATA"
in $PGDATA seems more helpful. Anyway I'm afraid that the renaming would
cause more pain than gain.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-04 Thread Fujii Masao

On Sat, Feb 4, 2017 at 9:01 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Fri, Feb 3, 2017 at 3:56 AM, Peter Eisentraut
> <peter.eisentr...@2ndquadrant.com> wrote:
>> On 2/2/17 12:48 AM, Michael Paquier wrote:
>>> +#define Query_for_list_of_subscriptions \
>>> +" SELECT pg_catalog.quote_ident(subname) "\
>>> +"   FROM pg_catalog.pg_subscription "\
>>> +"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"
>>
>> This query should also be qualified by current database.
>
> Indeed. Here is an updated patch.

With this patch, when normal users type TAB after SUBSCRIPTION,
they got "permission denied" error. I don't think that's acceptable.

In "CREATE SUBSCRIPTION ... PUBLICATION" and "ALTER SUBSCRIPTION ... SET
PUBLICATION" cases, the publication defined in the publisher side should be
specified. But, with this patch, the tab-completion for PUBLICATION shows
the publications defined in local server (ie, subscriber side in those cases).
This might be confusing.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-04 Thread Fujii Masao

On Fri, Feb 3, 2017 at 3:55 AM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 2/2/17 12:48 PM, Fujii Masao wrote:
>> +#define Query_for_list_of_subscriptions \
>> +" SELECT pg_catalog.quote_ident(subname) "\
>> +"   FROM pg_catalog.pg_subscription "\
>> +"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"
>>
>> Since non-superuser is not allowed to access to pg_subscription,
>> pg_stat_subscription should be accessed here, instead. Thought?
>
> Arguably, you could leave it like that, assuming it fails cleanly for
> nonsuperusers.  Nonsuperusers are not going to be able to run any
> commands on subscriptions anyway, so there is little use for it.

No. You can get rid of superuser privilege from the owner of the subscription
and that nonsuperuser owner can run some commands on the subscriptin.
It's a bit artificial, but you can. I'm not sure if we should add the check to
prevent the owner from becoming nonsuperuser. But if the owner always must
have a superuser privilege per the specification of logical replication,
I think that such check would be necessary.

Also I prefer to tab-complete the subscriptions even for nonsuperusers.
There are some objects that only superuser or owner can manage, but their
names are currently tab-completed even when current user is "normal" one.
So I'm afraid that handling only subscriptions differently might be more
confusing.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-03 Thread Fujii Masao

On Sat, Feb 4, 2017 at 12:49 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
> On Fri, Feb 3, 2017 at 10:56 AM, Kyotaro HORIGUCHI
> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao <masao.fu...@gmail.com> wrote 
>> in

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-03 Thread Fujii Masao

On Fri, Feb 3, 2017 at 10:56 AM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao <masao.fu...@gmail.com> wrote 
> in

Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-02 Thread Fujii Masao

On Thu, Feb 2, 2017 at 2:48 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> Hi all,
>
> While testing a bit the logical replication facility, I have bumped on
> the fast that psql's completion does not show the list of things
> already created. Attached is a patch.

+#define Query_for_list_of_subscriptions \
+" SELECT pg_catalog.quote_ident(subname) "\
+"   FROM pg_catalog.pg_subscription "\
+"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"

Since non-superuser is not allowed to access to pg_subscription,
pg_stat_subscription should be accessed here, instead. Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-02 Thread Fujii Masao

On Thu, Feb 2, 2017 at 2:36 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Thu, Feb 2, 2017 at 2:13 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> writes:
>>> Then, the reason for the TRY-CATCH cluase is that I found that
>>> some functions called from there can throw exceptions.
>>
>> Yes, but all LWLocks should be released by normal error recovery.
>> It should not be necessary for this code to clean that up by hand.
>> If it were necessary, there would be TRY-CATCH around every single
>> LWLockAcquire in the backend, and we'd have an unreadable and
>> unmaintainable system.  Please don't add a TRY-CATCH unless it's
>> *necessary* -- and you haven't explained why this one is.

Yes.

> Putting hands into the code and at the problem, I can see that
> dropping a subscription on a node makes it unresponsive in case of a
> stop. And that's just because calls to LWLockRelease are missing as in
> the patch attached. A try/catch problem should not be necessary.

Thanks for the patch!

With the patch, LogicalRepLauncherLock is released at the end of
DropSubscription(). But ISTM that the lock should be released just after
logicalrep_worker_stop() and there is no need to protect the removal of
replication slot with the lock.

/*
* If we found worker but it does not have proc set it is starting up,
* wait for it to finish and then kill it.
*/
while (worker && !worker->proc)
{

ISTM that the above loop in logicalrep_worker_stop() is not necessary
because LogicalRepLauncherLock ensures that the above condition is
always false. Thought? Am I missing something?

If the above condition is true, which means that there is the worker slot
having the "subid" of the worker to kill, but its "proc" has not been set yet.
Without LogicalRepLauncherLock, this situation can happen after "subid"
is set by the launcher and before "proc" is set by the worker. But
LogicalRepLauncherLock protects those operations, so logicalrep_worker_stop()
called while holding the lock should always think the above condition is false.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-01 Thread Fujii Masao

On Wed, Feb 1, 2017 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> Hello, while looking another bug, I found that standby cannot
> shutdown after DROP SUBSCRIPTION.
>
> standby=# CREATE SUBSCRPTION sub1 ...
> standby=# 
> standby=# DROP SUBSCRIPTION sub1;
>
> Ctrl-C to the standby fails to work. ApplyLauncherMain is waiting
> LogicalRepLauncherLock forever.
>
> The culprit is DropSbuscription. It acquires
> LogicalRepLauncherLock but never releases.
>
> The attached patch fixes it. Most part of the fucntion is now
> enclosed by PG_TRY-CATCH since some functions can throw
> exceptions.

The lwlock would be released when an exception occurs, so I don't think
that TRY-CATCH is necessary here. Or it's necessary for another reason?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-02-01 Thread Fujii Masao

On Thu, Jan 19, 2017 at 6:37 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> Hello,
>
> At Wed, 18 Jan 2017 12:34:51 +0900, Michael Paquier 
> <michael.paqu...@gmail.com> wrote in 
> <cab7npqqytf2gie7fd-4ojjppvwikjrdqpc24hlngthx01sb...@mail.gmail.com>
>> On Tue, Jan 17, 2017 at 7:36 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> > I managed to reproduce this. A little tweak as the first patch
>> > lets the standby to suicide as soon as walreceiver sees a
>> > contrecord at the beginning of a segment.
>>
>> Good idea.
>
> Thanks. Fortunately(?), the problematic situation seems to happen
> at almost all segment boundary.
>
>> > I believe that a continuation record cannot be span over three or
>> > more *segments* (is it right?), so keeping one spare segment
>> > would be enough. The attached second patch does this.
>>
>> I have to admit that I did not think about this problem much yet (I
>> bookmarked this report weeks ago to be honest as something to look
>> at), but that does not look right to me. Couldn't a record be spawned
>> across even more segments? Take a random string longer than 64MB or
>> event longer for example.
>
> Though I haven't look closer to how a modification is splitted
> into WAL records. A tuple cannot be so long. As a simple test, I
> observed rechder->xl_tot_len at the end of XLogRecordAssemble
> inserting an about 400KB not-so-compressable string into a text
> column, but I saw a series of many records with shorter than
> several thousand bytes.
>
>> > Other possible measures might be,
>> >
>> > - Allowing switching wal source while reading a continuation
>> >   record. Currently ReadRecord assumes that a continuation record
>> >   can be read from single source. But this needs refactoring
>> >   involving xlog.c, xlogreader.c and relatives.
>>
>> This is scary thinking about back-branches.
>
> Yes. It would be no longer a bug fix. (Or becomes a quite ugly hack..)
>
>> > - Delaying recycling a segment until the last partial record on it
>> >   completes. This seems doable in page-wise (coarse resolution)
>> >   but would cost additional reading of past xlog files (page
>> >   header of past pages is required).
>>
>> Hm, yes. That looks like the least invasive way to go. At least that
>> looks more correct than the others.
>
> The attached patch does that. Usually it reads page headers only
> on segment boundaries, but once continuation record found (or
> failed to read the next page header, that is, the first record on
> the first page in the next segment has not been replicated), it
> becomes to happen on every page boundary until non-continuation
> page comes.

I'm afraid that many WAL segments would start with a continuation record
when there are the workload of short transactions (e.g., by pgbench), and
which would make restart_lsn go behind very much. No?

The discussion on this thread just makes me think that restart_lsn should
indicate the replay location instead of flush location. This seems safer.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2017-01-24 Thread Fujii Masao

On Thu, Nov 24, 2016 at 4:24 PM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Thu, Nov 24, 2016 at 10:29 AM, Tsunakawa, Takayuki
> <tsunakawa.ta...@jp.fujitsu.com> wrote:
>> From: pgsql-hackers-ow...@postgresql.org
>>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
>>> Thanks for the clarification, I could reproduce the issue and confirms that
>>> patch has fixed it.  Find logs of cascading standby at  PG9.2 Head and Patch
>>> attached (I have truncated few lines at end of server log generated in Head
>>> as those were repetitive).  I think the way you have directly explained
>>> the bug steps in code comments is not right (think if we start writing bug
>>> steps for each bug fix, how the code will look like).  So I have modified
>>> the comment to explain the situation and reason of check,  see if you find
>>> that as okay?
>>
>> Thank you, I'm happy with your comment.
>>
>
> Okay, I have marked the patch as 'Ready for Committer'.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] src/test/README for logical replication

2017-01-23 Thread Fujii Masao

On Tue, Jan 24, 2017 at 7:17 AM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> src/test/README wasn't updated for the new src/test/subscription entry.
>
> Patch attached.

Thanks! Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2017-01-23 Thread Fujii Masao

On Sat, Jan 21, 2017 at 1:39 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 20/01/17 17:33, Jaime Casanova wrote:
>> On 20 January 2017 at 11:25, Petr Jelinek <petr.jeli...@2ndquadrant.com> 
>> wrote:
>>> On 20/01/17 17:05, Fujii Masao wrote:
>>>> On Fri, Jan 20, 2017 at 11:08 PM, Peter Eisentraut
>>>> <peter.eisentr...@2ndquadrant.com> wrote:
>>>>> On 1/19/17 5:01 PM, Petr Jelinek wrote:
>>>>>> There were some conflicting changes committed today so I rebased the
>>>>>> patch on top of them.
>>>>>>
>>>>>> Other than that nothing much has changed, I removed the separate sync
>>>>>> commit patch, included the rename patch in the patchset and fixed the
>>>>>> bug around pg_subscription catalog reported by Erik Rijkers.
>>>>>
>>>>> Committed.
>>>>
>>>> Sorry I've not followed the discussion about logical replication at all, 
>>>> but
>>>> why does logical replication launcher need to start up by default?
>>>>
>>>
>>> Because running subscriptions is allowed by default. You'd need to set
>>> max_logical_replication_workers to 0 to disable that.
>>>
>>
>> surely wal_level < logical shouldn't start a logical replication
>> launcher, and after an initdb wal_level is only replica
>>
>
> Launcher is needed for subscriptions, subscriptions don't depend on
> wal_level.

But why did you enable only subscription by default while publication is
disabled by default (i.e., wal_level != logical)? I think that it's better to
enable both by default OR disable both by default.

While I was reading the logical rep code, I found that
logicalrep_worker_launch returns *without* releasing LogicalRepWorkerLock
when there is no unused worker slot. This seems a bug.

/* Report this after the initial starting message for consistency. */
if (max_replication_slots == 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
errmsg("cannot start logical replication workers when
max_replication_slots = 0")));

logicalrep_worker_launch checks max_replication_slots as above.
Why does it need to check that setting value in the *subscriber* side?
Maybe I'm missing something here, but ISTM that the subscription uses
one replication slot in *publisher* side but doesn't use in *subscriber* side.

*  The apply worker may spawn additional workers (sync) for initial data
*  synchronization of tables.

The above header comment in logical/worker.c is true?

The copyright in each file that the commit of logical rep added needs to
be updated.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Checksums by default?

2017-01-21 Thread Fujii Masao

On Sun, Jan 22, 2017 at 12:18 AM, Petr Jelinek
<petr.jeli...@2ndquadrant.com> wrote:
> On 21/01/17 11:39, Magnus Hagander wrote:
>> Is it time to enable checksums by default, and give initdb a switch to
>> turn it off instead?
>
> I'd like to see benchmark first, both in terms of CPU and in terms of
> produced WAL (=network traffic) given that it turns on logging of hint bits.

+1

If the performance overhead by the checksums is really negligible,
we may be able to get rid of wal_log_hints parameter, as well.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix documentation typo

2017-01-20 Thread Fujii Masao

On Fri, Jan 20, 2017 at 8:02 PM, Ishii Ayumi <ishii.ayumi...@gmail.com> wrote:
> Hi,
>
>> Here is a patch that adds temporary column's description to
>> pg_replication_slots document.
>
> Attached patch is updated version  to add detailed description.
>
> Regards,
> --
> Ayumi Ishii
>
> 2017-01-19 15:56 GMT+09:00 Ishii Ayumi <ishii.ayumi...@gmail.com>:
>> Hi,
>>
>> Here is a patch that adds temporary column's description to
>> pg_replication_slots document.

This is the oversight of commit a924c32. Thanks for the patch! Pushed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2017-01-20 Thread Fujii Masao

On Fri, Jan 20, 2017 at 11:08 PM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 1/19/17 5:01 PM, Petr Jelinek wrote:
>> There were some conflicting changes committed today so I rebased the
>> patch on top of them.
>>
>> Other than that nothing much has changed, I removed the separate sync
>> commit patch, included the rename patch in the patchset and fixed the
>> bug around pg_subscription catalog reported by Erik Rijkers.
>
> Committed.

Sorry I've not followed the discussion about logical replication at all, but
why does logical replication launcher need to start up by default?

$ initdb -D data
$ pg_ctl -D data start

When I ran the above commands, I got the following message and
found that the bgworker for logical replicatdion launcher was running.

LOG:  logical replication launcher started

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix an assertion failure related to an exclusive backup.

2017-01-17 Thread Fujii Masao

On Tue, Jan 17, 2017 at 10:37 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Tue, Jan 17, 2017 at 5:40 PM, Fujii Masao <fu...@postgresql.org> wrote:
>> Fix an assertion failure related to an exclusive backup.
>>
>> Previously multiple sessions could execute pg_start_backup() and
>> pg_stop_backup() to start and stop an exclusive backup at the same time.
>> This could trigger the assertion failure of
>> "FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)".
>> This happend because, even while pg_start_backup() was starting
>> an exclusive backup, other session could run pg_stop_backup()
>> concurrently and mark the backup as not-in-progress unconditionally.
>>
>> This patch introduces ExclusiveBackupState indicating the state of
>> an exclusive backup. This state is used to ensure that there is only
>> one session running pg_start_backup() or pg_stop_backup() at
>> the same time, to avoid the assertion failure.
>
> Please note that this commit message is not completely exact. This fix
> does not only avoid triggerring this assertion failure, it also makes
> sure that no manual on-disk intervention is needed by the user to
> remove a backup_label file after a failure of pg_stop_backup(). Before
> this patch, what happened is that the exclusive backup counter in
> XLogCtl got decremented before removing backup_label. However, after
> the counter was decremented, if an error occurred, the shared memory
> counter would have been at 0 with a backup_label file on disk.
> Subsequent attempts to start pg_start_backup() would have failed, and
> putting the system backup into a consistent state would have required
> an operator to remove by hand the backup_label file. The heart of the
> logic here is in the callback of pg_stop_backup() when an error
> happens during the deletion of the backup_label file.

With the patch, what happens if pg_stop_backup exits with an error
after removing backup_label file before resetting the backup state
to none?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Measuring replay lag

2017-01-16 Thread Fujii Masao

On Thu, Dec 22, 2016 at 6:14 AM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> On Thu, Dec 22, 2016 at 2:14 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> I agree that the capability to measure the remote_apply lag is very useful.
>> Also I want to measure the remote_write and remote_flush lags, for example,
>> in order to diagnose the cause of replication lag.
>
> Good idea.  I will think about how to make that work.  There was a
> proposal to make writing and flushing independent[1].  I'd like that
> to go in.  Then the write_lag and flush_lag could diverge
> significantly, and it would be nice to be able to see that effect as
> time (though you could already see it with LSN positions).
>
>> For that, what about maintaining the pairs of send-timestamp and LSN in
>> *sender side* instead of receiver side? That is, walsender adds the pairs
>> of send-timestamp and LSN into the buffer every sampling period.
>> Whenever walsender receives the write, flush and apply locations from
>> walreceiver, it calculates the write, flush and apply lags by comparing
>> the received and stored LSN and comparing the current timestamp and
>> stored send-timestamp.
>
> I thought about that too, but I couldn't figure out how to make the
> sampling work.  If the primary is choosing (LSN, time) pairs to store
> in a buffer, and the standby is sending replies at times of its
> choosing (when wal_receiver_status_interval has been exceeded), then
> you can't accurately measure anything.

Yeah, even though the primary stores (100, 2017-01-17 00:00:00) as the pair of
(LSN, timestamp), for example, the standby may not send back the reply for
LSN 100 itself. The primary may receive the reply for larger LSN like 200,
instead. So the measurement of the lag in the primary side would not be so
accurate.

But we can calculate the "sync rep" lag by comparing the stored timestamp of
LSN 100 and the timestamp when the reply for LSN 200 is received. In sync rep,
since the transaction waiting for LSN 100 to be replicated is actually released
after the reply for LSN 200 is received, the above calculated lag is basically
accurate as sync rep lag.

Therefore I'm still thinking that it's better to maintain the pairs of LSN
and timestamp in the *primary* side. Thought?

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-16 Thread Fujii Masao

On Sat, Jan 14, 2017 at 3:10 AM, Vladimir Rusinov <vrusi...@google.com> wrote:
> Attached are two new version of the patch: one keeps aliases, one don't.
> Also, remove stray reference to xlog function in one of the tests.
>
> I've lost vote count. Should we create a form to calculate which one of the
> patches should be commited?

If we do that, we should vote on all the "renaming" stuff, i.e., not only
function names but also program names like pg_receivexlog, directory names
like clog, option names like xlogdir option of initdb, return value names of
the functions like xlog_position in pg_create_physical_replication_slot, etc.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackups and slots

2017-01-16 Thread Fujii Masao

On Mon, Jan 16, 2017 at 10:00 PM, Magnus Hagander <mag...@hagander.net> wrote:
>
>
> On Wed, Jan 4, 2017 at 3:32 PM, Peter Eisentraut
> <peter.eisentr...@2ndquadrant.com> wrote:
>>
>> This patch looks reasonable to me.
>>
>> Attached is a top-up patch with a few small fixups.
>>
>> I suggest to wait for the resolution of the "Replication/backup
>> defaults" thread.  I would not want to be in a situation where users who
>> have not been trained to use replication slots now have yet another
>> restart-only parameter to set before they can take a backup.
>>
>
> Thanks for the review and the top-up patch. I've applied it and pushed.

- if (replication_slot && includewal != STREAM_WAL)
+ if ((replication_slot || no_slot) && includewal != STREAM_WAL)

Why do we need to check "no_slot" here? Since no_slot=true means that
no temporary replication slot is specified, it's fine even with both -X none
and fetch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in condition_variable.c

2017-01-16 Thread Fujii Masao

On Mon, Jan 16, 2017 at 6:34 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Mon, Jan 16, 2017 at 6:27 PM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Mon, Jan 16, 2017 at 5:42 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> Attached patch fixes comment typos in condition_variable.c
>>
>> Seems the patch still has the typo. "initiailly" should be "initially"?
>>
>
> Yes, it should be "initially".

OK, pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in condition_variable.c

2017-01-16 Thread Fujii Masao

On Mon, Jan 16, 2017 at 5:42 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Hi,
>
> Attached patch fixes comment typos in condition_variable.c

Seems the patch still has the typo. "initiailly" should be "initially"?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait

2017-01-15 Thread Fujii Masao

On Sat, Jan 14, 2017 at 11:19 PM, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 1/11/17 11:20 AM, Ryan Murphy wrote:
>> Thanks for the review Beena, I'm glad the patch is ready to go!
>
> committed, thanks

Sorry for speaking up late.

This change may confuse the users who run "pg_ctl start" to perform a crash
recovery, archive recovery and standby server (with hot_standby=off) because
"pg_ctl start" would not return so long time. Also during that long time,
the error message "FATAL:  the database system is starting up" keeps outputing.
This makes me think that -W is better as the default of at least "pg_ctl start".

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for changes to recovery.conf API

2017-01-12 Thread Fujii Masao

On Thu, Jan 12, 2017 at 6:46 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 12 January 2017 at 05:49, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Wed, Jan 11, 2017 at 7:36 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>>> On 11 January 2017 at 09:51, Fujii Masao <masao.fu...@gmail.com> wrote:
>>>
>>>>> 5. recovery.conf parameters are all moved to postgresql.conf, with these 
>>>>> changes
>>>>
>>>> In current design of the patch, when recovery parameters are misconfigured
>>>> (e.g., set recovery_target_timeline to invalid timeline id) and
>>>> the configuration file is reloaded, the startup process emits FATAL error 
>>>> and
>>>> the server goes down. I don't think this is fine. Basically even
>>>> misconfiguration of the parameters should not cause the server crash.
>>>> If invalid settings are supplied, I think that we just should warn them
>>>> and ignore those new settings, like current other GUC is. Thought?
>>>
>>> Thanks for your comments.
>>>
>>> The specification of the recovery target parameters should be different, 
>>> IMHO.
>>>
>>> If the user is performing a recovery and the target of the recovery is
>>> incorrectly specified then it is clear that the recovery cannot
>>> continue with an imprecisely specified target.
>>
>> Could you tell me what case of "the target of the recovery is incorrectly
>> specified" are you thinking of? For example, you're thinking the case where
>> invalid format of timestamp value is specified in recovery_target_time?
>> In this case, I think that we should issue a WARNING and ignore that invalid
>> setting when the configuration file is reloaded. Of course, if it's the 
>> server
>> startup time, such invalid setting should prevent the server from starting 
>> up.
>>
>> Regarding recovery_target_timeline, isn't it better to mark that parameter
>> as PGC_POSTMASTER? I'm not sure if we really want to change the recovery
>> target timeline without server restart and also not sure if that operation is
>> safe.
>
> It's one of the main objectives of the patch to allow user to reset
> parameters online so I don't think we should set PGC_POSTMASTER.

I'm OK to mark the recovery target parameters *except* recovery_target_timeline
as PGC_SIGHUP. But, as I told upthread, I'm wondering whether there is really
the use case where the target timeline needs to be changed online. Also I'm
wondering whether that online change of target timeline is actually safe.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-11 Thread Fujii Masao

On Thu, Jan 12, 2017 at 2:49 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Thu, Jan 12, 2017 at 2:00 AM, Vladimir Rusinov <vrusi...@google.com> wrote:
>> On Tue, Jan 10, 2017 at 5:24 AM, Michael Paquier <michael.paqu...@gmail.com>
>> wrote:
>>> As there are two school of thoughts on this thread, keeping your patch
>>> with the compatibility table is the best move for now. Even if we end
>>> up by having a version without aliases, that will be just code to
>>> remove in the final version.
>>
>> Indeed, it is trivial to kill aliases.
>>
>> New version of the patch attached.
>
> The patch still updates a bunch of .po files. Those are normally
> refreshed with the translation updates, so they had better be removed.
> Other than that, the patch looks in good shape to me so switch to
> "Ready for committer".
>
> So, to sum up things on this thread, here are the votes about the use
> of aliases or a pure breakage:
> - No to aliases, shake the world: Stephen, Tom, David F, Vik, Bruce M => 5
> - Yes to aliases: Michael P, Andres, Peter E., Cynthia S, Jim N,
> Vladimir, Simon R, Fujii-san => 8
> If I misunderstood things, please feel free to speak up.

To be pricise, I'd like to avoid the renaming of the functions at all rather
than using aliases.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for changes to recovery.conf API

2017-01-11 Thread Fujii Masao

On Wed, Jan 11, 2017 at 7:36 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 11 January 2017 at 09:51, Fujii Masao <masao.fu...@gmail.com> wrote:
>
>>> 5. recovery.conf parameters are all moved to postgresql.conf, with these 
>>> changes
>>
>> In current design of the patch, when recovery parameters are misconfigured
>> (e.g., set recovery_target_timeline to invalid timeline id) and
>> the configuration file is reloaded, the startup process emits FATAL error and
>> the server goes down. I don't think this is fine. Basically even
>> misconfiguration of the parameters should not cause the server crash.
>> If invalid settings are supplied, I think that we just should warn them
>> and ignore those new settings, like current other GUC is. Thought?
>
> Thanks for your comments.
>
> The specification of the recovery target parameters should be different, IMHO.
>
> If the user is performing a recovery and the target of the recovery is
> incorrectly specified then it is clear that the recovery cannot
> continue with an imprecisely specified target.

Could you tell me what case of "the target of the recovery is incorrectly
specified" are you thinking of? For example, you're thinking the case where
invalid format of timestamp value is specified in recovery_target_time?
In this case, I think that we should issue a WARNING and ignore that invalid
setting when the configuration file is reloaded. Of course, if it's the server
startup time, such invalid setting should prevent the server from starting up.

Regarding recovery_target_timeline, isn't it better to mark that parameter
as PGC_POSTMASTER? I'm not sure if we really want to change the recovery
target timeline without server restart and also not sure if that operation is
safe.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2751 matches

Mail list logo