date:20210902

Re: suboverflowed subtransactions concurrency performance optimize

2021-09-02 Thread Andrey Borodin

Sorry, for some reason Mail.app converted message to html and mailing list 
mangled this html into mess. I'm resending previous message as plain text 
again. Sorry for the noise.

> 31 авг. 2021 г., в 11:43, Pengchengliu  написал(а):
> 
> Hi Andrey,
>  Thanks a lot for your replay and reference information.
> 
>  The default NUM_SUBTRANS_BUFFERS is 32. My implementation is 
> local_cache_subtrans_pages can be adjusted dynamically.
>  If we configure local_cache_subtrans_pages as 64, every backend use only 
> extra 64*8192=512KB memory. 
>  So the local cache is similar to the first level cache. And subtrans SLRU is 
> the second level cache.
>  And I think extra memory is very well worth it. It really resolve massive 
> subtrans stuck issue which I mentioned in previous email.
> 
>  I have view the patch of [0] before. For SLRU buffers adding GUC 
> configuration parameters are very nice.
>  I think for subtrans, its optimize is not enough. For 
> SubTransGetTopmostTransaction, we should get the SubtransSLRULock first, then 
> call SubTransGetParent in loop.
>  Prevent acquire/release  SubtransSLRULock in SubTransGetTopmostTransaction-> 
> SubTransGetParent in loop.
>  After I apply this patch which I  optimize SubTransGetTopmostTransaction,  
> with my test case, I still get stuck result.

SubTransGetParent() acquires only Shared lock on SubtransSLRULock. The problem 
may arise only when someone reads page from disk. But if you have big enough 
cache - this will never happen. And this cache will be much less than 
512KB*max_connections.

I think if we really want to fix exclusive SubtransSLRULock I think best option 
would be to split SLRU control lock into array of locks - one for each bank (in 
v17-0002-Divide-SLRU-buffers-into-n-associative-banks.patch). With this 
approach we will have to rename s/bank/partition/g for consistency with locks 
and buffers partitions. I really liked having my own banks, but consistency 
worth it anyway.

Thanks!

Best regards, Andrey Borodin.

Re: Skipping logical replication transactions on subscriber side

2021-09-02 Thread Greg Nancarrow

On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada  wrote:
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

BTW, these patches need rebasing (broken by recent commits, patches
0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Improve logging when using Huge Pages

2021-09-02 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Fujii-san, Julien-san

Thank you very much for your comment.
I followed your comment and changed the elog function to ereport function and 
also changed the log level. The output message is the same as in the case of 
non-HugePages memory acquisition failure.I did not simplify the error messages 
as it would have complicated the response to the preprocessor.

> I agree that the message should be promoted to a higher level.  But I 
> think we should also make that information available at the SQL level, 
> as the log files may be truncated / rotated before you need the info, 
> and it can be troublesome to find the information at the OS level, if 
> you're lucky enough to have OS access.

In the attached patch, I have added an Internal GUC 'using_huge_pages' to know 
that it is using HugePages. This parameter will be True only if the instance is 
using HugePages.

Regards,
Noriyoshi Shinoda

-Original Message-
From: Fujii Masao [mailto:masao.fu...@oss.nttdata.com] 
Sent: Wednesday, September 1, 2021 2:06 AM
To: Julien Rouhaud ; Shinoda, Noriyoshi (PN Japan FSIP) 

Cc: pgsql-hack...@postgresql.org
Subject: Re: Improve logging when using Huge Pages

On 2021/08/31 15:57, Julien Rouhaud wrote:
> On Tue, Aug 31, 2021 at 1:37 PM Shinoda, Noriyoshi (PN Japan FSIP) 
>  wrote:
>>
>> In the current version, when GUC huge_pages=try, which is the default 
>> setting, no log is output regardless of the success or failure of the 
>> HugePages acquisition. If you want to output logs, you need to set 
>> log_min_messages=DEBUG3, but it will output a huge amount of extra logs.
>> With huge_pages=try setting, if the kernel parameter vm.nr_hugepages is not 
>> enough, the administrator will not notice that HugePages is not being used.
>> I think it should output a log if HugePages was not available.

+1

-   elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge 
pages disabled: %m",
+   elog(WARNING, "mmap(%zu) with MAP_HUGETLB failed, huge 
pages 
+disabled: %m",

elog() should be used only for internal errors and low-level debug logging.
So per your proposal, elog() is not suitable here. Instead, ereport() should be 
used.

The log level should be LOG rather than WARNING because this message indicates 
the information about server activity that administrators are interested in.

The message should be updated so that it follows the Error Message Style Guide.
https://www.postgresql.org/docs/devel/error-style-guide.html 

With huge_pages=on, if shared memory fails to be allocated, the error message 
is reported currently. Even with huge_page=try, this error message should be 
used to simplify the code as follows?

 errno = mmap_errno;
-   ereport(FATAL,
+   ereport((huge_pages == HUGE_PAGES_ON) ? FATAL : LOG,
 (errmsg("could not map anonymous shared 
memory: %m"),
  (mmap_errno == ENOMEM) ?
  errhint("This error usually means that 
PostgreSQL's request "

> I agree that the message should be promoted to a higher level.  But I 
> think we should also make that information available at the SQL level, 
> as the log files may be truncated / rotated before you need the info, 
> and it can be troublesome to find the information at the OS level, if 
> you're lucky enough to have OS access.

+1

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

huge_pages_log_v2.diff
Description: huge_pages_log_v2.diff

Re: Read-only vs read only vs readonly

2021-09-02 Thread Kyotaro Horiguchi

At Thu, 2 Sep 2021 22:07:02 +, "Bossart, Nathan"  
wrote in 
> On 9/2/21, 11:30 AM, "Magnus Hagander"  wrote:
> > I had a customer point out to me that we're inconsistent in how we
> > spell read-only. Turns out we're not as inconsistent as I initially
> > thought :), but that they did manage to spot the one actual log
> > message we have that writes it differently than everything else -- but
> > that broke their grepping...
> >
> > Almost everywhere we use read-only. Attached patch changes the one log
> > message where we didn't, as well as a few places in the docs for it. I
> > did not bother with things like comments in the code.
> > 
> > Two questions:
> >
> > 1. Is it worth fixing? Or just silly nitpicking?
> 
> It seems entirely reasonable to me to consistently use "read-only" in
> the log messages and documentation.
> 
> > 2. What about translations? This string exists in translations --
> > should we just "fix" it there, without touching the translated string?
> > Or try to fix both? Or leave it for the translators who will get a
> > diff on it?
> 
> I don't have a strong opinion, but if I had to choose, I would say to
> leave it to the translators to decide.

+1 for both.  As a translator, it seems that that kind of changes are
usual.  Many changes about full-stops, spacings, capitalizing and so
happen especially at near-release season like now.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)

2021-09-02 Thread kuroda.hay...@fujitsu.com

Dear Fujii-san,

Thank you for your great works. Attached is the latest version.

> Thanks! What about updating the comments furthermore as follows?
> 
> -
> Use pgfdw_application_name as application_name if set.
> 
> PQconnectdbParams() processes the parameter arrays from start to end.
> If any key word is repeated, the last value is used. Therefore note that
> pgfdw_application_name must be added to the arrays after options of
> ForeignServer are, so that it can override application_name set in
> ForeignServer.
> -

It's more friendly than mine because it mentions
about specification about PQconnectdbParams(). 
Fixed like yours.

> + }
> + /* Use "postgres_fdw" as fallback_application_name */
> 
> It's better to add new empty line between these two lines.

Fixed.

> +-- Disconnect once because the value is used only when establishing
> connections
> +DO $$
> + BEGIN
> + PERFORM postgres_fdw_disconnect_all();
> + END
> +$$;
> 
> Why does DO command need to be used here to execute
> postgres_fdw_disconnect_all()? Instead, we can just execute
> "SELECT 1 FROM postgres_fdw_disconnect_all();"?

DO command was used because I want to
ignore the returning value of postgres_fdw_disconnect_all().
Currently this function retruns false, but if other tests are modified,
some connection may remain and the function may become to return true.

I seeked sql file and I found that this function was called by your way.
Hence I fixed.

> For test coverage, it's better to test at least the following three cases?
> 
> (1) appname is set in neither GUC nor foreign server
> -> "postgres_fdw" set in fallback_application_name is used
> (2) appname is set in foreign server, but not in GUC
> -> appname in foreign server is used
> (3) appname is set both in GUC and foreign server
>-> appname in GUC is used

I set four testcases:

(1) Sets neither GUC nor server option
(2) Sets server option, but not GUC
(3) Sets GUC but not server option
(4) Sets both GUC and server option

I confirmed it almost works fine, but I found that
fallback_application_name will be never used in our test enviroment.
It is caused because our test runner pg_regress sets PGAPPNAME to "pg_regress" 
and
libpq prefer the environment variable to fallback_appname.
(I tried to control it by \setenv, but failed...)

> +SELECT FROM ft1 LIMIT 1;
> 
> "1" should be added just after SELECT in the above statement?
> Because postgres_fdw regression test basically uses "SELECT 1 FROM ..."
> in other places.

Fixed.

> + DefineCustomStringVariable("postgres_fdw.application_name",
> +"Sets the 
> application name. This is used when connects to the remote server.",
>
> What about simplifying this description as follows?
>
> ---
> Sets the application name to be used on the remote server.
> ---

+1.

> +   Configuration Parameters 
> +  
> 
> The empty characters just after  and before  should be removed?

I checked other sgml file and agreed. Fixed.

And I found that including string.h is no more needed. Hence it is removed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v07_0001_add_application_name_GUC.patch
Description: v07_0001_add_application_name_GUC.patch

Re: Possible missing segments in archiving on standby

2021-09-02 Thread Kyotaro Horiguchi

At Fri, 3 Sep 2021 02:06:45 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/09/02 10:16, Kyotaro Horiguchi wrote:
> > Ok, I agree that the reader-side needs an amendment.
> 
> Thanks for the review! Attached is the updated version of the patch.
> Based on my latest patch, I changed the startup process so that
> it creates an archive notification file of the streamed WAL segment
> including XLOG_SWITCH record if the notification file has not been
> created yet.

+   if (readSource == XLOG_FROM_STREAM &&
+   record->xl_rmid == RM_XLOG_ID &&
+   (record->xl_info & ~XLR_INFO_MASK) == 
XLOG_SWITCH)

readSource is the source at the time startup reads it and it could be
different from the source at the time the record was written. We
cannot know where the record came from there, so does the readSource
condition work as expected?  If we had some trouble streaming just
before, readSource at the time is likely to be XLOG_FROM_PG_WAL.

+   if (XLogArchivingAlways())
+   
XLogArchiveNotify(xlogfilename, true);
+   else
+   
XLogArchiveForceDone(xlogfilename);

The path is used both for crash and archive recovery. If we pass there
while crash recovery on a primary with archive_mode=on, the file could
be marked .done before actually archived. On the other hand when
archive_mode=always, the file could be re-marked .ready even after it
has been already archived.  Why isn't it XLogArchiveCheckDone?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Unused variable in TAP tests file

2021-09-02 Thread Amul Sul

Few tap test files have the "tempdir_short" variable which isn't in
use. The attached patch removes the same

Regards,
Amul
From 0751895df64bcd6bc719933013edf1d76e31b784 Mon Sep 17 00:00:00 2001
From: Amul Sul 
Date: Fri, 3 Sep 2021 01:19:29 -0400
Subject: [PATCH] Remove unused variable

---
 src/bin/pg_ctl/t/002_status.pl   | 1 -
 src/bin/pg_dump/t/001_basic.pl   | 1 -
 src/bin/pg_dump/t/002_pg_dump.pl | 1 -
 src/bin/pg_dump/t/003_pg_dump_with_server.pl | 1 -
 src/test/modules/test_pg_dump/t/001_base.pl  | 1 -
 5 files changed, 5 deletions(-)

diff --git a/src/bin/pg_ctl/t/002_status.pl b/src/bin/pg_ctl/t/002_status.pl
index 56a06fafa3b..c6f4fac57b6 100644
--- a/src/bin/pg_ctl/t/002_status.pl
+++ b/src/bin/pg_ctl/t/002_status.pl
@@ -9,7 +9,6 @@ use TestLib;
 use Test::More tests => 3;
 
 my $tempdir   = TestLib::tempdir;
-my $tempdir_short = TestLib::tempdir_short;
 
 command_exit_is([ 'pg_ctl', 'status', '-D', "$tempdir/nonexistent" ],
 	4, 'pg_ctl status with nonexistent directory');
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index d1a7e1db405..d6731855eda 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -10,7 +10,6 @@ use TestLib;
 use Test::More tests => 82;
 
 my $tempdir   = TestLib::tempdir;
-my $tempdir_short = TestLib::tempdir_short;
 
 #
 # Basic checks
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index a4ee54d516f..223f60e3bcb 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -10,7 +10,6 @@ use TestLib;
 use Test::More;
 
 my $tempdir   = TestLib::tempdir;
-my $tempdir_short = TestLib::tempdir_short;
 
 ###
 # Definition of the pg_dump runs to make.
diff --git a/src/bin/pg_dump/t/003_pg_dump_with_server.pl b/src/bin/pg_dump/t/003_pg_dump_with_server.pl
index ba994aee823..a879ae28d8d 100644
--- a/src/bin/pg_dump/t/003_pg_dump_with_server.pl
+++ b/src/bin/pg_dump/t/003_pg_dump_with_server.pl
@@ -9,7 +9,6 @@ use TestLib;
 use Test::More tests => 3;
 
 my $tempdir   = TestLib::tempdir;
-my $tempdir_short = TestLib::tempdir_short;
 
 my $node = PostgresNode->new('main');
 my $port = $node->port;
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index ea7739d7254..17c404c81f2 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -10,7 +10,6 @@ use TestLib;
 use Test::More;
 
 my $tempdir   = TestLib::tempdir;
-my $tempdir_short = TestLib::tempdir_short;
 
 ###
 # This structure is based off of the src/bin/pg_dump/t test
-- 
2.18.0

Re: Estimating HugePages Requirements?

2021-09-02 Thread Kyotaro Horiguchi

At Thu, 2 Sep 2021 16:46:56 +, "Bossart, Nathan"  
wrote in 
> On 9/2/21, 12:54 AM, "Michael Paquier"  wrote:
> > Thanks.  Anyway, we don't really need huge_pages_required on Windows,
> > do we?  The following docs of Windows tell what do to when using large
> > pages:
> > https://docs.microsoft.com/en-us/windows/win32/memory/large-page-support
> >
> > The backend code does that as in PGSharedMemoryCreate(), now that I
> > look at it.  And there is no way to change the minimum large page size
> > there as far as I can see because that's decided by the processor, no?
> > There is a case for shared_memory_size on Windows to be able to adjust
> > the sizing of the memory of the host, though.
> 
> Yeah, huge_pages_required might not serve much purpose for Windows.
> We could always set it to -1 for Windows if it seems like it'll do
> more harm than good.

I agreed to this.

> > At the end it would be nice to not finish with two GUCs.  Both depend
> > on the reordering of the actions done by the postmaster, so I'd be
> > curious to hear the thoughts of others on this particular point.
> 
> Of course.  It'd be great to hear others' thoughts on this stuff.

Honestly, I would be satisfied if the following error message
contained required huge pages.

FATAL:  could not map anonymous shared memory: Cannot allocate memory
HINT:  This error usually means that PostgreSQL's request for a shared memory 
segment exceeded available memory, swap space, or huge pages. To reduce the 
request size (currently 148897792 bytes), reduce PostgreSQL's shared memory 
usage, perhaps by reducing shared_buffers or max_connections.

Or emit a different message if huge_pages=on.

FATAL: could not map anonymous shared memory from huge pages
HINT:  This usually means that PostgreSQL's request for huge pages more than 
available. The required 2048kB huge pages for the required memory size 
(currently 148897792 bytes) is 71 pages.

Returning to this feature, even if I am informed that via GUC, I won't
add memory by looking shared_memory_size.  Anyway since shard_buffers
occupies almost all portion of shared memory allocated to postgres, we
are not supposed to need such a precise adjustment of the required
size of shared memory.  On the other hand available number of huge
pages is configurable and we need to set it as required.  On the other
hand, it might seem to me a bit strange that there's only
huge_page_required and not shared_memory_size in the view of
comprehensiveness or completeness.  So my feeling at this point is "I
need only huge_pages_required but might want shared_memory_size just
for completeness".

By the way I noticed that postgres -C huge_page_size shows 0, which I
think should have the number used for the calculation if we show
huge_page_required.

I noticed that postgres -C shared_memory_size showed 137 (= 144703488)
whereas the error message above showed 148897792 bytes (142MB). So it
seems that something is forgotten while calculating
shared_memory_size.  As the consequence, launching postgres setting
huge_pages_required (69 pages) as vm.nr_hugepages ended up in the
"could not map anonymous shared memory" error.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: using an end-of-recovery record in all cases

2021-09-02 Thread Kyotaro Horiguchi

At Thu, 2 Sep 2021 11:30:59 -0400, Robert Haas  wrote in 
> On Mon, Aug 9, 2021 at 3:00 PM Robert Haas  wrote:
> > I decided to try writing a patch to use an end-of-recovery record
> > rather than a checkpoint record in all cases.
> >
> > The first problem I hit was that GetRunningTransactionData() does
> > Assert(TransactionIdIsNormal(CurrentRunningXacts->latestCompletedXid)).
> >
> > Unfortunately we can't just relax the assertion, because the
> > XLOG_RUNNING_XACTS record will eventually be handed to
> > ProcArrayApplyRecoveryInfo() for processing ... and that function
> > contains a matching assertion which would in turn fail. It in turn
> > passes the value to MaintainLatestCompletedXidRecovery() which
> > contains yet another matching assertion, so the restriction to normal
> > XIDs here looks pretty deliberate. There are no comments, though, so
> > the reader is left to guess why. I see one problem:
> > MaintainLatestCompletedXidRecovery uses FullXidRelativeTo, which
> > expects a normal XID. Perhaps it's best to just dodge the entire issue
> > by skipping LogStandbySnapshot() if latestCompletedXid happens to be
> > 2, but that feels like a hack, because AFAICS the real problem is that
> > StartupXLog() doesn't agree with the rest of the code on whether 2 is
> > a legal case, and maybe we ought to be storing a value that doesn't
> > need to be computed via TransactionIdRetreat().
> 
> Anyone have any thoughts about this?

I tried to reproduce this but just replacing the end-of-recovery
checkpoint request with issuing an end-of-recovery record didn't cause
make check-workd fail for me.  Do you have an idea of any other
requirement to cause that?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: [PATCH] support tab-completion for single quote input with equal sign

2021-09-02 Thread tanghy.f...@fujitsu.com

On Friday, September 3, 2021 2:14 AM, Jacob Champion  
wrote
>I applied your patch against HEAD (and did a clean build for good
>measure) but couldn't get the tab-completion you described -- on my
>machine, `PUBLICATION` still fails to complete. Tab completion is
>working in general, for example with the `SUBSCRIPTION` and
>`CONNECTION` keywords.
>
>Is there additional setup that I need to do?

Thanks for your check.

I applied the 0001-patch to the HEAD(c95ede41) by now and it worked as I 
expected.
Did you leave a space between "dbname=postgres'" and "[TAB]"?

In 0002-patch I added a tap test for the scenario of single quoted input with 
equal sign. 
The test result is 'All tests successful. ' on my machine. 
You can run the tap tests as follows:
1. Apply the attached patch
2. build the src with option '--enable-tap-tests' (./configure 
--enable-tap-tests)
3. Move to src subdirectory 'src/bin/psql' 
4. Run 'make check' 

I'd appreciate it if you can share your test results with me.

Regards,
Tang


v2-0001-support-tab-completion-for-single-quote-input-wit.patch
Description:  v2-0001-support-tab-completion-for-single-quote-input-wit.patch

Re: suboverflowed subtransactions concurrency performance optimize

2021-09-02 Thread Andrey Borodin

31 авг. 2021 г., в 11:43, Pengchengliu написал(а):Hi Andrey, Thanks a lot for your replay and reference information. The default NUM_SUBTRANS_BUFFERS is 32. My implementation is local_cache_subtrans_pages can be adjusted dynamically. If we configure local_cache_subtrans_pages as 64, every backend use only extra 64*8192=512KB memory. So the local cache is similar to the first level cache. And subtrans SLRU is the second level cache. And I think extra memory is very well worth it. It really resolve massive subtrans stuck issue which I mentioned in previous email. I have view the patch of [0] before. For SLRU buffers adding GUC configuration parameters are very nice. I think for subtrans, its optimize is not enough. For SubTransGetTopmostTransaction, we should get the SubtransSLRULock first, then call SubTransGetParent in loop. Prevent acquire/release SubtransSLRULock in SubTransGetTopmostTransaction-> SubTransGetParent in loop. After I apply this patch which I optimize SubTransGetTopmostTransaction, with my test case, I still get stuck result.SubTransGetParent() acquires only Shared lock on SubtransSLRULock. The problem may arise only when someone reads page from disk. But if you have big enough cache - this will never happen. And this cache will be much less than 512KB*max_connections.I think if we really want to fix exclusive SubtransSLRULock I think best option would be to split SLRU control lock into array of locks - one for each bank (in v17-0002-Divide-SLRU-buffers-into-n-associative-banks.patch). With this approach we will have to rename s/bank/partition/g for consistency with locks and buffers partitions. I really liked having my own banks, but consistency worth it anyway.Thanks!Best regards, Andrey Borodin.

82 matches

Mail list logo