Re: MSVC Build support with visual studio 2019
On Wed, 3 Jul 2019 at 10:01, Michael Paquier wrote: > On Tue, Jul 02, 2019 at 02:10:11PM +0900, Michael Paquier wrote: > > OK, committed to HEAD for now after perltidy'ing the patch. Let's see > > what the buildfarm has to say about it first. Once we are sure that > > the thing is stable, I'll try to backpatch it. This works on my own > > dev machines with VS 2015 and 2019, but who knows what hides in the > > shadows... > > The buildfarm did not have much to say, so backpatched down to 9.4, > adjusting things on the way. Thanks Michael. Regards, Haribabu Kommi
Re: Libpq support to connect to standby server as priority
On Mon, 3 Jun 2019 at 16:32, Haribabu Kommi wrote: > > > On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi > wrote: > >> I fixed all the comments that you have raised above and attached the >> updated >> patches. >> > > Rebased patches are attached. > Rebased patches are attached. Regards, Haribabu Kommi 0001-Restructure-the-code-to-remove-duplicate-code.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-variable.patch Description: Binary data 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0005-New-read-only-target_session_attrs-type.patch Description: Binary data 0006-Primary-prefer-standby-and-standby-options.patch Description: Binary data 0007-New-function-to-rejecting-the-checked-write-connecti.patch Description: Binary data 0008-Server-recovery-mode-handling.patch Description: Binary data
Re: MSVC Build support with visual studio 2019
On Thu, 27 Jun 2019 at 17:28, Michael Paquier wrote: > On Wed, Jun 26, 2019 at 10:29:05PM +1000, Haribabu Kommi wrote: > > Thanks for the review. Yes, that patch applies till 9.5, it is my mistake > > in naming the patch. > > I have been able to finally set up an environment with VS 2019 (as > usual this stuff needs time, anyway..), and I can confirm that the > patch is able to compile properly. > Thanks for the review. > - Visual Studio 2017 (including Express > editions), > - as well as standalone Windows SDK releases 6.0 to 8.1. > + Visual Studio 2019 (including Express > editions), > + as well as standalone Windows SDK releases 8.1a to 10. > I would like to understand why this range of requirements is updated. > Is there any reason to do so. If we change these docs, what does it > mean in terms of versions of Visual Studio supported? > We stopped the support of building with all the visual studio versions less than 2013. I updated the SDK versions accordingly. > -or a VS2015Solution or a VS2017Solution, all in Solution.pm, depending on > -the user's build environment) and adding objects implementing the > corresponding > -Project interface (VC2013Project or VC2015Project or VC2017Project from > -MSBuildProject.pm) to it. > +or a VS2015Solution or a VS2017Solution or a VS2019Solution, all in > Solution.pm, > +depending on the user's build environment) and adding objects implementing > +the corresponding Project interface (VC2013Project or VC2015Project or > VC2017Project > +or VC2019Project from MSBuildProject.pm) to it. > This formulation is weird the more we accumulate new objects, let's > put that in a proper list of elements separated with commas except > for the two last ones which should use "or". > > s/greather/greater/. > > The patch still has typos, and the format is not satisfying yet, so I > have done a set of fixes as per the attached. > The change in the patch is good. > > - elsif ($major < 6) > + elsif ($major < 12) > { > croak > - "Unable to determine Visual Studio version: > Visual Studio versions before 6.0 aren't supported."; > + "Unable to determine Visual Studio version: > Visual Studio versions before 12.0 aren't supported."; > Well, this is a separate bug fix, still I don't mind fixing that in > the same patch as we meddle with those code paths now. Good catch. > > - croak $visualStudioVersion; > + carp $visualStudioVersion; > Same here. Just wouldn't it be better to print the version found in > the same message? > Yes, that is a good change, I thought of doing the same, but I don't know how to do it. The similar change is required for the CreateProject also. > + # The major visual studio that is supported has nmake version >= > 14.20 and < 15. > if ($major > 14) > Comment line is too long. It seems to me that the condition here > should be ($major >= 14 && $minor >= 30). That's not completely > correct either as we have a version higher than 14.20 for VS 2019 but > that's better than just using 14.29 or a fake number I guess. > The change is good, but the comment is wrong. + # The major visual studio that is supported has nmake + # version >= 14.30, so stick with it as the latest version The major visual studio version that is supported has nmake version <=14.30 Except for the above two changes, overall the patch is in good shape. Regards, Haribabu Kommi
Re: MSVC Build support with visual studio 2019
On Wed, 5 Jun 2019 at 17:22, Juan José Santamaría Flecha < juanjo.santama...@gmail.com> wrote: > > On Wed, May 29, 2019 at 10:30 AM Haribabu Kommi > wrote: > >> >> Updated patches are attached. >> >> > All patches apply, build and pass tests. The patch > '0001-support-building-with-visual-studio-2019_v10_to_v9.6_v3.patch' > applies on version 9.5. > > Not sure if more review is needed before moving to 'ready for commite'r, > but I have no more comments to make on current patches. > Thanks for the review. Yes, that patch applies till 9.5, it is my mistake in naming the patch. Regards, Haribabu Kommi
Re: pg_basebackup failure after setting default_table_access_method option
On Thu, Jun 6, 2019 at 1:46 AM vignesh C wrote: > > Hi, > > *I noticed pg_basebackup failure when default_table_access_method option > is set.* > > *Test steps:* > > Step 1: Init database > ./initdb -D data > > Step 2: Start Server > ./postgres -D data & > > Step 3: Set Guc option > export PGOPTIONS='-c default_table_access_method=zheap' > > Step 4: Peform backup > /pg_basebackup -D backup -p 5432 --no-sync > 2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without > having selected a database > pg_basebackup: error: could not connect to server: FATAL: cannot read > pg_class without having selected a database > > *Reason why it is failing:* > pg_basebackup does not use any database to connect to server as it backs > up the whole data instance. > As the option default_table_access_method is set. > It tries to validate this option, but while validating the option in > ScanPgRelation function: > if (!OidIsValid(MyDatabaseId)) > elog(FATAL, "cannot read pg_class without having selected a database"); > > Here as pg_basebackup uses no database the command fails. > Thanks for the details steps to reproduce the bug, I am also able to reproduce the problem. > Fix: > The patch has the fix for the above issue: > > Let me know your opinion on this. > Thanks for the patch and it fixes the problem. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi wrote: > I fixed all the comments that you have raised above and attached the > updated > patches. > Rebased patches are attached. Regards, Haribabu Kommi Fujitsu Australia 0001-New-pg_basebackup-g-option-to-control-the-group-acce.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch Description: Binary data 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0005-New-read-only-target_session_attrs-type.patch Description: Binary data 0006-Primary-prefer-standby-and-standby-options.patch Description: Binary data 0007-New-function-to-rejecting-the-checked-write-connecti.patch Description: Binary data 0008-Server-recovery-mode-handling.patch Description: Binary data
Re: How to know referenced sub-fields of a composite type?
On Wed, May 29, 2019 at 4:51 PM Kohei KaiGai wrote: > Hi Amit, > > 2019年5月29日(水) 13:26 Amit Langote : > > > > Kaigai-san, > > > > On 2019/05/29 12:13, Kohei KaiGai wrote: > > > One interesting data type in Apache Arrow is "Struct" data type. It is > > > equivalent to composite > > > type in PostgreSQL. The "Struct" type has sub-fields, and individual > > > sub-fields have its own > > > values array for each. > > > > > > It means we can skip to load the sub-fields unreferenced, if > > > query-planner can handle > > > referenced and unreferenced sub-fields correctly. > > > On the other hands, it looks to me RelOptInfo or other optimizer > > > related structure don't have > > > this kind of information. RelOptInfo->attr_needed tells extension > > > which attributes are referenced > > > by other relation, however, its granularity is not sufficient for > sub-fields. > > > > Isn't that true for some other cases as well, like when a query accesses > > only some sub-fields of a json(b) column? In that case too, planner > > itself can't optimize away access to other sub-fields. What it can do > > though is match a suitable index to the operator used to access the > > individual sub-fields, so that the index (if one is matched and chosen) > > can optimize away accessing unnecessary sub-fields. IOW, it seems to me > > that the optimizer leaves it up to the indexes (and plan nodes) to > further > > optimize access to within a field. How is this case any different? > > > I think it is a little bit different scenario. > Even if an index on sub-fields can indicate the tuples to be fetched, > the fetched tuple contains all the sub-fields because heaptuple is > row-oriented data. > For example, if WHERE-clause checks a sub-field: "x" then aggregate > function references other sub-field "y", Scan/Join node has to return > a tuple that contains both "x" and "y". IndexScan also pops up a tuple > with a full composite type, so here is no problem if we cannot know > which sub-fields are referenced in the later stage. > Maybe, if IndexOnlyScan supports to return a partial composite type, > it needs similar infrastructure that can be used for a better composite > type support on columnar storage. > There is another issue related to the columnar store that needs targeted columns for projection from the scan is discussed in zedstore [1]. Projecting all columns from a columnar store is quite expensive than the row store. [1] - https://www.postgresql.org/message-id/CALfoeivu-n5o8Juz9wW%2BkTjnis6_%2BrfMf%2BzOTky1LiTVk-ZFjA%40mail.gmail.com Regards, Haribabu Kommi Fujitsu Australia
Re: MSVC Build support with visual studio 2019
On Mon, May 27, 2019 at 8:14 PM Juan José Santamaría Flecha < juanjo.santama...@gmail.com> wrote: > > On Thu, May 23, 2019 at 3:44 AM Haribabu Kommi > wrote: > >> >> Updated patches are attached for all branches. >> >> > I have gone through all patches and there are a couple of typos: > Thanks for the review. > 1. s/prodcutname/productname/ > > 1.1 In file: 0001-support-building-with-visual-studio-2019_v9.4.patch > @@ -97,8 +97,9 @@ >Visual Studio 2013. Building with >Visual Studio 2015 is supported down to >Windows Vista and Windows Server 2008. > - Building with Visual Studio 2017 is > supported > - down to Windows 7 SP1 and Windows Server > 2008 R2 SP1. > + Building with Visual Studio 2017 and > + Visual Studio 2019 are supported down to > > 1.2 In file: > 0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch > @@ -97,8 +97,9 @@ >Visual Studio 2013. Building with >Visual Studio 2015 is supported down to >Windows Vista and Windows Server 2008. > - Building with Visual Studio 2017 is > supported > - down to Windows 7 SP1 and Windows Server > 2008 R2 SP1. > + Building with Visual Studio 2017 and > + Visual Studio 2019 are supported down to > Corrected. > 2. s/stuido/studio/ > > 2.1 In file: 0001-support-building-with-visual-studio-2019_v9.4.patch > @@ -143,12 +173,12 @@ sub DetermineVisualStudioVersion > sub _GetVisualStudioVersion > { > my ($major, $minor) = @_; > - # visual 2017 hasn't changed the nmake version to 15, so still using > the older version for comparison. > + # The major visual stuido that is suppored has nmake version >= 14.20 > and < 15. > > 2.2 In file: > 0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch > @@ -132,12 +162,12 @@ sub DetermineVisualStudioVersion > sub _GetVisualStudioVersion > { > my ($major, $minor) = @_; > - # visual 2017 hasn't changed the nmake version to 15, so still using > the older version for comparison. > + # The major visual stuido that is suppored has nmake version >= 14.20 > and < 15. > > 2.3 In file: 0001-support-building-with-visual-studio-2019_v11.patch > @@ -139,12 +165,12 @@ sub _GetVisualStudioVersion > { > my ($major, $minor) = @_; > > - # visual 2017 hasn't changed the nmake version to 15, so still using > the older version for comparison. > + # The major visual stuido that is suppored has nmake version >= 14.20 > and < 15. > > 2.4 In file: 0001-Support-building-with-visual-studio-2019_HEAD.patch > @@ -106,17 +132,17 @@ sub _GetVisualStudioVersion > { > my ($major, $minor) = @_; > > - # visual 2017 hasn't changed the nmake version to 15, so still using > the older version for comparison. > + # The major visual stuido that is suppored has nmake version >= 14.20 > and < 15. > Corrected. And also 'supported' spelling is also wrong. Updated patches are attached. Regards, Haribabu Kommi Fujitsu Australia 0001-support-building-with-visual-studio-2019_v11_v3.patch Description: Binary data 0001-support-building-with-visual-studio-2019_v10_to_v9.6_v3.patch Description: Binary data 0001-support-building-with-visual-studio-2019_v9.4_v3.patch Description: Binary data 0001-Support-building-with-visual-studio-2019_HEAD_v3.patch Description: Binary data
Re: MSVC Build support with visual studio 2019
On Wed, May 22, 2019 at 7:36 AM Juanjo Santamaria Flecha < juanjo.santama...@gmail.com> wrote: > I have gone through path > '0001-Support-building-with-visual-studio-2019.patch' only, but I am sure > some comments will also apply to back branches. > Thanks for the review. > 1. The VisualStudioVersion value looks odd: > > + $self->{VisualStudioVersion}= '16.0.32.32432'; > > Are you using a pre-release version [1]? > I first developed this patch on the preview version. I updated it to version 16.0.28729.10. > 2. There is a typo: s/stuido/studio/: > > + # The major visual stuido that is suppored has nmake version >= > 14.20 and < 15. > > There is something in the current code that I think should be also > updated. The code for _GetVisualStudioVersion contains: > > if ($major > 14) > { > carp > "The determined version of Visual Studio is newer than the latest > supported version. Returning the latest supported version instead."; > return '14.00'; > } > > Shouldn't the returned value be '14.20' for Visual Studio 2019? > Yes, that will be good to return Visual Studio 2019, updated. Updated patches are attached for all branches. Regards, Haribabu Kommi Fujitsu Australia 0001-Support-building-with-visual-studio-2019_HEAD.patch Description: Binary data 0001-support-building-with-visual-studio-2019_v11.patch Description: Binary data 0001-support-building-with-visual-studio-2019_v9.4.patch Description: Binary data 0001-support-building-with-visual-studio-2019_v10_to_v9.5.patch Description: Binary data
Re: PG 12 draft release notes
On Tue, May 21, 2019 at 8:17 AM Andres Freund wrote: > > >Add command to create >new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera, >Dimitri Dolgov) > > > A few points: > > 1) Is this really source code, given that CREATE ACCESS METHOD TYPE >TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an >option to DDL commands? > +1 It would be better to provide a description of the newly added syntax. Do we need to provide any 'Note' explaining that currently there are no other alternatives to the heap? 2) I think the description sounds a bit too much like it's about new >forms of tables, rather than their storage. How about something >roughly like: > >Allow different table access methods to be >used. This allows to develop and >use new ways of storing and accessing table data, optimized for >different use-cases, without having to modify >PostgreSQL. The existing heap access method >remains the default. > > 3) This misses a large set of commits around making tableam possible, in >particular the commits around > > commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1 > Author: Andres Freund > Date: 2018-11-16 16:35:11 -0800 > > Make TupleTableSlots extensible, finish split of existing slot type. > >Given that those commits entail an API break relevant for extensions, >should we have them as a separate "source code" note? > +1 to add, but I am not sure whether we need to list all the breakage that has introduced by Tableam needs to be described separately or with some combined note to explain it to extension developers is fine? > 4) I think the attribution isn't quite right. For one, a few names with >substantial work are missing (Amit Khandekar, Ashutosh Bapat, >Alexander Korotkov), and the order doesn't quite seem right. On the >latter part I might be somewhat petty, but I spend *many* months of >my life on this. > >How about: >Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, > David Rowley, Dimitri Golgov >if we keep 3) separate and >Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, > Alexander Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov >otherwise? +1 to either of the above. Without Andres enormous efforts, Tableam couldn't have been possible into v12. Regards, Haribabu Kommi Fujitsu Australia
Transparent data encryption support as an extension
Hi Hackers, I read many mail discussions in supporting data at rest encryption support in PostgreSQL. I checked the discussions around full instance encryption or tablespace or table level encryption. In my observation, all the proposals are trying to modify the core code to support encryption. I am thinking of an approach of providing tablespace level encryption support including WAL using an extension instead of changing the core code by adding hooks in xlogwrite and xlogread flows, reorderbuffer flows and also by adding smgr plugin routines to support encryption and decryption of other pages. Definitely this approach does't work for full instance encryption. Any opinions/comments/problems in evaluating the encryption with an extesnion approach? Regards, Haribabu Kommi Fujitsu Australia
Re: Attempt to consolidate reading of XLOG page
On Fri, Apr 12, 2019 at 2:06 AM Antonin Houska wrote: > While working on the instance encryption I found it annoying to apply > decyption of XLOG page to three different functions. Attached is a patch > that > tries to merge them all into one function, XLogRead(). The existing > implementations differ in the way new segment is opened. So I added a > pointer > to callback function as a new argument. This callback handles the specific > ways to determine segment file name and to open the file. > > I can split the patch into multiple diffs to make detailed review easier, > but > first I'd like to hear if anything is seriously wrong about this > design. Thanks. > I didn't check the code, but it is good to combine all the 3 page read functions into one instead of spreading the logic. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Mar 29, 2019 at 7:06 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > Hi Hari-san, > > I've reviewed all the files. The patch would be OK when the following > have been fixed, except for the complexity of fe-connect.c (which probably > cannot be improved.) > > Unfortunately, I'll be absent next week. The earliest date I can do the > test will be April 8 or 9. I hope someone could take care of this patch... > Thanks for the review. Apologies that I could not able finish it on time because of other work. > > (1) 0001 > With this read-only option type, application can connect to > to a read-only server in the list of hosts, in case > ... > before issuing the SHOW transaction_readonly to find out whether > > > "to" appears twice in a row. > transaction_readonly -> transaction_read_only > > > (2) 0001 > +succesful connection or failure. > > succesful -> successful > > > (3) 0008 > to conenct to a standby server with a faster check instead of > > conenct -> connect > > > (4) 0008 > Logically, recovery exit should be notified after the following statement: > > XLogCtl->SharedRecoveryInProgress = false; > > > (5) 0008 > + /* Update in_recovery status. */ > + if (LocalRecoveryInProgress) > + SetConfigOption("in_recovery", > + "on", > + PGC_INTERNAL, > PGC_S_OVERRIDE); > + > > This SetConfigOption() is called for every RecoveryInProgress() call on > the standby. It may cause undesirable overhead. How about just calling > SetConfigOption() once in InitPostgres() to set the initial value for > in_recovery? InitPostgres() and its subsidiary functions call > SetConfigOption() likewise. > > > (6) 0008 > async.c is for LISTEN/UNLISTEN/NOTIFY. How about adding the new functions > in postgres.c like RecoveryConflictInterrupt()? > > > (7) 0008 > + if (pid != 0) > + { > + (void) SendProcSignal(pid, reason, > procvxid.backendId); > + } > > The braces are not necessary because the block only contains a single > statement. > I fixed all the comments that you have raised above and attached the updated patches. Regards, Haribabu Kommi Fujitsu Australia 0001-Restructure-the-code-to-remove-duplicate-code.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch Description: Binary data 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0005-New-read-only-target_session_attrs-type.patch Description: Binary data 0006-Primary-prefer-standby-and-standby-options.patch Description: Binary data 0007-New-function-to-rejecting-the-checked-write-connecti.patch Description: Binary data 0008-Server-recovery-mode-handling.patch Description: Binary data
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Wed, Apr 10, 2019 at 3:32 PM Amit Kapila wrote: > On Mon, Apr 8, 2019 at 8:51 AM Amit Kapila > wrote: > > > > On Mon, Apr 8, 2019 at 7:54 AM Jamison, Kirk > wrote: > > > So I am marking this thread as “Ready for Committer”. > > > > > > > Thanks, Hari and Jamison for verification. The patches for > > back-branches looks good to me. I will once again verify them before > > commit. I will commit this patch tomorrow unless someone has > > objections. Robert/Alvaro, do let me know if you see any problem with > > this fix? > > > > Pushed. > Thanks Amit. Will look into it further to handle all the internally generated transactions. Regards, Haribabu Kommi Fujitsu Australia
Re: pgsql: tableam: basic documentation.
On Wed, Apr 10, 2019 at 12:56 PM Michael Paquier wrote: > Hi Andres, > > On Thu, Apr 04, 2019 at 12:42:06AM +, Andres Freund wrote: > > tableam: basic documentation. > > > > This adds documentation about the user oriented parts of table access > > methods (i.e. the default_table_access_method GUC and the USING clause > > for CREATE TABLE etc), adds a basic chapter about the table access > > method interface, and adds a note to storage.sgml that it's contents > > don't necessarily apply for non-builtin AMs. > > > > Author: Haribabu Kommi and Andres Freund > > Discussion: > https://postgr.es/m/20180703070645.wchpu5muyto5n...@alap3.anarazel.de > > + Any developer of a new table access method can refer > to > + the existing heap implementation present in > + src/backend/heap/heapam_handler.c for more details > of > + how it is implemented. > > This path is incorrect, it should be that instead (missing "access"): > src/backend/access/heap/heapam_handler.c > Thanks for the review, Yes I missed it when I added the path. Patch attached. Regards, Haribabu Kommi Fujitsu Australia 0001-tabeam-docs-file-path-correction.patch Description: Binary data
Re: MSVC Build support with visual studio 2019
On Wed, Mar 27, 2019 at 11:42 AM Haribabu Kommi wrote: > > On Wed, Mar 27, 2019 at 3:03 AM Andrew Dunstan < > andrew.duns...@2ndquadrant.com> wrote: > >> >> On 3/20/19 8:36 PM, Haribabu Kommi wrote: >> > Hi Hackers, >> > >> > Here I attached a patch that supports building of PostgreSQL with VS >> 2019. >> > VS 2019 is going to release on Apr 2nd 2019, it will be good if version >> 12 >> > supports compiling. The attached for is for review, it may needs some >> > updates >> > once the final version is released. >> > >> > Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the >> > minimum visual studio support to 2013 to support C99 standards, >> > because of this >> > reason, the current attached patch cannot be backpatched as it is. >> > >> > I can provide a separate back branches patch later once this patch >> > comes to a stage of commit. Currently all the supported branches are >> > possible to compile with VS 2017. >> > >> > comments? >> > >> > >> >> >> I have verified that this works with VS2019. >> >> >> There are a few typos in the comments that need cleaning up. >> > > Thanks for the review. > > I corrected the typos in the comments, hopefully I covered everything. > Attached is the updated patch. Once the final VS 2019, I will check the > patch if it needs any more updates. > Visual Studio 2019 is officially released. There is no major change in the patch, except some small comments update. Also attached patches for the back branches also. Regards, Haribabu Kommi Fujitsu Australia From 6b21a229400bae51b225dd96d68249d2a61d9ac2 Mon Sep 17 00:00:00 2001 From: Hari Babu Date: Fri, 5 Apr 2019 10:15:57 +1100 Subject: [PATCH] support building with visual studio 2019 --- doc/src/sgml/install-windows.sgml | 21 src/tools/msvc/MSBuildProject.pm | 25 +++ src/tools/msvc/README | 12 +- src/tools/msvc/Solution.pm| 28 ++ src/tools/msvc/VSObjectFactory.pm | 40 +-- 5 files changed, 103 insertions(+), 23 deletions(-) diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml index ac00ac8232..efa9f5dbc2 100644 --- a/doc/src/sgml/install-windows.sgml +++ b/doc/src/sgml/install-windows.sgml @@ -19,10 +19,10 @@ There are several different ways of building PostgreSQL on Windows. The simplest way to build with - Microsoft tools is to install Visual Studio Express 2017 + Microsoft tools is to install Visual Studio Express 2019 for Windows Desktop and use the included compiler. It is also possible to build with the full - Microsoft Visual C++ 2005 to 2017. + Microsoft Visual C++ 2005 to 2019. In some cases that requires the installation of the Windows SDK in addition to the compiler. @@ -69,19 +69,19 @@ Visual Studio Express or some versions of the Microsoft Windows SDK. If you do not already have a Visual Studio environment set up, the easiest - ways are to use the compilers from Visual Studio Express 2017 + ways are to use the compilers from Visual Studio Express 2019 for Windows Desktop or those in the Windows SDK - 8.1, which are both free downloads from Microsoft. + 10, which are both free downloads from Microsoft. Both 32-bit and 64-bit builds are possible with the Microsoft Compiler suite. 32-bit PostgreSQL builds are possible with Visual Studio 2005 to - Visual Studio 2017 (including Express editions), - as well as standalone Windows SDK releases 6.0 to 8.1. + Visual Studio 2019 (including Express editions), + as well as standalone Windows SDK releases 6.0 to 10. 64-bit PostgreSQL builds are supported with - Microsoft Windows SDK version 6.0a to 8.1 or + Microsoft Windows SDK version 6.0a to 10 or Visual Studio 2008 and above. Compilation is supported down to Windows XP and Windows Server 2003 when building with @@ -89,8 +89,9 @@ Visual Studio 2013. Building with Visual Studio 2015 is supported down to Windows Vista and Windows Server 2008. - Building with Visual Studio 2017 is supported - down to Windows 7 SP1 and Windows Server 2008 R2 SP1. + Building with Visual Studio 2017 and Visual Studio 2019 + are supported down to Windows 7 SP1 and + Windows Server 2008 R2 SP1. @@ -162,7 +163,7 @@ $ENV{MSBFLAGS}="/m"; If your build environment doesn't ship with a supported version of the Microsoft Windows SDK it is recommended that you upgrade to the latest version (currently - version 7.1), available for download from + version 10), available for download from https://www.microsoft.com/download";>. diff --git a
Re: Status of the table access method work
On Sat, Apr 6, 2019 at 7:25 AM Andres Freund wrote: > Hi, > > In this email I want to give a brief status update of the table access > method work - I assume that most of you sensibly haven't followed it > into all nooks and crannies. > > I want to thank Haribabu, Alvaro, Alexander, David, Dmitry and all the > others that collaborated on making tableam happen. It was/is a huge > project. A big thank you Andres for your enormous efforts in this patch. Without your involvement, this patch couldn't have been made into v12. With regards to storing the rows themselves, the second biggest > limitation is a limitation that is not actually a part of tableam > itself: WAL. Many tableam's would want to use WAL, but we only have > extensible WAL as part of generic_xlog.h. While that's useful to allow > prototyping etc, it's imo not efficient enough to build a competitive > storage engine for OLTP (OLAP probably much less of a problem). I don't > know what the best approach here is - allowing "well known" extensions > to register rmgr entries would be the easiest solution, but it's > certainly a bit crummy. > I got the same doubt when i looked into some of the UNDO patches where it tries to modify the core code to add UNDO specific WAL types. Different AM's may need different set of operations to be WAL logged, so it may be better for the AM's to register their own types? Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Thu, Apr 4, 2019 at 3:29 PM Amit Kapila wrote: > On Wed, Apr 3, 2019 at 10:45 AM Haribabu Kommi > wrote: > > > > Thanks for the review. > > > > While changing the approach to use the is_parallel_worker_flag, I thought > > that the rest of the stats are also not required to be updated and also > those > > are any way write operations and those values are zero anyway for > parallel > > workers. > > > > Instead of expanding the patch scope, I just changed to avoid the commit > > or rollback stats as discussed, and later we can target the handling of > all the > > internal transactions and their corresponding stats. > > > > The patch looks good to me. I have changed the commit message and ran > the pgindent in the attached patch. Can you once see if that looks > fine to you? Also, we should backpatch this till 9.6. So, can you > once verify if the change is fine in all bank branches? Also, test > with a force_parallel_mode option. I have already tested it with > force_parallel_mode = 'regress' in HEAD, please test it in back > branches as well. > Thanks for the updated patch. I tested in back branches even with force_parallelmode and it is working as expected. But the patches apply is failing in back branches, so attached the patches for their branches. For v11 it applies with hunks. Regards, Haribabu Kommi Fujitsu Australia 0001-Avoid-counting-transaction-stats-for-parallel-worker_10.patch Description: Binary data 0001-Avoid-counting-transaction-stats-for-parallel-worker_96.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Fri, Mar 29, 2019 at 9:05 PM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2019-03-26 03:26, Michael Paquier wrote: > > Do we really want to extend the replication protocol to control that? > > Perhaps we are losing sight of the original problem, which is that if > you create the target directory with the wrong permissions then ... it > has the wrong permissions. And you are free to change the permissions > at any time. Many of the proposed solutions sound excessively > complicated relative to that. > Yes, I agree that the proposed solution for fixing the original problem of existing data directory permissions with new group options to pg_basebackup is a big. why can't we just fix the permissions of the directory by default as per the source instance? I feel with this change, it may be useful to many people than problem. >From understanding of the thread discussion, +1 by: Michael Paquier Robert Haas Haribabu Kommi -1 by: Magnus Hagander Peter Eisentraut Does any one want to weigh their opinion on this patch to consider best option for controlling the existing standby data directory permissions. Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Wed, Apr 3, 2019 at 1:59 PM Amit Kapila wrote: > On Thu, Mar 28, 2019 at 11:43 AM Haribabu Kommi > wrote: > > > > On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila > wrote: > >> > >> On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi < > kommi.harib...@gmail.com> wrote: > >> > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila > wrote: > >> >> > >> >> As part of this thread, maybe we can > >> >> just fix the case of the parallel cooperating transaction. > >> > > >> > > >> > With the current patch, all the parallel implementation transaction > are getting > >> > skipped, in my tests parallel workers are the major factor in the > transaction > >> > stats counter. Even before parallelism, the stats of the autovacuum > and etc > >> > are still present but their contribution is not greatly influencing > the stats. > >> > > >> > I agree with you in fixing the stats with parallel workers and > improve stats. > >> > > >> > >> I was proposing to fix only the transaction started with > >> StartParallelWorkerTransaction by using is_parallel_worker flag as > >> discussed above. I understand that it won't completely fix the > >> problem reported by you, but it will be a step in that direction. My > >> main worry is that if we fix it the way you are proposing and we also > >> invent a new way to deal with all other internal transactions, then > >> the fix done by us now needs to be changed/reverted. Note, that this > >> fix needs to be backpatched as well, so we should avoid doing any fix > >> which needs to be changed or reverted. > > > > > > I tried the approach as your suggested as by not counting the actual > parallel work > > transactions by just releasing the resources without touching the > counters, > > the counts are not reduced much. > > > > HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3 > + 1) > > Patch - With 4 parallel workers running query generates 9 stats ( 4 * 2 > + 1) > > Old approach patch - With 4 parallel workers running query generates 1 > stat (1) > > > > Currently the parallel worker start transaction 3 times in the following > places. > > 1. InitPostgres > > 2. ParallelWorkerMain (2) > > > > with the attached patch, we reduce one count from ParallelWorkerMain. > > > > Right, that is expected from this fix. BTW, why you have changed the > approach in this patch to not count anything by the parallel worker as > compared to the earlier version where you were just ignoring the stats > for transactions. As of now, either way is fine, but in future (after > parallel inserts/updates), we want other stats to be updated. I think > your previous idea was better, just you need to start using > is_parallel_worker flag. > Thanks for the review. While changing the approach to use the is_parallel_worker_flag, I thought that the rest of the stats are also not required to be updated and also those are any way write operations and those values are zero anyway for parallel workers. Instead of expanding the patch scope, I just changed to avoid the commit or rollback stats as discussed, and later we can target the handling of all the internal transactions and their corresponding stats. Regards, Haribabu Kommi Fujitsu Australia 0001-Avoid-counting-parallel-worker-start-transactions_v4.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Tue, Apr 2, 2019 at 11:53 AM Andres Freund wrote: > Hi, > > On 2019-04-02 11:39:57 +1100, Haribabu Kommi wrote: > > > What made you rename indexam.sgml to am.sgml, instead of creating a > > > separate tableam.sgml? Seems easier to just have a separate file? > > > > > > > No specific reason, I just thought of adding all the access methods under > > one file. > > I can change it to tableam.sgml. > > I'd rather keep it separate. It seems likely that both table and indexam > docs will grow further over time, and they're not that closely > related. Additionally not moving sect1->sect2 etc will keep links stable > (which we could also achieve with different sect1 names, I realize > that). > OK. > > I can understand your point of avoiding function-by-function API > reference, > > as the user can check directly the code comments, Still I feel some > people > > may refer the doc for API changes. I am fine to remove based on your > > opinion. > > I think having to keeping both tableam.h and the sgml file current is > too much overhead - and anybody that's going to create a new tableam is > going to be able to look into the source anyway. > Here I attached updated patches as per the discussion. Is the description of table access methods is enough? or do you want me to add some more details? Regards, Haribabu Kommi Fujitsu Australia 0001-tableam-doc-update-of-table-access-methods.patch Description: Binary data 0002-Doc-updates-for-pluggable-table-access-method-syntax.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Tue, Apr 2, 2019 at 10:18 AM Andres Freund wrote: > Hi, > > On 2019-03-16 23:21:31 +1100, Haribabu Kommi wrote: > > updated patches are attached. > > Now that nearly all of the tableam patches are committed (with the > exception of the copy.c changes which are for bad reasons discussed at > [1]) I'm looking at the docs changes. > Thanks. > What made you rename indexam.sgml to am.sgml, instead of creating a > separate tableam.sgml? Seems easier to just have a separate file? > No specific reason, I just thought of adding all the access methods under one file. I can change it to tableam.sgml. > I'm currently not planning to include the function-by-function API > reference you've in your patchset, as I think it's more reasonable to > centralize all of it in tableam.h. I think I've included most of the > information there - could you check whether you agree? > I checked all the comments and explanation that is provided in the tableam.h is good enough to understand. Even I updated docs section to reflect with some more details from tableam.h comments. I can understand your point of avoiding function-by-function API reference, as the user can check directly the code comments, Still I feel some people may refer the doc for API changes. I am fine to remove based on your opinion. Added current set of doc patches for your reference. Regards, Haribabu Kommi Fujitsu Australia 0001-Rename-indexam.sgml-to-am.sgml.patch Description: Binary data 0002-Doc-updates-for-pluggable-table-access-method-syntax.patch Description: Binary data 0003-Table-access-method-API-explanation.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Wed, Mar 27, 2019 at 11:17 AM Andres Freund wrote: > Hi, > > On 2019-02-22 14:52:08 -0500, Robert Haas wrote: > > On Fri, Feb 22, 2019 at 11:19 AM Amit Khandekar > wrote: > > > Thanks for the review. Attached v2. > > > > Thanks. I took this, combined it with Andres's > > v12-0040-WIP-Move-xid-horizon-computation-for-page-level-.patch, did > > some polishing of the code and comments, and pgindented. Here's what > > I ended up with; see what you think. > > I pushed this after some fairly minor changes, directly including the > patch to route the horizon computation through tableam. The only real > change is that I removed the table relfilenode from the nbtree/hash > deletion WAL record - it was only required to access the heap without > accessing the catalog and was unused now. Also added a WAL version > bump. > > It seems possible that some other AM might want to generalize the > prefetch logic from heapam.c, but I think it's fair to defer that until > such an AM wants to do so > As I see that your are fixing some typos of the code that is committed, I just want to share some more corrections that I found in the patches that are committed till now. Regards, Haribabu Kommi Fujitsu Australia 0001-dA-to-show-Table-type-access-method.patch Description: Binary data 0002-Typos-and-comemnt-corrections.patch Description: Binary data
Re: Libpq support to connect to standby server as priority
On Wed, Mar 27, 2019 at 5:17 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > I've looked through 0004-0007. I've only found the following: > > (5) 0005 > With this read-only option type, application can connect to > connecting to a read-only server in the list of hosts, in case > if there is any read-only servers available, the connection > attempt fails. > > "connecting to" can be removed. > > in case if there is any read-only servers > -> If There's no read only server > Thanks for the review. I corrected all the comments that are raised by you and attached updated version of patches along with implementation of WIP in_recovery GUC_REPORT variable. This patch has used some of the implementations that are provided earlier in thread [1]. During the implementation of in_recovery configuration variable, I see a lot of code addition just for this variable, is this worth it? [1] - https://www.postgresql.org/message-id/2239254.dtfY1H9x0t%40hammer.magicstack.net Regards, Haribabu Kommi Fujitsu Australia From aa812716104b3fbd787dae4483341894c9e5ed9a Mon Sep 17 00:00:00 2001 From: Hari Babu Date: Thu, 21 Feb 2019 23:11:55 +1100 Subject: [PATCH 1/8] Restructure the code to remove duplicate code The duplicate code logic of checking for the server version before issuing the SHOW transaction_readonly to find out whether the server is read-write or not is restructured under a new connection status, so that duplicate code is removed. This is required for the next set of patches --- doc/src/sgml/libpq.sgml | 26 +--- src/interfaces/libpq/fe-connect.c | 99 --- src/interfaces/libpq/libpq-fe.h | 3 +- 3 files changed, 57 insertions(+), 71 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index c1d1b6b2db..6221e359f0 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname target_session_attrs + +The supported options for this parameter are, any and +read-write. The default value of this parameter, +any, regards all connections as acceptable. +If multiple hosts were specified in the connection string, based on the +specified value, any remaining servers will be tried before confirming +succesful connection or failure. + + If this parameter is set to read-write, only a connection in which read-write transactions are accepted by default -is considered acceptable. The query -SHOW transaction_read_only will be sent upon any -successful connection; if it returns on, the connection -will be closed. If multiple hosts were specified in the connection -string, any remaining servers will be tried just as if the connection -attempt had failed. The default value of this parameter, -any, regards all connections as acceptable. - +is considered acceptable. + + + +To find out whether the server supports read-write transactions are not, +query SHOW transaction_read_only will be sent upon any +successful connection; if it returns on, it means server +doesn't support read-write transactions. + diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index e3bf6a7449..c68448786d 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -3188,6 +3188,43 @@ keep_going: /* We will come back to here until there is return PGRES_POLLING_WRITING; } +conn->status = CONNECTION_CHECK_TARGET; +goto keep_going; + } + + case CONNECTION_SETENV: + { + +/* + * Do post-connection housekeeping (only needed in protocol 2.0). + * + * We pretend that the connection is OK for the duration of these + * queries. + */ +conn->status = CONNECTION_OK; + +switch (pqSetenvPoll(conn)) +{ + case PGRES_POLLING_OK: /* Success */ + break; + + case PGRES_POLLING_READING: /* Still going */ + conn->status = CONNECTION_SETENV; + return PGRES_POLLING_READING; + + case PGRES_POLLING_WRITING: /* Still going */ + conn->status = CONNECTION_SETENV; + return PGRES_POLLING_WRITING; + + default: + goto error_return; +} + } + + /* Intentional fall through */ + + case CONNECTION_CHECK_TARGET: + { /* * If a read-write connection is required, see if we have one. * @@ -3229,68 +3266,6 @@ keep_going: /* We will come back to here until there is return PGRES_POLLING_OK; } - case CONNECTION_SETENV: - - /* - * Do post-connection housekeeping (only needed in protocol 2.0). - * - * We pretend that the connection is OK for the duration of these - * queries. -
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Wed, Mar 27, 2019 at 11:27 PM Amit Kapila wrote: > On Wed, Mar 27, 2019 at 6:53 AM Haribabu Kommi > wrote: > > On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila > wrote: > >> > >> As part of this thread, maybe we can > >> just fix the case of the parallel cooperating transaction. > > > > > > With the current patch, all the parallel implementation transaction are > getting > > skipped, in my tests parallel workers are the major factor in the > transaction > > stats counter. Even before parallelism, the stats of the autovacuum and > etc > > are still present but their contribution is not greatly influencing the > stats. > > > > I agree with you in fixing the stats with parallel workers and improve > stats. > > > > I was proposing to fix only the transaction started with > StartParallelWorkerTransaction by using is_parallel_worker flag as > discussed above. I understand that it won't completely fix the > problem reported by you, but it will be a step in that direction. My > main worry is that if we fix it the way you are proposing and we also > invent a new way to deal with all other internal transactions, then > the fix done by us now needs to be changed/reverted. Note, that this > fix needs to be backpatched as well, so we should avoid doing any fix > which needs to be changed or reverted. > I tried the approach as your suggested as by not counting the actual parallel work transactions by just releasing the resources without touching the counters, the counts are not reduced much. HEAD - With 4 parallel workers running query generates 13 stats ( 4 * 3 + 1) Patch - With 4 parallel workers running query generates 9 stats ( 4 * 2 + 1) Old approach patch - With 4 parallel workers running query generates 1 stat (1) Currently the parallel worker start transaction 3 times in the following places. 1. InitPostgres 2. ParallelWorkerMain (2) with the attached patch, we reduce one count from ParallelWorkerMain. Regards, Haribabu Kommi Fujitsu Australia 0001-Avoid-counting-parallel-worker-transactions-stats_v3.patch Description: Binary data
Re: [HACKERS] Block level parallel vacuum
On Wed, Mar 27, 2019 at 1:31 AM Masahiko Sawada wrote: > On Tue, Mar 26, 2019 at 10:19 AM Haribabu Kommi > wrote: > > > > > > + for (i = 0; i < nindexes; i++) > > + { > > + LVIndStats *s = &(copied_indstats[i]); > > + > > + if (s->updated) > > + lazy_update_index_statistics(Irel[i], &(s->stats)); > > + } > > + > > + pfree(copied_indstats); > > > > why can't we use the shared memory directly to update the stats once all > the workers > > are finished, instead of copying them to a local memory? > > Since we cannot use heap_inplace_update() which is called by > vac_update_relstats() during parallel mode I copied the stats. Is that > safe if we destroy the parallel context *after* exited parallel mode? > OK, understood the reason behind the copy. Thanks for the updated patches. I reviewed them again and they are fine. I marked the patch as "ready for committer". Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Tue, Mar 26, 2019 at 9:08 PM Amit Kapila wrote: > On Mon, Mar 25, 2019 at 6:55 PM Haribabu Kommi > wrote: > > > > > > Thanks to everyone for their opinions and suggestions to improve. > > > > Without parallel workers, there aren't many internal implementation > > logic code that causes the stats to bloat. Parallel worker stats are not > > just the transactions, it update other stats also, for eg; seqscan stats > > of a relation. I also eagree that just fixing it for transactions may not > > be a proper solution. > > > > Using of XID data may not give proper TPS of the instance as it doesn't > > involve the read only transactions information. > > > > How about adding additional two columns that provides all the internal > > and background worker transactions into that column? > > > > I can see the case where the users will be interested in application > initiated transactions, so if we want to add new columns, it could be > to display that information and the current columns can keep > displaying the overall transactions in the database. Here, I think > we need to find a way to distinguish between internal and > user-initiated transactions. OTOH, I am not sure adding new columns > is better than changing the meaning of existing columns. We can go > either way based on what others feel good, but I think we can do that > as a separate head-only feature. I agree with you. Adding new columns definitely needs more discussions of what processes should be skipped and what needs to be added and etc. > As part of this thread, maybe we can > just fix the case of the parallel cooperating transaction. > With the current patch, all the parallel implementation transaction are getting skipped, in my tests parallel workers are the major factor in the transaction stats counter. Even before parallelism, the stats of the autovacuum and etc are still present but their contribution is not greatly influencing the stats. I agree with you in fixing the stats with parallel workers and improve stats. Regards, Haribabu Kommi Fujitsu Australia
Re: MSVC Build support with visual studio 2019
On Wed, Mar 27, 2019 at 3:03 AM Andrew Dunstan < andrew.duns...@2ndquadrant.com> wrote: > > On 3/20/19 8:36 PM, Haribabu Kommi wrote: > > Hi Hackers, > > > > Here I attached a patch that supports building of PostgreSQL with VS > 2019. > > VS 2019 is going to release on Apr 2nd 2019, it will be good if version > 12 > > supports compiling. The attached for is for review, it may needs some > > updates > > once the final version is released. > > > > Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the > > minimum visual studio support to 2013 to support C99 standards, > > because of this > > reason, the current attached patch cannot be backpatched as it is. > > > > I can provide a separate back branches patch later once this patch > > comes to a stage of commit. Currently all the supported branches are > > possible to compile with VS 2017. > > > > comments? > > > > > > > I have verified that this works with VS2019. > > > There are a few typos in the comments that need cleaning up. > Thanks for the review. I corrected the typos in the comments, hopefully I covered everything. Attached is the updated patch. Once the final VS 2019, I will check the patch if it needs any more updates. Regards, Haribabu Kommi Fujitsu Australia 0001-Support-building-with-visual-studio-2019_v2.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Tue, Mar 26, 2019 at 1:27 PM Michael Paquier wrote: > On Sun, Mar 24, 2019 at 10:30:47PM +1100, Haribabu Kommi wrote: > > With the above additional options, the pg_basebackup is able to control > > the access permissions of the backup files, but when it comes to tar mode > > all the files are sent from the server and stored as it is in backup, to > > support > > tar mode group access mode control, the BASE BACKUP protocol is > > enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group' > > to control the file permissions before they are sent to backup. Sending > > GROUP_MODE to the server depends on the -g option received to the > > pg_basebackup utility. > > Thanks for the review. > Do we really want to extend the replication protocol to control that? > As the backup data is passed in tar format and if the pg_basebackup is also storing it in tar format, i feel changing the permissions on tar creation is easier than regenerating the received tar with different permissions at pg_basebackup side. Other than tar format, changing only in pg_basebackup can support independent group access permissions of the standby directory. I am really questioning if we should keep this stuff isolated within > pg_basebackup or not. At the same time, it may be confusing to have > BASE_BACKUP only use the permissions inherited from the data > directory, so some input from folks maintaining an external backup > tool is welcome. > That would be good to hear what other external backup tool authors think of this change. Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Fri, Mar 22, 2019 at 4:06 PM Masahiko Sawada wrote: > > Attached the updated version patch. 0001 patch allows all existing > vacuum options an boolean argument. 0002 patch introduces parallel > lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb > command. > Thanks for sharing the updated patches. 0001 patch: +PARALLEL [ N ] But this patch contains syntax of PARALLEL but no explanation, I saw that it is explained in 0002. It is not a problem, but just mentioning. + Specifies parallel degree for PARALLEL option. The + value must be at least 1. If the parallel degree + integer is omitted, then + VACUUM decides the number of workers based on number of + indexes on the relation which further limited by + . Can we add some more details about backend participation also, parallel workers will come into picture only when there are 2 indexes in the table. + /* + * Do post-vacuum cleanup and statistics update for each index if + * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do + * only post-vacum cleanup and then update statistics after exited + * from parallel mode. + */ + lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats, + lps, true); How about renaming the above function, as it does the cleanup also? lazy_vacuum_or_cleanup_all_indexes? + if (!IsInParallelVacuum(lps)) + { + /* + * Update index statistics. If in parallel lazy vacuum, we will + * update them after exited from parallel mode. + */ + lazy_update_index_statistics(Irel[idx], stats[idx]); + + if (stats[idx]) + pfree(stats[idx]); + } The above check in lazy_vacuum_all_indexes can be combined it with the outer if check where the memcpy is happening. I still feel that the logic around the stats makes it little bit complex. + if (IsParallelWorker()) + msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum worker"; + else + msg = "scanned index \"%s\" to remove %d row versions"; I feel, this way of error message may not be picked for the translations. Is there any problem if we duplicate the entire ereport message with changed message? + for (i = 0; i < nindexes; i++) + { + LVIndStats *s = &(copied_indstats[i]); + + if (s->updated) + lazy_update_index_statistics(Irel[i], &(s->stats)); + } + + pfree(copied_indstats); why can't we use the shared memory directly to update the stats once all the workers are finished, instead of copying them to a local memory? + tab->at_params.nworkers = 0; /* parallel lazy autovacuum is not supported */ User is not required to provide workers number compulsory even that parallel vacuum can work, so just setting the above parameters doesn't stop the parallel workers, user must pass the PARALLEL option also. So mentioning that also will be helpful later when we start supporting it or some one who is reading the code can understand. Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Sat, Mar 23, 2019 at 11:10 PM Amit Kapila wrote: > On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera > wrote: > > > > On 2019-Mar-23, Amit Kapila wrote: > > > > > I think some users might also be interested in the write transactions > > > happened in the system, basically, those have consumed xid. > > > > Well, do they really want to *count* these transactions, or is it enough > > to keep an eye on the "age" of some XID column? Other than for XID > > freezing purposes, I don't see such internal transactions as very > > interesting. > > > > That's what I also had in mind. I think doing anything more than just > fixing the count for the parallel cooperating transaction by parallel > workers doesn't seem intuitive to me. I mean if we want we can commit > the fix such that all supporting transactions by parallel worker > shouldn't be counted, but I am not able to convince myself that that > is the good fix. Instead, I think rather than fixing that one case we > should think more broadly about all the supportive transactions > happening in the various operations. Also, as that is a kind of > behavior change, we should discuss that as a separate topic. > Thanks to everyone for their opinions and suggestions to improve. Without parallel workers, there aren't many internal implementation logic code that causes the stats to bloat. Parallel worker stats are not just the transactions, it update other stats also, for eg; seqscan stats of a relation. I also eagree that just fixing it for transactions may not be a proper solution. Using of XID data may not give proper TPS of the instance as it doesn't involve the read only transactions information. How about adding additional two columns that provides all the internal and background worker transactions into that column? Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Mar 22, 2019 at 6:07 PM Haribabu Kommi wrote: > > On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi > wrote: > >> >> On Fri, Mar 22, 2019 at 6:57 AM Robert Haas >> wrote: >> >>> On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi >>> wrote: >>> > Based on the above new options that can be added to >>> target_session_attrs, >>> > >>> > primary - it is just an alias to the read-write option. >>> > standby, prefer-standby - These options should check whether server is >>> running in recovery mode or not >>> > instead of checking whether server accepts read-only connections or >>> not? >>> >>> I think it will be best to have one set of attributes that check >>> default_transaction_read_only and a differently-named set that check >>> pg_is_in_recovery(). For each, there should be one value that looks >>> for a 'true' return and one value that looks for a 'false' return and >>> perhaps values that accept either but prefer one or the other. >>> >>> IOW, there's no reason to make primary an alias for read-write. If >>> you want read-write, you can just say read-write. But we can make >>> 'primary' or 'master' look for a server that's not in recovery and >>> 'standby' look for one that is. >>> >> >> OK, I agree with opinion. I will produce a patch for those new options. >> > > Here I attached WIP patch for the new options along with other older > patches. > The basic cases are working fine, doc and tests are missing. > > Currently this patch doesn't implement the GUC_REPORT for recovery mode > yet. I am yet to optimize the complex if check. > Except in_hotstandby GUC_REPORT, rest of the changes are implemented. Updated patches are attached. while going through the old patch where the GUC_REPORT is implemented, Tom has commented the logic of sending the signal to all backends to process the hot standby exit with SIGHUP, if we add the logic of updating the GUC variable value in SIGHUP, we may need to change all the SIGHUP handler code paths. It is also possible that there is no need to update the variable for other processes except backends. If we go with adding the new SIGUSR1 type to check and update the GUC varaible can work for most of the backends and background workers. opinions Regards, Haribabu Kommi Fujitsu Australia Note - Attachments order may sometime go wrong. 0001-Restructure-the-code-to-remove-duplicate-code.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch Description: Binary data 0005-New-read-only-target_session_attrs-type.patch Description: Binary data 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0006-Primary-prefer-standby-and-standby-options.patch Description: Binary data 0007-New-function-to-rejecting-the-checked-write-connecti.patch Description: Binary data
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Sun, Mar 24, 2019 at 11:16 PM Michael Paquier wrote: > On Fri, Mar 22, 2019 at 01:01:44PM +0900, Michael Paquier wrote: > > On Fri, Mar 22, 2019 at 02:35:41PM +1100, Haribabu Kommi wrote: > > > Thanks for the correction. Yes, that is correct and it works fine. > > > > Thanks for double-checking. Are there any objections with this patch? > > Done and committed down to v11 where group access has been added. > There could be an argument to do the same in v10, but as the root of > the problem is the interaction between a data folder using 0640 as > base mode for files and log_file_mode being more restrictive, then it > cannot apply to v10. > > After testing and reviewing the patch, I noticed that all versions > sent up to now missed two things done by logfile_open(): > - Bufferring is line-buffered. For current_logfiles it may not matter > much as the contents are first written into a temporary file and then > the file is renamed, but for debugging having the insurance of > consistent contents is nice even for the temporary file. > - current_logfiles uses \r\n. While it does not have a consequence > for the parsing of the file by pg_current_logfile, it breaks the > readability of the file on Windows, which is not nice. > So I have kept the patch with the previous defaults for consistency. > Perhaps they could be changed, but the current set is a good set. > Thanks Micheal and others. This really helps to choose the restrictive log file permissions even when the data directory is enabled with group access. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Sat, Mar 23, 2019 at 2:23 AM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2019-03-22 05:00, Michael Paquier wrote: > > On Fri, Mar 22, 2019 at 02:45:24PM +1100, Haribabu Kommi wrote: > >> How about letting the pg_basebackup to decide group permissions of the > >> standby directory irrespective of the primary directory permissions. > >> > >> Default - permissions are same as primary > >> --allow-group-access - standby directory have group access permissions > >> --no-group--access - standby directory doesn't have group permissions > >> > >> The last two options behave irrespective of the primary directory > >> permissions. > > > > Yes, I'd imagine that we would want to be able to define three > > different behaviors, by either having a set of options, or a sinple > > option with a switch, say --group-access: > > - "inherit" causes the permissions to be inherited from the source > > node, and that's the default. > > - "none" enforces the default 0700/0600. > > - "group" enforces group read access. > > Yes, we could use those three behaviors. > Thanks for all your opinions, here I attached an updated patch as discussed. New option -g --group-mode is added to pg_basebackup to specify the group access permissions. inherit - same permissions as source instance (default) none - No group permissions irrespective of source instance group - group permissions irrespective of source instance With the above additional options, the pg_basebackup is able to control the access permissions of the backup files, but when it comes to tar mode all the files are sent from the server and stored as it is in backup, to support tar mode group access mode control, the BASE BACKUP protocol is enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group' to control the file permissions before they are sent to backup. Sending GROUP_MODE to the server depends on the -g option received to the pg_basebackup utility. comments? Regards, Haribabu Kommi Fujitsu Australia 0001-New-pg_basebackup-g-option-to-control-the-group-acce.patch Description: Binary data
Re: Libpq support to connect to standby server as priority
On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi wrote: > > On Fri, Mar 22, 2019 at 6:57 AM Robert Haas wrote: > >> On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi >> wrote: >> > Based on the above new options that can be added to >> target_session_attrs, >> > >> > primary - it is just an alias to the read-write option. >> > standby, prefer-standby - These options should check whether server is >> running in recovery mode or not >> > instead of checking whether server accepts read-only connections or not? >> >> I think it will be best to have one set of attributes that check >> default_transaction_read_only and a differently-named set that check >> pg_is_in_recovery(). For each, there should be one value that looks >> for a 'true' return and one value that looks for a 'false' return and >> perhaps values that accept either but prefer one or the other. >> >> IOW, there's no reason to make primary an alias for read-write. If >> you want read-write, you can just say read-write. But we can make >> 'primary' or 'master' look for a server that's not in recovery and >> 'standby' look for one that is. >> > > OK, I agree with opinion. I will produce a patch for those new options. > Here I attached WIP patch for the new options along with other older patches. The basic cases are working fine, doc and tests are missing. Currently this patch doesn't implement the GUC_REPORT for recovery mode yet. I am yet to optimize the complex if check. Regards, Haribabu Kommi Fujitsu Australia 0001-Restructure-the-code-to-remove-duplicate-code.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0005-New-read-only-target_session_attrs-type.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch Description: Binary data 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0006-Primary-prefer-standby-and-standby-options.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Fri, Mar 22, 2019 at 11:42 AM Michael Paquier wrote: > On Thu, Mar 21, 2019 at 02:56:24PM -0400, Robert Haas wrote: > > On Tue, Mar 19, 2019 at 2:29 AM Michael Paquier > wrote: > >> Hm. We have been assuming that the contents of a base backup inherit > >> the permission of the source when using pg_basebackup because this > >> allows users to keep a nodes in a consistent state without deciding > >> which option to use. Do you mean that you would like to enforce the > >> permissions of only the root directory if it exists? Or the root > >> directory with all its contents? The former may be fine. The latter > >> is definitely not. > > > > Why not? > > Because we have released v11 so as we respect the permissions set on > the source instead from which the backup is taken for all the folder's > content. If we begin to enforce it we may break some cases. If a new > option is introduced, it seems to me that the default should remain > what has been released with v11, but that it is additionally possible > to enforce group permissions or non-group permissions at will on the > backup taken for all the contents in the data folder, including the > root folder, created manually or not before running the pg_basebackup > command. > How about letting the pg_basebackup to decide group permissions of the standby directory irrespective of the primary directory permissions. Default - permissions are same as primary --allow-group-access - standby directory have group access permissions --no-group--access - standby directory doesn't have group permissions The last two options behave irrespective of the primary directory permissions. opinions? Regards, Haribabu Kommi Fujitsu Australia
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Fri, Mar 22, 2019 at 12:24 PM Michael Paquier wrote: > On Thu, Mar 21, 2019 at 12:52:14PM +1100, Haribabu Kommi wrote: > > Earlier attached patch is wrong. > > - oumask = umask(pg_file_create_mode); > + oumask = umask(pg_mode_mask); > Indeed that was wrong. > > > Correct patch attached. Sorry for the inconvenience. > > This looks better for the umask setting, still it could be more > simple. > > #include > - > +#include "common/file_perm.h" > #include "lib/stringinfo.h" > Nit: it is better for readability to keep an empty line between the > system includes and the Postgres ones. > > A second thing, more important, is that you can reset umask just after > opening the file, as attached. This way there is no need to reset the > umask in all the code paths leaving update_metainfo_datafile(). Does > that look fine to you? > Thanks for the correction, Yes, that is correct and it works fine. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Mar 22, 2019 at 6:57 AM Robert Haas wrote: > On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi > wrote: > > Based on the above new options that can be added to target_session_attrs, > > > > primary - it is just an alias to the read-write option. > > standby, prefer-standby - These options should check whether server is > running in recovery mode or not > > instead of checking whether server accepts read-only connections or not? > > I think it will be best to have one set of attributes that check > default_transaction_read_only and a differently-named set that check > pg_is_in_recovery(). For each, there should be one value that looks > for a 'true' return and one value that looks for a 'false' return and > perhaps values that accept either but prefer one or the other. > > IOW, there's no reason to make primary an alias for read-write. If > you want read-write, you can just say read-write. But we can make > 'primary' or 'master' look for a server that's not in recovery and > 'standby' look for one that is. > OK, I agree with opinion. I will produce a patch for those new options. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Thu, Mar 21, 2019 at 3:02 AM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2019-03-19 08:34, Haribabu Kommi wrote: > > How about the following change? > > > > pg_basebackup --> copies the contents of the src directory (with group > > access) > > and even the root directory permissions. > > > > pg_basebackup --no-group-access --> copies the contents of the src > > directory > > (with no group access) even for the root directory. > > I'm OK with that. Perhaps a positive option --allow-group-access would > also be useful. > Do you want both --allow-group-access and --no-group-access options to be added to pg_basebackup? Without --allow-group-access is automatically --no-group-access? Or you want pg_basebackup independently decide the group access irrespective of the source directory? Even if the source directory is "not group access", pg_basebackup --allow-group-access make it standby as "group access". source directory is "group access", pg_basebackup --no-group-access make it "no group access" standby directory. Default behavior of pg_basebackup is just to copy same as source directory? > Let's make sure the behavior of these options is aligned with initdb. > And write tests for each variant. > OK. Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Fri, Mar 22, 2019 at 5:16 AM Andres Freund wrote: > Hi, > > Attached is a version of just the first patch. I'm still updating it, > but it's getting closer to commit: > > - There were no tests testing EPQ interactions with DELETE, and only an > accidental test for EPQ in UPDATE with a concurrent DELETE. I've added > tests. Plan to commit that ahead of the big change. > > - I was pretty unhappy about how the EPQ integration looked before, I've > changed that now. > > I still wonder if we should restore EvalPlanQualFetch and move the > table_lock_tuple() calls in ExecDelete/Update into it. But it seems > like it'd not gain that much, because there's custom surrounding code, > and it's not that much code. > > - I changed heapam_tuple_tuple to return *WouldBlock rather than just > the last result. I think that's one of the reason Haribabu had > neutered a few asserts. > > - I moved comments from heapam.h to tableam.h where appropriate > > - I updated the name of HeapUpdateFailureData to TM_FailureData, > HTSU_Result to TM_Result, TM_Results members now properly distinguish > between update vs modifications (delete & update). > > - I separated the HEAP_INSERT_ flags into TABLE_INSERT_* and > HEAP_INSERT_ with the latter being a copy of TABLE_INSERT_ with the > sole addition of _SPECULATIVE. table_insert_speculative callers now > don't specify that anymore. > > > Pending work: > - Wondering if table_insert/delete/update should rather be > table_tuple_insert etc. Would be a bit more consistent with the > callback names, but a bigger departure from existing code. > > - I'm not yet happy with TableTupleDeleted computation in heapam.c, I > want to revise that further > > - formatting > > - commit message > > - a few comments need a bit of polishing (ExecCheckTIDVisible, > heapam_tuple_lock) > > - Rename TableTupleMayBeModified to TableTupleOk, but also probably a > s/TableTuple/TableMod/ > > - I'll probably move TUPLE_LOCK_FLAG_LOCK_* into tableam.h > > - two more passes through the patch > Thanks for the corrections. > On 2019-03-21 15:07:04 +1100, Haribabu Kommi wrote: > > As you are modifying the 0003 patch for modify API's, I went and reviewed > > the > > existing patch and found couple corrections that are needed, in case if > you > > are not > > taken care of them already. > > Some I have... > > > > > + /* Update the tuple with table oid */ > > + slot->tts_tableOid = RelationGetRelid(relation); > > + if (slot->tts_tableOid != InvalidOid) > > + tuple->t_tableOid = slot->tts_tableOid; > > > > The setting of slot->tts_tableOid is not required in this function, > > After set the check is happening, the above code is present in both > > heapam_heap_insert and heapam_heap_insert_speculative. > > I'm not following? Those functions are independent? > In those functions, the slot->tts_tableOid is set and in the next statement checked whether it is invalid or not? Callers of table_insert should have already set that. So setting the value and checking in the next line is it required? The value cannot be InvalidOid. > > > + slot->tts_tableOid = RelationGetRelid(relation); > > > > In heapam_heap_update, i don't think there is a need to update > > slot->tts_tableOid. > > Why? > The slot->tts_tableOid should have been updated before the call to heap_update. setting it again after the heap_update is required? I also observed setting slot->tts_tableOid after table_insert_XXX calls also in Exec_insert function? Is this to make sure that AM hasn't modified that value? > > + default: > > + elog(ERROR, "unrecognized heap_update status: %u", result); > > > > heap_update --> table_update? > > > > > > + default: > > + elog(ERROR, "unrecognized heap_delete status: %u", result); > > > > same as above? > > I've fixed that in a number of places. > > > > + /*hari FIXME*/ > > + /*Assert(result != HeapTupleUpdated && hufd.traversed);*/ > > > > Removing the commented codes in both ExecDelete and ExecUpdate functions. > > I don't think that's the right fix. I've refactored that code > significantly now, and restored the assert in a, imo, correct version. > > OK. > > + /**/ > > + if (epqreturnslot) > > + { > > + *epqreturnslot = epqslot; > > + return NULL; > > + } > > > > comment update missed? > > Well, you'd deleted a comment around there ;). I've added something back > now... > This is not only the problem I could have introduced, All the comments that listed are introduced by me ;). Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Wed, Mar 20, 2019 at 5:01 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Robert Haas [mailto:robertmh...@gmail.com] > > I really dislike having both target_sesion_attrs and > > target_server_type. It doesn't solve any actual problem. master, > > slave, prefer-save, or whatever you like could be put in > > target_session_attrs just as easily, and then we wouldn't end up with > > two keywords doing closely related things. 'master' is no more or > > less a server attribute than 'read-write'. > > Hmm, that may be OK. At first, I felt it strange to treat the server type > (primary or standby) as a session attribute. But we can see the server > type as one attribute in a sense that a session is established for. I'm > inclined to agree with: > > target_session_attr = {any | read-write | read-only | prefer-read | > primary | standby | prefer-standby} > Thanks for your suggestions. Based on the above new options that can be added to target_session_attrs, primary - it is just an alias to the read-write option. standby, prefer-standby - These options should check whether server is running in recovery mode or not instead of checking whether server accepts read-only connections or not? Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Wed, Mar 20, 2019 at 7:38 PM Amit Kapila wrote: > On Sun, Feb 10, 2019 at 10:54 AM Haribabu Kommi > wrote: > > On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila > wrote: > >> > >> I don't think so. It seems to me that we should consider it as a > >> single transaction. Do you want to do the leg work for this and try > >> to come up with a patch? On a quick look, I think we might want to > >> change AtEOXact_PgStat so that the commits for parallel workers are > >> not considered. I think the caller has that information. > > > > > > I try to fix it by adding a check for parallel worker or not and based > on it > > count them into stats. Patch attached. > > > Thanks for the review. > @@ -2057,14 +2058,18 @@ AtEOXact_PgStat(bool isCommit) > { > .. > + /* Don't count parallel worker transactions into stats */ > + if (!IsParallelWorker()) > + { > + /* > + * Count transaction commit or abort. (We use counters, not just bools, > + * in case the reporting message isn't sent right away.) > + */ > + if (isCommit) > + pgStatXactCommit++; > + else > + pgStatXactRollback++; > + } > .. > } > > I wonder why you haven't used the 'is_parallel_worker' flag from the > caller, see CommitTransaction/AbortTransaction? The difference is > that if we use that then it will just avoid counting transaction for > the parallel work (StartParallelWorkerTransaction), otherwise, it > might miss the count of any other transaction we started in the > parallel worker for some intermediate work (for example, the > transaction we started to restore library and guc state). > I understand your comment. current patch just avoids every transaction that is used for the parallel operation. With the use of 'is_parallel_worker' flag, it avoids only the actual transaction that has done parallel operation (All supportive transactions are counted). > I think it boils down to whether we want to avoid any transaction that > is started by a parallel worker or just the transaction which is > shared among leader and worker. It seems to me that currently, we do > count all the internal transactions (started with > StartTransactionCommand and CommitTransactionCommand) in this counter, > so we should just try to avoid the transaction for which state is > shared between leader and workers. > > Anyone else has any opinion on this matter? > As I said in my first mail, including pgAdmin4 also displays the TPS as stats on the dashboard. Those TPS values are received from xact_commits column of the pg_stat_database view. With Inclusion of parallel worker transactions, the TPS will be a higher number, thus user may find it as better throughput. This is the case with one of our tool. The tool changes the configuration randomly to find out the best configuration for the server based on a set of workload, during our test, with some configurations, the TPS is so good, but the overall performance is slow as the system is having less resources to keep up with that configuration. Opinions? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
Hi, The psql \dA commands currently doesn't show the type of the access methods of type 'Table'. postgres=# \dA heap List of access methods Name | Type --+--- heap | (1 row) Attached a simple patch that fixes the problem and outputs as follows. postgres=# \dA heap List of access methods Name | Type --+--- heap | Table (1 row) The attached patch directly modifies the query that is sent to the server. Servers < 12 doesn't support of type 'Table', so the same query can work, because of the case addition as follows. SELECT amname AS "Name", CASE amtype WHEN 'i' THEN 'Index' WHEN 't' THEN 'Table' END AS "Type" FROM pg_catalog.pg_am ... Anyone feels that it requires a separate query for servers < 12? Regards, Haribabu Kommi Fujitsu Australia 0001-dA-to-show-Table-type-access-method.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Sat, Mar 16, 2019 at 5:43 PM Haribabu Kommi wrote: > > > On Sat, Mar 9, 2019 at 2:13 PM Andres Freund wrote: > >> Hi, >> >> While 0001 is pretty bulky, the interesting bits concentrate on a >> comparatively small area. I'd appreciate if somebody could give the >> comments added in tableam.h a read (both on callbacks, and their >> wrappers, as they have different audiences). It'd make sense to first >> read the commit message, to understand the goal (and I'd obviously also >> appreciate suggestions for improvements there as well). >> >> I'm pretty happy with the current state of the scan patch. I plan to do >> two more passes through it (formatting, comment polishing, etc. I don't >> know of any functional changes needed), and then commit it, lest >> somebody objects. >> > > I found couple of typos in the committed patch, attached patch fixes them. > I am not sure about one typo, please check it once. > > And I reviewed the 0002 patch, which is a pretty simple and it can be > committed. > As you are modifying the 0003 patch for modify API's, I went and reviewed the existing patch and found couple corrections that are needed, in case if you are not taken care of them already. + /* Update the tuple with table oid */ + slot->tts_tableOid = RelationGetRelid(relation); + if (slot->tts_tableOid != InvalidOid) + tuple->t_tableOid = slot->tts_tableOid; The setting of slot->tts_tableOid is not required in this function, After set the check is happening, the above code is present in both heapam_heap_insert and heapam_heap_insert_speculative. + slot->tts_tableOid = RelationGetRelid(relation); In heapam_heap_update, i don't think there is a need to update slot->tts_tableOid. + default: + elog(ERROR, "unrecognized heap_update status: %u", result); heap_update --> table_update? + default: + elog(ERROR, "unrecognized heap_delete status: %u", result); same as above? + /*hari FIXME*/ + /*Assert(result != HeapTupleUpdated && hufd.traversed);*/ Removing the commented codes in both ExecDelete and ExecUpdate functions. + /**/ + if (epqreturnslot) + { + *epqreturnslot = epqslot; + return NULL; + } comment update missed? Regards, Haribabu Kommi Fujitsu Australia
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Thu, Mar 21, 2019 at 12:41 PM Haribabu Kommi wrote: > > On Wed, Mar 20, 2019 at 4:33 PM Michael Paquier > wrote: > >> And actually it seems to me that you have a race condition in that >> stuff. I think that you had better use umask(), then fopen, and then >> once again umask() to put back the previous permissions, removing the >> extra chmod() call. >> > > Changed the patch to use umask() instead of chmod() according to > your suggestion. > > updated patch attached. > Earlier attached patch is wrong. Correct patch attached. Sorry for the inconvenience. Regards, Haribabu Kommi Fujitsu Australia 0001-Adjust-current_logfiles-file-permissions_v3.patch Description: Binary data
Re: MSVC Build support with visual studio 2019
On Thu, Mar 21, 2019 at 12:31 PM Michael Paquier wrote: > On Thu, Mar 21, 2019 at 09:47:02AM +0900, Michael Paquier wrote: > > When it comes to support newer versions of MSVC, we have come up > > lately to backpatch that down to two stable versions but not further > > down (see f2ab389 for v10 and v9.6), so it looks sensible to target > > v11 and v10 as well if we were to do it today, and v11/v12 if we do it > > in six months per the latest trends. > > By the way, you mentioned upthread that all the branches can compile > with MSVC 2017, but that's not actually the case for 9.5 and 9.4 if > you don't back-patch f2ab389 further down. > The commit f2ab389 is later back-patch to version till 9.3 in commit 19acfd65. I guess that building the windows installer for all the versions using the same visual studio is may be the reason behind that back-patch. Regards, Haribabu Kommi Fujitsu Australia
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Wed, Mar 20, 2019 at 4:33 PM Michael Paquier wrote: > On Fri, Mar 15, 2019 at 06:51:37PM +1100, Haribabu Kommi wrote: > > IMO, this update is just a recommendation to the user, and sometimes it > is > > still possible that there may be strict permissions for the log file > > even the data directory is allowed for the group access. So I feel > > it is still better to update the permissions of the current_logfiles > > to the database files permissions than log file permissions. > > I was just reading again this thread, and the suggestions that > current_logfiles is itself not a log file is also a sensible > position. I was just looking at the patch that you sent at the top of > the thread here: > > https://www.postgresql.org/message-id/cajrrpgceotf1p7awoeqyd3pqr-0xkqg_herv98djbamj+na...@mail.gmail.com > Thanks for the review. > And actually it seems to me that you have a race condition in that > stuff. I think that you had better use umask(), then fopen, and then > once again umask() to put back the previous permissions, removing the > extra chmod() call. > Changed the patch to use umask() instead of chmod() according to your suggestion. updated patch attached. Regards, Haribabu Kommi Fujitsu Australia 0001-Adjust-current_logfiles-file-permissions_v2.patch Description: Binary data
MSVC Build support with visual studio 2019
Hi Hackers, Here I attached a patch that supports building of PostgreSQL with VS 2019. VS 2019 is going to release on Apr 2nd 2019, it will be good if version 12 supports compiling. The attached for is for review, it may needs some updates once the final version is released. Commit d9dd406fe281d22d5238d3c26a7182543c711e74 has reduced the minimum visual studio support to 2013 to support C99 standards, because of this reason, the current attached patch cannot be backpatched as it is. I can provide a separate back branches patch later once this patch comes to a stage of commit. Currently all the supported branches are possible to compile with VS 2017. comments? Regards, Haribabu Kommi Fujitsu Australia 0001-Support-building-with-visual-studio-2019.patch Description: Binary data
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Tue, Mar 19, 2019 at 6:47 PM Jamison, Kirk wrote: > Hi Hari-san, > > > > On Sunday, February 10, 2019 2:25 PM (GMT+9), Haribabu Kommi wrote: > > > I try to fix it by adding a check for parallel worker or not and based > on it > > > count them into stats. Patch attached. > > > > > > With this patch, currently it doesn't count parallel worker > transactions, and > > > rest of the stats like seqscan and etc are still get counted. IMO they > still > > > may need to be counted as those stats represent the number of tuples > > > returned and etc. > > > > > > Comments? > > > > I took a look at your patch, and it’s pretty straightforward. > > However, currently the patch does not apply, so I reattached an updated one > > to keep the CFbot happy. > > > > The previous patch also had a missing header to detect parallel workers > > so I added that. --> #include "access/parallel.h" > Thanks for update and review krik. Regards, Haribabu Kommi Fujitsu Australia
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Tue, Mar 19, 2019 at 2:32 PM Rahila Syed wrote: > Hi Haribabu, > > The latest patch fails while applying header files part. Kindly rebase. > Thanks for the review. > The patch looks good to me. However, I wonder what are the other scenarios > where xact_commit is incremented because even if I commit a single > transaction with your patch applied the increment in xact_commit is > 1. As > you mentioned upthread, need to check what internal operation resulted in > the increase. But the increase in xact_commit is surely lesser with the > patch. > Currently, the transaction counts are incremented by the background process like autovacuum and checkpointer. when turn the autovacuum off, you can see exactly how many transaction commits increases. > postgres=# BEGIN; > BEGIN > postgres=# select xact_commit from pg_stat_database where datname = > 'postgres'; > xact_commit > - > 158 > (1 row) > > postgres=# explain analyze select * from foo where i = 1000; >QUERY > PLAN > > > Gather (cost=1000.00..136417.85 rows=1 width=37) (actual > time=4.596..3792.710 rows=1 loops=1) >Workers Planned: 2 >Workers Launched: 2 >-> Parallel Seq Scan on foo (cost=0.00..135417.75 rows=1 width=37) > (actual time=2448.038..3706.009 rows=0 loops=3) > Filter: (i = 1000) > Rows Removed by Filter: 333 > Planning Time: 0.353 ms > Execution Time: 3793.572 ms > (8 rows) > > postgres=# commit; > COMMIT > postgres=# select xact_commit from pg_stat_database where datname = > 'postgres'; > xact_commit > - > 161 > (1 row) > Thanks for the test and confirmation. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Tue, Mar 19, 2019 at 5:29 PM Michael Paquier wrote: > On Mon, Mar 18, 2019 at 11:45:05AM -0400, Robert Haas wrote: > > So you want to default to no group access regardless of the directory > > permissions, with an option to enable group access that must be > > explicitly specified? That seems like a reasonable option to me; note > > that initdb does seem to chdir() an existing directory. > > Hm. We have been assuming that the contents of a base backup inherit > the permission of the source when using pg_basebackup because this > allows users to keep a nodes in a consistent state without deciding > which option to use. Do you mean that you would like to enforce the > permissions of only the root directory if it exists? Or the root > directory with all its contents? The former may be fine. The latter > is definitely not. > As per my understanding going through the discussion, the option is for root directory with all its contents also. How about the following change? pg_basebackup --> copies the contents of the src directory (with group access) and even the root directory permissions. pg_basebackup --no-group-access --> copies the contents of the src directory (with no group access) even for the root directory. So the default behavior works for many people, others that needs restrict behavior can use the new option. Regards, Haribabu Kommi Fujitsu Australia
Re: What to name the current heap after pluggable storage / what to rename?
On Tue, Mar 19, 2019 at 2:32 PM Andres Freund wrote: > On 2019-03-18 16:24:40 -0700, Andres Freund wrote: > > Hi, > > > > On 2019-03-13 08:29:47 -0400, Robert Haas wrote: > > > On Tue, Mar 12, 2019 at 8:39 PM Andres Freund > wrote: > > > > > I like that option. > > > > > > > > In that vein, does anybody have an opinion about the naming of > > > > a) HeapUpdateFailureData, which will be used for different AMs > > > > b) HTSU_Result itself, which'll be the return parameter for > > > >update/delete via tableam > > > > c) Naming of HTSU_Result members (like HeapTupleBeingUpdated) > > > > > > > > I can see us doing several things: > > > > 1) Live with the old names, explain the naming as historical baggage > > > > 2) Replace names, but add typedefs / #defines for backward > compatibility > > > > 3) Rename without backward compatibility > > > > > > I think I have a mild preference for renaming HeapUpdateFailureData to > > > TableUpdateFailureData, but I'm less excited about renaming > > > HTSU_Result or its members. I don't care if you want to do > > > s/HeapTuple/TableTuple/g and s/HTSU_Result/TTSU_Result/g though. > > > > Anybody else got an opion on those? I personally don't have meaningful > > feelings on this. I'm mildly inclined to the renamings suggested > > above. > > What I now have is: > > /* > * Result codes for table_{update,delete,lock}_tuple, and for visibility > * routines inside table AMs. > */ > typedef enum TM_Result > { > /* Signals that the action succeeded (i.e. update/delete > performed) */ > TableTupleMayBeModified, > > /* The affected tuple wasn't visible to the relevant snapshot */ > TableTupleInvisible, > > /* The affected tuple was already modified by the calling backend > */ > TableTupleSelfModified, > > /* The affected tuple was updated by another transaction */ > TableTupleUpdated, > > /* The affected tuple was deleted by another transaction */ > TableTupleDeleted, > > /* > * The affected tuple is currently being modified by another > session. This > * will only be returned if (update/delete/lock)_tuple are > instructed not > * to wait. > */ > TableTupleBeingModified, > > /* lock couldn't be acquired, action skipped. Only used by > lock_tuple */ > TableTupleWouldBlock > } TM_Result; > With the above structure, it is easy to understand where and how these values are used. so +1 to go with this change. > I'm kinda wondering about replacing the TableTuple prefix with TableMod, > seems less confusing to me. One more way, how about just TupleUpdated and etc. Removing of Table? The structure name also suggests as TM (IMO, it is TupleModication?) I'm also wondering about replacing > *MayBeModified with *OK. > How about TupleModified? I am fine with *OK also. > Right now I have > > typedef enum TM_Result HTSU_Result; > > #define HeapTupleMayBeUpdated TableTupleMayBeModified > #define HeapTupleInvisible TableTupleInvisible > #define HeapTupleSelfUpdated TableTupleSelfModified > #define HeapTupleUpdated TableTupleUpdated > #define HeapTupleDeleted TableTupleDeleted > #define HeapTupleBeingUpdated TableTupleBeingModified > #define HeapTupleWouldBlock TableTupleWouldBlock > > in heapam.h (whereas the above is in tableam.h), for backward > compat. But I'm not sure it's worth it. > These old macros are pretty much used in the internal code, and I doubt that any one depends directly on those macros. I vote for removal of these backward compatibility macros. Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Mon, Mar 18, 2019 at 1:58 PM Masahiko Sawada wrote: > On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada > wrote: > > > > On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi > wrote: > > > > > > On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada > wrote: > > >> > > >> Thank you. Attached the rebased patch. > > > > > > > > > I ran some performance tests to compare the parallelism benefits, > > > > Thank you for testing! > > > > > but I got some strange results of performance overhead, may be it is > > > because, I tested it on my laptop. > > > > Hmm, I think the parallel vacuum would help for heavy workloads like a > > big table with multiple indexes. In your test result, all executions > > are completed within 1 sec, which seems to be one use case that the > > parallel vacuum wouldn't help. I suspect that the table is small, > > right? Anyway I'll also do performance tests. > > > > Here is the performance test results. I've setup a 500MB table with > several indexes and made 10% of table dirty before each vacuum. > Compared execution time of the patched postgrse with the current HEAD > (at 'speed_up' column). In my environment, > > indexes | parallel_degree | patched |head| speed_up > -+-+++-- > 0 | 0 | 238.2085 | 244.7625 | 1.0275 > 0 | 1 | 237.7050 | 244.7625 | 1.0297 > 0 | 2 | 238.0390 | 244.7625 | 1.0282 > 0 | 4 | 238.1045 | 244.7625 | 1.0280 > 0 | 8 | 237.8995 | 244.7625 | 1.0288 > 0 | 16 | 237.7775 | 244.7625 | 1.0294 > 1 | 0 | 1328.8590 | 1334.9125 | 1.0046 > 1 | 1 | 1325.9140 | 1334.9125 | 1.0068 > 1 | 2 | 1333.3665 | 1334.9125 | 1.0012 > 1 | 4 | 1329.5205 | 1334.9125 | 1.0041 > 1 | 8 | 1334.2255 | 1334.9125 | 1.0005 > 1 | 16 | 1335.1510 | 1334.9125 | 0.9998 > 2 | 0 | 2426.2905 | 2427.5165 | 1.0005 > 2 | 1 | 1416.0595 | 2427.5165 | 1.7143 > 2 | 2 | 1411.6270 | 2427.5165 | 1.7197 > 2 | 4 | 1411.6490 | 2427.5165 | 1.7196 > 2 | 8 | 1410.1750 | 2427.5165 | 1.7214 > 2 | 16 | 1413.4985 | 2427.5165 | 1.7174 > 4 | 0 | 4622.5060 | 4619.0340 | 0.9992 > 4 | 1 | 2536.8435 | 4619.0340 | 1.8208 > 4 | 2 | 2548.3615 | 4619.0340 | 1.8126 > 4 | 4 | 1467.9655 | 4619.0340 | 3.1466 > 4 | 8 | 1486.3155 | 4619.0340 | 3.1077 > 4 | 16 | 1481.7150 | 4619.0340 | 3.1174 > 8 | 0 | 9039.3810 | 8990.4735 | 0.9946 > 8 | 1 | 4807.5880 | 8990.4735 | 1.8701 > 8 | 2 | 3786.7620 | 8990.4735 | 2.3742 > 8 | 4 | 2924.2205 | 8990.4735 | 3.0745 > 8 | 8 | 2684.2545 | 8990.4735 | 3.3493 > 8 | 16 | 2672.9800 | 8990.4735 | 3.3635 > 16 | 0 | 17821.4715 | 17740.1300 | 0.9954 > 16 | 1 | 9318.3810 | 17740.1300 | 1.9038 > 16 | 2 | 7260.6315 | 17740.1300 | 2.4433 > 16 | 4 | 5538.5225 | 17740.1300 | 3.2030 > 16 | 8 | 5368.5255 | 17740.1300 | 3.3045 > 16 | 16 | 5291.8510 | 17740.1300 | 3.3523 > (36 rows) > The performance results are good. Do we want to add the recommended size in the document for the parallel option? the parallel option for smaller tables can lead to performance overhead. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Thu, Feb 28, 2019 at 1:00 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Haribabu Kommi [mailto:kommi.harib...@gmail.com] > > Attached are the updated patches. > > Thanks, all look fixed. > > > > The target_server_type option yet to be implemented. > > Please let me review once more and proceed to testing when the above is > added, to make sure the final code looks good. I'd like to see how complex > the if conditions in multiple places would be after adding > target_server_type, and consider whether we can simplify them together with > you. Even now, the if conditions seem complicated to me... that's probably > due to the existence of read_write_host_index. > Yes, if checks are little bit complex because of additional checks to identify, I will check if there is any easier way to update them without introducing code duplication. While working on implementation of target_server_type new connection option for the libpq to specify master, slave and etc, there is no problem when the newly added target_server_type option is used separate, but when it is combined with the existing target_session_attrs, there may be some combinations that are not valid or such servers doesn't exist. Target_session_attrs Target_server_type read-write prefer-slave, slave prefer-read master, slave read-onlymaster, prefer-slave I know that some of the cases above is possible, like master server with by default accepts read-only sessions. Instead of we put a check to validate what is right combination, how about allowing the combinations and in case if such combination is not possible, means there shouldn't be any server the supports the requirement, and connection fails. comments? And also as we need to support the new option to connect to servers < 12 also, this option sends the command "select pg_is_in_recovery()" to the server to find out whether the server is recovery mode or not? And also regarding the implementation point of view, the new target_server_type option validation is separately handled, means the check for the required server is handled in a separate switch case, when both options are given, first find out the required server for target_session_attrs and validate the same again with target_server_type? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Sat, Mar 9, 2019 at 2:13 PM Andres Freund wrote: > Hi, > > While 0001 is pretty bulky, the interesting bits concentrate on a > comparatively small area. I'd appreciate if somebody could give the > comments added in tableam.h a read (both on callbacks, and their > wrappers, as they have different audiences). It'd make sense to first > read the commit message, to understand the goal (and I'd obviously also > appreciate suggestions for improvements there as well). > > I'm pretty happy with the current state of the scan patch. I plan to do > two more passes through it (formatting, comment polishing, etc. I don't > know of any functional changes needed), and then commit it, lest > somebody objects. > I found couple of typos in the committed patch, attached patch fixes them. I am not sure about one typo, please check it once. And I reviewed the 0002 patch, which is a pretty simple and it can be committed. Regards, Haribabu Kommi Fujitsu Australia 0001-table-access-methods-typos-correction.patch Description: Binary data
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Tue, Mar 12, 2019 at 5:03 PM Michael Paquier wrote: > On Tue, Feb 26, 2019 at 12:22:53PM +1100, Haribabu Kommi wrote: > > I checked the code why the current_logfiles is not implemented as > > shared memory and found that the current syslogger doesn't attach to > > the shared memory of the postmaster. To support storing the > > current_logfiles in shared memory, the syslogger process also needs > > to attach to the shared memory, this seems to be a new > > infrastructure change. > > I don't think you can do that anyway and we should not do it. Shared > memory can be reset after a backend exits abnormally, but the > syslogger lives across that. What you sent upthread to improve the > documentation is in my opinion sufficient: > > https://www.postgresql.org/message-id/cajrrpge-v2_lmfd9nhrbejjy3vvokjwy3w_h+fs2nxcjg3p...@mail.gmail.com > > I would not have split the paragraph you broke into two, but instead > just add this part in-between: > + > +Permissions 0640 are recommended to be > compatible with > +initdb option > --allow-group-access. > + > Any objections in doing that? > If I remember correctly, in one of the mails, you mentioned that having a separate para is better. Attached is the updated patch as per your suggestion. IMO, this update is just a recommendation to the user, and sometimes it is still possible that there may be strict permissions for the log file even the data directory is allowed for the group access. So I feel it is still better to update the permissions of the current_logfiles to the database files permissions than log file permissions. Regards, Haribabu Kommi Fujitsu Australia 0001-log_file_mode-recommended-value-update_v2.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Fri, Mar 15, 2019 at 10:33 AM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 2019-03-09 02:19, Haribabu Kommi wrote: > > Yes, I agree that it may be a problem if the existing data directory > > permissions > > are 0700 to changing it to 0750. But it may not be a problem for the > > scenarios, > > where the existing data permissions >=0750, to the upstream permissions. > > Because user must need to change anyway to start the server, otherwise > > server > > start fails, and also the files inside the data folder follows the > > permissions of the > > upstream data directory. > > > > usually production systems follows same permissions are of upstream, I > don't > > see a problem in following the same for development environment also? > > I think the potential problems of getting this wrong are bigger than the > issue we are trying to fix. > Thanks for your opinion. I am not sure exactly what are the problems. But anyway I can go with your suggestion. How about changing the data directory permissions for the -R scenario? if executing pg_basebackup on to an existing data directory is a common scenario? or leave it? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Sat, Mar 9, 2019 at 2:18 PM Andres Freund wrote: > Hi, > > On 2019-03-09 11:03:21 +1100, Haribabu Kommi wrote: > > Here I attached the rebased patches that I shared earlier. I am adding > the > > comments to explain the API's in the code, will share those patches > later. > > I've started to add those for the callbacks in the first commit. I'd > appreciate a look! > Thanks for the updated patches. + /* + * Callbacks for hon-modifying operations on individual tuples + * Typo in tableam.h file. hon->non > I think I'll include the docs patches, sans the callback documentation, > in the next version. I'll probably merge them into one commit, if that's > OK with you? > OK. For easy review, I will still maintain 3 or 4 patches instead of the current patch series. > > I observed a crash with the latest patch series in the COPY command. > > Hm, which version was this? I'd at some point accidentally posted a > 'tmp' commit that was just a performance hack > Yes. in my version that checked have that commit. May be that is the reason for the failure. Btw, your patches always are attached out of order: > > https://www.postgresql.org/message-id/CAJrrPGd%2Brkz54wE-oXRojg4XwC3bcF6bjjRziD%2BXhFup9Q3n2w%40mail.gmail.com > 10, 1, 3, 4, 2 ... > Sorry about that. I always think why it is ordering that way when I attached the patch files into the mail. I thought it may be gmail behavior, but with experiment I found that, while adding the multiple patches, the last selected patch given the preference and it will be listed as first attachment. I will take care that this problem doesn't repeat it again. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Fri, Mar 8, 2019 at 11:59 PM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > pg_basebackup copies the data directory permission mode from the > upstream server. But it doesn't copy the ownership. So if say the > upstream server allows group access and things are owned by > postgres:postgres, and I want to make a copy for local development and > make a backup into a directory owned by peter:staff without group > access, then it would be inappropriate for pg_basebackup to change the > permissions on that directory. > Yes, I agree that it may be a problem if the existing data directory permissions are 0700 to changing it to 0750. But it may not be a problem for the scenarios, where the existing data permissions >=0750, to the upstream permissions. Because user must need to change anyway to start the server, otherwise server start fails, and also the files inside the data folder follows the permissions of the upstream data directory. usually production systems follows same permissions are of upstream, I don't see a problem in following the same for development environment also? comments? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Thu, Mar 7, 2019 at 6:33 AM Andres Freund wrote: > Hi, > > On 2019-03-05 23:07:21 -0800, Andres Freund wrote: > > My next steps are: > > - final polish & push the basic DDL and pg_dump patches > > Done and pushed. Some collation dependent fallout, I'm hoping I've just > fixed that. > Thanks for the corrections that I missed, also for the extra changes. Here I attached the rebased patches that I shared earlier. I am adding the comments to explain the API's in the code, will share those patches later. I observed a crash with the latest patch series in the COPY command. I am not sure whether the problem is with the reduce of tableOid patch problem, Will check it and correct the problem. Regards, Haribabu Kommi Fujitsu Australia 0010-Table-access-method-API-explanation.patch Description: Binary data 0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch Description: Binary data 0003-Docs-of-default_table_access_method-GUC.patch Description: Binary data 0004-Rename-indexam.sgml-to-am.sgml.patch Description: Binary data 0002-Removal-of-scan_update_snapshot-callback.patch Description: Binary data 0005-Reorganize-am-as-both-table-and-index.patch Description: Binary data 0006-Doc-update-of-Create-access-method-type-table.patch Description: Binary data 0009-Doc-of-CREATE-TABLE-AS-.-USING-syntax.patch Description: Binary data 0007-Doc-update-of-create-materialized-view-.-USING-synta.patch Description: Binary data 0008-Doc-update-of-CREATE-TABLE-.-USING-syntax.patch Description: Binary data
Re: [bug fix] Produce a crash dump before main() on Windows
On Mon, Mar 4, 2019 at 3:23 PM Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > Hello. > > At Tue, 6 Nov 2018 15:53:37 +1100, Haribabu Kommi < > kommi.harib...@gmail.com> wrote in < > cajrrpgcxzi4z_sttnuuvyoaw+sadk7+cjgypuf7ao43vujl...@mail.gmail.com> > > Thanks for confirmation of that PostgreSQL runs as service. > > > > Based on the following details, we can decide whether this fix is > required > > or not. > > 1. Starting of Postgres server using pg_ctl without service is of > > production use or not? > > 2. Without this fix, how difficult is the problem to find out that > > something is preventing the > > server to start? In case if it is easy to find out, may be better to > > provide some troubleshoot > > guide for windows users can help. > > > > I am in favor of doc fix if it easy to find the problem instead of > assuming > > the user usage. > > I don't have an idea about which is better behavior, but does > this work for you? > > > https://docs.microsoft.com/en-us/windows/desktop/wer/collecting-user-mode-dumps > No, this option is not generating local dumps for postgresql, but this option is useful to generate dumps for the applications that don't have a specific WER reporting. > > These dumps are configured and controlled independently of the > > rest of the WER infrastructure. You can make use of the local > > dump collection even if WER is disabled or if the user cancels > > WER reporting. The local dump can be different than the dump sent > > to Microsoft. > > I found that symantec offers this in the steps for error > reporting of its products. > Adding some doc changes for the users to refer what can be the issue windows if the PostgreSQL server doesn't start and there is no core file available. Regards, Haribabu Kommi Fujitsu Australia
Re: Inheriting table AMs for partitioned tables
On Tue, Mar 5, 2019 at 10:47 AM Andres Freund wrote: > Hi, > > In the pluggable storage patch [1], one thing that I'm wondering about > is how exactly to inherit the storage AM across partitions. I think > that's potentially worthy of a discussion with a wider audience than I'd > get in that thread. It seems also related to the recent discussion in [2] > > Consider (excerpted from the tests): > > CREATE TABLE tableam_parted_heap2 (a text, b int) PARTITION BY list (a) > USING heap2; > > SET default_table_access_method = 'heap'; > CREATE TABLE tableam_parted_a_heap2 PARTITION OF tableam_parted_heap2 FOR > VALUES IN ('a'); > > SET default_table_access_method = 'heap2'; > CREATE TABLE tableam_parted_b_heap2 PARTITION OF tableam_parted_heap2 FOR > VALUES IN ('b'); > > CREATE TABLE tableam_parted_c_heap2 PARTITION OF tableam_parted_heap2 FOR > VALUES IN ('c') USING heap; > CREATE TABLE tableam_parted_d_heap2 PARTITION OF tableam_parted_heap2 FOR > VALUES IN ('d') USING heap2; > > It seems pretty clear that tableam_parted_heap2, tableam_parted_d_heap2 > would be stored via heap2, and tableam_parted_c_heap2 via heap. > > But for tableam_parted_a_heap2 tableam_parted_b_heap2 the answer isn't > quite as clear. I think it'd both be sensible for new partitions to > inherit the AM from the root, but it'd also be sensible to use the > current default. > > Out of laziness (it's how it works rn) I'm inclined to to go with using > the current default, but I'd be curious if others disagree. > As other said that, I also agree to go with default_table_access_method to be preferred if not explicitly specified the access method during the table creation. This discussion raises a point that, in case if the user wants to change the access method of a table later once it is created, currently there is no option. currently there are no other alternative table access methods that are available for the user to switch, but definitely it may be required later. I will provide a patch to alter the access method of a table for v13. Regards, Haribabu Kommi Fujitsu Australia
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Sat, Mar 2, 2019 at 7:27 AM Robert Haas wrote: > On Thu, Feb 7, 2019 at 3:28 AM Masahiko Sawada > wrote: > > WAL encryption will follow as an additional feature. > > I don't think WAL encryption is an optional feature. You can argue > about whether it's useful to encrypt the disk files in the first place > given that there's no privilege boundary between the OS user and the > database, but a lot of people seem to think it is and maybe they're > right. However, who can justify encrypting only SOME of the disk > files and not others? I mean, maybe you could justify not encryption > the SLRU files on the grounds that they probably don't leak much in > the way of interesting information, but the WAL files certainly do -- > your data is there, just as much as in the data files themselves. > +1 WAL encryption is not optional, it must be encrypted. > To be honest, I think there is a lot to like about the patches > Cybertec has proposed. Those patches don't have all of the fancy > key-management stuff that you are proposing here, but maybe that > stuff, if we want it, could be added, rather than starting over from > scratch. It seems to me that those patches get a lot of things right. > In particular, it looked to me when I looked at them like they made a > pretty determined effort to encrypt every byte that might go down to > the disk. It seems to me that you if you want encryption, you want > that. > The Cybertec proposed patches are doing the encryption at the instance level, AFAIK, the current discussion is also trying to reduce the scope of the encryption to object level like (tablesapce, database or table) to avoid the encryption performance impact for the databases, tables that don't need it. IMO, the entire performance of encryption depends on WAL encryption, whether we choose instance level or object level encryption. As WAL is not an optional encryption, even if the encryption is set to object level, the corresponding objects WAL needs to be encrypted, so this should be done at the WAL insertion not at WAL write to disk, because some entries are not encrypted and some not. Or may be something like encrypting entire WAL even if one object is set for encryption. Regards, Haribabu Kommi Fujitsu Australia
Re: Drop type "smgr"?
On Thu, Feb 28, 2019 at 5:37 PM Tom Lane wrote: > Thomas Munro writes: > > On Thu, Feb 28, 2019 at 7:08 PM Tom Lane wrote: > >> I agree that smgrtype as it stands is pretty pointless, but what > >> will we be using instead to get to those other implementations? > > > Our current thinking is that smgropen() should know how to map a small > > number of special database OIDs to different smgr implementations > > Hmm. Maybe mapping based on tablespaces would be a better idea? > Thanks to bringing up this idea of mutliple smgr implementations. I also thought of implementing our own smgr implementation to support transparent data encryption on the disk based on tablespace mapping. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Mon, Feb 25, 2019 at 11:38 AM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > Hi Hari-san, > > I've reviewed all files. I think I'll proceed to testing when I've > reviewed the revised patch and the patch for target_server_type. > > Thanks for the review. > > (1) patch 0001 > CONNECTION_CHECK_WRITABLE, /* Check if we could make a > writable > * > connection. */ > + CONNECTION_CHECK_TARGET,/* Check if we have a proper target > +* > connection */ > CONNECTION_CONSUME /* Wait for any pending > message and consume > * them. */ > > According to the following comment, a new enum value should be added at > the end. > > /* > * Although it is okay to add to these lists, values which become unused > * should never be removed, nor should constants be redefined - that would > * break compatibility with existing code. > */ > fixed. > > > (2) patch 0002 > It seems to align better with the existing code to set the default value > to pg_conn.requested_session_type explicitly in makeEmptyPGconn(), even if > the default value is 0. Although I feel it's redundant, other member > variables do so. > > fixed. > > (3) patch 0003 > IntervalStyle was not reported by releases before > 8.4; > -application_name was not reported by releases > before 9.0.) > +application_name was not reported by releases > before 9.0 > +transaction_read_only was not reported by releases > before 12.0.) > > ";" is missing at the end of the third line. > > fixed. > > (4) patch 0004 > - /* Type of connection to make. Possible values: any, read-write. > */ > + /* Type of connection to make. Possible values: any, read-write, > perfer-read. */ > char *target_session_attrs; > > perfer -> prefer > > fixed. > > (5) patch 0004 > @@ -3608,6 +3691,9 @@ makeEmptyPGconn(void) > conn = NULL; > } > > + /* Initial value */ > + conn->read_write_host_index = -1; > > The new member should be initialized earlier in this function. Otherwise, > as you can see in the above fragment, conn can be NULL in an out-of-memory > case. > > fixed. > > (6) patch 0004 > Don't we add read-only as well as prefer-read, which corresponds to Slave > or Secondary of PgJDBC's targetServerType? I thought the original proposal > was to add both. > > Added read-only option. > > (7) patch 0004 > @@ -2347,6 +2367,7 @@ keep_going: > /* We will come back to here until there is > > conn->try_next_addr = true; > goto keep_going; > } > + > > appendPQExpBuffer(&conn->errorMessage, > > libpq_gettext("could not create socket: %s\n"), > > Is this an unintended newline addition? > > removed. > > (8) patch 0004 > + const char *type = > (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ? > + > "read-only" : "writable"; > > I'm afraid these strings are not translatable into other languages. > > OK. I added two separate error message prepare statements for both read-write and read-only so, it shouldn't be a problem. > > (9) patch 0004 > + /* Record read-write host > index */ > + if > (conn->read_write_host_index == -1) > + > conn->read_write_host_index = conn->whichhost; > > At this point, the session can be either read-write or read-only, can't > it? Add the check "!conn->transaction_read_only" in this if condition? > Yes, fixed. > > (10) patch 0004 > + if ((conn->target_session_attrs != NULL) && > + (conn->requested_session_type > == SESSION_TYPE_PREFER_READ) && > + (conn->read_write_host_index != > -2)) > > The first condition is not necessary because the second one exists. > > The parenthesis surrounding each of these conditions are redundant. It > would be better to remove them for readability. This also applies to the > following part: > > + if
Re: Pluggable Storage - Andres's take
On Wed, Feb 27, 2019 at 11:10 AM Andres Freund wrote: > Hi, > > On 2019-01-21 10:32:37 +1100, Haribabu Kommi wrote: > > I am not able to remove the complete t_tableOid from HeapTuple, > > because of its use in triggers, as the slot is not available in triggers > > and I need to store the tableOid also as part of the tuple. > > What precisely do you man by "use in triggers"? You mean that a trigger > might access a HeapTuple's t_tableOid directly, even though all of the > information is available in the trigger context? > I forgot the exact scenario, but during the trigger function execution, the pl/pgsql function execution access the TableOidAttributeNumber from the stored tuple using the heap_get* function. Because of lack of slot support in the triggers, we still need to maintain the t_tableOid with proper OID. The heaptuple t_tableOid member data is updated whenever the heaptuple is generated from slot. Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada wrote: > Thank you. Attached the rebased patch. > I ran some performance tests to compare the parallelism benefits, but I got some strange results of performance overhead, may be it is because, I tested it on my laptop. FYI, Table schema: create table tbl(f1 int, f2 char(100), f3 float4, f4 char(100), f5 float8, f6 char(100), f7 bigint); Tbl with 3 indexes 1000 record deletion master - 22ms patch - 25ms with 0 parallel workers patch - 43ms with 1 parallel worker patch - 72ms with 2 parallel workers 1 record deletion master - 52ms patch - 56ms with 0 parallel workers patch - 79ms with 1 parallel worker patch - 86ms with 2 parallel workers 10 record deletion master - 410ms patch - 379ms with 0 parallel workers patch - 330ms with 1 parallel worker patch - 289ms with 2 parallel workers Tbl with 5 indexes 1000 record deletion master - 28ms patch - 34ms with 0 parallel workers patch - 86ms with 2 parallel workers patch - 106ms with 4 parallel workers 1 record deletion master - 58ms patch - 63ms with 0 parallel workers patch - 101ms with 2 parallel workers patch - 118ms with 4 parallel workers 10 record deletion master - 632ms patch - 490ms with 0 parallel workers patch - 455ms with 2 parallel workers patch - 403ms with 4 parallel workers Tbl with 7 indexes 1000 record deletion master - 35ms patch - 44ms with 0 parallel workers patch - 93ms with 2 parallel workers patch - 110ms with 4 parallel workers patch - 123ms with 6 parallel workers 1 record deletion master - 76ms patch - 78ms with 0 parallel workers patch - 135ms with 2 parallel workers patch - 143ms with 4 parallel workers patch - 139ms with 6 parallel workers 10 record deletion master - 641ms patch - 656ms with 0 parallel workers patch - 613ms with 2 parallel workers patch - 735ms with 4 parallel workers patch - 679ms with 6 parallel workers Regards, Haribabu Kommi Fujitsu Australia
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Mon, Feb 4, 2019 at 12:16 PM Haribabu Kommi wrote: > > And regarding current_logfiles permissions, I feel this file should have > permissions of data directory files as it is present in the data directory > whether it stores the information of log file, until this file is > completely > removed with another approach to store the log file details. > > I am not sure whether this has been already discussed or not? How about > using shared memory to store the log file names? So that we don't need > of this file? > I checked the code why the current_logfiles is not implemented as shared memory and found that the current syslogger doesn't attach to the shared memory of the postmaster. To support storing the current_logfiles in shared memory, the syslogger process also needs to attach to the shared memory, this seems to be a new infrastructure change. In case if we are not going to change the permissions of the file to group access mode instead of if we strict with log_file_mode, I just tried the attached patch of moving the current_logfiles patch to the log_directory. The only drawback of this approach, is incase if the user changes the log_directory, the current_logfiles is present in the old log_directory. I don't see that as a problem. comments? Regards, Haribabu Kommi Fujitsu Australia 0001-Move-the-current_logfiles-file-into-log_directory.patch Description: Binary data
Re: [HACKERS] Block level parallel vacuum
On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada wrote: > On Wed, Feb 13, 2019 at 9:32 PM Haribabu Kommi > wrote: > > > > > > On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada > wrote: > >> > >> On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi < > kommi.harib...@gmail.com> wrote: > >> > > >> > > >> > On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada > wrote: > >> >> > >> >> > >> >> The passing stats = NULL to amvacuumcleanup and ambulkdelete means > the > >> >> first time execution. For example, btvacuumcleanup skips cleanup if > >> >> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or > >> >> amvacuumcleanup when the first time calling. And they store the > result > >> >> stats to the memory allocated int the local memory. Therefore in the > >> >> parallel vacuum I think that both worker and leader need to move it > to > >> >> the shared memory and mark it as updated as different worker could > >> >> vacuum different indexes at the next time. > >> > > >> > > >> > OK, understood the point. But for btbulkdelete whenever the stats are > NULL, > >> > it allocates the memory. So I don't see a problem with it. > >> > > >> > The only problem is with btvacuumcleanup, when there are no dead > tuples > >> > present in the table, the btbulkdelete is not called and directly the > btvacuumcleanup > >> > is called at the end of vacuum, in that scenario, there is code flow > difference > >> > based on the stats. so why can't we use the deadtuples number to > differentiate > >> > instead of adding another flag? > >> > >> I don't understand your suggestion. What do we compare deadtuples > >> number to? Could you elaborate on that please? > > > > > > The scenario where the stats should pass NULL to btvacuumcleanup > function is > > when there no dead tuples, I just think that we may use that deadtuples > structure > > to find out whether stats should pass NULL or not while avoiding the > extra > > memcpy. > > > > Thank you for your explanation. I understood. Maybe I'm worrying too > much but I'm concernced compatibility; currently we handle indexes > individually. So if there is an index access method whose ambulkdelete > returns NULL at the first call but returns a palloc'd struct at the > second time or other, that doesn't work fine. > > The documentation says that passed-in 'stats' is NULL at the first > time call of ambulkdelete but doesn't say about the second time or > more. Index access methods may expect that the passed-in 'stats' is > the same as what they has returned last time. So I think to add an > extra flag for keeping comptibility. > I checked some of the ambulkdelete functions, and they are not returning a NULL data whenever those functions are called. But the palloc'd structure doesn't get filled with the details. IMO, there is no need of any extra code that is required for parallel vacuum compared to normal vacuum. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Wed, Feb 20, 2019 at 7:40 PM Magnus Hagander wrote: > > On Wed, Feb 20, 2019 at 5:17 AM Haribabu Kommi > wrote: > >> >> On Fri, Feb 15, 2019 at 10:15 AM Michael Paquier >> wrote: >> >>> On Thu, Feb 14, 2019 at 11:21:19PM +1100, Haribabu Kommi wrote: >>> > On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander >>> wrote: >>> >> I think it could be argued that neither initdb *or* pg_basebackup >>> should >>> >> change the permissions on an existing directory, because the admin >>> may have >>> >> done that intentionally. But when they do create the directory, they >>> should >>> >> follow the same patterns. >>> > >>> > Hmm, even if the administrator set some specific permissions to the >>> data >>> > directory, PostgreSQL server doesn't allow server to start if the >>> > permissions are not (0700) for versions less than 11 and (0700 or >>> > 0750) for version 11 or later. >>> >>> Yes, particularly with pg_basebackup -R this adds an extra step in the >>> user flow. >>> >> > Perhaps we should make the enforcement of permissions conditional on -R? > OTOH that's documented as "write recovery.conf", but we could change that > to be "prepare for replication" or something? > Yes, the enforcement of permissions can be done only when -R option is provided. The documentation is changed in v12 already as "write configuration for replication". > > > To let the user to use the PostgreSQL server, user must change the >>> > permissions of the data directory. So, I don't see a problem in >>> > changing the permissions by these tools. >>> >>> I certainly agree with the point of Magnus that both tools should >>> behave consistently, and I cannot actually imagine why it would be >>> useful for an admin to keep a more permissive data folder while all >>> the contents already have umasks set at the same level as the primary >>> (or what initdb has been told to use), but perhaps I lack imagination. >>> If we doubt about potential user impact, the usual, best, answer is to >>> let back-branches behave the way they do now, and only do something on >>> HEAD. >>> >> >> I also agree that both inidb and pg_basebackup should behave same. >> Our main concern is that standby data directory that doesn't follow >> the primary data directory permissions can lead failures when the standby >> gets promoted. >> > > I don't think that follows at all. There are many scenarios where you'd > want the standby to have different permissions than the primary. > I really having a hard time to understand that how the different permissions are possible? I think of that the standby is having more restrict permissions. May be the standby is not a hot standby? Can you please provide some more details? Till v11, PostgreSQL allows the data directory permissions to be 0700 of the directory, otherwise server start fails, even if the pg_basebackup is successful. In my testing I came to know that data directory permissions less than 0700 e.g- 0300 also the server is started. I feel the check of validating data directory is checking whether are there any group permissions or not? But it didn't whether the current owner have all the permissions are not? Is this the scenario are you expecting? > And I'm not sure it's our business to enforce that. A much much more > common mistake people make is run pg_basebackup as the wrong user, thereby > getting the wrong owner of all files. But that doesn't mean we should > enforce the owner/group of the files. > I didn't understand this point also clearly. The system user who executes the pg_basebackup command, all the database files are associated with that user. If that corresponding user don't have permissions to access that existing data folder, the backup fails. Regards, Haribabu Kommi Fujitsu Australia
Re: Removal of duplicate variable declarations in fe-connect.c
On Fri, Feb 22, 2019 at 3:22 PM Michael Paquier wrote: > On Fri, Feb 22, 2019 at 11:33:17AM +1100, Haribabu Kommi wrote: > > During the development of another feature, I found that same local > > variables are declared twice. > > IMO, there is no need of again declaring the local variables. Patch > > attached. > > Indeed, fixed. That's not a good practice, and each variable is > assigned in its own block before getting used, so there is no > overlap. > Thanks. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Feb 22, 2019 at 5:47 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Haribabu Kommi [mailto:kommi.harib...@gmail.com] > Here I attached first set of patches that implemented the prefer-read > option > > after reporting the transaction_read_only GUC to client. Along the lines > > of adding prefer-read option patch, > > Great, thank you! I'll review and test it. > Thanks. > > 3. Existing read-write code is modified to use the new reported GUC > instead > > of executing the show command. > > Is the code path left to use SHOW for older servers? > Yes, for older versions less than 12, still uses the SHOW command approach. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Thu, Feb 14, 2019 at 1:04 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Haribabu Kommi [mailto:kommi.harib...@gmail.com] > > No. It's not good if the user has to be bothered by > > default_transaction_read_only when he simply wants to a standby. > > > > > > > > OK. Understood. > > so if we are going to differentiate between readonly and standby types, > > then I still > > feel that adding a prefer-read to target_session_attrs is still valid > > improvement. > > I agree that it's valid improvement to add prefer-read to > target_session_attr, as a means to "get a read-only session." > > > But the above improvement can be enhanced once the base work of > GUC_REPORT > > is finished. > > Is it already in progress in some thread, or are you trying to start from > scratch? (I may have done it, but I don't remember it well...) > > > Yes, I want to work on this patch, hopefully by next commitfest. In case > > if I didn't get time, > > I can ask for your help. > > I'm glad to hear that. Sure. I'd like to review your patch, and possibly > add/modify code if necessary. Are you going to add > target_server_type={primary | standby | prefer_standby} as well as add > prefer-read to target_session_attr? > > > > (I wonder which of server_type or server_role feels natural in > > English.) > > > > > > > > server_type may be good as it stands with connection option > > (target_server_type). > > Thanks, agreed. That also follows PgJDBC's targetServerType. > Here I attached first set of patches that implemented the prefer-read option after reporting the transaction_read_only GUC to client. Along the lines of adding prefer-read option patch, 1. I refactor the existing code to reduce the duplicate. 2. Added a enum to represent the user requested target_session_attrs type, this is used in comparison instead of doing a strcmp always. 3. Existing read-write code is modified to use the new reported GUC instead of executing the show command. Basic patches are working, there may still need some documentation works. Now I will add the another parameter target_server_type to choose the primary, standby or prefer-standby as discussed in the upthreads with a new GUC variable. Regards, Haribabu Kommi Fujitsu Australia 0004-New-prefer-read-target_session_attrs-type.patch Description: Binary data 0001-Restructure-the-code-to-remove-duplicate-code.patch Description: Binary data 0002-New-TargetSessionAttrsType-enum.patch Description: Binary data 0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patch Description: Binary data
Removal of duplicate variable declarations in fe-connect.c
Hi Hackers, During the development of another feature, I found that same local variables are declared twice. IMO, there is no need of again declaring the local variables. Patch attached. Regards, Haribabu Kommi Fujitsu Australia 0001-Removal-of-duplicate-local-variable-declaration.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Fri, Feb 15, 2019 at 10:15 AM Michael Paquier wrote: > On Thu, Feb 14, 2019 at 11:21:19PM +1100, Haribabu Kommi wrote: > > On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander > wrote: > >> I think it could be argued that neither initdb *or* pg_basebackup should > >> change the permissions on an existing directory, because the admin may > have > >> done that intentionally. But when they do create the directory, they > should > >> follow the same patterns. > > > > Hmm, even if the administrator set some specific permissions to the data > > directory, PostgreSQL server doesn't allow server to start if the > > permissions are not (0700) for versions less than 11 and (0700 or > > 0750) for version 11 or later. > > Yes, particularly with pg_basebackup -R this adds an extra step in the > user flow. > > > To let the user to use the PostgreSQL server, user must change the > > permissions of the data directory. So, I don't see a problem in > > changing the permissions by these tools. > > I certainly agree with the point of Magnus that both tools should > behave consistently, and I cannot actually imagine why it would be > useful for an admin to keep a more permissive data folder while all > the contents already have umasks set at the same level as the primary > (or what initdb has been told to use), but perhaps I lack imagination. > If we doubt about potential user impact, the usual, best, answer is to > let back-branches behave the way they do now, and only do something on > HEAD. > I also agree that both inidb and pg_basebackup should behave same. Our main concern is that standby data directory that doesn't follow the primary data directory permissions can lead failures when the standby gets promoted. Lack of complaints from the users, how about making this change in the HEAD? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Tue, Nov 27, 2018 at 4:59 PM Amit Langote wrote: > Hi, > > On 2018/11/02 9:17, Haribabu Kommi wrote: > > Here I attached the cumulative fixes of the patches, new API additions > for > > zheap and > > basic outline of the documentation. > > I've read the documentation patch while also looking at the code and here > are some comments. > Thanks for the review and apologies for the delay. I have taken care of your most of the comments in the latest version of the doc patches. > > + > + > +TupleTableSlotOps * > +slot_callbacks (Relation relation); > + > + API to access the slot specific methods; > + Following methods are available; > + TTSOpsVirtual, > + TTSOpsHeapTuple, > + TTSOpsMinimalTuple, > + TTSOpsBufferTuple, > + > > Unless I'm misunderstanding what the TupleTableSlotOps abstraction is or > its relations to the TableAmRoutine abstraction, I think the text > description could better be written as: > > "API to get the slot operations struct for a given table access method" > > It's not clear to me why various TTSOps* structs are listed here? Is the > point that different AMs may choose one of the listed alternatives? For > example, I see that heap AM implementation returns TTOpsBufferTuple, so it > manipulates slots containing buffered tuples, right? Other AMs are free > to return any one of these? For example, some AMs may never use buffer > manager and hence not use TTOpsBufferTuple. Is that understanding correct? > Yes, AM can decide what type of Slot method it wants to use. Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Mon, Feb 4, 2019 at 2:31 PM Haribabu Kommi wrote: > > On Tue, Jan 22, 2019 at 1:43 PM Haribabu Kommi > wrote: > >> >> >> OK. I will work on the doc changes. >> > > Sorry for the delay. > > Attached a draft patch of doc and comments changes that I worked upon. > Currently I added comments to the callbacks that are present in the > TableAmRoutine > structure and I copied it into the docs. I am not sure whether it is a > good approach or not? > I am yet to add description for the each parameter of the callbacks for > easier understanding. > > Or, Giving description of each callbacks in the docs with division of > those callbacks > according to them divided in the TableAmRoutine structure? Currently > following divisions > are available. > 1. Table scan > 2. Parallel table scan > 3. Index scan > 4. Manipulation of physical tuples > 5. Non-modifying operations on individual tuples > 6. DDL > 7. Planner > 8. Executor > > Suggestions? > Here I attached the doc patches for the pluggable storage, I divided the API's into the above specified groups and explained them in the docs.I can further add more details if the approach seems fine. Regards, Haribabu Kommi Fujitsu Australia 0008-Table-access-method-API-explanation.patch Description: Binary data 0001-Docs-of-default_table_access_method-GUC.patch Description: Binary data 0002-Rename-indexam.sgml-to-am.sgml.patch Description: Binary data 0003-Reorganize-am-as-both-table-and-index.patch Description: Binary data 0004-Doc-update-of-Create-access-method-type-table.patch Description: Binary data 0005-Doc-update-of-create-materialized-view-.-USING-synta.patch Description: Binary data 0006-Doc-update-of-CREATE-TABLE-.-USING-syntax.patch Description: Binary data 0007-Doc-of-CREATE-TABLE-AS-.-USING-syntax.patch Description: Binary data
Re: pg_basebackup ignores the existing data directory permissions
On Thu, Feb 14, 2019 at 8:57 PM Magnus Hagander wrote: > On Thu, Feb 14, 2019 at 9:10 AM Michael Paquier > wrote: > >> On Thu, Feb 14, 2019 at 06:34:07PM +1100, Haribabu Kommi wrote: >> > we have an application that is used to create the data directory with >> >> Well, initdb would do that happily, so there is no actual any need to >> do that to begin with. Anyway.. >> >> > owner access (0700), but with initdb group permissions option, it >> > automatically >> > converts to (0750) by the initdb. But pg_basebackup doesn't change it >> when >> > it tries to do a backup from a group access server. >> >> So that's basically the opposite of the case I was thinking about, >> where you create a path for a base backup with permissions strictly >> higher than 700, say 755, and the base backup path does not have >> enough restrictions. And in your case the permissions are too >> restrictive because of the application creating the folder itself but >> they should be relaxed if group access is enabled. Actually, that's >> something that we may want to do consistently across all branches. If >> an application calls pg_basebackup after creating a path, they most >> likely change the permissions anyway to allow the postmaster to >> start. >> > > I think it could be argued that neither initdb *or* pg_basebackup should > change the permissions on an existing directory, because the admin may have > done that intentionally. But when they do create the directory, they should > follow the same patterns. > Hmm, even if the administrator set some specific permissions to the data directory, PostgreSQL server doesn't allow server to start if the permissions are not (0700) for versions less than 11 and (0700 or 0750) for version 11 or later. To let the user to use the PostgreSQL server, user must change the permissions of the data directory. So, I don't see a problem in changing the permissions by these tools. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Thu, Feb 14, 2019 at 3:04 PM Michael Paquier wrote: > On Wed, Feb 13, 2019 at 06:42:36PM +1100, Haribabu Kommi wrote: > > This should back-patch till 11 where the group access is introduced. > > Because of lack of complaints, I agree with you that there is no need of > > further back-patch. > > I am confused by the link with group access. Apologies to confuse you by linking it with group access. This patch doesn't have an interaction with group access. From v11 onwards, PostgreSQL server accepts two set of permissions for the data directory because of group access. we have an application that is used to create the data directory with owner access (0700), but with initdb group permissions option, it automatically converts to (0750) by the initdb. But pg_basebackup doesn't change it when it tries to do a backup from a group access server. > The patch you are > sending is compatible down to v11, but we could also do it further > down by just using chmod with S_IRWXU on the target folder if it > exists and is empty. > Yes, I agree with you that by changing chmod as you said fixes it in the back-branches. Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Feb 8, 2019 at 8:16 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Haribabu Kommi [mailto:kommi.harib...@gmail.com] > > target_session_attrs checks for the default_transaction_readonly or not? > > PG 11 uses transaction_read_only, not default_transaction_readonly. > That's fine, because its purpose is to get a read-only session as the name > suggests, not to connect to a standby. > Thanks for correction, yes it uses the transaction_readonly. > > target_server_type checks for whether the server is in recovery or not? > > Yes. > > > > I feel having two options make this feature complex to use it from the > user > > point of view? > > > > The need of two options came because of a possibility of a master server > > with default_transaction_readonly set to true. Even if the default > > transaction > > is readonly, it is user changeable parameter, so there shouldn't be any > > problem. > > No. It's not good if the user has to be bothered by > default_transaction_read_only when he simply wants to a standby. > OK. Understood. so if we are going to differentiate between readonly and standby types, then I still feel that adding a prefer-read to target_session_attrs is still valid improvement. But the above improvement can be enhanced once the base work of GUC_REPORT is finished. > > how about just adding one parameter that takes the options similar like > > JDBC? > > target_server_type - Master, standby and prefer-standby. (The option > names > > can revised based on the common words on the postgresql docs?) > > "Getting a read-only session" is not equal to "connecting to a standby", > so two different parameters make sense. > > > > And one more thing, what happens when the server promotes to master but > > the connection requested is standby? I feel we can maintain the existing > > connections > > and later new connections can be redirected? comments? > > Ideally, it should be possible for the user to choose the behavior like > Oracle below. But that's a separate feature. > > > 9.2 Role Transitions Involving Physical Standby Databases > > https://docs.oracle.com/en/database/oracle/oracle-database/18/sbydb/managing-oracle-data-guard-role-transitions.html#GUID-857F6F45-DC1C-4345-BD39-F3BE7D79F1CD > -- > Keeping Physical Standby Sessions Connected During Role Transition > > As of Oracle Database 12c Release 2 (12.2.0.1), when a physical standby > database is converted into a primary you have the option to keep any > sessions connected to the physical standby connected, without disruption, > during the switchover/failover. > > To enable this feature, set the STANDBY_DB_PRESERVE_STATES initialization > parameter in your init.ora file before the standby instance is started. > This parameter applies to physical standby databases only. The allowed > values are: > > NONE — No sessions on the standby are retained during a > switchover/failover. This is the default value. > > ALL — User sessions are retained during switchover/failover. > > SESSION — User sessions are retained during switchover/failover. > -- > Yes, the above feature is completely a different role enhancement feature, that can taken up separately. > Would you like to work on this patch? I'm not sure if I can take time, > but I'm willing to do it if you don't have enough time. > > As Tom mentioned, we need to integrate and clean patches in three mail > threads: > > * Make a new GUC_REPORT parameter, server_type, to show the server role > (primary or standby). > * Add target_server_type libpq connection parameter, whose values are > either primary, standby, or prefer_standby. > * Failover timeout, load balancing, etc. that someone proposed in the > other thread? > Yes, I want to work on this patch, hopefully by next commitfest. In case if I didn't get time, I can ask for your help. > (I wonder which of server_type or server_role feels natural in English.) > server_type may be good as it stands with connection option (target_server_type). Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Sat, Feb 9, 2019 at 11:47 PM Masahiko Sawada wrote: > On Tue, Feb 5, 2019 at 12:14 PM Haribabu Kommi > wrote: > > > > > > On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada > wrote: > >> > >> > >> The passing stats = NULL to amvacuumcleanup and ambulkdelete means the > >> first time execution. For example, btvacuumcleanup skips cleanup if > >> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or > >> amvacuumcleanup when the first time calling. And they store the result > >> stats to the memory allocated int the local memory. Therefore in the > >> parallel vacuum I think that both worker and leader need to move it to > >> the shared memory and mark it as updated as different worker could > >> vacuum different indexes at the next time. > > > > > > OK, understood the point. But for btbulkdelete whenever the stats are > NULL, > > it allocates the memory. So I don't see a problem with it. > > > > The only problem is with btvacuumcleanup, when there are no dead tuples > > present in the table, the btbulkdelete is not called and directly the > btvacuumcleanup > > is called at the end of vacuum, in that scenario, there is code flow > difference > > based on the stats. so why can't we use the deadtuples number to > differentiate > > instead of adding another flag? > > I don't understand your suggestion. What do we compare deadtuples > number to? Could you elaborate on that please? > The scenario where the stats should pass NULL to btvacuumcleanup function is when there no dead tuples, I just think that we may use that deadtuples structure to find out whether stats should pass NULL or not while avoiding the extra memcpy. > > And also this scenario is not very often, so avoiding > > memcpy for normal operations would be better. It may be a small gain, > just > > thought of it. > > > > This scenario could happen periodically on an insert-only table. > Additional memcpy is executed once per indexes in a vacuuming but I > agree that the avoiding memcpy would be good. > Yes, understood. If possible removing the need of memcpy would be good. The latest patch doesn't apply anymore. Needs a rebase. Regards, Haribabu Kommi Fujitsu Australia
Re: pg_basebackup ignores the existing data directory permissions
On Wed, Feb 13, 2019 at 12:42 PM Michael Paquier wrote: > On Tue, Feb 12, 2019 at 06:03:37PM +1100, Haribabu Kommi wrote: > > During the testing allow group access permissions on the standby database > > directory, one of my colleague found the issue, that pg_basebackup > > doesn't verify whether the existing data directory has the required > > permissions or not? This issue is not related allow group access > > permissions. This problem was present for a long time, (I think from > > the time the pg_basebackup was introduced). > > In which case this would cause the postmaster to fail to start. > Yes, the postmaster fails to start, but I feel if pg_basebackup takes care of correcting the file permissions automatically like initdb, that will be good. > > Attached patch fixes the problem similar like initdb by changing the > > permissions of the data > > directory to the required permissions. > > It looks right to me and takes care of the case where group access is > allowed. Still, we have not seen any complains about the current > behavior either and pg_basebackup is around for some time already, so > I would tend to not back-patch that. Any thoughts from others? > This should back-patch till 11 where the group access is introduced. Because of lack of complaints, I agree with you that there is no need of further back-patch. Regards, Haribabu Kommi Fujitsu Australia
pg_basebackup ignores the existing data directory permissions
Hi Hackers, During the testing allow group access permissions on the standby database directory, one of my colleague found the issue, that pg_basebackup doesn't verify whether the existing data directory has the required permissions or not? This issue is not related allow group access permissions. This problem was present for a long time, (I think from the time the pg_basebackup was introduced). Attached patch fixes the problem similar like initdb by changing the permissions of the data directory to the required permissions. Regards, Haribabu Kommi Fujitsu Australia 0001-pg_basebackup-Correct-the-existing-directory-permiss.patch Description: Binary data
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Sat, Feb 9, 2019 at 4:07 PM Amit Kapila wrote: > On Fri, Feb 8, 2019 at 6:55 AM Haribabu Kommi > wrote: > > > > On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi > wrote: > >> > >> > >> This is because of larger xact_commit value than default configuration. > With the changed server configuration, that leads to generate more parallel > workers and every parallel worker operation is treated as an extra commit, > because of this reason, the total number of commits increased, but the > overall query performance is decreased. > >> > >> Is there any relation of transaction commits to performance? > >> > >> Is there any specific reason to consider the parallel worker activity > also as a transaction commit? Especially in my observation, if we didn't > consider the parallel worker activity as separate commits, the test doesn't > show an increase in transaction commits. > > > > > > The following statements shows the increase in the xact_commit value with > > parallel workers. I can understand that workers updating the seq_scan > stats > > as they performed the seq scan. > > > Thanks for your opinion. > Yeah, that seems okay, however, one can say that for the scan they > want to consider it as a single scan even if part of the scan is > accomplished by workers or may be a separate counter for parallel > workers scan. > OK. > > Is the same applied to parallel worker transaction > > commits also? > > > > I don't think so. It seems to me that we should consider it as a > single transaction. Do you want to do the leg work for this and try > to come up with a patch? On a quick look, I think we might want to > change AtEOXact_PgStat so that the commits for parallel workers are > not considered. I think the caller has that information. > I try to fix it by adding a check for parallel worker or not and based on it count them into stats. Patch attached. With this patch, currently it doesn't count parallel worker transactions, and rest of the stats like seqscan and etc are still get counted. IMO they still may need to be counted as those stats represent the number of tuples returned and etc. Comments? Regards, Haribabu Kommi Fujitsu Australia 0001-Avoid-counting-parallel-worker-transactions-stats.patch Description: Binary data
Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
On Thu, Feb 7, 2019 at 9:31 PM Haribabu Kommi wrote: > Hi Hackers, > > Does increase in Transaction commits per second means good query > performance? > Why I asked this question is, many monitoring tools display that number of > transactions > per second in the dashboard (including pgadmin). > > During the testing of bunch of queries with different set of > configurations, I observed that > TPS of some particular configuration has increased compared to default > server configuration, but the overall query execution performance is > decreased after comparing all queries run time. > > This is because of larger xact_commit value than default configuration. > With the changed server configuration, that leads to generate more parallel > workers and every parallel worker operation is treated as an extra commit, > because of this reason, the total number of commits increased, but the > overall query performance is decreased. > > Is there any relation of transaction commits to performance? > > Is there any specific reason to consider the parallel worker activity also > as a transaction commit? Especially in my observation, if we didn't > consider the parallel worker activity as separate commits, the test doesn't > show an increase in transaction commits. > The following statements shows the increase in the xact_commit value with parallel workers. I can understand that workers updating the seq_scan stats as they performed the seq scan. Is the same applied to parallel worker transaction commits also? The transaction commit counter is updated with all the internal operations like autovacuum, checkpoint and etc. The increase in counters with these operations are not that visible compared to parallel workers. The spike of TPS with parallel workers is fine to consider? postgres=# select relname, seq_scan from pg_stat_user_tables where relname = 'tbl'; relname | seq_scan -+-- tbl | 16 (1 row) postgres=# begin; BEGIN postgres=# select xact_commit from pg_stat_database where datname = 'postgres'; xact_commit - 524 (1 row) postgres=# explain analyze select * from tbl where f1 = 1000; QUERY PLAN --- Gather (cost=0.00..3645.83 rows=1 width=214) (actual time=1.703..79.736 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbl (cost=0.00..3645.83 rows=1 width=214) (actual time=28.180..51.672 rows=0 loops=3) Filter: (f1 = 1000) Rows Removed by Filter: 3 Planning Time: 0.090 ms Execution Time: 79.776 ms (8 rows) postgres=# commit; COMMIT postgres=# select xact_commit from pg_stat_database where datname = 'postgres'; xact_commit - 531 (1 row) postgres=# select relname, seq_scan from pg_stat_user_tables where relname = 'tbl'; relname | seq_scan -+-- tbl | 19 (1 row) Regards, Haribabu Kommi Fujitsu Australia
Transaction commits VS Transaction commits (with parallel) VS query mean time
Hi Hackers, Does increase in Transaction commits per second means good query performance? Why I asked this question is, many monitoring tools display that number of transactions per second in the dashboard (including pgadmin). During the testing of bunch of queries with different set of configurations, I observed that TPS of some particular configuration has increased compared to default server configuration, but the overall query execution performance is decreased after comparing all queries run time. This is because of larger xact_commit value than default configuration. With the changed server configuration, that leads to generate more parallel workers and every parallel worker operation is treated as an extra commit, because of this reason, the total number of commits increased, but the overall query performance is decreased. Is there any relation of transaction commits to performance? Is there any specific reason to consider the parallel worker activity also as a transaction commit? Especially in my observation, if we didn't consider the parallel worker activity as separate commits, the test doesn't show an increase in transaction commits. Suggestions? Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Mon, Jan 21, 2019 at 5:48 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Haribabu Kommi [mailto:kommi.harib...@gmail.com] > > Thanks for finding out the problem, how about the following way of > checking > > for prefer-read/prefer-standby. > > > > 1. (default_transaction_read_only = true and recovery = true) > > > > 2. If none of the host satisfies the above scenario, then recovery = true > > 3. Last check is for default_transaction_read_only = true > > That would be fine. But as I mentioned in another mail, I think "get > read-only session" and "connect to standby" differ. So I find it better to > separate parameters for those request; target_session_attr and > target_server_type. > Thanks for your opinion. target_session_attrs checks for the default_transaction_readonly or not? target_server_type checks for whether the server is in recovery or not? I feel having two options make this feature complex to use it from the user point of view? The need of two options came because of a possibility of a master server with default_transaction_readonly set to true. Even if the default transaction is readonly, it is user changeable parameter, so there shouldn't be any problem. The same can be applied for logical replication also, the user can change the default transaction mode once the connection is established, if it is not according to it's requirement. how about just adding one parameter that takes the options similar like JDBC? target_server_type - Master, standby and prefer-standby. (The option names can revised based on the common words on the postgresql docs?) And one more thing, what happens when the server promotes to master but the connection requested is standby? I feel we can maintain the existing connections and later new connections can be redirected? comments? Regards, Haribabu Kommi Fujitsu Australia
Re: Memory contexts reset for trigger invocations
On Tue, Feb 5, 2019 at 4:29 PM Andres Freund wrote: > Hi, > > trigger.c goes through some trouble to free the tuples returned by > trigger functions. There's plenty codepaths that look roughly like: > if (oldtuple != newtuple && oldtuple != slottuple) > heap_freetuple(oldtuple); > if (newtuple == NULL) > { > if (trigtuple != fdw_trigtuple) > heap_freetuple(trigtuple); > return NULL;/* "do nothing" */ > } > > but we, as far as I can tell, do not reset the memory context in which > the trigger functions have been called. > > Wouldn't it be better to reset an appropriate context after each > invocation? Yes, that'd require some care to manage the lifetime of > tuples returned by triggers, but that seems OK? > Currently the trigger functions are executed under per tuple memory context, but the returned tuples are allocated from the executor memory context in some scenarios. /* * Copy tuple to upper executor memory. But if user just did * "return new" or "return old" without changing anything, there's * no need to copy; we can return the original tuple (which will * save a few cycles in trigger.c as well as here). */ if (rettup != trigdata->tg_newtuple && rettup != trigdata->tg_trigtuple) rettup = SPI_copytuple(rettup); we need to take care of these also before switch to a context? > Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada wrote: > On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi > wrote: > > > > > > > > > > + * Before starting parallel index vacuum and parallel cleanup index we > launch > > + * parallel workers. All parallel workers will exit after processed all > indexes > > > > parallel vacuum index and parallel cleanup index? > > > > > > ISTM we're using like "index vacuuming", "index cleanup" and "FSM > vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and > "parallel index cleanup" would be better? > OK. > > + /* > > + * If there is already-updated result in the shared memory we > > + * use it. Otherwise we pass NULL to index AMs and copy the > > + * result to the shared memory segment. > > + */ > > + if (lvshared->indstats[idx].updated) > > + result = &(lvshared->indstats[idx].stats); > > > > I didn't really find a need of the flag to differentiate the stats > pointer from > > first run to second run? I don't see any problem in passing directing > the stats > > and the same stats are updated in the worker side and leader side. > Anyway no two > > processes will do the index vacuum at same time. Am I missing something? > > > > Even if this flag is to identify whether the stats are updated or not > before > > writing them, I don't see a need of it compared to normal vacuum. > > > > The passing stats = NULL to amvacuumcleanup and ambulkdelete means the > first time execution. For example, btvacuumcleanup skips cleanup if > it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or > amvacuumcleanup when the first time calling. And they store the result > stats to the memory allocated int the local memory. Therefore in the > parallel vacuum I think that both worker and leader need to move it to > the shared memory and mark it as updated as different worker could > vacuum different indexes at the next time. > OK, understood the point. But for btbulkdelete whenever the stats are NULL, it allocates the memory. So I don't see a problem with it. The only problem is with btvacuumcleanup, when there are no dead tuples present in the table, the btbulkdelete is not called and directly the btvacuumcleanup is called at the end of vacuum, in that scenario, there is code flow difference based on the stats. so why can't we use the deadtuples number to differentiate instead of adding another flag? And also this scenario is not very often, so avoiding memcpy for normal operations would be better. It may be a small gain, just thought of it. > > > + initStringInfo(&buf); > > + appendStringInfo(&buf, > > + ngettext("launched %d parallel vacuum worker %s (planned: %d", > > + "launched %d parallel vacuum workers %s (planned: %d", > > + lvstate->pcxt->nworkers_launched), > > + lvstate->pcxt->nworkers_launched, > > + for_cleanup ? "for index cleanup" : "for index vacuum", > > + lvstate->pcxt->nworkers); > > + if (lvstate->options.nworkers > 0) > > + appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers); > > > > what is the difference between planned workers and requested workers, > aren't both > > are same? > > The request is the parallel degree that is specified explicitly by > user whereas the planned is the actual number we planned based on the > number of indexes the table has. For example, if we do like 'VACUUM > (PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000 > and the planned is 4. Also if max_parallel_maintenance_workers is 2 > the planned is 2. > OK. Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Tue, Jan 22, 2019 at 1:43 PM Haribabu Kommi wrote: > > > OK. I will work on the doc changes. > Sorry for the delay. Attached a draft patch of doc and comments changes that I worked upon. Currently I added comments to the callbacks that are present in the TableAmRoutine structure and I copied it into the docs. I am not sure whether it is a good approach or not? I am yet to add description for the each parameter of the callbacks for easier understanding. Or, Giving description of each callbacks in the docs with division of those callbacks according to them divided in the TableAmRoutine structure? Currently following divisions are available. 1. Table scan 2. Parallel table scan 3. Index scan 4. Manipulation of physical tuples 5. Non-modifying operations on individual tuples 6. DDL 7. Planner 8. Executor Suggestions? Regards, Haribabu Kommi Fujitsu Australia 0001-Doc-and-comments-update.patch Description: Binary data
Re: current_logfiles not following group access and instead follows log_file_mode permissions
On Fri, Feb 1, 2019 at 7:22 PM Michael Paquier wrote: > On Fri, Jan 18, 2019 at 09:50:40AM -0500, Stephen Frost wrote: > > Yes, we should update the documentation in this regard, though it's > > really an independent thing as that documentation should have been > > updated in the original group-access patch, so I'll see about fixing > > it and back-patching it. > > Stephen, could you apply Hari's patch then? I am not sure what the > consensus is, but documenting the restriction is the minimum we can > do. > > -The default permissions are 0600, meaning only the > -server owner can read or write the log files. The other commonly > -useful setting is 0640, allowing members of the > owner's > -group to read the files. Note however that to make use of such a > -setting, you'll need to alter to > -store the files somewhere outside the cluster data directory. In > -any case, it's unwise to make the log files world-readable, since > -they might contain sensitive data. > +The default permissions are either 0600, meaning > only the > +server owner can read or write the log files or > 0640, that > +allows any user in the same group can read the log files, based on > the new > +cluster created with --allow-group-access option of > initdb > +command. Note however that to make use of any setting other than > default, > +you'll need to alter to store the > files > +somewhere outside the cluster data directory. > > I would formulate that differently, by just adding an extra paragraph > to mention that using 0640 is recommended to be > compatible with initdb's --allow-group-access instead of sticking it > on the middle of the existing paragraph. > Thanks for the review. I changed the log_file_mode doc patch as per your comment. How about the attached? And regarding current_logfiles permissions, I feel this file should have permissions of data directory files as it is present in the data directory whether it stores the information of log file, until this file is completely removed with another approach to store the log file details. I am not sure whether this has been already discussed or not? How about using shared memory to store the log file names? So that we don't need of this file? Regards, Haribabu Kommi Fujitsu Australia 0001-log_file_mode-recommended-value-update.patch Description: Binary data
Re: initdb --allow-group-access behaviour in windows
On Sun, Feb 3, 2019 at 10:34 AM Michael Paquier wrote: > On Sat, Feb 02, 2019 at 08:50:14AM +0200, David Steele wrote: > > How about: > > > > +files created by initdb. This option is > ignored > > +on Windows, which does not support > > +POSIX-style group permissions. > > Fine for me. Anybody else has an opinion to offer? > +1 to the above changes. Thanks for working on it. Regards, Haribabu Kommi Fujitsu Australia
initdb --allow-group-access behaviour in windows
Hi Hackers, During the testing of new feature allowing group access mode for the data directory by specifying initdb --allow-group-access by one of my collegue, we didn't observe any change in the data folder permissions. With/without group access the data folder have permissions to the group. As Microsoft windows doesn't support POSIX style of permissions, we always set for the permissions GUC's as not supported in windows. Even the new GUC that is added with --allow-group-access feature also mentioned the same. The initdb --allow-group-access doesn't have any impact on the microsoft windows, so I feel it should be better to write the same in initdb docs? need a patch? Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada wrote: > > Attached the latest patches. > Thanks for the updated patches. Some more code review comments. + started by a single utility command. Currently, the parallel + utility commands that support the use of parallel workers are + CREATE INDEX and VACUUM + without FULL option, and only when building + a B-tree index. Parallel workers are taken from the pool of I feel the above sentence may not give the proper picture, how about the adding following modification? CREATE INDEX only when building a B-tree index and VACUUM without FULL option. + * parallel vacuum, we perform both index vacuum and index cleanup in parallel. + * Individual indexes is processed by one vacuum process. At beginning of How about vacuum index and cleanup index similar like other places? + * memory space for dead tuples. When starting either index vacuum or cleanup + * vacuum, we launch parallel worker processes. Once all indexes are processed same here as well? + * Before starting parallel index vacuum and parallel cleanup index we launch + * parallel workers. All parallel workers will exit after processed all indexes parallel vacuum index and parallel cleanup index? + /* + * If there is already-updated result in the shared memory we + * use it. Otherwise we pass NULL to index AMs and copy the + * result to the shared memory segment. + */ + if (lvshared->indstats[idx].updated) + result = &(lvshared->indstats[idx].stats); I didn't really find a need of the flag to differentiate the stats pointer from first run to second run? I don't see any problem in passing directing the stats and the same stats are updated in the worker side and leader side. Anyway no two processes will do the index vacuum at same time. Am I missing something? Even if this flag is to identify whether the stats are updated or not before writing them, I don't see a need of it compared to normal vacuum. + * Enter the parallel mode, allocate and initialize a DSM segment. Return + * the memory space for storing dead tuples or NULL if no workers are prepared. + */ + pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main", + request, true); But we are passing as serializable_okay flag as true, means it doesn't return NULL. Is it expected? + initStringInfo(&buf); + appendStringInfo(&buf, + ngettext("launched %d parallel vacuum worker %s (planned: %d", + "launched %d parallel vacuum workers %s (planned: %d", + lvstate->pcxt->nworkers_launched), + lvstate->pcxt->nworkers_launched, + for_cleanup ? "for index cleanup" : "for index vacuum", + lvstate->pcxt->nworkers); + if (lvstate->options.nworkers > 0) + appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers); what is the difference between planned workers and requested workers, aren't both are same? - COMPARE_SCALAR_FIELD(options); - COMPARE_NODE_FIELD(rels); + if (a->options.flags != b->options.flags) + return false; + if (a->options.nworkers != b->options.nworkers) + return false; Options is changed from SCALAR to check, but why the rels check is removed? The options is changed from int to a structure so using SCALAR may not work in other function like _copyVacuumStmt and etc? +typedef struct VacuumOptions +{ + VacuumFlag flags; /* OR of VacuumFlag */ + int nworkers; /* # of parallel vacuum workers */ +} VacuumOptions; Do we need to add NodeTag for the above structure? Because this structure is part of VacuumStmt structure. +vacuumdb will require background workers, +so make sure your + setting is more than one. removing vacuumdb and changing it as "This option will ..."? I will continue the testing of this patch and share the details. Regards, Haribabu Kommi Fujitsu Australia
Re: [HACKERS] Block level parallel vacuum
On Fri, Jan 18, 2019 at 11:42 PM Masahiko Sawada wrote: > On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi > wrote: > > > > > > On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada > wrote: > >> > >> > >> Rebased. > > > > > > I started reviewing the patch, I didn't finish my review yet. > > Following are some of the comments. > > Thank you for reviewing the patch. > > > > > +PARALLEL class="parameter">N > > + > > + > > + Execute index vacuum and cleanup index in parallel with > > > > I doubt that user can understand the terms index vacuum and cleanup > index. > > May be it needs some more detailed information. > > > > Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum > phases. So maybe adding the referencint to it would work. > OK. > > > > -typedef enum VacuumOption > > +typedef enum VacuumOptionFlag > > { > > > > I don't find the new name quite good, how about VacuumFlags? > > > > Agreed with removing "Option" from the name but I think VacuumFlag > would be better because this enum represents only one flag. Thoughts? > OK. > > > postgres=# vacuum (parallel 2, verbose) tbl; > > > > With verbose, no parallel workers related information is available. > > I feel giving that information is required even when it is not parallel > > vacuum also. > > > > Agreed. How about the folloiwng verbose output? I've added the number > of launched, planned and requested vacuum workers and purpose (vacuum > or cleanup). > > postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table > 'test' has 3 indexes > INFO: vacuuming "public.test" > INFO: launched 2 parallel vacuum workers for index vacuum (planned: > 2, requested: 30) > INFO: scanned index "test_idx1" to remove 2000 row versions > DETAIL: CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s > INFO: scanned index "test_idx2" to remove 2000 row versions by > parallel vacuum worker > DETAIL: CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s > INFO: scanned index "test_idx3" to remove 2000 row versions by > parallel vacuum worker > DETAIL: CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s > INFO: "test": removed 2000 row versions in 10 pages > DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s > INFO: launched 2 parallel vacuum workers for index cleanup (planned: > 2, requested: 30) > INFO: index "test_idx1" now contains 991151 row versions in 2745 pages > DETAIL: 2000 index row versions were removed. > 24 index pages have been deleted, 18 are currently reusable. > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > INFO: index "test_idx2" now contains 991151 row versions in 2745 pages > DETAIL: 2000 index row versions were removed. > 24 index pages have been deleted, 18 are currently reusable. > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > INFO: index "test_idx3" now contains 991151 row versions in 2745 pages > DETAIL: 2000 index row versions were removed. > 24 index pages have been deleted, 18 are currently reusable. > CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > INFO: "test": found 2000 removable, 367 nonremovable row versions in > 41 out of 4425 pages > DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 500 > There were 6849 unused item pointers. > Skipped 0 pages due to buffer pins, 0 frozen pages. > 0 pages are entirely empty. > CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s. > VACUUM > The verbose output is good. Since the previous patch conflicts with 285d8e12 I've attached the > latest version patch that incorporated the review comment I got. > Thanks for the latest patch. I have some more minor comments. + Execute index vacuum and cleanup index in parallel with Better to use vacuum index and cleanup index? This is in same with the description of vacuum phases. It is better to follow same notation in the patch. + dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers); With the change, the lazy_space_alloc takes care of initializing the parallel vacuum, can we write something related to that in the comments. + initprog_val[2] = dead_tuples->max_dead_tuples; dead_tuples variable may need rename for better reading? + if (lvshared->indstats[idx].updated) + result = &(lvshared->indstats[idx].stats); + else + copy_result = true; I don't see a need for copy_result variable, how about directly using the updated flag to decide whether to copy or not? Once the result is copied update the flag. +use Test::More tests => 34; I don't find any new tetst are added in this patch. I am thinking of performance penalty if we use the parallel option of vacuum on a small sized table? Regards, Haribabu Kommi Fujitsu Australia
Re: Pluggable Storage - Andres's take
On Tue, Jan 22, 2019 at 12:15 PM Andres Freund wrote: > Hi, > > Thanks! > > On 2019-01-22 11:51:57 +1100, Haribabu Kommi wrote: > > Attached the patch for removal of scan_update_snapshot > > and also the rebased patch of reduction in use of t_tableOid. > > I'll soon look at the latter. > Thanks. > > > > - consider removing table_gimmegimmeslot() > > > - add substantial docs for every callback > > > > > > > Will work on the above two. > > I think it's easier if I do the first, because I can just do it while > rebasing, reducing unnecessary conflicts. > > OK. I will work on the doc changes. > > > While I saw an initial attempt at writing smgl docs for the table AM > > > API, I'm not convinced that's the best approach. I think it might make > > > more sense to have high-level docs in sgml, but then do all the > > > per-callback docs in tableam.h. > > > > > > > OK, I will update the sgml docs accordingly. > > Index AM has per callback docs in the sgml, refactor them also? > > I don't think it's a good idea to tackle the index docs at the same time > - this patchset is already humongously large... > OK. > > > diff --git a/src/backend/access/heap/heapam_handler.c > b/src/backend/access/heap/heapam_handler.c > > index 62c5f9fa9f..3dc1444739 100644 > > --- a/src/backend/access/heap/heapam_handler.c > > +++ b/src/backend/access/heap/heapam_handler.c > > @@ -2308,7 +2308,6 @@ static const TableAmRoutine heapam_methods = { > > .scan_begin = heap_beginscan, > > .scan_end = heap_endscan, > > .scan_rescan = heap_rescan, > > - .scan_update_snapshot = heap_update_snapshot, > > .scan_getnextslot = heap_getnextslot, > > > > .parallelscan_estimate = table_block_parallelscan_estimate, > > diff --git a/src/backend/executor/nodeBitmapHeapscan.c > b/src/backend/executor/nodeBitmapHeapscan.c > > index 59061c746b..b48ab5036c 100644 > > --- a/src/backend/executor/nodeBitmapHeapscan.c > > +++ b/src/backend/executor/nodeBitmapHeapscan.c > > @@ -954,5 +954,9 @@ ExecBitmapHeapInitializeWorker(BitmapHeapScanState > *node, > > node->pstate = pstate; > > > > snapshot = RestoreSnapshot(pstate->phs_snapshot_data); > > - table_scan_update_snapshot(node->ss.ss_currentScanDesc, snapshot); > > + Assert(IsMVCCSnapshot(snapshot)); > > + > > + RegisterSnapshot(snapshot); > > + node->ss.ss_currentScanDesc->rs_snapshot = snapshot; > > + node->ss.ss_currentScanDesc->rs_temp_snap = true; > > } > > I was rather thinking that we'd just move this logic into > table_scan_update_snapshot(), without it invoking a callback. > OK. Changed accordingly. But the table_scan_update_snapshot() function is moved into tableam.c, to avoid additional header file snapmgr.h inclusion in tableam.h Regards, Haribabu Kommi Fujitsu Australia 0002-Removal-of-scan_update_snapshot-callback.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Mon, Jan 21, 2019 at 1:01 PM Andres Freund wrote: > Hi, > > On 2018-12-10 18:13:40 -0800, Andres Freund wrote: > > On 2018-11-26 17:55:57 -0800, Andres Freund wrote: > > > FWIW, now that oids are removed, and the tuple table slot abstraction > > > got in, I'm working on rebasing the pluggable storage patchset ontop of > > > that. > > > > I've pushed a version to that to the git tree, including a rebased > > version of zheap: > > https://github.com/anarazel/postgres-pluggable-storage > > https://github.com/anarazel/postgres-pluggable-zheap > > I've pushed the newest, substantially revised, version to the same > repository. Note, that while the newest pluggable-zheap version is newer > than my last email, it's not based on the latest version, and the > pluggable-zheap development is now happening in the main zheap > repository. > Thanks for the new version of patches and changes. > Todo: > - consider removing scan_update_snapshot > Attached the patch for removal of scan_update_snapshot and also the rebased patch of reduction in use of t_tableOid. > - consider removing table_gimmegimmeslot() > - add substantial docs for every callback > Will work on the above two. While I saw an initial attempt at writing smgl docs for the table AM > API, I'm not convinced that's the best approach. I think it might make > more sense to have high-level docs in sgml, but then do all the > per-callback docs in tableam.h. > OK, I will update the sgml docs accordingly. Index AM has per callback docs in the sgml, refactor them also? Regards, Haribabu Kommi Fujitsu Australia 0002-Removal-of-scan_update_snapshot.patch Description: Binary data 0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch Description: Binary data
Re: Pluggable Storage - Andres's take
On Tue, Jan 15, 2019 at 6:05 PM Andres Freund wrote: > Hi, > > On 2019-01-15 18:02:38 +1100, Haribabu Kommi wrote: > > On Tue, Dec 11, 2018 at 1:13 PM Andres Freund > wrote: > > > > > Hi, > > > > > > On 2018-11-26 17:55:57 -0800, Andres Freund wrote: > > > Further tasks I'm not yet planning to tackle, that I'd welcome help on: > > > - pg_upgrade testing > > > > > > > I did the pg_upgrade testing from older version with some tables and > views > > exists, and all of them are properly transformed into new server with > heap > > as the default access method. > > > > I will add the dimitry pg_dump patch and test the pg_upgrade to confirm > > the proper access method is retained on the upgraded database. > > > > > > > > > - I think we should consider removing HeapTuple->t_tableOid, it should > > > imo live entirely in the slot > > > > > > > I removed the t_tableOid from HeapTuple and during testing I found some > > problems with triggers, will post the patch once it is fixed. > > > Please note that I'm working on a heavily revised version of the patch > right now, trying to clean up a lot of things (you might have seen some > of the threads I started). I hope to post it ~Thursday. Local-ish > patches shouldn't be a problem though. > Yes, I am checking you other threads of refactoring and cleanups. I will rebase this patch once the revised code is available. I am not able to remove the complete t_tableOid from HeapTuple, because of its use in triggers, as the slot is not available in triggers and I need to store the tableOid also as part of the tuple. Currently setting of t_tableOid is done only when the tuple is formed from the slot, and it is use is replaced with slot member. comments? Regards, Haribabu Kommi Fujitsu Australia 0001-Reduce-the-use-of-HeapTuple-t_tableOid.patch Description: Binary data
Re: Libpq support to connect to standby server as priority
On Fri, Jan 18, 2019 at 5:33 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Tsunakawa, Takayuki [mailto:tsunakawa.ta...@jp.fujitsu.com] > > As for prefer-standby/prefer-read, if host parameter specifies > host1,host2 > > in this order, and host1 is the primary with > > default_transaction_read_only=true, does the app get a connection to > host1? > > I want to connect to host2 (standby) only if host1 is down. > > Oops, reverse -- I wanted to say "I want to connect to host1 (primary) > only if host2 is down." > Thanks for finding out the problem, how about the following way of checking for prefer-read/prefer-standby. 1. (default_transaction_read_only = true and recovery = true) 2. If none of the host satisfies the above scenario, then recovery = true 3. Last check is for default_transaction_read_only = true Regards, Haribabu Kommi Fujitsu Australia
Re: Libpq support to connect to standby server as priority
On Fri, Jan 18, 2019 at 2:34 PM Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Laurenz Albe [mailto:laurenz.a...@cybertec.at] > > I think that transaction_read_only is good. > > > > If it is set to false, we are sure to be on a replication primary or > > stand-alone server, which is enough to know for the load balancing use > case. > > As Tatsuo-san said, setting default_transaction_read_only leads to a > misjudgement of the primary. > > > > I deem it unlikely that someone will set default_transaction_read_only to > > FALSE and then complain that the feature is not working as expected, but > > again > > I cannot prove that claim. > > I wonder what default_transaction_read_only exists for. For maing the > database by default and allowing only specific users to write to the > database with "CREATE/ALTER USER SET default_transaction_read_only = false"? > default_transaction_read_only is a user settable parameter, even if it set as true by default, any user can change it later. Deciding server type based on this whether it supports read-write or read-only can go wrong, as the user can change it later. > I'm sorry to repeat myself, but anyway, I think we need a method to > connect to a standby as the original desire, because the primary instance > may be read only by default while only limited users update data. That's > for reducing the burdon on the primary and minimizing the impact on users > who update data. For example, > > * run data reporting on the standby > * backup the database from the standby > * cascade replication from the standby > IMO, if we try to use only pg_is_in_recovery() only to decide to connect, we may not support all the target_session_attrs that are possible. how about using both to decide? Master/read-write -- recovery = false and default_transaction_read_only = false Standby/read-only -- recovery = true prefer-standby/prefer-read -- recovery = true or default_transaction_read_only = true any -- Nothing to be verified I feel above verifications can cover for both physical and logical replication. we can decide what type of options that we can support? and also if we don't want to rely on default_transaction_read_only user settable parameter, we can add a new parameter that cannot be changed only with server restart? Regards, Haribabu Kommi Fujitsu Australia