[HACKERS] proposal: enhanced stack trace for PL - print param args
Hi In last discussion related to PLpgSQL was mentioned weak of stack trace PLpgSQL. The function parameters are not printed. CREATE OR REPLACE FUNCTION public.foo(a double precision, b double precision) RETURNS double precision LANGUAGE plpgsql AS $function$ begin return 100/a; end; $function$ Current: postgres=# select foo(0, 100); ERROR: division by zero CONTEXT: PL/pgSQL function foo(double precision) line 3 at RETURN Proposed result: postgres=# select foo(0, 100); ERROR: division by zero CONTEXT: PL/pgSQL function foo(double precision) line 3 at RETURN ARGUMENTS: a=0, b=100 * only function parameters are printed - no local parameters * the line of arguments will be limited - X chars ? * the content of variable should be limited - X chars ? - maybe 40 chars This function can has impact on performance - so it should be explicitly enabled with some GUC - like extra_back_trace or some similar. Probably before any call the function parameters and related out functions should be copied to safe memory context. More it can increase press on Error Memory Context and possibly on log size. Is a interest about this feature? Comments, notes? Regards Pavel
Re: [HACKERS] Patch to implement pg_current_logfile() function
On Sat, Jan 14, 2017 at 10:02 PM, Gilles Darold wrote: > Le 13/01/2017 à 14:09, Michael Paquier a écrit : >> On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold >> wrote: >>> Le 13/01/2017 à 05:26, Michael Paquier a écrit : Surely the temporary file of current_logfiles should not be included in base backups (look at basebackup.c). Documentation needs to reflect that as well. Also, should we have current_logfiles in a base backup? I would think no. >>> Done but can't find any documentation about the file exclusion, do you >>> have a pointer? >> protocol.sgml, in the protocol-replication part. You could search for >> the paragraph that contains postmaster.opts. There is no real harm in >> including current_logfiles in base backups, but that's really in the >> same bag as postmaster.opts or postmaster.pid, particularly if the log >> file name has a timestamp. > > Thanks for the pointer. After reading this part I think it might only > concern files that are critical at restore time. I still think that the > file might not be removed automatically from the backup. If it is > restored it has strictly no impact at all, it will be removed or > overwritten at startup. We can let the user choose to remove it from the > backup or not, the file can be an help to find the latest log file > related to a backup. OK, no problem for me. I can see that your patch does the right thing to ignore the current_logfiles.tmp. Could you please update t/010_pg_basebackup.pl and add this file to the list of files filled with DONOTCOPY? pg_current_logfile() can be run by *any* user. We had better revoke its access in system_views.sql! >>> Why? There is no special information returned by this function unless >>> the path but it can be retrieve using show log_directory. >> log_directory could be an absolute path, and we surely don't want to >> let normal users have a look at it. For example, "show log_directory" >> can only be seen by superusers. As Stephen Frost is for a couple of >> months (years?) on a holly war path against super-user checks in >> system functions, I think that revoking the access to this function is >> the best thing to do. It is as well easier to restrict first and >> relax, the reverse is harder to justify from a compatibility point of >> view. > > Right, I've append a REVOKE to the function in file > backend/catalog/system_views.sql, patch v21 reflect this change. Thanks. That looks good. + { + /* Logging collector is not enabled. We don't know where messages are +* logged. Remove outdated file holding the current log filenames. +*/ + unlink(LOG_METAINFO_DATAFILE); return 0 This comment format is not postgres-like. I think that's all I have for this patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
On Sun, Jan 15, 2017 at 1:31 AM, Magnus Hagander wrote: > > > On Tue, Jan 10, 2017 at 3:01 AM, Michael Paquier > wrote: >> Based on some analysis, it is enough to look at the last 4 bytes of >> the compressed file to get the size output data with a single call to >> lseek() and then read(). So as there is a simple way to do things and >> that's far cheaper than decompressing perhaps hundred of segments I'd >> rather do it this way. Attached is the implementation. This code is >> using 2 booleans for 4 states of the file names: full non-compressed, >> partial non-compressed, full compressed and partial compressed. This >> keeps the last check of FindStreamingStart() more simple, but that's >> quite fancy lately to have an enum for such things :D > > > I think you need to document that analysis at least in a code comment. I > assume this is in reference to the ISIZE member of the gzip format? Good question. I am not sure on this one. > I was going to say what happens if the file is corrupt and we get random > junk data there, but as we only compare it to XlogSegSize that should be > safe. But we might want to put a note in about that too? Perhaps. I have made a better try in FindStreamingStart. > Finally, I think we should make the error message clearly say "compressed > segment file" - just to make things extra clear. Sure. >> > I found another problem with it -- it is completely broken in sync mode. >> > You >> > need to either forbid sync mode together with compression, or teach >> > dir_sync() about it. The later would sound better, but I wonder how much >> > that's going to kill compression due to the small blocks? Is it a >> > reasonable >> > use-case? >> >> Hm. Looking at the docs I think that --compress defined with >> --synchronous maps to the use of Z_SYNC_FLUSH with gzflush(). FWIW I >> don't have a direct use case for it, but it is not a major effort to >> add it, so done. There is no actual reason to forbid this combinations >> of options either. >> > It is enough to just gzflush(), don't we also need to still fsync()? I can't > see any documentation that gzflush() does that, and a quick look at the code > of zlib indicates it doesn't (but I didn't check in detail, so if you did > please point me to it). Hm. It looks that you are right. zlib goes down to _tr_flush_bits() to flush some output, but this finishes only with put_byte(). As the fd is opaque in gzFile, it would be just better to open() the file first, and then use gzdopen to get the gzFile. Let's use as well the existing fd field to save it. gzclose() closes as well the parent fd per the documentation of zlib. -- Michael receivexlog-gzip-v5.patch Description: application/stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Index Scans
On Fri, Jan 13, 2017 at 7:58 PM, Anastasia Lubennikova wrote: > 27.12.2016 17:33, Amit Kapila: > > > The similar problem has occurred while testing "parallel index only > scan" patch and Rafia has included the fix in her patch [1] which > ideally should be included in this patch, so I have copied the fix > from her patch. Apart from that, I observed that similar problem can > happen for backward scans, so fixed the same as well. > > > I confirm that this problem is solved. > > But I'm trying to find the worst cases for this feature. And I suppose we > should test parallel index scans with > concurrent insertions. More parallel readers we have, higher the > concurrency. > I doubt that it can significantly decrease performance, because number of > parallel readers is not that big, > > I am not sure if such a test is meaningful for this patch because > parallelism is generally used for large data reads and in such cases > there are generally not many concurrent writes. > > > I didn't find any case of noticeable performance degradation, > so set status to "Ready for committer". > Thank you for the review! -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries
Fabien COELHO writes: >> It ends up being about 30 fewer lines of code overall, despite there >> being four places that have to make ad-hoc RawStmt nodes. On the whole >> I feel like this is cleaner, > I agree: Better typing, more homogeneous code (PlannedStmt for all), > less ad-hoc checks to work around utility statements... OK, pushed like that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Packages: Again
On Fri, Jan 13, 2017 at 8:56 PM, Serge Rielau wrote: >> That's total nonsense. >> >> MERGE isn't UPSERT…. > > Peter, > you are misreading what I wrote. I did not allege that PostgreSQL did the > wrong thing. And you are essentially confirming that there was debate and > MERGE deemed to be not what was wanted. So PG, with reason, went with > something not in the standard. > > That is precisely my point! I'm sorry for being so blunt. That was unnecessary. I thought that you were citing that as a negative counterexample, rather than a neutral or positive one. Still, it's true that MERGE has very little overlap with UPSERT, both as specified by the standard, and as implemented in practice by both SQL Server and Oracle. The Oracle docs introduce MERGE with a statement that is something along the lines of "MERGE is a way to combine INSERT, UPDATE, and DELETE into one convenient DML statement". MERGE is most compelling when performing bulk loading. That being the case, in my mind MERGE remains something that we really haven't turned our back on at all. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
On Fri, Jan 13, 2017 at 12:49 AM, Jesper Pedersen wrote: > Rebased, and removed the compile warn in hashfuncs.c I did some testing and review for the patch. I did not see any major issue, but there are few minor cases for which I have few suggestions. 1. Source File : /doc/src/sgml/pageinspect.sgml example given. SELECT * FROM hash_page_type(get_raw_page('con_hash_index', 0)); can be changed to SELECT hash_page_type(get_raw_page('con_hash_index', 0)); 2. @verify_hash_page I can easily produce this error right after the split, so there will be pages which are not inited before it is used. So an error saying it is unexpected might be slightly misleading. I think error message should be improved. postgres=# SELECT hash_page_type(get_raw_page('i1', 12)); ERROR: index table contains unexpected zero page 3. @verify_hash_page (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("page is not a HASH page"), + errdetail("Expected %08x, got %08x.", In error message "HASH" can downcase to "hash". That makes error messages consistent with other error messages like "page is not a hash page of expected type" 4. The condition is raw_page_size != BLCKSZ; But error msg "input page too small"; I think error message should be changed to "invalid page size". if (raw_page_size != BLCKSZ) + ereport(ERROR, i+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("input page too small"), + errdetail("Expected size %d, got %d", + BLCKSZ, raw_page_size))); 5. @hash_bitmap_info Metabuf can be released once after bitmapblkno is set it is off no use. _hash_relbuf(indexRel, metabuf); is Write lock required here for bitmap page? mapbuf = _hash_getbuf(indexRel, bitmapblkno, HASH_WRITE, LH_BITMAP_PAGE); 6. You have renamed convert_ovflblkno_to_bitno to _hash_ovflblkno_to_bitno. But unfortunately, you also moved the function down. The diff appears as if you have completely replaced it. I think we should put it back to at old place. 7. I think it is not your bug, but probably a bug in Hash index itself; page flag is set to 66 (for below test); So the page is both LH_BUCKET_PAGE and LH_BITMAP_PAGE. Is not this a bug in hash index? I have inserted 3000 records. Hash index is on integer column. select hasho_flag FROM hash_page_stats(get_raw_page('i1', 1)); hasho_flag 66 -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for pg_receivexlog --format=plain|tar
On Tue, Jan 10, 2017 at 3:01 AM, Michael Paquier wrote: > On Sat, Jan 7, 2017 at 8:19 PM, Magnus Hagander > wrote: > > On Sat, Jan 7, 2017 at 12:31 AM, Michael Paquier < > michael.paqu...@gmail.com> > > wrote: > >> There is something I forgot. With this patch, > >> FindStreamingStart()@pg_receivexlog.c is actually broken. In short it > >> forgets to consider files that have been compressed at the last run of > >> pg_receivexlog and will try to stream changes from the beginning. I > >> can see that gzip -l provides this information... But I have yet to > >> find something in zlib that allows a cheap lookup as startup of > >> streaming should be fast. Looking at how gzip -l does it may be faster > >> than looking at the docs. > > > > Do we really care though? As in, does startup of streaming have to be > *that* > > fast? Even gunziping 16Mb (worst case) doesn't exactly take a long time. > If > > your pg_receivexlog is restarting so often that it becomes a problem, I > > think you already have another and much bigger problem on your hands. > > Based on some analysis, it is enough to look at the last 4 bytes of > the compressed file to get the size output data with a single call to > lseek() and then read(). So as there is a simple way to do things and > that's far cheaper than decompressing perhaps hundred of segments I'd > rather do it this way. Attached is the implementation. This code is > using 2 booleans for 4 states of the file names: full non-compressed, > partial non-compressed, full compressed and partial compressed. This > keeps the last check of FindStreamingStart() more simple, but that's > quite fancy lately to have an enum for such things :D > I think you need to document that analysis at least in a code comment. I assume this is in reference to the ISIZE member of the gzip format? I was going to say what happens if the file is corrupt and we get random junk data there, but as we only compare it to XlogSegSize that should be safe. But we might want to put a note in about that too? Finally, I think we should make the error message clearly say "compressed segment file" - just to make things extra clear. > > I found another problem with it -- it is completely broken in sync mode. > You > > need to either forbid sync mode together with compression, or teach > > dir_sync() about it. The later would sound better, but I wonder how much > > that's going to kill compression due to the small blocks? Is it a > reasonable > > use-case? > > Hm. Looking at the docs I think that --compress defined with > --synchronous maps to the use of Z_SYNC_FLUSH with gzflush(). FWIW I > don't have a direct use case for it, but it is not a major effort to > add it, so done. There is no actual reason to forbid this combinations > of options either. > > It is enough to just gzflush(), don't we also need to still fsync()? I can't see any documentation that gzflush() does that, and a quick look at the code of zlib indicates it doesn't (but I didn't check in detail, so if you did please point me to it). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Replication/backup defaults
On Wed, Jan 11, 2017 at 2:09 AM, Michael Paquier wrote: > On Wed, Jan 11, 2017 at 10:06 AM, David Steele > wrote: > > On 1/10/17 3:06 PM, Stephen Frost wrote: > >> * Magnus Hagander (mag...@hagander.net) wrote: > >>> On Tue, Jan 10, 2017 at 8:03 PM, Robert Haas > wrote: > > > I may be outvoted, but I'm still not in favor of changing the default > wal_level. That caters only to people who lack sufficient foresight > to know that they need a replica before the system becomes so critical > that they can't bounce it to update the configuration. > >>> > >>> True. But the current level only caters to those people who run large > ETL > >>> jobs without doing any tuning on their system (at least none that would > >>> require a restart), or another one of the fairly specific workloads. > >>> > >>> And as I keep re-iterating, it's not just about replicas, it's also > about > >>> the ability to make proper backups. Which is a pretty fundamental > feature. > >>> > >>> I do think you are outvoted, yes :) At least that's the result of my > >>> tallying up the people who have spoken out on the thread. > >> > >> I tend to agree with Magnus on this, being able to perform an online > >> backup is pretty darn important. > > > > Agreed and +1. > > +1'ing. > I've pushed this. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Re: Clarifying "server starting" messaging in pg_ctl start without --wait
On 1/11/17 11:20 AM, Ryan Murphy wrote: > Thanks for the review Beena, I'm glad the patch is ready to go! committed, thanks -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to implement pg_current_logfile() function
Le 13/01/2017 à 14:09, Michael Paquier a écrit : > On Fri, Jan 13, 2017 at 5:48 PM, Gilles Darold > wrote: >> Le 13/01/2017 à 05:26, Michael Paquier a écrit : >>> Surely the temporary file of current_logfiles should not be included >>> in base backups (look at basebackup.c). Documentation needs to reflect >>> that as well. Also, should we have current_logfiles in a base backup? >>> I would think no. >> Done but can't find any documentation about the file exclusion, do you >> have a pointer? > protocol.sgml, in the protocol-replication part. You could search for > the paragraph that contains postmaster.opts. There is no real harm in > including current_logfiles in base backups, but that's really in the > same bag as postmaster.opts or postmaster.pid, particularly if the log > file name has a timestamp. Thanks for the pointer. After reading this part I think it might only concern files that are critical at restore time. I still think that the file might not be removed automatically from the backup. If it is restored it has strictly no impact at all, it will be removed or overwritten at startup. We can let the user choose to remove it from the backup or not, the file can be an help to find the latest log file related to a backup. > >>> pg_current_logfile() can be run by *any* user. We had better revoke >>> its access in system_views.sql! >> Why? There is no special information returned by this function unless >> the path but it can be retrieve using show log_directory. > log_directory could be an absolute path, and we surely don't want to > let normal users have a look at it. For example, "show log_directory" > can only be seen by superusers. As Stephen Frost is for a couple of > months (years?) on a holly war path against super-user checks in > system functions, I think that revoking the access to this function is > the best thing to do. It is as well easier to restrict first and > relax, the reverse is harder to justify from a compatibility point of > view. Right, I've append a REVOKE to the function in file backend/catalog/system_views.sql, patch v21 reflect this change. Thanks. -- Gilles Darold Consultant PostgreSQL http://dalibo.com - http://dalibo.org diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 30dd54c..393195f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4211,6 +4211,17 @@ SELECT * FROM parent WHERE key = 2400; server log + + When logs are written to the file system, the file includes the type, + the location and the name of the logs currently in use. This provides a + convenient way to find the logs currently in used by the instance. + +$ cat $PGDATA/current_logfiles +stderr pg_log/postgresql-2017-01-13_085812.log + + + Where To Log diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 10e3186..693669b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -15441,6 +15441,13 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n); + pg_current_logfile(text) + text + Primary log file name, or log in the requested format, + currently in use by the logging collector + + + pg_my_temp_schema() oid OID of session's temporary schema, or 0 if none @@ -15658,6 +15665,39 @@ SET search_path TO schema , schema, .. the time when the postmaster process re-read the configuration files.) + +pg_current_logile + + + +Logging +pg_current_logfile function + + + +pg_current_logfile returns, as text, +the path of either the csv or stderr log file currently in use by the +logging collector. This is a path including the + directory and the log file name. +Log collection must be active or the return value +is NULL. When multiple log files exist, each in a +different format, pg_current_logfile called +without arguments returns the path of the file having the first format +found in the ordered +list: stderr, csvlog. +NULL is returned when no log file has any of these +formats. To request a specific file format supply, +as text, either csvlog +or stderr as the value of the optional parameter. The +return value is NULL when the log format requested +is not a configured . + +pg_current_logfiles reflects the content of the + file. All caveats +regards current_logfiles content are applicable +to pg_current_logfiles' return value. + + pg_my_temp_schema diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index 5c52824..3ce2a7e 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -60,6 +60,38 @@ Item Subdirectory containing per-database subdirectories + + + + current_logfiles + + + Logging + current_logfiles file + + current_logfiles + + + + A file recording the log file(s) currently written to by the syslogger + and t
Re: [HACKERS] parallelize queries containing subplans
On Fri, Jan 13, 2017 at 7:13 PM, Tom Lane wrote: > Dilip Kumar writes: >> ERROR: did not find '}' at end of input node at character 762 > I could reproduce this error with simple query like: SELECT * FROM information_schema.role_usage_grants WHERE object_type LIKE 'FOREIGN%' AND object_name IN ('s6', 'foo') ORDER BY 1, 2, 3, 4, 5; The reason is that the stored rules have the old structure of Param. See below: {RELABELTYPE :arg {PARAM :paramkind 2 :paramid 1 :paramtype 11975 :paramtypmod -1 :paramcollid 100 :location -1} The new variable parallel_safe is not present in above node. If you recreate the fresh database you won't see the above problem. I have observed a problem in equal function which I have fixed in the attached patch. I have also added a test, so that we can catch any problem similar to what you have reported. > I've not looked at the patches, but just seeing this error message, > this looks like somebody's fat-fingered the correspondence between > outfuncs.c and readfuncs.c processing. > I think what we need is catversion bump as Param is part of stored rules. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com pq_pushdown_correl_subplan_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers