Re: more backtraces

2019-12-14 Thread Robert Haas
On Fri, Dec 13, 2019 at 7:26 AM Peter Eisentraut
 wrote:
> On 2019-12-04 22:34, Tom Lane wrote:
> > Andres Freund  writes:
> >> It'd be bad if the addition of backtraces for SEGV/BUS suddenly made it
> >> harder to attach a debugger and getting useful results.
> >
> > Yeah.  TBH, I'm not sure I want this, at least not in debug builds.
>
> I understand that the SEGV/BUS thing can be a bit scary.  We can skip it.
>
> Are people interested in backtraces on abort()?  That was asked for in
> an earlier thread.

I mean, I think backtraces are great, and we should have more of them.
It's possible that trying to do it in certain cases will cause
problems, but we could back off those cases as we find them, or maybe
try to work around them using sigaltstack(), or maybe back it off in
debug builds.

It would make life a lot easier for me if I never had to explain to a
customer (1) how to install gdb or (2) that they needed to get $BOSS
to approve installation of development tools on production systems. I
would hate to see us shy away from improvements that might reduce the
need for such conversations on the theory that bad stuff *might*
happen.

In my experience, the importance of having a stack trace in the log is
greatest for a segmentation fault, because otherwise you have no
indication whatsoever of where the problem happened. Having the query
text has been a boon, but it's still not a lot to go on unless the
same query crashes every time. In other situations, like a PANIC,
Assertion failure, or (and this is a big one) non-descriptive error
message (cache look failed for thingy %u) a backtrace is sometimes
really helpful as well. You don't *always* need it, but you *often*
need it.

It is absolutely important that we don't break debuggability in the
service of getting more stack traces. At the same time, there are a
lot more PostgreSQL users out there than there are PostgreSQL
developers, and a lot of those people are running non-cassert,
non-debug builds. Being able to get debugging information from
failures that happen on those installations that enables us to fix
things without having to go through a time-consuming process of
guesswork and attempted reproduction is really valuable. A stack trace
can turn a lengthy nightmare into a quick fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: allowing broader use of simplehash

2019-12-14 Thread Robert Haas
On Thu, Dec 12, 2019 at 2:33 PM Andres Freund  wrote:
> I was basically just thinking that we could pass the context to use via
> CurrentMemoryContext, instead of explicitly passing it in.

I thought about that, but as a general rule, replacing a function
parameter with a global variable is the wrong direction. One could
argue this particular case is a counterexample, and I won't fight
tooth and nail if you want to take that position, but I don't think I
believe it myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Wrong assert in TransactionGroupUpdateXidStatus

2019-12-14 Thread Robert Haas
On Thu, Dec 12, 2019 at 9:23 PM Amit Kapila  wrote:
> Do you think we need such an Assert after having StaticAssert for
> (THRESHOLD_SUBTRANS_CLOG_OPT <= PGPROC_MAX_CACHED_SUBXIDS)  and then
> an if statement containing (nsubxids <= THRESHOLD_SUBTRANS_CLOG_OPT)
> just before this Assert?  Sure, we can keep this for extra safety, but
> I don't see the need for it.

I don't have strong feelings about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pg_ls_tmpdir to show shared filesets

2019-12-14 Thread Justin Pryzby
On Thu, Dec 12, 2019 at 11:39:31PM -0600, Justin Pryzby wrote:
> I suggested that pg_ls_tmpdir should show shared filesets like
> > 169347 5492 -rw-r-   1 postgres postgres  5619712 Dec  7 01:35 
> > /var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/0.0
..
> Actually, my suggestion would be to make pg_ls_tmpdir expose "isdir", same as
> pg_stat_file.

Done like that
>From 83740ad5897f8724b073e110ff3f477875fcaeaf Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 14 Dec 2019 16:22:15 -0600
Subject: [PATCH v1] pg_ls_tmpdir to show directories

See also 9cd92d1a33699f86aa53d44ab04cc3eb50c18d11
---
 doc/src/sgml/func.sgml   | 12 +++-
 src/backend/utils/adt/genfile.c  | 29 ++---
 src/include/catalog/catversion.h |  2 +-
 src/include/catalog/pg_proc.dat  |  8 
 4 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5a98158..83e567e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21922,12 +21922,14 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

setof record

-List the name, size, and last modification time of files in the
-temporary directory for tablespace.  If
+For files in the temporary directory for
+tablespace, list the name, size, last modification time,
+and boolean indicating if it's a directory.  Directories are used for temporary files
+used by parallel processes.  If
 tablespace is not provided, the
-pg_default tablespace is used.  Access is granted
-to members of the pg_monitor role and may be
-granted to other non-superuser roles.
+pg_default tablespace is used.  Access is granted to
+members of the pg_monitor role and may be granted to
+other non-superuser roles.

   
   
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 5d4f26a..671fc60 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -522,9 +522,9 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS)
 	return pg_ls_dir(fcinfo);
 }
 
-/* Generic function to return a directory listing of files */
+/* Generic function to return a directory listing of files (and optionally dirs) */
 static Datum
-pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
+pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok, bool dir_ok) // could be bitmap of "flags"
 {
 	FuncCallContext *funcctx;
 	struct dirent *de;
@@ -540,13 +540,17 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 		fctx = palloc(sizeof(directory_fctx));
 
-		tupdesc = CreateTemplateTupleDesc(3);
+		tupdesc = CreateTemplateTupleDesc(dir_ok ? 4:3);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "name",
 		   TEXTOID, -1, 0);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "size",
 		   INT8OID, -1, 0);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 3, "modification",
 		   TIMESTAMPTZOID, -1, 0);
+		if (dir_ok)
+			TupleDescInitEntry(tupdesc, (AttrNumber) 4, "isdir",
+		   BOOLOID, -1, 0); 
+
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		fctx->location = pstrdup(dir);
@@ -575,8 +579,8 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 
 	while ((de = ReadDir(fctx->dirdesc, fctx->location)) != NULL)
 	{
-		Datum		values[3];
-		bool		nulls[3];
+		Datum		values[4];
+		bool		nulls[4];
 		char		path[MAXPGPATH * 2];
 		struct stat attrib;
 		HeapTuple	tuple;
@@ -593,14 +597,17 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 	 errmsg("could not stat directory \"%s\": %m", dir)));
 
 		/* Ignore anything but regular files */
-		if (!S_ISREG(attrib.st_mode))
+		if (!S_ISREG(attrib.st_mode) &&
+			(!dir_ok && S_ISDIR(attrib.st_mode)))
 			continue;
 
 		values[0] = CStringGetTextDatum(de->d_name);
 		values[1] = Int64GetDatum((int64) attrib.st_size);
 		values[2] = TimestampTzGetDatum(time_t_to_timestamptz(attrib.st_mtime));
-		memset(nulls, 0, sizeof(nulls));
+		if (dir_ok)
+			values[3] = BoolGetDatum(S_ISDIR(attrib.st_mode));
 
+		memset(nulls, 0, sizeof(nulls));
 		tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
 		SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
 	}
@@ -613,14 +620,14 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 Datum
 pg_ls_logdir(PG_FUNCTION_ARGS)
 {
-	return pg_ls_dir_files(fcinfo, Log_directory, false);
+	return pg_ls_dir_files(fcinfo, Log_directory, false, false);
 }
 
 /* Function to return the list of files in the WAL directory */
 Datum
 pg_ls_waldir(PG_FUNCTION_ARGS)
 {
-	return pg_ls_dir_files(fcinfo, XLOGDIR, false);
+	return pg_ls_dir_files(fcinfo, XLOGDIR, false, false);
 }
 
 /*
@@ -638,7 +645,7 @@ pg_ls_tmpdir(FunctionCallInfo fcinfo, Oid tblspc)
 		tblspc)));
 
 	TempTablespacePath(path, 

Re: [HACKERS] proposal: schema variables

2019-12-14 Thread Pavel Stehule
po 18. 11. 2019 v 19:47 odesílatel Pavel Stehule 
napsal:

>
>
> ne 3. 11. 2019 v 17:27 odesílatel Pavel Stehule 
> napsal:
>
>>
>>
>> čt 10. 10. 2019 v 11:41 odesílatel Pavel Stehule 
>> napsal:
>>
>>> Hi
>>>
>>> minor change - replace heap_tuple_fetch_attr by detoast_external_attr.
>>>
>>>
>> similar update - heap_open, heap_close was replaced by table_open,
>> table_close
>>
>
> fresh rebase
>

only rebase

Regards

Pavel


> Regards
>
> Pavel
>
>
>> Regards
>>
>> Pavel
>>
>


schema-variables-20191214.patch.gz
Description: application/gzip


Re: tuplesort test coverage

2019-12-14 Thread Tom Lane
Andres Freund  writes:
> On 2019-12-12 09:27:04 -0500, Tom Lane wrote:
>> What seems like a simpler and more reliable fix is to make
>> test_mark_restore a temp table, thus keeping autovac away from it.
>> Is there a reason in terms of the test's goals not to do that?

> I can't see any reason. The sorting code shouldn't care about the source
> of tuples. I guess there could at some point be tests for parallel
> sorting, but that'd just use a different table.

OK, done that way.

>> Also ... why in the world does the script drop its tables at the end
>> with IF EXISTS?  They'd better exist at that point.  I object
>> to the DROP IF EXISTS up at the top, too.  The regression tests
>> do not need to be designed to deal with an unpredictable start state,
>> and coding them to do so can have no effect other than possibly
>> masking problems.

> Well, it makes it a heck of a lot easier to run tests in isolation while
> evolving them. While I personally think it's good to leave cleanup for
> partial states in for cases where it was helpful during development, I
> also don't care about it strongly.

As far as that goes, making the tables temp is an even better solution.

regards, tom lane




Re: Memory-Bounded Hash Aggregation

2019-12-14 Thread Tomas Vondra

On Wed, Nov 27, 2019 at 02:58:04PM -0800, Jeff Davis wrote:

On Wed, 2019-08-28 at 12:52 -0700, Taylor Vesely wrote:

Right now the patch always initializes 32 spill partitions. Have you
given
any thought into how to intelligently pick an optimal number of
partitions yet?


Attached a new patch that addresses this.

1. Divide hash table memory used by the number of groups in the hash
table to get the average memory used per group.
2. Multiply by the number of groups spilled -- which I pessimistically
estimate as the number of tuples spilled -- to get the total amount of
memory that we'd like to have to process all spilled tuples at once.
3. Divide the desired amount of memory by work_mem to get the number of
partitions we'd like to have such that each partition can be processed
in work_mem without spilling.
4. Apply a few sanity checks, fudge factors, and limits.

Using this runtime information should be substantially better than
using estimates and projections.

Additionally, I removed some branches from the common path. I think I
still have more work to do there.

I also rebased of course, and fixed a few other things.



I've done a bit more testing on this, after resolving a couple of minor
conflicts due to recent commits (rebased version attached).

In particular, I've made a comparison with different dataset sizes,
group sizes, GUC settings etc. The script and results from two different
machines are available here:

  * https://bitbucket.org/tvondra/hashagg-tests/src/master/

The script essentially runs a simple grouping query with different
number of rows, groups, work_mem and parallelism settings. There's
nothing particularly magical about it.

I did run it both on master and patched code, allowing us to compare
results and assess impact of the patch. Overall, the changes are
expected and either neutral or beneficial, i.e. the timing are the same
or faster.

The number of cases that regressed is fairly small, but sometimes the
regressions are annoyingly large - up to 2x in some cases. Consider for
example this trivial example with 100M rows:

CREATE TABLE t AS
SELECT (1 * random())::int AS a
  FROM generate_series(1,1) s(i);

On the master, the plan with default work_mem (i.e. 4MB) and

SET max_parallel_workers_per_gather = 8;

looks like this:


EXPLAIN SELECT * FROM (SELECT a, count(*) FROM t GROUP BY a OFFSET 10) 
foo;

 QUERY PLAN

 Limit  (cost=16037474.49..16037474.49 rows=1 width=12)
   ->  Finalize GroupAggregate  (cost=2383745.73..16037474.49 rows=60001208 
width=12)
 Group Key: t.a
 ->  Gather Merge  (cost=2383745.73..14937462.25 rows=10032 
width=12)
   Workers Planned: 8
   ->  Partial GroupAggregate  (cost=2382745.59..2601495.66 
rows=1254 width=12)
 Group Key: t.a
 ->  Sort  (cost=2382745.59..2413995.60 rows=1254 
width=4)
   Sort Key: t.a
   ->  Parallel Seq Scan on t  (cost=0.00..567478.04 
rows=1254 width=4)
(10 rows)

Which kinda makes sense - we can't do hash aggregate, because there are
100M distinct values, and that won't fit into 4MB of memory (and the
planner knows about that).

And it completes in about 108381 ms, give or take. With the patch, the
plan changes like this:


EXPLAIN SELECT * FROM (SELECT a, count(*) FROM t GROUP BY a OFFSET 10) 
foo;

QUERY PLAN
---
 Limit  (cost=2371037.74..2371037.74 rows=1 width=12)
   ->  HashAggregate  (cost=1942478.48..2371037.74 rows=42855926 width=12)
 Group Key: t.a
 ->  Seq Scan on t  (cost=0.00..1442478.32 rows=10032 width=4)
(4 rows)

i.e. it's way cheaper than the master plan, it's not parallel, but when
executed it takes much longer (about 147442 ms). After forcing a
parallel query (by setting parallel_setup_cost = 0) the plan changes to
a parallel one, but without a partial aggregate, but it's even slower.

The explain analyze for the non-parallel plan looks like this:

 QUERY PLAN
-
 Limit  (cost=2371037.74..2371037.74 rows=1 width=12) (actual 
time=160180.718..160180.718 rows=0 loops=1)
   ->  HashAggregate  (cost=1942478.48..2371037.74 rows=42855926 width=12) 
(actual time=54462.728..157594.756 rows=63215980 loops=1)
 Group Key: t.a
 Memory Usage: 4096kB  Batches: 8320  Disk Usage:4529172kB
 ->  Seq Scan on t  (cost=0.00..1442478.32 rows=10032 width=4) 
(actual time=0.014..12198.044 rows=1 loops=1)
 Planning Time: 0.110 ms
 Execution Time: 

Re: psql's \watch is broken

2019-12-14 Thread Fabien COELHO




I've not dug into code itself, I just bisected it.


Thanks for the report. I'll look into it.


Looked at it already.


Ah, the magic of timezones!

And yes, I can see the difference.  This comes from the switch from 
cancel_pressed to CancelRequested in psql, especially PSQLexecWatch() in 
this case.  And actually, now that I look at it, I think that we should 
simply get rid of cancel_pressed in psql completely and replace it with 
CancelRequested.  This also removes the need of having cancel_pressed 
defined in print.c, which was not really wanted originally.  Attached is 
a patch which addresses the issue for me, and cleans up the code while

on it.  Fabien, Jeff, can you confirm please?


Yep. Patch applies cleanly, compiles, works for me as well.

--
Fabien.




Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2019-12-14 Thread Dent John
Hi folks,I’ve made a revision of this patch. The significant change is to access the Portal using Portal APIs rather than through SPI. It seems the persisted state necessary to survive being used to retrieve a row at a time inside an SRF just isn’t a good fit for SPI. It turned out there was upstream machinery in the FunctionScan node that prevented Postgres being able to pipeline SRFs, even if they return ValuePerCall. So, in practice, this patch is of limited benefit without another patch that changes that behaviour (see [1]). Nevertheless, the code is independent so I’m submitting the two changes separately. I’ll push this into the Jan commit fest.denty. [1] https://commitfest.postgresql.org/26/2372/

unnest-refcursor-v3.patch
Description: Binary data