Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-03-03 Thread Bossart, Nathan
On 3/3/20, 12:24 PM, "Alvaro Herrera"  wrote:
> IMO: is_dir should be there (and subdirs should be listed), but
> parent_dir should not appear.  Also, the "path" should show the complete
> pathname, including containing dirs, starting from whatever the "root"
> is for the operation.
>
> So for the example in the initial email, it would look like
>
> pathisdir
> pgsql_tmp11025.0.sharedfileset/ t
> pgsql_tmp11025.0.sharedfileset/0.0  f
> pgsql_tmp11025.0.sharedfileset/1.0  f
>
> plus additional columns, same as pg_ls_waldir et al.
>
> I'd rather not have the code assume that there's a single level of
> subdirs, or assuming that an entry in the subdir cannot itself be a dir;
> that might end up hiding files for no good reason.

+1

Nathan



Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-03-03 Thread Alvaro Herrera
On 2020-Mar-03, Justin Pryzby wrote:

> But I don't think it makes sense to go through more implementation/review
> cycles without some agreement from a larger group regarding the
> desired/intended interface.  Should there be a column for "parent dir" ?  Or a
> column for "is_dir" ?  Should dirs be shown at all, or only files ?

IMO: is_dir should be there (and subdirs should be listed), but
parent_dir should not appear.  Also, the "path" should show the complete
pathname, including containing dirs, starting from whatever the "root"
is for the operation.

So for the example in the initial email, it would look like

pathisdir
pgsql_tmp11025.0.sharedfileset/ t
pgsql_tmp11025.0.sharedfileset/0.0  f
pgsql_tmp11025.0.sharedfileset/1.0  f

plus additional columns, same as pg_ls_waldir et al.

I'd rather not have the code assume that there's a single level of
subdirs, or assuming that an entry in the subdir cannot itself be a dir;
that might end up hiding files for no good reason.

I don't understand what purpose is served by having pg_ls_waldir() hide
directories.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-03-03 Thread Justin Pryzby
On Tue, Mar 03, 2020 at 02:51:54PM -0500, David Steele wrote:
> Hi Fabien,
> 
> On 1/16/20 9:38 AM, Justin Pryzby wrote:
> >On Thu, Jan 16, 2020 at 09:34:32AM +0100, Fabien COELHO wrote:
> >>Also, I'm not fully sure why ".*" files should be skipped, maybe it should
> >>be an option? Or the user can filter it with SQL if it does not want them?
> >
> >I think if someone wants the full generality, they can do this:
> >
> >postgres=# SELECT name, s.size, s.modification, s.isdir FROM (SELECT 
> >'base/pgsql_tmp'p)p, pg_ls_dir(p)name, pg_stat_file(p||'/'||name)s;
> >  name | size |  modification  | isdir
> >--+--++---
> >  .foo | 4096 | 2020-01-16 08:57:04-05 | t
> >
> >In my mind, pg_ls_tmpdir() is for showing tmpfiles, not just a shortcut to
> >SELECT pg_ls_dir((SELECT 'base/pgsql_tmp'p)); -- or, for all tablespaces:
> >WITH x AS (SELECT format('/PG_%s_%s', 
> >split_part(current_setting('server_version'), '.', 1), catalog_version_no) 
> >suffix FROM pg_control_system()), y AS (SELECT a, pg_ls_dir(a) AS d FROM 
> >(SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(oid),'')||suffix, 
> >'base') a FROM pg_tablespace,x)a) SELECT a, pg_ls_dir(a||'/pgsql_tmp') FROM 
> >y WHERE d='pgsql_tmp';
> >
> >I think changing dotfiles is topic for another patch.
> >That would also affect pg_ls_dir, and everything else that uses the backing
> >function pg_ls_dir_files_recurse.  I'd have to ask why not also show . and 
> >.. ?
> >
> >(In fact, if I were to change anything, I would propose to limit 
> >pg_ls_tmpdir()
> >to files matching PG_TEMP_FILE_PREFIX).
> 
> We seem to be at an impasse on this patch.  What do you think of Justin's
> comments here?

Actually, I found Fabien's comment regarding extensions use of tmp dir to be
convincing.  And I'm willing to update the patch to use a stack to show
arbitrarily-deep files/dirs rather than just one level deep (as used for shared
filesets in core postgres).

But I don't think it makes sense to go through more implementation/review
cycles without some agreement from a larger group regarding the
desired/intended interface.  Should there be a column for "parent dir" ?  Or a
column for "is_dir" ?  Should dirs be shown at all, or only files ?

-- 
Justin




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-03-03 Thread David Steele

Hi Fabien,

On 1/16/20 9:38 AM, Justin Pryzby wrote:

On Thu, Jan 16, 2020 at 09:34:32AM +0100, Fabien COELHO wrote:

Also, I'm not fully sure why ".*" files should be skipped, maybe it should
be an option? Or the user can filter it with SQL if it does not want them?


I think if someone wants the full generality, they can do this:

postgres=# SELECT name, s.size, s.modification, s.isdir FROM (SELECT 
'base/pgsql_tmp'p)p, pg_ls_dir(p)name, pg_stat_file(p||'/'||name)s;
  name | size |  modification  | isdir
--+--++---
  .foo | 4096 | 2020-01-16 08:57:04-05 | t

In my mind, pg_ls_tmpdir() is for showing tmpfiles, not just a shortcut to
SELECT pg_ls_dir((SELECT 'base/pgsql_tmp'p)); -- or, for all tablespaces:
WITH x AS (SELECT format('/PG_%s_%s', 
split_part(current_setting('server_version'), '.', 1), catalog_version_no) 
suffix FROM pg_control_system()), y AS (SELECT a, pg_ls_dir(a) AS d FROM 
(SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(oid),'')||suffix, 
'base') a FROM pg_tablespace,x)a) SELECT a, pg_ls_dir(a||'/pgsql_tmp') FROM y 
WHERE d='pgsql_tmp';

I think changing dotfiles is topic for another patch.
That would also affect pg_ls_dir, and everything else that uses the backing
function pg_ls_dir_files_recurse.  I'd have to ask why not also show . and .. ?

(In fact, if I were to change anything, I would propose to limit pg_ls_tmpdir()
to files matching PG_TEMP_FILE_PREFIX).


We seem to be at an impasse on this patch.  What do you think of 
Justin's comments here?


Do you still believe a different implementation is required?

Regards,
--
-David
da...@pgmasters.net




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-01-16 Thread Justin Pryzby
On Thu, Jan 16, 2020 at 09:34:32AM +0100, Fabien COELHO wrote:
> Also, I'm not fully sure why ".*" files should be skipped, maybe it should
> be an option? Or the user can filter it with SQL if it does not want them?

I think if someone wants the full generality, they can do this:

postgres=# SELECT name, s.size, s.modification, s.isdir FROM (SELECT 
'base/pgsql_tmp'p)p, pg_ls_dir(p)name, pg_stat_file(p||'/'||name)s;
 name | size |  modification  | isdir 
--+--++---
 .foo | 4096 | 2020-01-16 08:57:04-05 | t

In my mind, pg_ls_tmpdir() is for showing tmpfiles, not just a shortcut to
SELECT pg_ls_dir((SELECT 'base/pgsql_tmp'p)); -- or, for all tablespaces:
WITH x AS (SELECT format('/PG_%s_%s', 
split_part(current_setting('server_version'), '.', 1), catalog_version_no) 
suffix FROM pg_control_system()), y AS (SELECT a, pg_ls_dir(a) AS d FROM 
(SELECT DISTINCT COALESCE(NULLIF(pg_tablespace_location(oid),'')||suffix, 
'base') a FROM pg_tablespace,x)a) SELECT a, pg_ls_dir(a||'/pgsql_tmp') FROM y 
WHERE d='pgsql_tmp';

I think changing dotfiles is topic for another patch.
That would also affect pg_ls_dir, and everything else that uses the backing
function pg_ls_dir_files_recurse.  I'd have to ask why not also show . and .. ?

(In fact, if I were to change anything, I would propose to limit pg_ls_tmpdir()
to files matching PG_TEMP_FILE_PREFIX).

Justin




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-01-16 Thread Fabien COELHO



Hello Justin,


I'm trying to think about how to get rid of the strange structure and hacks,
and the arbitrary looking size 2 array.

Also the recursion is one step, but I'm not sure why, ISTM it could/should
go on always?


Because tmpfiles only go one level deep.


I'm not sure it is a general rule. ISTM that extensions can use tmp files, 
and we would have no control about what they would do there.



Looking at the code, ISTM that relying on a stack/list would be much cleaner
and easier to understand. The code could look like:


I'm willing to change the implementation, but only after there's an agreement
about the desired behavior (extra column, one level, etc).


For the level, ISTM that the implementation should not make this 
assumption. If in practice there is just one level, then the function will 
not recurse deep, no problem.


For the column, I'm not sure that "isdir" is necessary.

You could put it implicitely in the file name by ending it with "/", 
and/or showing the directory contents is enough a hint that there is a 
directory?


Also, I'm not fully sure why ".*" files should be skipped, maybe it should 
be an option? Or the user can filter it with SQL if it does not want them?


--
Fabien.




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-01-15 Thread Justin Pryzby
On Wed, Jan 15, 2020 at 11:21:36AM +0100, Fabien COELHO wrote:
> I'm trying to think about how to get rid of the strange structure and hacks,
> and the arbitrary looking size 2 array.
> 
> Also the recursion is one step, but I'm not sure why, ISTM it could/should
> go on always?

Because tmpfiles only go one level deep.

> Looking at the code, ISTM that relying on a stack/list would be much cleaner
> and easier to understand. The code could look like:

I'm willing to change the implementation, but only after there's an agreement
about the desired behavior (extra column, one level, etc).

Justin




Re: [PATCH v1] pg_ls_tmpdir to show directories

2020-01-15 Thread Fabien COELHO



Hello Justin,

About this v4.

It applies cleanly.

I'm trying to think about how to get rid of the strange structure and 
hacks, and the arbitrary looking size 2 array.


Also the recursion is one step, but I'm not sure why, ISTM it could/should 
go on always?


Maybe the recursive implementation was not such a good idea, if I 
suggested it is because I did not noticed the "return next" re-entrant 
stuff, shame on me.


Looking at the code, ISTM that relying on a stack/list would be much 
cleaner and easier to understand. The code could look like:


  ls_func()
if (first_time)
  initialize description
  set next directory to visit
while (1)
   if next directory
 init/push next directory to visit as current
   read current directory
 if emty
   pop/close current directory
 elif is_a_dir and recursion allowed
   set next directory to visit
 else ...
   return next tuple
cleanup

The point is to avoid a hack around the directory_fctx array, to have only 
one place to push/init a new directory (i.e. call AllocateDir and play 
around with the memory context) instead of two, and to allow deeper 
recursion if useful.


Details : snprintf return is not checked. Maybe it should say why an 
overflow cannot occur.


"bool nulls[3] = { 0,};" -> "bool nulls[3} = { false, false, false };"?

--
Fabien.




Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-28 Thread Justin Pryzby
Here's a version which uses an array of directory_fctx, rather than of DIR and
location.  That avoids changing the data structure and collatoral implications
to pg_ls_dir().

Currently, this *shows* subdirs of subdirs, but doesn't decend into them.
So I think maybe toplevel subdirs should be shown, too.
And maybe the is_dir flag should be re-introduced (although someone could call
pg_stat_file if needed).
I'm interested to hear feedback on that, although this patch still isn't great.
>From dd3b2779939fc1b396fed1fba2f7cefc9a6b1ad5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Dec 2019 23:34:14 -0600
Subject: [PATCH v4 1/2] BUG: in errmsg

Note there's two changes here.
Should backpatch to v12, where pg_ls_tmpdir was added.
---
 src/backend/utils/adt/genfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 5d4f26a..c978e15 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -590,7 +590,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 		if (stat(path, ) < 0)
 			ereport(ERROR,
 	(errcode_for_file_access(),
-	 errmsg("could not stat directory \"%s\": %m", dir)));
+	 errmsg("could not stat file \"%s\": %m", path)));
 
 		/* Ignore anything but regular files */
 		if (!S_ISREG(attrib.st_mode))
-- 
2.7.4

>From 30031c790fe5bb358f0bb372cb2d7975e2d688aa Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 14 Dec 2019 16:22:15 -0600
Subject: [PATCH v4 2/2] pg_ls_tmpdir to show directories

See also 9cd92d1a33699f86aa53d44ab04cc3eb50c18d11
---
 doc/src/sgml/func.sgml  |  14 +++--
 src/backend/utils/adt/genfile.c | 136 ++--
 2 files changed, 97 insertions(+), 53 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5a98158..8abc643 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21922,12 +21922,14 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

setof record

-List the name, size, and last modification time of files in the
-temporary directory for tablespace.  If
-tablespace is not provided, the
-pg_default tablespace is used.  Access is granted
-to members of the pg_monitor role and may be
-granted to other non-superuser roles.
+For files in the temporary directory for
+tablespace, list the name, size, and last modification time.
+Files beneath a first-level directory are shown, and include a pathname
+component of their parent directory; such files are used by parallel processes.
+If tablespace is not provided, the
+pg_default tablespace is used.  Access is granted to
+members of the pg_monitor role and may be granted to
+other non-superuser roles.

   
   
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c978e15..6bfac64 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -522,12 +522,84 @@ pg_ls_dir_1arg(PG_FUNCTION_ARGS)
 	return pg_ls_dir(fcinfo);
 }
 
-/* Generic function to return a directory listing of files */
+/* Recursive helper to handle showing a first level of files beneath a subdir */
 static Datum
-pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
+pg_ls_dir_files_recurse(FunctionCallInfo fcinfo, FuncCallContext *funcctx, const char *dir, bool missing_ok, bool dir_ok)
+{
+	bool		nulls[3] = {0,};
+	Datum		values[3];
+
+	directory_fctx	*fctx = (directory_fctx *) funcctx->user_fctx;
+
+	while (1) {
+		struct dirent *de;
+		char *location;
+		DIR *dirdesc;
+
+		location = fctx[1].location ? fctx[1].location : fctx[0].location;
+		dirdesc = fctx[1].dirdesc ? fctx[1].dirdesc : fctx[0].dirdesc;
+
+		while ((de = ReadDir(dirdesc, location)) != NULL)
+		{
+			char		path[MAXPGPATH * 2];
+			HeapTuple	tuple;
+			struct stat	attrib;
+
+			/* Skip hidden files */
+			if (de->d_name[0] == '.')
+continue;
+
+			/* Get the file info */
+			snprintf(path, sizeof(path), "%s/%s", location, de->d_name);
+			if (stat(path, ) < 0)
+ereport(ERROR,
+		(errcode_for_file_access(),
+		 errmsg("could not stat file \"%s\": %m", path)));
+
+			/* Ignore anything but regular files or (if requested) dirs */
+			if (S_ISDIR(attrib.st_mode)) {
+/* Note: decend into dirs, but do not return a tuple for the dir itself */
+/* Do not expect dirs more than one level deep */
+if (dir_ok && !fctx[1].location) {
+	MemoryContext oldcontext;
+	oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+	fctx[1].location = pstrdup(path);
+	fctx[1].dirdesc = AllocateDir(path);
+	MemoryContextSwitchTo(oldcontext);
+	return pg_ls_dir_files_recurse(fcinfo, funcctx, path, missing_ok, false);
+}
+/* else: fall through and show the dir itself 

Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-28 Thread Fabien COELHO


Hello Justin,


Ok, so this suggests recursing into subdirs, which requires to make a
separate function of the inner loop.


Yea, it suggests that; but, SRF_RETURN_NEXT doesn't make that very easy.
It'd need to accept the fcinfo argument, and pg_ls_dir_files would call it once
for every tuple to be returned.  So it's recursive and saves its state...

The attached is pretty ugly, but I can't see how to do better.


Hmmm… I do agree with pretty ugly:-)

If it really neads to be in the structure, why not save the open directory 
field in the caller and restore it after the recursive call, and pass the 
rest of the structure as a pointer? Something like:


  ... root_function(...)
  {
 struct status_t status = initialization();
 ...
 call rec_function(, path)
 ...
 cleanup();
  }

  ... rec_function(struct *status, path)
  {
 status->dir = opendir(path);
 for (dir contents)
 {
if (it_is_a_dir)
{
 /* some comment about why we do that… */
 dir_t saved = status->dir;
 status->dir = NULL;
 rec_function(status, subpath);
 status->dir = saved;
}
 }
closedir(status->dir), status->dir = NULL;
  }


The alternative seems to be to build up a full list of pathnames in the SRF
initial branch, and stat them all later.  Or stat them all in the INITIAL case,
and keep a list of stat structures to be emited during future calls.


--
Fabien.

Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-28 Thread Justin Pryzby
On Sat, Dec 28, 2019 at 07:52:55AM +0100, Fabien COELHO wrote:
> >>Why not simply showing the files underneath their directories?
> >>
> >>  /path/to/tmp/file1
> >>  /path/to/tmp/subdir1/file2
> >>
> >>In which case probably showing the directory itself is not useful,
> >>and the is_dir column could be dropped?
> >
> >The names are expected to look like this:
> >
> >$ sudo find /var/lib/pgsql/12/data/base/pgsql_tmp -ls
> >1429774 drwxr-x---   3 postgres postgres 4096 Dec 27 13:51 
> >/var/lib/pgsql/12/data/base/pgsql_tmp
> >1698684 drwxr-x---   2 postgres postgres 4096 Dec  7 01:35 
> >/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset
> >169347 5492 -rw-r-   1 postgres postgres  5619712 Dec  7 01:35 
> >/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/0.0
> >169346 5380 -rw-r-   1 postgres postgres  5505024 Dec  7 01:35 
> >/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/1.0
> >
> >I think we'd have to show sudbdir/file1, subdir/file2, not just file1, file2.
> >It doesn't seem useful or nice to show a bunch of files called 0.0 or 1.0.
> >Actually the results should be unique, either on filename or (dir,file).
> 
> Ok, so this suggests recursing into subdirs, which requires to make a
> separate function of the inner loop.

Yea, it suggests that; but, SRF_RETURN_NEXT doesn't make that very easy.
It'd need to accept the fcinfo argument, and pg_ls_dir_files would call it once
for every tuple to be returned.  So it's recursive and saves its state...

The attached is pretty ugly, but I can't see how to do better.
The alternative seems to be to build up a full list of pathnames in the SRF
initial branch, and stat them all later.  Or stat them all in the INITIAL case,
and keep a list of stat structures to be emited during future calls.

BTW, it seems to me this error message should be changed:

snprintf(path, sizeof(path), "%s/%s", fctx->location, 
de->d_name);
if (stat(path, ) < 0)
ereport(ERROR,
(errcode_for_file_access(),
-errmsg("could not stat directory 
\"%s\": %m", dir)));
+errmsg("could not stat file \"%s\": 
%m", path)));

>From fd88be5f1687354d9990fb1838adc0db36bc6dde Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 27 Dec 2019 23:34:14 -0600
Subject: [PATCH v2 1/2] BUG: in errmsg

---
 src/backend/utils/adt/genfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 5d4f26a..c978e15 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -590,7 +590,7 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
 		if (stat(path, ) < 0)
 			ereport(ERROR,
 	(errcode_for_file_access(),
-	 errmsg("could not stat directory \"%s\": %m", dir)));
+	 errmsg("could not stat file \"%s\": %m", path)));
 
 		/* Ignore anything but regular files */
 		if (!S_ISREG(attrib.st_mode))
-- 
2.7.4

>From fff91aec87f635755527e91aebb7554fa6385fec Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 14 Dec 2019 16:22:15 -0600
Subject: [PATCH v2 2/2] pg_ls_tmpdir to show directories

See also 9cd92d1a33699f86aa53d44ab04cc3eb50c18d11
---
 doc/src/sgml/func.sgml   |  14 +++--
 src/backend/utils/adt/genfile.c  | 132 ++-
 src/include/catalog/catversion.h |   2 +-
 3 files changed, 96 insertions(+), 52 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5a98158..8abc643 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21922,12 +21922,14 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());

setof record

-List the name, size, and last modification time of files in the
-temporary directory for tablespace.  If
-tablespace is not provided, the
-pg_default tablespace is used.  Access is granted
-to members of the pg_monitor role and may be
-granted to other non-superuser roles.
+For files in the temporary directory for
+tablespace, list the name, size, and last modification time.
+Files beneath a first-level directory are shown, and include a pathname
+component of their parent directory; such files are used by parallel processes.
+If tablespace is not provided, the
+pg_default tablespace is used.  Access is granted to
+members of the pg_monitor role and may be granted to
+other non-superuser roles.

   
   
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c978e15..2b540e9 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -37,8 +37,12 @@
 
 typedef struct
 {
-	char	   *location;
-	DIR		   

Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-27 Thread Fabien COELHO


Hello Justin,


Why not simply showing the files underneath their directories?

  /path/to/tmp/file1
  /path/to/tmp/subdir1/file2

In which case probably showing the directory itself is not useful,
and the is_dir column could be dropped?


The names are expected to look like this:

$ sudo find /var/lib/pgsql/12/data/base/pgsql_tmp -ls
1429774 drwxr-x---   3 postgres postgres 4096 Dec 27 13:51 
/var/lib/pgsql/12/data/base/pgsql_tmp
1698684 drwxr-x---   2 postgres postgres 4096 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset
169347 5492 -rw-r-   1 postgres postgres  5619712 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/0.0
169346 5380 -rw-r-   1 postgres postgres  5505024 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/1.0

I think we'd have to show sudbdir/file1, subdir/file2, not just file1, file2.
It doesn't seem useful or nice to show a bunch of files called 0.0 or 1.0.
Actually the results should be unique, either on filename or (dir,file).


Ok, so this suggests recursing into subdirs, which requires to make a 
separate function of the inner loop.



It's worth thinking if subdir should be a separate column.


My 0.02 €: I would rather simply keep the full path and just add subdir 
contents, so that the function output does not change at all.


--
Fabien.

Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-27 Thread Justin Pryzby
On Fri, Dec 27, 2019 at 06:50:24PM +0100, Fabien COELHO wrote:
> >On Fri, Dec 27, 2019 at 05:22:47PM +0100, Fabien COELHO wrote:
> >>The implementation simply extends an existing functions with a boolean to
> >>allow for sub-directories. However, the function does not seem to show
> >>subdir contents recursively. Should it be the case?
> >
> >>STM that "//"-comments are not project policy.
> >
> >Sure, but the patch is less important than the design, which needs to be
> >addressed first.  The goal is to somehow show tmpfiles (or at least dirs) 
> >used
> >by parallel workers.  I mentioned a few possible ways, of which this was the
> >simplest to implement.  Showing files beneath the dir is probably good, but
> >need to decide how to present it.  Should there be a column for the dir (null
> >if not a shared filesets)?  Or some other presentation, like a boolean column
> >"is_shared_fileset".
> 
> Why not simply showing the files underneath their directories?
> 
>   /path/to/tmp/file1
>   /path/to/tmp/subdir1/file2
> 
> In which case probably showing the directory itself is not useful,
> and the is_dir column could be dropped?

The names are expected to look like this:

$ sudo find /var/lib/pgsql/12/data/base/pgsql_tmp -ls
1429774 drwxr-x---   3 postgres postgres 4096 Dec 27 13:51 
/var/lib/pgsql/12/data/base/pgsql_tmp
1698684 drwxr-x---   2 postgres postgres 4096 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset
169347 5492 -rw-r-   1 postgres postgres  5619712 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/0.0
169346 5380 -rw-r-   1 postgres postgres  5505024 Dec  7 01:35 
/var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/1.0

I think we'd have to show sudbdir/file1, subdir/file2, not just file1, file2.
It doesn't seem useful or nice to show a bunch of files called 0.0 or 1.0.
Actually the results should be unique, either on filename or (dir,file).

"ls" wouldn't list same name twice, unless you list multiple dirs, like:
|ls a/b c/d.

It's worth thinking if subdir should be a separate column.

I'm interested to hear back from others.

Justin




Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-27 Thread Fabien COELHO




Re-added -hackers.


Indeed, I left it out by accident. I tried to bounce the original mail but 
it did not work.



Thanks for reviewing.

On Fri, Dec 27, 2019 at 05:22:47PM +0100, Fabien COELHO wrote:

The implementation simply extends an existing functions with a boolean to
allow for sub-directories. However, the function does not seem to show
subdir contents recursively. Should it be the case?



STM that "//"-comments are not project policy.


Sure, but the patch is less important than the design, which needs to be
addressed first.  The goal is to somehow show tmpfiles (or at least dirs) used
by parallel workers.  I mentioned a few possible ways, of which this was the
simplest to implement.  Showing files beneath the dir is probably good, but
need to decide how to present it.  Should there be a column for the dir (null
if not a shared filesets)?  Or some other presentation, like a boolean column
"is_shared_fileset".


Why not simply showing the files underneath their directories?

  /path/to/tmp/file1
  /path/to/tmp/subdir1/file2

In which case probably showing the directory itself is not useful,
and the is_dir column could be dropped?


I'm unconvinced by the skipping condition:

  +  if (!S_ISREG(attrib.st_mode) &&
  +  (!dir_ok && S_ISDIR(attrib.st_mode)))
continue;

which is pretty hard to read. ISTM you meant "not A and not (B and C)"
instead?


I can write it as two ifs.


Hmmm. Not sure it would help that much. At least the condition must be 
right. Also, the comment should be updated.



 And, it's probably better to say:
 if (!ISDIR() || !dir_ok)


I cannot say I like it. dir_ok is cheaper to test so could come first.


..which is same as: !(B && C), as you said.


Ok, so you confirm that the condition was wrong.

Catalog update should be a date + number? Maybe this is best left to 
the committer?


Yes, but I preferred to include it, causing a deliberate conflict, to ensure
it's not forgotten.


Ok.

--
Fabien.




Re: [PATCH v1] pg_ls_tmpdir to show directories

2019-12-27 Thread Justin Pryzby
Re-added -hackers.

Thanks for reviewing.

On Fri, Dec 27, 2019 at 05:22:47PM +0100, Fabien COELHO wrote:
> The implementation simply extends an existing functions with a boolean to
> allow for sub-directories. However, the function does not seem to show
> subdir contents recursively. Should it be the case?

> STM that "//"-comments are not project policy.

Sure, but the patch is less important than the design, which needs to be
addressed first.  The goal is to somehow show tmpfiles (or at least dirs) used
by parallel workers.  I mentioned a few possible ways, of which this was the
simplest to implement.  Showing files beneath the dir is probably good, but
need to decide how to present it.  Should there be a column for the dir (null
if not a shared filesets)?  Or some other presentation, like a boolean column
"is_shared_fileset".

> I'm unconvinced by the skipping condition:
> 
>   +  if (!S_ISREG(attrib.st_mode) &&
>   +  (!dir_ok && S_ISDIR(attrib.st_mode)))
> continue;
> 
> which is pretty hard to read. ISTM you meant "not A and not (B and C)"
> instead?

I can write it as two ifs.  And, it's probably better to say:
if (!ISDIR() || !dir_ok)

..which is same as: !(B && C), as you said.

> Catalog update should be a date + number? Maybe this is best left to the
> committer?

Yes, but I preferred to include it, causing a deliberate conflict, to ensure
it's not forgotten.

Thanks,
Justin