Re: hash join error improvement (old)

2020-05-26 Thread Thomas Munro
On Wed, May 27, 2020 at 1:30 PM Tom Lane  wrote:
> Alvaro Herrera  writes:
> > There are more uses of BufFileRead that don't bother to distinguish
> > these two cases apart, though -- logtape.c, tuplestore.c,
> > gistbuildbuffers.c all do the same.
>
> Yeah.  I rather suspect that callers of BufFileRead/Write are mostly
> expecting that those functions will throw an ereport() for any interesting
> error condition.  Maybe we should make it so, instead of piecemeal fixing
> the callers?

Yeah.  I proposed that over here:

https://www.postgresql.org/message-id/ca+hukgk0w+gts8advskdvu7cfzse5q+0np_9kpsxg2na1ne...@mail.gmail.com

But I got stuck trying to figure out whether to back-patch (arguably
yes: there are bugs here, but arguably no: the interfaces change).




Re: hash join error improvement (old)

2020-05-26 Thread Tom Lane
Alvaro Herrera  writes:
> There are more uses of BufFileRead that don't bother to distinguish
> these two cases apart, though -- logtape.c, tuplestore.c,
> gistbuildbuffers.c all do the same.

Yeah.  I rather suspect that callers of BufFileRead/Write are mostly
expecting that those functions will throw an ereport() for any interesting
error condition.  Maybe we should make it so, instead of piecemeal fixing
the callers?

regards, tom lane




Re: hash join error improvement (old)

2020-05-26 Thread Alvaro Herrera
On 2020-May-26, Tom Lane wrote:

> Are you sure you correctly identified the source of the bogus error
> report?

This version's better.  It doesn't touch the write side at all.
On the read side, only report a short read as such if errno's not set.

This error isn't frequently seen.  This page
https://blog.csdn.net/pg_hgdb/article/details/106279303
(A Postgres fork; blames the error on the temp hash files being encrypted,
suggests to increase temp_buffers) is the only one I found.

There are more uses of BufFileRead that don't bother to distinguish
these two cases apart, though -- logtape.c, tuplestore.c,
gistbuildbuffers.c all do the same.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index d13efc4d98..f91e69a09d 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -930,9 +930,17 @@ ExecHashJoinGetSavedTuple(HashJoinState *hjstate,
 		return NULL;
 	}
 	if (nread != sizeof(header))
-		ereport(ERROR,
-(errcode_for_file_access(),
- errmsg("could not read from hash-join temporary file: %m")));
+	{
+		if (errno == 0)
+			ereport(ERROR,
+	(errcode_for_file_access(),
+	 errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+			(int) nread, (int) sizeof(header;
+		else
+			ereport(ERROR,
+	(errcode_for_file_access(),
+	 errmsg("could not read from hash-join temporary file: %m")));
+	}
 	*hashvalue = header[0];
 	tuple = (MinimalTuple) palloc(header[1]);
 	tuple->t_len = header[1];
@@ -940,9 +948,17 @@ ExecHashJoinGetSavedTuple(HashJoinState *hjstate,
 		(void *) ((char *) tuple + sizeof(uint32)),
 		header[1] - sizeof(uint32));
 	if (nread != header[1] - sizeof(uint32))
-		ereport(ERROR,
-(errcode_for_file_access(),
- errmsg("could not read from hash-join temporary file: %m")));
+	{
+		if (errno == 0)
+			ereport(ERROR,
+	(errcode_for_file_access(),
+	 errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+			(int) nread, (int) (header[1] - sizeof(uint32);
+		else
+			ereport(ERROR,
+	(errcode_for_file_access(),
+	 errmsg("could not read from hash-join temporary file: %m")));
+	}
 	return ExecStoreMinimalTuple(tuple, tupleSlot, true);
 }
 


Re: hash join error improvement (old)

2020-05-26 Thread Alvaro Herrera
On 2020-May-26, Tom Lane wrote:

> Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
> which calls FileWrite which takes pains to set errno correctly after a
> short write --- so other than the lack of commentary about these
> functions' error-reporting API, I don't think there's any actual bug here.

Doh, you're right, this patch is completely broken ... aside from
carelessly writing the wrong "if" test, my unfamiliarity with the stdio
fread/fwrite interface is showing.  I'll look more carefully.

> Are you sure you correctly identified the source of the bogus error
> report?

Nope.  And I wish the bogus error report was all there was to it.  The
actual problem is a server crash.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: hash join error improvement (old)

2020-05-26 Thread Tom Lane
Alvaro Herrera  writes:
> Hmm, right -- I was extending the partial read case to apply to a
> partial write, and we deal with those very differently.  I changed the
> write case to use our standard approach.

Actually ... looking more closely, this proposed change in
ExecHashJoinSaveTuple flat out doesn't work, because it assumes that
BufFileWrite reports errors the same way as write(), which is not the
case.  In particular, written < 0 can't happen; moreover, you've
removed detection of a short write as opposed to a completely failed
write.

Digging further down, it looks like BufFileWrite calls BufFileDumpBuffer
which calls FileWrite which takes pains to set errno correctly after a
short write --- so other than the lack of commentary about these
functions' error-reporting API, I don't think there's any actual bug here.
Are you sure you correctly identified the source of the bogus error
report?

Similarly, I'm afraid you introduced rather than removed problems
in ExecHashJoinGetSavedTuple.  BufFileRead doesn't use negative
return values either.

regards, tom lane




Re: hash join error improvement (old)

2020-05-26 Thread Alvaro Herrera
Hi Tom, thanks for looking.

On 2020-May-25, Tom Lane wrote:

> I don't mind if you want to extend that paradigm to also use "wrote only
> %d bytes" wording, but the important point is to get the SQLSTATE set on
> the basis of ENOSPC rather than whatever random value errno will have
> otherwise.

Hmm, right -- I was extending the partial read case to apply to a
partial write, and we deal with those very differently.  I changed the
write case to use our standard approach.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index d13efc4d98..3817c41225 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -882,16 +882,26 @@ ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
 	}
 
 	written = BufFileWrite(file, (void *) , sizeof(uint32));
-	if (written != sizeof(uint32))
+	if (written < 0)
+	{
+		/* if write didn't set errno, assume problem is no disk space */
+		if (errno == 0)
+			errno = ENOSPC;
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not write to hash-join temporary file: %m")));
+	}
 
 	written = BufFileWrite(file, (void *) tuple, tuple->t_len);
-	if (written != tuple->t_len)
+	if (written < 0)
+	{
+		/* if write didn't set errno, assume problem is no disk space */
+		if (errno == 0)
+			errno = ENOSPC;
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not write to hash-join temporary file: %m")));
+	}
 }
 
 /*
@@ -929,20 +939,30 @@ ExecHashJoinGetSavedTuple(HashJoinState *hjstate,
 		ExecClearTuple(tupleSlot);
 		return NULL;
 	}
-	if (nread != sizeof(header))
+	else if (nread < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not read from hash-join temporary file: %m")));
+	else if (nread != sizeof(header))
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+		(int) nread, (int) sizeof(header;
 	*hashvalue = header[0];
 	tuple = (MinimalTuple) palloc(header[1]);
 	tuple->t_len = header[1];
 	nread = BufFileRead(file,
 		(void *) ((char *) tuple + sizeof(uint32)),
 		header[1] - sizeof(uint32));
-	if (nread != header[1] - sizeof(uint32))
+	if (nread < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not read from hash-join temporary file: %m")));
+	else if (nread != header[1] - sizeof(uint32))	/* might read zero bytes */
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+		(int) nread, (int) (header[1] - sizeof(uint32);
 	return ExecStoreMinimalTuple(tuple, tupleSlot, true);
 }
 


Re: hash join error improvement (old)

2020-05-25 Thread Tom Lane
Alvaro Herrera  writes:
> I recently noticed this in a customer log file:
>   ERROR:  could not read from hash-join temporary file: Success
> The problem is we're reporting with %m when the problem is a partial
> read or write.
> I propose the attached patch to solve it: report "wrote only X of X
> bytes".  This caused a lot of other trouble, the cause of which I've
> been unable to pinpoint as yet.  But in the meantime, this is already a
> small improvement.

+1 for the idea, but there's still one small problem with what you did
here: errcode_for_file_access() looks at errno, which we can presume
will not have a relevant value in the "wrote only %d bytes" paths.

Most places that are taking pains to deal with this scenario
do something like

errno = 0;
if (write(fd, data, len, xlrec->offset) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
ereport(ERROR,
(errcode_for_file_access(),
 errmsg("could not write to file \"%s\": %m", path)));
}

I don't mind if you want to extend that paradigm to also use "wrote only
%d bytes" wording, but the important point is to get the SQLSTATE set on
the basis of ENOSPC rather than whatever random value errno will have
otherwise.

regards, tom lane




hash join error improvement (old)

2020-05-25 Thread Alvaro Herrera
I recently noticed this in a customer log file:
  ERROR:  could not read from hash-join temporary file: Success

The problem is we're reporting with %m when the problem is a partial
read or write.

I propose the attached patch to solve it: report "wrote only X of X
bytes".  This caused a lot of other trouble, the cause of which I've
been unable to pinpoint as yet.  But in the meantime, this is already a
small improvement.

-- 
Álvaro Herrera
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index d13efc4d98..377554772b 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -882,16 +882,26 @@ ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
 	}
 
 	written = BufFileWrite(file, (void *) , sizeof(uint32));
+	if (written < 0)
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not write to hash-join temporary file: %m")));
 	if (written != sizeof(uint32))
 		ereport(ERROR,
 (errcode_for_file_access(),
- errmsg("could not write to hash-join temporary file: %m")));
+ errmsg("could not write to hash-join temporary file: wrote only %d of %d bytes",
+		(int) written, (int) sizeof(uint32;
 
 	written = BufFileWrite(file, (void *) tuple, tuple->t_len);
-	if (written != tuple->t_len)
+	if (written < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not write to hash-join temporary file: %m")));
+	if (written != tuple->t_len)
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not write to hash-join temporary file: wrote only %d of %d bytes",
+		(int) written, (int) tuple->t_len)));
 }
 
 /*
@@ -929,20 +939,30 @@ ExecHashJoinGetSavedTuple(HashJoinState *hjstate,
 		ExecClearTuple(tupleSlot);
 		return NULL;
 	}
-	if (nread != sizeof(header))
+	if (nread < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not read from hash-join temporary file: %m")));
+	if (nread != sizeof(header))
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+		(int) nread, (int) sizeof(header;
 	*hashvalue = header[0];
 	tuple = (MinimalTuple) palloc(header[1]);
 	tuple->t_len = header[1];
 	nread = BufFileRead(file,
 		(void *) ((char *) tuple + sizeof(uint32)),
 		header[1] - sizeof(uint32));
-	if (nread != header[1] - sizeof(uint32))
+	if (nread < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
  errmsg("could not read from hash-join temporary file: %m")));
+	if (nread != header[1] - sizeof(uint32))
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not read from hash-join temporary file: read only %d of %d bytes",
+		(int) nread, (int) (header[1] - sizeof(uint32);
 	return ExecStoreMinimalTuple(tuple, tupleSlot, true);
 }