Re: Permission failures with WAL files in 13~ on Windows

2021-03-21 Thread Michael Paquier
On Thu, Mar 18, 2021 at 12:01:40PM +0900, Michael Paquier wrote:
> Yep, that's actually something I wrote for my own setups, with
> log_checkpoints enabled to catch all concurrent checkpoint activity
> and some LOGs.  Still no luck unfortunately :(

The various reporters had more luck than myself in reproducing the
issue, so I have applied 909b449e to address the issue.  I am pretty
sure that we should review more this business in the future, but I'd
rather not touch the stable branches.
--
Michael


signature.asc
Description: PGP signature


Re: Permission failures with WAL files in 13~ on Windows

2021-03-17 Thread Michael Paquier
On Wed, Mar 17, 2021 at 07:30:04PM -0700, Andres Freund wrote:
> I suspect it might be easier to reproduce the issue with smaller WAL
> segments, a short checkpoint_timeout, and multiple jobs generating WAL
> and then sleeping for random amounts of time. Not sure if that's the
> sole ingredient, but consider what happens there's processes that
> XLogWrite()s some WAL and then sleeps. Typically such a process'
> openLogFile will still point to the WAL segment. And they may still do
> that when the next checkpoint finishes and we recycle the WAL file.

Yep.  That's basically the kind of scenarios I have been testing to
stress the recycling/removing, with pgbench putting some load into the
server.  This has worked for me.  Once.  But I have little idea why it
gets easier to reproduce in the environments of others, so there may
be an OS-version dependency in the equation here.

> I wonder if we actually fail to unlink() the file in
> durable_link_or_rename(), and then end up recycling the same old file
> into multiple "future" positions in the WAL stream.

You actually mean durable_rename_excl() as of 13~, right?  Yeah, this
matches my impression that it is a two-step failure:
- Failure in one of the steps of durable_rename_excl().
- Fallback to segment removal, where we get the complain about
renaming.

> 1) and 2) seems problematic for restore_command use. I wonder if there's
> a chance that some of the reports ended up hitting 3), and that windows
> doesn't handle that well.

Yeap.  I was thinking about 3) being the actual problem while going
through those docs two days ago.

> If you manage to reproduce, could you check what the link count of the
> all the segments is? Apparently sysinternal's findlinks can do that.
> 
> Or perhaps even better, add an error check that the number of links of
> WAL segments is 1 in a bunch of places (recycling, opening them, closing
> them, maybe?).
> 
> Plus error reporting for unlink failures, of course.

Yep, that's actually something I wrote for my own setups, with
log_checkpoints enabled to catch all concurrent checkpoint activity
and some LOGs.  Still no luck unfortunately :(
--
Michael


signature.asc
Description: PGP signature


Re: Permission failures with WAL files in 13~ on Windows

2021-03-17 Thread Andres Freund
Hi,

On 2021-03-18 09:55:46 +0900, Michael Paquier wrote:
> Let's see how it goes from this point, but, FWIW, I have not been able
> to reproduce again my similar problem with the archive command :/ --

I suspect it might be easier to reproduce the issue with smaller WAL
segments, a short checkpoint_timeout, and multiple jobs generating WAL
and then sleeping for random amounts of time. Not sure if that's the
sole ingredient, but consider what happens there's processes that
XLogWrite()s some WAL and then sleeps. Typically such a process'
openLogFile will still point to the WAL segment. And they may still do
that when the next checkpoint finishes and we recycle the WAL file.

I wonder if we actually fail to unlink() the file in
durable_link_or_rename(), and then end up recycling the same old file
into multiple "future" positions in the WAL stream.

There's also these interesting notes at
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createhardlinka

1)
> The security descriptor belongs to the file to which a hard link
> points. The link itself is only a directory entry, and does not have a
> security descriptor. Therefore, when you change the security
> descriptor of a hard link, you a change the security descriptor of the
> underlying file, and all hard links that point to the file allow the
> newly specified access. You cannot give a file different security
> descriptors on a per-hard-link basis.

2)
> Flags, attributes, access, and sharing that are specified in
> CreateFile operate on a per-file basis. That is, if you open a file
> that does not allow sharing, another application cannot share the file
> by creating a new hard link to the file.

3)
> The maximum number of hard links that can be created with this
> function is 1023 per file. If more than 1023 links are created for a
> file, an error results.


1) and 2) seems problematic for restore_command use. I wonder if there's
a chance that some of the reports ended up hitting 3), and that windows
doesn't handle that well.


If you manage to reproduce, could you check what the link count of the
all the segments is? Apparently sysinternal's findlinks can do that.

Or perhaps even better, add an error check that the number of links of
WAL segments is 1 in a bunch of places (recycling, opening them, closing
them, maybe?).

Plus error reporting for unlink failures, of course.

Greetings,

Andres Freund




Re: Permission failures with WAL files in 13~ on Windows

2021-03-17 Thread Andres Freund
Hi,

On 2021-03-16 16:20:37 +0900, Michael Paquier wrote:
> Fujii-san has mentioned that on twitter, but one area that has changed
> during the v13 cycle is aaa3aed, where the code recycling segments has
> been switched from a pgrename() (with a retry loop) to a
> CreateHardLinkA()+pgunlink() (with a retry loop for the second).  One
> theory that I got in mind here is the case where we create the hard
> link, but fail to finish do the pgunlink() on the xlogtemp.N file,
> though after some testing it did not seem to have any impact.

A related question: What on earth is the point of using the unlink
approach on any operating system. We use the durable_rename_excl() (and
its predecessor, durable_link_or_rename(), and in turn its open coded
predecessors) for things like recycling WAL files at check points.

Now imagine that durable_rename_excl() fails to unlink the old
file. We'll still have the old file, but there's a second link to a new
WAL file, which will be used. No error will be thrown, because we don't
check unlink()'s return code (but if we did, we'd still have similar
issues).

And then imagine that that happens again, during the next checkpoint,
because the permission or whatever issue is not yet resolved. We now
will have the same physical file in two location in the future WAL
stream.

Welcome impossible to debug issues.

And all of this with the sole justification of "paranoidly trying to
avoid overwriting an existing file (there shouldn't be one).". A few
lines after we either unlinked the target filename, or used stat() to
find an unused filename.

Isn't the whole idea of durable_rename_excl() bad? There's not the same
danger of using it when we start from a temp filename, e.g. during
creation of new segments, or timeline files or whatnot. But I still
don't see what the whole thing is supposed to protect us against
realistically.

Greetings,

Andres Freund




Re: Permission failures with WAL files in 13~ on Windows

2021-03-17 Thread Michael Paquier
On Tue, Mar 16, 2021 at 11:40:12AM +0100, Magnus Hagander wrote:
> If we can provide a new .EXE built with exactly the same flags as the
> EDB downloads that they can just drop into a directory, I think it's a
> lot easier to get that done.

Yeah, multiple people have been complaining about that bug, so I have
just produced two builds that people with those sensitive environments
can use, and sent some private links to get the builds.  Let's see how
it goes from this point, but, FWIW, I have not been able to reproduce
again my similar problem with the archive command :/
--
Michael


signature.asc
Description: PGP signature


Re: Permission failures with WAL files in 13~ on Windows

2021-03-16 Thread Ranier Vilela
 >Yeah, it'd definitely be good to figure out exactly what it is that
>triggers the issue.

I think that this issue is the same at 1.
And IMHO the patch solves, but nobody is interested.

I think that "MOVEFILE_COPY_ALLOWED" that's what is missing.
At least on my machine, Postgres can rename statistics files.

regards,
Ranier Vilela

1.
https://www.postgresql.org/message-id/20200616041018.GR20404%40telsasoft.com


fix_windows_rename.patch
Description: Binary data


Re: Permission failures with WAL files in 13~ on Windows

2021-03-16 Thread Magnus Hagander
On Tue, Mar 16, 2021 at 11:22 AM Michael Paquier  wrote:
>
> On Tue, Mar 16, 2021 at 10:02:25AM +0100, Magnus Hagander wrote:
> > If you back out that patch, does the problem you can reproduce with
> > archive_command go away?
>
> That's the first thing I did after seeing the failure, and I saw
> nothing after 2~3 hours of pgbench :)

:) That's at least an "almost".


> The second thing I did was to revert back to HEAD with more logging in
> the area, but I was not able to see my error again.  Perhaps I just
> need to put more load, there are still too many guesses and not enough
> facts.

Agreed.



> > I agree with your analysis in general. It certainly seems to hit right
> > in the center of the problem scope.
> >
> > Maybe hardlinks on Windows has yet another "weird behaviour" vs what
> > we're used to from Unix.
>
> Yeah, I'd like to think that this is a rational explanation, and
> that's why I was just focusing on reproducing this issue rather
> reliably as a first step.

Yeah, it'd definitely be good to figure out exactly what it is that
triggers the issue.


> > It would definitely be more useful if we could figure out *when* this
> > happens. But failing that, I wonder if we could find a way to provide
> > a build with this patch backed out for the bug reporters to test out,
> > given they all seem to have it fairly well reproducible. (But I am
> > assuming are unlikely to be able to create their own builds easily,
> > given the complexity of doing so on Windows). Given that this is a
> > pretty isolated change, it should hopefully be easy enough to back out
> > for testing.
>
> There is a large pool of bug reporters, hopefully one of them may be
> able to help..

I think you're overestimating peoples ability to get our build going
on Windows :)

If we can provide a new .EXE built with exactly the same flags as the
EDB downloads that they can just drop into a directory, I think it's a
lot easier to get that done.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Permission failures with WAL files in 13~ on Windows

2021-03-16 Thread Michael Paquier
On Tue, Mar 16, 2021 at 10:02:25AM +0100, Magnus Hagander wrote:
> If you back out that patch, does the problem you can reproduce with
> archive_command go away?

That's the first thing I did after seeing the failure, and I saw
nothing after 2~3 hours of pgbench :)

The second thing I did was to revert back to HEAD with more logging in
the area, but I was not able to see my error again.  Perhaps I just
need to put more load, there are still too many guesses and not enough
facts.

> I agree with your analysis in general. It certainly seems to hit right
> in the center of the problem scope.
> 
> Maybe hardlinks on Windows has yet another "weird behaviour" vs what
> we're used to from Unix.

Yeah, I'd like to think that this is a rational explanation, and
that's why I was just focusing on reproducing this issue rather
reliably as a first step.

> It would definitely be more useful if we could figure out *when* this
> happens. But failing that, I wonder if we could find a way to provide
> a build with this patch backed out for the bug reporters to test out,
> given they all seem to have it fairly well reproducible. (But I am
> assuming are unlikely to be able to create their own builds easily,
> given the complexity of doing so on Windows). Given that this is a
> pretty isolated change, it should hopefully be easy enough to back out
> for testing.

There is a large pool of bug reporters, hopefully one of them may be
able to help..
--
Michael


signature.asc
Description: PGP signature


Re: Permission failures with WAL files in 13~ on Windows

2021-03-16 Thread Magnus Hagander
On Tue, Mar 16, 2021 at 8:20 AM Michael Paquier  wrote:
>
> Hi all,
>
> There has been for the last couple of weeks a collection of reports
> complaining that the renaming of WAL segments is broken:
> https://www.postgresql.org/message-id/3861ff1e-0923-7838-e826-094cc9bef...@hot.ee
> https://www.postgresql.org/message-id/16874-c3eecd319e36a...@postgresql.org
> https://www.postgresql.org/message-id/095ccf8d-7f58-d928-427c-b17ace23c...@burgess.co.nz
> https://www.postgresql.org/message-id/16927-67c570d968c99567%40postgresql.org
>
> These have happened on a variety of Windows versions, 2019 and 2012 R2
> being mentioned when segments are recycled.
>
> The number of those failures is alarming, and the information gathered
> points at 13.1 and 13.2 as the culprits where those failures are
> happening, so I'd like to believe that there is a regression in 13.

Agreed.


> FWIW, I have also been doing some tests on my side, and while I as not
> able to trigger the reported failure, I have been able to trigger the
> same error with an archive_command doing a simple cp that failed
> continuously on EACCES.
>
> Fujii-san has mentioned that on twitter, but one area that has changed
> during the v13 cycle is aaa3aed, where the code recycling segments has
> been switched from a pgrename() (with a retry loop) to a
> CreateHardLinkA()+pgunlink() (with a retry loop for the second).  One
> theory that I got in mind here is the case where we create the hard
> link, but fail to finish do the pgunlink() on the xlogtemp.N file,
> though after some testing it did not seem to have any impact.

If you back out that patch, does the problem you can reproduce with
archive_command go away?


> I am running more tests with several scenarios (aggressive segment
> recycling or segment rotation) to get more reproducible scenarios,
> but I was wondering if anybody had ideas around that.
>
> So, thoughts?

I agree with your analysis in general. It certainly seems to hit right
in the center of the problem scope.

Maybe hardlinks on Windows has yet another "weird behaviour" vs what
we're used to from Unix.

It would definitely be more useful if we could figure out *when* this
happens. But failing that, I wonder if we could find a way to provide
a build with this patch backed out for the bug reporters to test out,
given they all seem to have it fairly well reproducible. (But I am
assuming are unlikely to be able to create their own builds easily,
given the complexity of doing so on Windows). Given that this is a
pretty isolated change, it should hopefully be easy enough to back out
for testing.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Permission failures with WAL files in 13~ on Windows

2021-03-16 Thread Michael Paquier
Hi all,

There has been for the last couple of weeks a collection of reports
complaining that the renaming of WAL segments is broken:
https://www.postgresql.org/message-id/3861ff1e-0923-7838-e826-094cc9bef...@hot.ee
https://www.postgresql.org/message-id/16874-c3eecd319e36a...@postgresql.org
https://www.postgresql.org/message-id/095ccf8d-7f58-d928-427c-b17ace23c...@burgess.co.nz
https://www.postgresql.org/message-id/16927-67c570d968c99567%40postgresql.org

These have happened on a variety of Windows versions, 2019 and 2012 R2
being mentioned when segments are recycled.

The number of those failures is alarming, and the information gathered
points at 13.1 and 13.2 as the culprits where those failures are
happening, so I'd like to believe that there is a regression in 13.
FWIW, I have also been doing some tests on my side, and while I as not
able to trigger the reported failure, I have been able to trigger the
same error with an archive_command doing a simple cp that failed
continuously on EACCES.

Fujii-san has mentioned that on twitter, but one area that has changed
during the v13 cycle is aaa3aed, where the code recycling segments has
been switched from a pgrename() (with a retry loop) to a
CreateHardLinkA()+pgunlink() (with a retry loop for the second).  One
theory that I got in mind here is the case where we create the hard
link, but fail to finish do the pgunlink() on the xlogtemp.N file,
though after some testing it did not seem to have any impact.

I am running more tests with several scenarios (aggressive segment
recycling or segment rotation) to get more reproducible scenarios,
but I was wondering if anybody had ideas around that.

So, thoughts?
--
Michael


signature.asc
Description: PGP signature