Re: No core file generated after PostgresNode->start
Michael Paquier writes: > On Tue, May 12, 2020 at 04:15:26PM -0400, Robert Haas wrote: >> On Mon, May 11, 2020 at 10:48 PM Tom Lane wrote: >>> I have a standing note to check the permissions on /cores after any macOS >>> upgrade, because every so often Apple decides that that directory ought to >>> be read-only. >> Thanks, that was my problem. > Was that a recent problem with Catalina and/or Mojave? I have never > seen an actual problem up to 10.13. I don't recall exactly when I started seeing this, but it was at least a couple years back, so maybe Mojave. I think it's related to Apple's efforts to make the root filesystem read-only. (It's not apparent to me how come I can write in /cores when "mount" clearly reports /dev/disk1s1 on / (apfs, local, read-only, journaled) but nonetheless it works, as long as the directory permissions permit.) regards, tom lane
Re: No core file generated after PostgresNode->start
On Tue, May 12, 2020 at 04:15:26PM -0400, Robert Haas wrote: > On Mon, May 11, 2020 at 10:48 PM Tom Lane wrote: >> I have a standing note to check the permissions on /cores after any macOS >> upgrade, because every so often Apple decides that that directory ought to >> be read-only. > > Thanks, that was my problem. Was that a recent problem with Catalina and/or Mojave? I have never seen an actual problem up to 10.13. -- Michael signature.asc Description: PGP signature
Re: No core file generated after PostgresNode->start
On Mon, May 11, 2020 at 10:48 PM Tom Lane wrote: > I have a standing note to check the permissions on /cores after any macOS > upgrade, because every so often Apple decides that that directory ought to > be read-only. Thanks, that was my problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: No core file generated after PostgresNode->start
On Tue, May 12, 2020 at 3:36 AM Robert Haas wrote: > On Sun, May 10, 2020 at 11:21 PM Andy Fan > wrote: > > Looks this doesn't mean a crash. If the test case(subscription/t/ > 013_partition.pl) > > failed, test framework kill some process, which leads the above > message. So you can > > ignore this issue now. Thanks > > I think there might be a real issue here someplace, though, because I > couldn't get a core dump last week when I did have a crash happening > locally. I forget to say the failure happens on my modified version, I guess this is what happened in my case (subscription/t/013_partition.pl ). 1. It need to read data from slave, however it get ERROR, elog(ERROR, ..) rather crash. 2. The test framework knows the case failed, so it kill the primary in some way. 3. The primary raises the error below. 2020-05-11 09:37:40.778 CST [69541] sub_viaroot WARNING: terminating connection because of crash of another server process 2020-05-11 09:37:40.778 CST [69541] sub_viaroot DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. Finally I get the root cause by looking into the error log in slave. After I fix my bug, the issue gone. Best Regards Andy Fan
Re: No core file generated after PostgresNode->start
Robert Haas writes: > On Mon, May 11, 2020 at 4:24 PM Antonin Houska wrote: >> Could "sysctl kernel.core_pattern" be the problem? I discovered this setting >> sometime when I also couldn't find the core dump on linux. > Well, I'm running on macOS and the core files normally show up in > /cores, but in this case they didn't. I have a standing note to check the permissions on /cores after any macOS upgrade, because every so often Apple decides that that directory ought to be read-only. regards, tom lane
Re: No core file generated after PostgresNode->start
On Mon, May 11, 2020 at 4:24 PM Antonin Houska wrote: > Could "sysctl kernel.core_pattern" be the problem? I discovered this setting > sometime when I also couldn't find the core dump on linux. Well, I'm running on macOS and the core files normally show up in /cores, but in this case they didn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: No core file generated after PostgresNode->start
Robert Haas wrote: > On Sun, May 10, 2020 at 11:21 PM Andy Fan wrote: > > Looks this doesn't mean a crash. If the test > > case(subscription/t/013_partition.pl) > > failed, test framework kill some process, which leads the above message. > > So you can > > ignore this issue now. Thanks > > I think there might be a real issue here someplace, though, because I > couldn't get a core dump last week when I did have a crash happening > locally. I didn't poke into it very hard though so I never figured out > exactly why not, but ulimit -c unlimited didn't help. Could "sysctl kernel.core_pattern" be the problem? I discovered this setting sometime when I also couldn't find the core dump on linux. -- Antonin Houska Web: https://www.cybertec-postgresql.com
Re: No core file generated after PostgresNode->start
On Sun, May 10, 2020 at 11:21 PM Andy Fan wrote: > Looks this doesn't mean a crash. If the test > case(subscription/t/013_partition.pl) > failed, test framework kill some process, which leads the above message. So > you can > ignore this issue now. Thanks I think there might be a real issue here someplace, though, because I couldn't get a core dump last week when I did have a crash happening locally. I didn't poke into it very hard though so I never figured out exactly why not, but ulimit -c unlimited didn't help. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: No core file generated after PostgresNode->start
On Mon, May 11, 2020 at 9:48 AM Andy Fan wrote: > Hi: > > > 2020-05-11 09:37:40.778 CST [69541] sub_viaroot WARNING: terminating > connection because of crash of another server process > > Looks this doesn't mean a crash. If the test case(subscription/t/ 013_partition.pl) failed, test framework kill some process, which leads the above message. So you can ignore this issue now. Thanks Best Regards Andy Fan
No core file generated after PostgresNode->start
Hi: When I run make -C subscription check, then I see the following logs in ./tmp_check/log/013_partition_publisher.log 2020-05-11 09:37:40.778 CST [69541] sub_viaroot WARNING: terminating connection because of crash of another server process 2020-05-11 09:37:40.778 CST [69541] sub_viaroot DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. However there is no core file generated. In my other cases(like start pg manually with bin/postgres xxx) can generate core file successfully at the same machine. What might be the problem for PostgresNode case? I tried this modification, but it doesn't help. --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -766,7 +766,7 @@ sub start # Note: We set the cluster_name here, not in postgresql.conf (in # sub init) so that it does not get copied to standbys. - $ret = TestLib::system_log('pg_ctl', '-D', $self->data_dir, '-l', + $ret = TestLib::system_log('pg_ctl', "-c", '-D', $self->data_dir, '-l', $self->logfile, '-o', "--cluster-name=$name", 'start'); } Best Regards Andy Fan