Re: WARNING: could not flush dirty data: Function not implemented
Hi, On 2018-09-02 19:29:49 -0400, Tom Lane wrote: > Ravi Krishna writes: > > A simple search tells that this is an old problem and my Windows has > > necessary WSL and other subsystem for this error to not show up. > > If this is on Ubuntu, I don't understand why you're talking > about Windows. The OP said "Ubuntu 18.04 as Windows bash" - so I assume this is postgres compiled as a linux binary is running on MS's new-ish linux emulation. Greetings, Andres Freund
Re: dat names generated by pg_dump
On 09/02/2018 09:26 PM, Tom Lane wrote: Ron writes: I can associate these dat names with their source tables through a bunch of bash and vim manual operations, but I was wondering if there's any automated method (maybe some SQL query of some catalog table; pg_class didn't seem to have the relevant data) of making the association. Those numbers are the "dump object IDs" generated by pg_dump. They don't have any significance on the server side, and typically would vary from one pg_dump run to another. You have to look at the dump TOC (table of contents) to figure out what corresponds to what. For example, $ pg_dump -Fd -f dumpd regression $ ls -1 dumpd 6143.dat.gz 6144.dat.gz 6145.dat.gz ... blob_3001.dat.gz blobs.toc toc.dat $ pg_restore -l dumpd ; ; Archive created at 2018-09-02 22:14:48 EDT ... 6573; 2613 119655 BLOB - 119655 postgres Thanks. That's exactly what I needed. -- Angular momentum makes the world go 'round.
Re: dat names generated by pg_dump
Ron writes: > I can associate these dat names with their source tables through a bunch of > bash and vim manual operations, but I was wondering if there's any automated > method (maybe some SQL query of some catalog table; pg_class didn't seem to > have the relevant data) of making the association. Those numbers are the "dump object IDs" generated by pg_dump. They don't have any significance on the server side, and typically would vary from one pg_dump run to another. You have to look at the dump TOC (table of contents) to figure out what corresponds to what. For example, $ pg_dump -Fd -f dumpd regression $ ls -1 dumpd 6143.dat.gz 6144.dat.gz 6145.dat.gz ... blob_3001.dat.gz blobs.toc toc.dat $ pg_restore -l dumpd ; ; Archive created at 2018-09-02 22:14:48 EDT ... 6573; 2613 119655 BLOB - 119655 postgres 6574; 2613 3001 BLOB - 3001 postgres 6603; 0 0 COMMENT - LARGE OBJECT 3001 postgres 6247; 0 100933 TABLE DATA public a postgres 6212; 0 89417 TABLE DATA public a_star postgres 6180; 0 88516 TABLE DATA public abstime_tbl postgres 6218; 0 89444 TABLE DATA public aggtest postgres 6446; 0 121383 TABLE DATA public alter_table_under_transition_tables postgres ... The numbers before the semicolons are the dump IDs. In particular $ pg_restore -l dumpd | grep 6143 6143; 0 88018 TABLE DATA public int4_tbl postgres so 6143.dat.gz contains the data for table public.int4_tbl. There will only be separate files in the dump directory for TABLE DATA and BLOB dump objects ... other stuff is just embedded in the toc.dat file. regards, tom lane
Re: dat names generated by pg_dump
On 09/02/2018 07:07 PM, Ron wrote: On 09/02/2018 08:41 PM, Adrian Klaver wrote: On 09/02/2018 05:40 PM, Ron wrote: Hi, I can associate these dat names with their source tables through a bunch of bash and vim manual operations, but I was wondering if there's any automated method (maybe some SQL query of some catalog table; pg_class didn't seem to have the relevant data) of making the association. Some background would be helpful: 1) What is producing the *.dat files? pg_dump, as described in the Subject. Oops, missed that. 2) What is their structure? They're pg_dump files. What is the full pg_dump command used? 3) What do the numbers refer to? That's what I'm asking the list. If relevant, the source database is v8.4, but the backup was done by 9.6 on a separate server. $ ls -1 CDSLBXW/*dat | head CDSLBXW/8412.dat CDSLBXW/8414.dat CDSLBXW/8416.dat CDSLBXW/8418.dat CDSLBXW/8420.dat CDSLBXW/8422.dat CDSLBXW/8423.dat CDSLBXW/8425.dat CDSLBXW/8427.dat CDSLBXW/8428.dat Thanks -- Adrian Klaver adrian.kla...@aklaver.com
Re: dat names generated by pg_dump
On 09/02/2018 08:41 PM, Adrian Klaver wrote: On 09/02/2018 05:40 PM, Ron wrote: Hi, I can associate these dat names with their source tables through a bunch of bash and vim manual operations, but I was wondering if there's any automated method (maybe some SQL query of some catalog table; pg_class didn't seem to have the relevant data) of making the association. Some background would be helpful: 1) What is producing the *.dat files? pg_dump, as described in the Subject. 2) What is their structure? They're pg_dump files. 3) What do the numbers refer to? That's what I'm asking the list. If relevant, the source database is v8.4, but the backup was done by 9.6 on a separate server. $ ls -1 CDSLBXW/*dat | head CDSLBXW/8412.dat CDSLBXW/8414.dat CDSLBXW/8416.dat CDSLBXW/8418.dat CDSLBXW/8420.dat CDSLBXW/8422.dat CDSLBXW/8423.dat CDSLBXW/8425.dat CDSLBXW/8427.dat CDSLBXW/8428.dat Thanks -- Angular momentum makes the world go 'round.
Re: WARNING: could not flush dirty data: Function not implemented
> > If this is on Ubuntu, I don't understand why you're talking > about Windows. Because I am using Ubuntu Bash on Windows, which requires WLS (Windows Linux Subsystem). I also have necessary build version of Windows which supports Ubuntu Bash.
Re: dat names generated by pg_dump
On 09/02/2018 05:40 PM, Ron wrote: Hi, I can associate these dat names with their source tables through a bunch of bash and vim manual operations, but I was wondering if there's any automated method (maybe some SQL query of some catalog table; pg_class didn't seem to have the relevant data) of making the association. Some background would be helpful: 1) What is producing the *.dat files? 2) What is their structure? 3) What do the numbers refer to? If relevant, the source database is v8.4, but the backup was done by 9.6 on a separate server. $ ls -1 CDSLBXW/*dat | head CDSLBXW/8412.dat CDSLBXW/8414.dat CDSLBXW/8416.dat CDSLBXW/8418.dat CDSLBXW/8420.dat CDSLBXW/8422.dat CDSLBXW/8423.dat CDSLBXW/8425.dat CDSLBXW/8427.dat CDSLBXW/8428.dat Thanks -- Adrian Klaver adrian.kla...@aklaver.com
dat names generated by pg_dump
Hi, I can associate these dat names with their source tables through a bunch of bash and vim manual operations, but I was wondering if there's any automated method (maybe some SQL query of some catalog table; pg_class didn't seem to have the relevant data) of making the association. If relevant, the source database is v8.4, but the backup was done by 9.6 on a separate server. $ ls -1 CDSLBXW/*dat | head CDSLBXW/8412.dat CDSLBXW/8414.dat CDSLBXW/8416.dat CDSLBXW/8418.dat CDSLBXW/8420.dat CDSLBXW/8422.dat CDSLBXW/8423.dat CDSLBXW/8425.dat CDSLBXW/8427.dat CDSLBXW/8428.dat Thanks -- Angular momentum makes the world go 'round.
Re: WARNING: could not flush dirty data: Function not implemented
Hi, On 2018-09-02 22:57:55 +, Ravi Krishna wrote: > Ubuntu 18.04 as Windows bash > > Distributor ID: Ubuntu > Description:Ubuntu 18.04.1 LTS > Release:18.04 > Codename: bionic > > > PG 10.5.1 > > postgres@ravi-lenovo:~$ psql -d postgres > psql (10.5 (Ubuntu 10.5-1.pgdg16.04+1)) > > A CREATE DATABASE statement spewed out > > WARNING: could not flush dirty data: Function not implemented > > many times, but the db was created. > > What exactly is this? That means that the linux emulation by microsoft isn't good enough. You can work around it by setting checkpoint_flush_after=0 and wal_writer_flush_after=0. > A simple search tells that this is an old problem and my Windows has > necessary WSL and other subsystem for this error to not show up. Please note that nobody has verified that postgres works correctly via the emulation stuff MS is doing. There is a native version of postgres for windows however, and that is tested (and exercised by a lot of installations). Greetings, Andres Freund
Re: WARNING: could not flush dirty data: Function not implemented
Ravi Krishna writes: > A CREATE DATABASE statement spewed out > WARNING: could not flush dirty data: Function not implemented Hmm, that's probably ENOSYS coming back from sync_file_range(). What filesystem is this database sitting on? It's harmless from a correctness standpoint, because we'll fsync the data later anyway; but it seems bad that we've selected sync_file_range() if it doesn't actually work on your platform. > A simple search tells that this is an old problem and my Windows has > necessary WSL and other subsystem for this error to not show up. If this is on Ubuntu, I don't understand why you're talking about Windows. regards, tom lane
WARNING: could not flush dirty data: Function not implemented
Ubuntu 18.04 as Windows bash Distributor ID: Ubuntu Description:Ubuntu 18.04.1 LTS Release:18.04 Codename: bionic PG 10.5.1 postgres@ravi-lenovo:~$ psql -d postgres psql (10.5 (Ubuntu 10.5-1.pgdg16.04+1)) A CREATE DATABASE statement spewed out WARNING: could not flush dirty data: Function not implemented many times, but the db was created. What exactly is this? A simple search tells that this is an old problem and my Windows has necessary WSL and other subsystem for this error to not show up.
Re: pg_basebackup: could not receive data from WAL stream
wal_sender_timeout should be as long as necessary. Each wal file is 16MB, so it should be *at least* as long as the time needed to transfer 16MB*wal_keep_segments. Take a look at the size of your pg_xlog folder. On Sun, Sep 2, 2018 at 3:41 PM Adrian Klaver wrote: > On 09/01/2018 09:06 PM, greigwise wrote: > > Hello. > > > > On postgresql 10.5, my pg_basebackup is failing with this error: > > > > pg_basebackup: could not receive data from WAL stream: server closed the > > connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request > > > > In the postgres log files, I'm seeing: > > > > 2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG: > terminating > > walsender process due to replication timeout > > > > I'm running the following command right on the database server itself: > > > > pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z > > > > It seems to be an intermittent problem.. I've had it fail or succeed > about > > 50/50. I even bumped up the wal_sender_timeout to 2000. One notable > thing > > is that I'm running on an ec2 instance on AWS. > > The unit for wal_sender_timeout is ms so the above is 2 seconds whereas > the default value is 60 seconds(60s in postgresql.conf file). > > See below for setting units in file: > > https://www.postgresql.org/docs/10/static/config-setting.html > > Also what is your max_wal_senders setting? > > > > > Any advice would be helpful. > > > > Greig Wise > > > > > > > > -- > > Sent from: > http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html > > > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: locate DB corruption
On Sun, Sep 2, 2018 at 4:51 AM Stephen Frost wrote: > Greetings, > > * Dave Peticolas (d...@krondo.com) wrote: > > On Sat, Sep 1, 2018 at 5:09 PM Adrian Klaver > > wrote: > > > > > On 09/01/2018 04:45 PM, Dave Peticolas wrote: > > > > > > > Well restoring from a backup of the primary does seem to have fixed > the > > > > issue with the corrupt table. > > > > > > Pretty sure it was not that the table was corrupt but that transaction > > > information was missing from pg_clog. > > > > > > In a previous post you mentioned you ran tar to do the snapshot of > > > $PG_DATA. > > > > > > Was there any error when tar ran the backup that caused you problems? > > > > Well the interesting thing about that is that although the bad table was > > originally discovered in a DB restored from a snapshot, I subsequently > > discovered it in the real-time clone of the primary from which the > backups > > are made. So somehow the clone's table became corrupted. The same table > was > > not corrupt on the primary, but I have discovered an error on the primary > > -- it's in the thread I posted today. These events seem correlated in > time, > > I'll have to mine the logs some more. > > Has this primary been the primary since inception, or was it promoted to > be one at some point after first being built as a replica..? It was the primary since inception. All the problems now appear to have stemmed from the primary due to a bug in 9.6.8 (see other thread). I've since upgraded to 9.6.10.
Re: pg_basebackup: could not receive data from WAL stream
On 09/01/2018 09:06 PM, greigwise wrote: Hello. On postgresql 10.5, my pg_basebackup is failing with this error: pg_basebackup: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request In the postgres log files, I'm seeing: 2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG: terminating walsender process due to replication timeout I'm running the following command right on the database server itself: pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z It seems to be an intermittent problem.. I've had it fail or succeed about 50/50. I even bumped up the wal_sender_timeout to 2000. One notable thing is that I'm running on an ec2 instance on AWS. The unit for wal_sender_timeout is ms so the above is 2 seconds whereas the default value is 60 seconds(60s in postgresql.conf file). See below for setting units in file: https://www.postgresql.org/docs/10/static/config-setting.html Also what is your max_wal_senders setting? Any advice would be helpful. Greig Wise -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html -- Adrian Klaver adrian.kla...@aklaver.com
Re: locate DB corruption
Greetings, * Dave Peticolas (d...@krondo.com) wrote: > On Sat, Sep 1, 2018 at 5:09 PM Adrian Klaver > wrote: > > > On 09/01/2018 04:45 PM, Dave Peticolas wrote: > > > > > Well restoring from a backup of the primary does seem to have fixed the > > > issue with the corrupt table. > > > > Pretty sure it was not that the table was corrupt but that transaction > > information was missing from pg_clog. > > > > In a previous post you mentioned you ran tar to do the snapshot of > > $PG_DATA. > > > > Was there any error when tar ran the backup that caused you problems? > > Well the interesting thing about that is that although the bad table was > originally discovered in a DB restored from a snapshot, I subsequently > discovered it in the real-time clone of the primary from which the backups > are made. So somehow the clone's table became corrupted. The same table was > not corrupt on the primary, but I have discovered an error on the primary > -- it's in the thread I posted today. These events seem correlated in time, > I'll have to mine the logs some more. Has this primary been the primary since inception, or was it promoted to be one at some point after first being built as a replica..? Thanks! Stephen signature.asc Description: PGP signature