Re: WARNING: could not flush dirty data: Function not implemented

2018-09-02 Thread Andres Freund
Hi,

On 2018-09-02 19:29:49 -0400, Tom Lane wrote:
> Ravi Krishna  writes:
> > A simple search tells that this is an old problem and my Windows has 
> > necessary WSL and other subsystem for this error to not show up.
> 
> If this is on Ubuntu, I don't understand why you're talking
> about Windows.

The OP said "Ubuntu 18.04 as Windows bash" - so I assume this is
postgres compiled as a linux binary is running on MS's new-ish linux
emulation.

Greetings,

Andres Freund



Re: dat names generated by pg_dump

2018-09-02 Thread Ron

On 09/02/2018 09:26 PM, Tom Lane wrote:

Ron  writes:

I can associate these dat names with their source tables through a bunch of
bash and vim manual operations, but I was wondering if there's any automated
method (maybe some SQL query of some catalog table; pg_class didn't seem to
have the relevant data) of making the association.

Those numbers are the "dump object IDs" generated by pg_dump.  They don't
have any significance on the server side, and typically would vary from
one pg_dump run to another.  You have to look at the dump TOC (table of
contents) to figure out what corresponds to what.  For example,

$ pg_dump -Fd -f dumpd regression
$ ls -1 dumpd
6143.dat.gz
6144.dat.gz
6145.dat.gz
...
blob_3001.dat.gz
blobs.toc
toc.dat
$ pg_restore -l dumpd
;
; Archive created at 2018-09-02 22:14:48 EDT
...
6573; 2613 119655 BLOB - 119655 postgres


Thanks. That's exactly what I needed.

--
Angular momentum makes the world go 'round.



Re: dat names generated by pg_dump

2018-09-02 Thread Tom Lane
Ron  writes:
> I can associate these dat names with their source tables through a bunch of 
> bash and vim manual operations, but I was wondering if there's any automated 
> method (maybe some SQL query of some catalog table; pg_class didn't seem to 
> have the relevant data) of making the association.

Those numbers are the "dump object IDs" generated by pg_dump.  They don't
have any significance on the server side, and typically would vary from
one pg_dump run to another.  You have to look at the dump TOC (table of
contents) to figure out what corresponds to what.  For example,

$ pg_dump -Fd -f dumpd regression
$ ls -1 dumpd
6143.dat.gz
6144.dat.gz
6145.dat.gz
...
blob_3001.dat.gz
blobs.toc
toc.dat
$ pg_restore -l dumpd
;
; Archive created at 2018-09-02 22:14:48 EDT
...
6573; 2613 119655 BLOB - 119655 postgres
6574; 2613 3001 BLOB - 3001 postgres
6603; 0 0 COMMENT - LARGE OBJECT 3001 postgres
6247; 0 100933 TABLE DATA public a postgres
6212; 0 89417 TABLE DATA public a_star postgres
6180; 0 88516 TABLE DATA public abstime_tbl postgres
6218; 0 89444 TABLE DATA public aggtest postgres
6446; 0 121383 TABLE DATA public alter_table_under_transition_tables postgres
...

The numbers before the semicolons are the dump IDs.  In particular

$ pg_restore -l dumpd | grep 6143
6143; 0 88018 TABLE DATA public int4_tbl postgres

so 6143.dat.gz contains the data for table public.int4_tbl.  There
will only be separate files in the dump directory for TABLE DATA and
BLOB dump objects ... other stuff is just embedded in the toc.dat file.

regards, tom lane



Re: dat names generated by pg_dump

2018-09-02 Thread Adrian Klaver

On 09/02/2018 07:07 PM, Ron wrote:

On 09/02/2018 08:41 PM, Adrian Klaver wrote:

On 09/02/2018 05:40 PM, Ron wrote:

Hi,

I can associate these dat names with their source tables through a 
bunch of bash and vim manual operations, but I was wondering if 
there's any automated method (maybe some SQL query of some catalog 
table; pg_class didn't seem to have the relevant data) of making the 
association.


Some background would be helpful:

1) What is producing the *.dat files?


pg_dump, as described in the Subject.


Oops, missed that.





2) What is their structure?


They're pg_dump files.


What is the full pg_dump command used?





3) What do the numbers refer to?


That's what I'm asking the list.





If relevant, the source database is v8.4, but the backup was done by 
9.6 on a separate server.


$ ls -1 CDSLBXW/*dat | head
CDSLBXW/8412.dat
CDSLBXW/8414.dat
CDSLBXW/8416.dat
CDSLBXW/8418.dat
CDSLBXW/8420.dat
CDSLBXW/8422.dat
CDSLBXW/8423.dat
CDSLBXW/8425.dat
CDSLBXW/8427.dat
CDSLBXW/8428.dat

Thanks









--
Adrian Klaver
adrian.kla...@aklaver.com



Re: dat names generated by pg_dump

2018-09-02 Thread Ron

On 09/02/2018 08:41 PM, Adrian Klaver wrote:

On 09/02/2018 05:40 PM, Ron wrote:

Hi,

I can associate these dat names with their source tables through a bunch 
of bash and vim manual operations, but I was wondering if there's any 
automated method (maybe some SQL query of some catalog table; pg_class 
didn't seem to have the relevant data) of making the association.


Some background would be helpful:

1) What is producing the *.dat files?


pg_dump, as described in the Subject.



2) What is their structure?


They're pg_dump files.



3) What do the numbers refer to?


That's what I'm asking the list.





If relevant, the source database is v8.4, but the backup was done by 9.6 
on a separate server.


$ ls -1 CDSLBXW/*dat | head
CDSLBXW/8412.dat
CDSLBXW/8414.dat
CDSLBXW/8416.dat
CDSLBXW/8418.dat
CDSLBXW/8420.dat
CDSLBXW/8422.dat
CDSLBXW/8423.dat
CDSLBXW/8425.dat
CDSLBXW/8427.dat
CDSLBXW/8428.dat

Thanks






--
Angular momentum makes the world go 'round.



Re: WARNING: could not flush dirty data: Function not implemented

2018-09-02 Thread Ravi Krishna
> 
> If this is on Ubuntu, I don't understand why you're talking
> about Windows.

Because I am using Ubuntu Bash on Windows, which requires WLS (Windows Linux 
Subsystem).  I also have necessary build version of Windows which supports 
Ubuntu Bash.


Re: dat names generated by pg_dump

2018-09-02 Thread Adrian Klaver

On 09/02/2018 05:40 PM, Ron wrote:

Hi,

I can associate these dat names with their source tables through a bunch 
of bash and vim manual operations, but I was wondering if there's any 
automated method (maybe some SQL query of some catalog table; pg_class 
didn't seem to have the relevant data) of making the association.


Some background would be helpful:

1) What is producing the *.dat files?

2) What is their structure?

3) What do the numbers refer to?



If relevant, the source database is v8.4, but the backup was done by 9.6 
on a separate server.


$ ls -1 CDSLBXW/*dat | head
CDSLBXW/8412.dat
CDSLBXW/8414.dat
CDSLBXW/8416.dat
CDSLBXW/8418.dat
CDSLBXW/8420.dat
CDSLBXW/8422.dat
CDSLBXW/8423.dat
CDSLBXW/8425.dat
CDSLBXW/8427.dat
CDSLBXW/8428.dat

Thanks




--
Adrian Klaver
adrian.kla...@aklaver.com



dat names generated by pg_dump

2018-09-02 Thread Ron

Hi,

I can associate these dat names with their source tables through a bunch of 
bash and vim manual operations, but I was wondering if there's any automated 
method (maybe some SQL query of some catalog table; pg_class didn't seem to 
have the relevant data) of making the association.


If relevant, the source database is v8.4, but the backup was done by 9.6 on 
a separate server.


$ ls -1 CDSLBXW/*dat | head
CDSLBXW/8412.dat
CDSLBXW/8414.dat
CDSLBXW/8416.dat
CDSLBXW/8418.dat
CDSLBXW/8420.dat
CDSLBXW/8422.dat
CDSLBXW/8423.dat
CDSLBXW/8425.dat
CDSLBXW/8427.dat
CDSLBXW/8428.dat

Thanks

--
Angular momentum makes the world go 'round.



Re: WARNING: could not flush dirty data: Function not implemented

2018-09-02 Thread Andres Freund
Hi,

On 2018-09-02 22:57:55 +, Ravi Krishna wrote:
> Ubuntu 18.04 as Windows bash
> 
> Distributor ID: Ubuntu
> Description:Ubuntu 18.04.1 LTS
> Release:18.04
> Codename:   bionic
> 
> 
> PG 10.5.1
> 
> postgres@ravi-lenovo:~$ psql -d postgres
> psql (10.5 (Ubuntu 10.5-1.pgdg16.04+1))
> 
> A CREATE DATABASE statement spewed out 
> 
> WARNING:  could not flush dirty data: Function not implemented
> 
> many times, but the db was created.
> 
> What exactly is this?

That means that the linux emulation by microsoft isn't good enough.  You
can work around it by setting checkpoint_flush_after=0 and
wal_writer_flush_after=0.


> A simple search tells that this is an old problem and my Windows has
> necessary WSL and other subsystem for this error to not show up.

Please note that nobody has verified that postgres works correctly via
the emulation stuff MS is doing.  There is a native version of postgres
for windows however, and that is tested (and exercised by a lot of
installations).


Greetings,

Andres Freund



Re: WARNING: could not flush dirty data: Function not implemented

2018-09-02 Thread Tom Lane
Ravi Krishna  writes:
> A CREATE DATABASE statement spewed out 
> WARNING:  could not flush dirty data: Function not implemented

Hmm, that's probably ENOSYS coming back from sync_file_range().
What filesystem is this database sitting on?

It's harmless from a correctness standpoint, because we'll fsync
the data later anyway; but it seems bad that we've selected
sync_file_range() if it doesn't actually work on your platform.

> A simple search tells that this is an old problem and my Windows has 
> necessary WSL and other subsystem for this error to not show up.

If this is on Ubuntu, I don't understand why you're talking
about Windows.

regards, tom lane



WARNING: could not flush dirty data: Function not implemented

2018-09-02 Thread Ravi Krishna
Ubuntu 18.04 as Windows bash

Distributor ID: Ubuntu
Description:Ubuntu 18.04.1 LTS
Release:18.04
Codename:   bionic


PG 10.5.1

postgres@ravi-lenovo:~$ psql -d postgres
psql (10.5 (Ubuntu 10.5-1.pgdg16.04+1))

A CREATE DATABASE statement spewed out 

WARNING:  could not flush dirty data: Function not implemented

many times, but the db was created.

What exactly is this?

A simple search tells that this is an old problem and my Windows has necessary 
WSL and other subsystem for this error to not show up.





Re: pg_basebackup: could not receive data from WAL stream

2018-09-02 Thread Kaixi Luo
wal_sender_timeout should be as long as necessary. Each wal file is 16MB,
so it should be *at least* as long as the time needed to transfer
16MB*wal_keep_segments. Take a look at the size of your pg_xlog folder.

On Sun, Sep 2, 2018 at 3:41 PM Adrian Klaver 
wrote:

> On 09/01/2018 09:06 PM, greigwise wrote:
> > Hello.
> >
> > On postgresql 10.5, my pg_basebackup is failing with this error:
> >
> > pg_basebackup: could not receive data from WAL stream: server closed the
> > connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request
> >
> > In the postgres log files, I'm seeing:
> >
> > 2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:
> terminating
> > walsender process due to replication timeout
> >
> > I'm running the following command right on the database server itself:
> >
> > pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z
> >
> > It seems to be an intermittent problem.. I've had it fail or succeed
> about
> > 50/50.  I even bumped up the wal_sender_timeout to 2000.  One notable
> thing
> > is that I'm running on an ec2 instance on AWS.
>
> The unit for wal_sender_timeout is ms so the above is 2 seconds whereas
> the default value is 60 seconds(60s in postgresql.conf file).
>
> See below for setting units in file:
>
> https://www.postgresql.org/docs/10/static/config-setting.html
>
> Also what is your max_wal_senders setting?
>
> >
> > Any advice would be helpful.
> >
> > Greig Wise
> >
> >
> >
> > --
> > Sent from:
> http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: locate DB corruption

2018-09-02 Thread Dave Peticolas
On Sun, Sep 2, 2018 at 4:51 AM Stephen Frost  wrote:

> Greetings,
>
> * Dave Peticolas (d...@krondo.com) wrote:
> > On Sat, Sep 1, 2018 at 5:09 PM Adrian Klaver 
> > wrote:
> >
> > > On 09/01/2018 04:45 PM, Dave Peticolas wrote:
> > >
> > > > Well restoring from a backup of the primary does seem to have fixed
> the
> > > > issue with the corrupt table.
> > >
> > > Pretty sure it was not that the table was corrupt but that transaction
> > > information was missing from pg_clog.
> > >
> > > In a previous post you mentioned you ran tar to do the snapshot of
> > > $PG_DATA.
> > >
> > > Was there any error when tar ran the backup that caused you problems?
> >
> > Well the interesting thing about that is that although the bad table was
> > originally discovered in a DB restored from a snapshot, I subsequently
> > discovered it in the real-time clone of the primary from which the
> backups
> > are made. So somehow the clone's table became corrupted. The same table
> was
> > not corrupt on the primary, but I have discovered an error on the primary
> > -- it's in the thread I posted today. These events seem correlated in
> time,
> > I'll have to mine the logs some more.
>
> Has this primary been the primary since inception, or was it promoted to
> be one at some point after first being built as a replica..?


It was the primary since inception. All the problems now appear to have
stemmed from the primary due to a bug in 9.6.8 (see other thread). I've
since upgraded to 9.6.10.


Re: pg_basebackup: could not receive data from WAL stream

2018-09-02 Thread Adrian Klaver

On 09/01/2018 09:06 PM, greigwise wrote:

Hello.

On postgresql 10.5, my pg_basebackup is failing with this error:

pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request

In the postgres log files, I'm seeing:

2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:  terminating
walsender process due to replication timeout

I'm running the following command right on the database server itself:

pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z

It seems to be an intermittent problem.. I've had it fail or succeed about
50/50.  I even bumped up the wal_sender_timeout to 2000.  One notable thing
is that I'm running on an ec2 instance on AWS.


The unit for wal_sender_timeout is ms so the above is 2 seconds whereas 
the default value is 60 seconds(60s in postgresql.conf file).


See below for setting units in file:

https://www.postgresql.org/docs/10/static/config-setting.html

Also what is your max_wal_senders setting?



Any advice would be helpful.

Greig Wise



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: locate DB corruption

2018-09-02 Thread Stephen Frost
Greetings,

* Dave Peticolas (d...@krondo.com) wrote:
> On Sat, Sep 1, 2018 at 5:09 PM Adrian Klaver 
> wrote:
> 
> > On 09/01/2018 04:45 PM, Dave Peticolas wrote:
> >
> > > Well restoring from a backup of the primary does seem to have fixed the
> > > issue with the corrupt table.
> >
> > Pretty sure it was not that the table was corrupt but that transaction
> > information was missing from pg_clog.
> >
> > In a previous post you mentioned you ran tar to do the snapshot of
> > $PG_DATA.
> >
> > Was there any error when tar ran the backup that caused you problems?
> 
> Well the interesting thing about that is that although the bad table was
> originally discovered in a DB restored from a snapshot, I subsequently
> discovered it in the real-time clone of the primary from which the backups
> are made. So somehow the clone's table became corrupted. The same table was
> not corrupt on the primary, but I have discovered an error on the primary
> -- it's in the thread I posted today. These events seem correlated in time,
> I'll have to mine the logs some more.

Has this primary been the primary since inception, or was it promoted to
be one at some point after first being built as a replica..?

Thanks!

Stephen


signature.asc
Description: PGP signature