from:"Harry Ambrose"

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-08-25 Thread Harry Ambrose

Hi All,

Sorry to open this can of worms again. However, we are still struggling
with this issue across quite a large amount of our estate.

>From doing some further research I stumbled across the following which
seems to sum up what we are seeing quite well...

http://grokbase.com/t/postgresql/pgsql-hackers/1321h6dpv7/getoldestxmin-going-backwards-is-dangerous-after-all

The above thread does not state whether a fix was committed, can anyone
confirm/deny?

Have a great weekend (bank holiday for some!)

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-08-09 Thread Harry Ambrose

Unfortunately we still see it frequently :(

On 9 August 2017 at 14:29, Achilleas Mantzios 
wrote:

> On 09/08/2017 15:27, ADSJ (Adam Sjøgren) wrote:
>
>> On 2017-06-21 Adam Sjøgren  wrote:
>>
>> Adam Sjøgren  wrote:
>>>
 Meanwhile, I can report that I have upgraded from 9.3.14 to 9.3.17 and
 the errors keep appearing the log.

>>> Just to close this, for the record: We haven't seen the errors since
>> 2017-06-30. We upgraded to 9.3.17 (latest 9.3 point-release at the time
>> of writing) on 2017-06-10.
>>
>> Whether this means that the affected rows gradually got overwritten
>> after switching to .17 and thus got fixed, or if something subtle in our
>> workflow changed, so we aren't hitting this anymore, or something else
>> entirely is the answer, we're not sure.
>>
> Glad you sorted it out! You have been consistent in your effort to chase
> this down, and reverted back with your findings to close the case. Thumbs
> up!
>
>> We didn't get to trying Alvaro Herrera's suggestion of removing
>> 6c243f90ab6904f27fa990f1f3261e1d09a11853 before the errors stopped
>> appearing "by themselves".
>>
>>
>>Best regards,
>>
>>  Adam
>>
>>
> --
> Achilleas Mantzios
> IT DEV Lead
> IT DEPT
> Dynacom Tankers Mgmt
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-23 Thread Harry Ambrose

Hi Everyone,

Still trying to fathom this one. I have added quite a few log lines to a
copy of 9.4.12 and compiled it hoping to find the fault.

Below is from the log (at DEBUG5). Apologies for my name in the log lines,
it was the easiest way to grep them specifically I also apologise that its
a bit messy, i'm not a C dev.

This excerpt is without failure:

127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - UPDATE - Old tuple is 0, new tuple is 0,
table is: 2345873096
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - index_beginscan_internal scan: 57267920
127.0.0.1 2017-06-23 09:45:26.083 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  CommitTransaction
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  name: unnamed; blockState:   STARTED; state: INPROGR,
xid/subid/cid: 73603293/1/0 (used), nestlvl: 1, children:
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  StartTransaction
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 DEBUG:  name: unnamed; blockState:   DEFAULT; state: INPROGR,
xid/subid/cid: 0/1/0, nestlvl: 1, children:
127.0.0.1 2017-06-23 09:45:26.084 BST  4061 594ccaa6.fdd postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab

This is with the failure, interesting that there are multiple flush's
nearby, could be total coincidence though?

127.0.0.1 2017-06-23 10:28:25.862 BST  1219 594cd5ba.4c3 postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
**.*.**.*** 2017-06-23 10:28:25.996 BST  24765 594bdf98.60bd postgres
walreceiver  0 DEBUG:  write 2F81/E7DA8000 flush 2F81/E7D9 apply
2F81/E7D8FBA0
**.*.**.*** 2017-06-23 10:28:25.996 BST  24765 594bdf98.60bd postgres
walreceiver  0 DEBUG:  write 2F81/E7DA8000 flush 2F81/E7DA8000 apply
2F81/E7D8FBA0

sent 16492 bytes  received 24607 bytes  82198.00 bytes/sec
total size is 16777216  speedup is 408.21
 2017-06-23 10:28:26.014 BST  24752 594bdf95.60b00 DEBUG:  archived
transaction log file "00012F8100E5"
 2017-06-23 10:28:26.018 BST  24752 594bdf95.60b00 DEBUG:
 executing archive command "rsync -e ssh -arv
 /wal/pg_xlog/00012F8100E6 postgres@:/wal/pg_xlog"
127.0.0.1 2017-06-23 10:28:26.101 BST  1219 594cd5ba.4c3 postgres [unknown]
 0 DEBUG:  HARRYAMBROSE - index_beginscan_internal scan: 630163208
127.0.0.1 2017-06-23 10:28:26.101 BST  1219 594cd5ba.4c3 postgres [unknown]
 0 STATEMENT:  VACUUM (FULL, ANALYZE) ctab
sending incremental file list

sent 69 bytes  received 12 bytes  162.00 bytes/sec
total size is 16777216  speedup is 207126.12
 2017-06-23 10:28:26.200 BST  24752 594bdf95.60b00 DEBUG:  archived
transaction log file "00012F8100E6"
 2017-06-23 10:28:26.201 BST  24752 594bdf95.60b00 DEBUG:
 executing archive command "rsync -e ssh -arv
 /wal/pg_xlog/00012F8100E7 postgres@:/wal/pg_xlog"
**.*.**.*** 2017-06-23 10:28:26.203 BST  24765 594bdf98.60bd postgres
walreceiver  0 DEBUG:  write 2F81/E7DC8000 flush 2F81/E7DA8000 apply
2F81/E7DA7FC8
**.*.**.*** 2017-06-23 10:28:26.203 BST  247

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-13 Thread Harry Ambrose

Hi,

Not sure whether its relevant or not, however upon adding an ANALYSE before
the second vacuum the issue has not presented when testing. I have managed
95 cycles thus far.

BEGIN;
CREATE TABLE x (id BIGSERIAL PRIMARY KEY, payload1 VARCHAR, payload2
VARCHAR, payload3 VARCHAR, payload4 BIGINT, payload5 BIGINT);
/* Repeat until 2,000,000 rows are inserted */
INSERT INTO x (id, payload1, payload2, payload3, payload4, payload5) VALUES
(random values of varying length/size to force random toast usage);
COMMIT;

VACUUM (ANALYZE, FULL);

BEGIN;
/* Repeat until all 2,000,000 rows are updated */
UPDATE x SET payload1 = , payload2 = , payload3 = , payload4 = , payload5 =
... again random values of varying length/size to force random toast usage
COMMIT;

ANALYZE x; -- <== New analyse here.

VACCUM (ANALYZE, FULL);

(...)

Vacuum end. 30 sec sleep

Update selective

Inserting the rows

update 0

update 2

update 4

update 6

update 8

update 10

update 12

update 14

update 16

update 18

Update all

Attempting vacuum

Vacuum completed

dropping the table

=

New attempt - number 96

Inserting the rows

Executing  0

Executing  4

Executing  8

Executing  12

Executing  16

Executing  20

Executing  24

Executing  28

(...)

Many thanks,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-12 Thread Harry Ambrose

Hi,

BTW, how do you get that jar to make the test table on a non-default
> tablespace?  Or are you just putting the whole test DB on a tablespace?
>
> regards, tom lane
>

I have been putting the whole database on a tablespace. It seemed easier
than modifying the jar.

Many thanks,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-12 Thread Harry Ambrose

It seems to be very hit and miss...

The below is from the machine described in this thread running PostgreSQL
9.4.10:

update 10

update 12

update 14

update 16

update 18

Update all

Vacuum

org.postgresql.util.PSQLException: ERROR: unexpected chunk number 2285
(expected 0) for toast value 187504167 in pg_toast_187504156

at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2455)

at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2155)

at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:288)

at
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:430)

at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:356)

at
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:168)

at
org.postgresql.jdbc.PgCallableStatement.executeWithFlags(PgCallableStatement.java:78)

at
org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:157)

at Start.execute(Start.java:118)

at Start.main(Start.java:20)

=

New attempt - number 2

Inserting the rows

Executing  0

Executing  4

Executing  8

Executing  12

Executing  16

Executing  20

Thank you for trying :)

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-12 Thread Harry Ambrose

Hi,

pls tell me, I am currently running 2nd run in my box, (New attempt 2), and
> its in the "Attempting vacuum" phase.
> What is it supposed to do next?
> I got no errors , it has gotten my machine to its knees.
>

The jar has an endless while loop. Thus please kill the PID when you are
done testing. It loosely follows the following:

BEGIN;
CREATE TABLE x (id BIGSERIAL PRIMARY KEY, payload1 VARCHAR, payload2
VARCHAR, payload3 VARCHAR, payload4 BIGINT, payload5 BIGINT);
/* Repeat until 2,000,000 rows are inserted */
INSERT INTO x (id, payload1, payload2, payload3, payload4, payload5) VALUES
(random values of varying length/size to force random toast usage);
COMMIT;

VACUUM (ANALYZE, FULL);

BEGIN;
/* Repeat until all 2,000,000 rows are updated */
UPDATE x SET payload1 = , payload2 = , payload3 = , payload4 = , payload5 =
... again random values of varying length/size to force random toast usage
COMMIT;

VACCUM (ANALYZE, FULL);

If the error is going to occur it will happen during the second "Attempting
vacuum" phase.

1st vacuum:

c.setAutoCommit(true);

System.out.println("Attempting vacuum");

c.prepareCall("VACUUM (FULL, ANALYZE, VERBOSE) x").execute();

System.out.println("Vacuum end. 30 sec sleep");

Thread.sleep(6);

2nd vacuum:

System.out.println("Attempting vacuum");

c.prepareCall("VACUUM (FULL, ANALYZE, VERBOSE) x").execute();

System.out.println("Vacuum completed");

Thread.sleep(3);

System.out.println("Dropping the table");

c.createStatement().execute("drop table if exists x");

Thread.sleep(3);

Hope this helps!

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-12 Thread Harry Ambrose

Hi,

> Their suggestion is to upload to Google Drive. That or use a third party
site, like Dropbox.

I have uploaded the jar to dropbox, link below (please let me know if you
have any issues downloading):

https://www.dropbox.com/s/96vm465i7rwhcf8/toast-corrupter-aio.jar?dl=0

> So I guess you run memtest86+ and it reported that your memory is indeed
ECC and also that it is working properly?

Correct, there are no issues reported. The issue can also be reproduced on
multiple different environments making the likelihood of a bad memory slim.

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-10 Thread Harry Ambrose

Hi,

Please find the jar attached (renamed with a .txt extension as I know some
email services deem jars a security issue).

The jar accepts the following arguments:

$1 = host
$2 = database
$3 = username
$4 = password
$5 = port

It returns its logging to STDOUT. Please let me know if you require further
info.

Best wishes,
Harry

On 7 June 2017 at 17:46, Tom Lane  wrote:

> Harry Ambrose  writes:
> > Tom - I can provide a jar that I have been using to replicate the issue.
> Whats the best transport method to send it over?
>
> If it's not enormous, just send it as an email attachment.
>
> regards, tom lane
>


Unsupported File Types Alert.txt
Description: Unsupported File Types Alert.txt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-10 Thread Harry Ambrose

Hi,

Please find the jar attached (renamed with a .txt extension as I know some
email services deem jars a security issue).

The jar accepts the following arguments:

$1 = host
$2 = database
$3 = username
$4 = password
$5 = port

It returns its logging to STDOUT. Please let me know if you require further
info.

Best wishes,
Harry

On 7 June 2017 at 17:46, Tom Lane  wrote:

> Harry Ambrose  writes:
> > Tom - I can provide a jar that I have been using to replicate the issue.
> Whats the best transport method to send it over?
>
> If it's not enormous, just send it as an email attachment.
>
> regards, tom lane
>


Unsupported File Types Alert.txt
Description: Unsupported File Types Alert.txt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-09 Thread Harry Ambrose

Hi,

Maybe you could give some info on :
> - your ext3 mkfs and mount options (journal, barriers, etc)
>
/etc/fstab details below:
LABEL=/var/lib/pgsql/var/lib/pgsql   ext3defaults
 1 2
LABEL=/tablespace1  /tablespace1ext3defaults
 1 2
LABEL=/tablespace2  /tablespace2ext3defaults
 1 2
LABEL=/tablespace3  /tablespace3ext3defaults
 1 2
pg_default:

Filesystem features:  has_journal ext_attr resize_inode dir_index
filetype needs_recovery sparse_super large_file

Filesystem flags: signed_directory_hash

Default mount options:(none)

Filesystem state: clean

Errors behavior:  Continue

Filesystem OS type:   Linux

Inode count:  36634624

Block count:  146506767

Reserved block count: 7325338

Free blocks:  143785740

Free inodes:  36627866

First block:  0

Block size:   4096

Fragment size:4096

Reserved GDT blocks:  989

Blocks per group: 32768

Fragments per group:  32768

Inodes per group: 8192

Inode blocks per group:   512

RAID stride:  64

RAID stripe width:64

Filesystem created:   Fri Aug  9 16:11:53 2013

Last mount time:  Fri Apr 21 22:37:02 2017

Last write time:  Fri Apr 21 22:37:02 2017

Mount count:  2

Maximum mount count:  100

Last checked: Thu Sep 15 18:52:43 2016

Check interval:   31536000 (12 months, 5 days)

Next check after: Fri Sep 15 18:52:43 2017

Reserved blocks uid:  0 (user root)

Reserved blocks gid:  0 (group root)

First inode:  11

Inode size:   256

Required extra isize: 28

Desired extra isize:  28

Journal inode:8

Default directory hash:   half_md4

Journal backup:   inode blocks
tablespaces

Filesystem features:  has_journal ext_attr resize_inode dir_index
filetype needs_recovery sparse_super large_file

Filesystem flags: signed_directory_hash

Default mount options:(none)

Filesystem state: clean

Errors behavior:  Continue

Filesystem OS type:   Linux

Inode count:  73261056

Block count:  293013543

Reserved block count: 14650677

Free blocks:  286208439

Free inodes:  73174728

First block:  0

Block size:   4096

Fragment size:4096

Reserved GDT blocks:  954

Blocks per group: 32768

Fragments per group:  32768

Inodes per group: 8192

Inode blocks per group:   512

RAID stride:  64

RAID stripe width:128

Filesystem created:   Fri Aug  9 16:11:53 2013

Last mount time:  Fri Apr 21 22:37:02 2017

Last write time:  Fri Apr 21 22:37:02 2017

Mount count:  2

Maximum mount count:  100

Last checked: Thu Sep 15 18:52:43 2016

Check interval:   31536000 (12 months, 5 days)

Next check after: Fri Sep 15 18:52:43 2017

Reserved blocks uid:  0 (user root)

Reserved blocks gid:  0 (group root)

First inode:  11

Inode size:   256

Required extra isize: 28

Desired extra isize:  28

Journal inode:8

Default directory hash:   half_md4

Journal backup:   inode blocks

> - your controller setup (battery should be working good and cache mode set
> to write back)
>
Cache Board Present: True
Cache Status: OK
Cache Ratio: 10% Read / 90% Write
Drive Write Cache: Disabled
Total Cache Size: 2.0 GB
Total Cache Memory Available: 1.8 GB
No-Battery Write Cache: Disabled
SSD Caching RAID5 WriteBack Enabled: False
SSD Caching Version: 1
Cache Backup Power Source: Capacitors
Battery/Capacitor Count: 1
Battery/Capacitor Status: OK

> - your disks setup (write cache should be disabled)
>
Write cache is disabled, see above.

> - you should check your syslogs/messages for any errors related to storage
>
No error messages found.

> - is your RAM ECC? Did you run any memtest?
>
Yes, memory is ECC. No error messages found.

> - is your CPU overheating ?
>
No overheating issues.

> - have you experienced any crashes/freezes ?
>
No crashes/freezes experienced.

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-09 Thread Harry Ambrose

Hi Tom,

Thanks for attempting to replicate the issue.

Anyway, the bad news is I couldn't reproduce the problem then and I can't
> now.  I don't know if it's a timing issue or if there's something critical
> about configuration that I'm not duplicating.  Can you explain what sort
> of platform you're testing on, and what nondefault configuration settings
> you're using?
>

Further details about the environment that I can replicate on below:

- Non default postgresql.conf settings:
checkpoint_segments = 192
checkpoint_completion_target = 0.9
checkpoint_timeout = 5min
wal_keep_segments = 256
wal_writer_delay = 200ms
archive_mode = on
archive_command = 'rsync -e ssh -arv  /wal/pg_xlog/%f postgres@:/wal/pg_xlog'
archive_timeout = 60
syslog_facility = 'LOCAL0'
log_statement = 'mod'
syslog_ident = 'postgres'
log_line_prefix = '%h %m  %p %c %u %a  %e '
log_timezone = 'GB'
track_activities = on
track_counts = on
datestyle = 'iso, mdy'
timezone = 'GB'
default_text_search_config = 'pg_catalog.english'
array_nulls = on
sql_inheritance = on
standard_conforming_strings = on
synchronize_seqscans = on
transform_null_equals = off
- Two node master/slave setup using streaming replication (without slots).
- CentOS 6.9 (2.6.32-696.el6.x86_64).
- PostgreSQL 9.4.10 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit.
- 64GiB RAM.
- AMD Opteron(TM) Processor 6238.
- pg_default sat on 2 disk RAID-1 conifugration (ext3 filesystem).
- Custom tablespaces (where the errors occur) sat on 4 disk RAID-10 (ext3
filesystem).
- All disks are HP 600G SAS 6.0Gbps with P420(i) controllers and battery
backed cache enabled.

Please let me know if you require further info.

Best wishes,
Harry

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-07 Thread Harry Ambrose

Hi,

Thanks for the responses.

> "One lesson I learned from the BSD camp when dealing with random freezes and 
> panics : when all else fails to give an answer it is time to start blaming my 
> hardware. Are those tablespaces on any cheap SSD's ?”

The tablespaces are not sat on SSD’s. Something I had also considered.

Tom - I can provide a jar that I have been using to replicate the issue. Whats 
the best transport method to send it over?

Best wishes,
Harry

> On 7 Jun 2017, at 16:27, Tom Lane  wrote:
> 
> Harry Ambrose  writes:
>> I have been following the updates to the 9.4 branch hoping a fix will 
>> appear, but sadly no luck yet. I have manually replicated the issue on 
>> 9.4.4, 9.4.10 and 9.4.12. My replication steps are:
> 
> This is a very interesting report, but you didn't actually provide a
> reproducer, just a handwavy outline.  If you submit a script that
> makes this happen, we will most definitely look into it.  But
> people aren't going to be excited about trying to reverse-engineer
> a test case out of a vague description.
> 
>> I also found the following has been reported:
>> https://www.postgresql.org/message-id/20161201165505.4360.28...@wrigleys.postgresql.org
> 
> That person never came back with a self-contained test case, either.
> 
> https://wiki.postgresql.org/wiki/Guide_to_reporting_problems
> 
>   regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-07 Thread Harry Ambrose

Hi,

I too have been experiencing this with a busy PostgreSQL instance.

I have been following the updates to the 9.4 branch hoping a fix will appear, 
but sadly no luck yet. I have manually replicated the issue on 9.4.4, 9.4.10 
and 9.4.12. My replication steps are:

BEGIN;
CREATE TABLE x (id BIGSERIAL PRIMARY KEY, payload1 VARCHAR, payload2 VARCHAR, 
payload3 VARCHAR, payload4 BIGINT, payload5 BIGINT);
/* Repeat until 2,000,000 rows are inserted */
INSERT INTO x (id, payload1, payload2, payload3, payload4, payload5) VALUES 
(random values of varying length/size to force random toast usage);
COMMIT;

VACUUM (ANALYZE, FULL);

BEGIN;
/* Repeat until all 2,000,000 rows are updated */
UPDATE x SET payload1 = , payload2 = , payload3 = , payload4 = , payload5 = ... 
again random values of varying length/size to force random toast usage
COMMIT;

VACCUM (ANALYZE, FULL);

The second vacuum causes an ERROR identical to that you are reporting below 
(unexpected chunk number n (expected n) for toast value...). However it may 
take up to ten attempts to replicate it.

Out of interest, are you using any tablespaces other than pg_default? I can 
only replicate the issue when using separately mounted tablespaces.

I have been investigating this quite extensively and everything I can find on 
the web suggests data corruption. However running the the following DO reports 
no errors and I can dump the database without issue.

DO $$
DECLARE

curid INT := 0;
vcontent RECORD;
badid BIGINT;

var1_sub VARCHAR;
var2_sub VARCHAR;
var3_sub VARCHAR;
var4_sub VARCHAR;
var5_sub VARCHAR;

BEGIN
FOR badid IN SELECT id FROM x 
LOOP
curid = curid + 1;

IF curid % 10 = 0 
THEN
RAISE NOTICE '% rows inspected', curid;
END IF;

BEGIN
SELECT *
INTO vcontent
FROM x
WHERE rowid = badid;

var1_sub := SUBSTR(vcontent.var1,2000,5000);
var2_sub := SUBSTR(vcontent.var2,2000,5000);   
var3_sub := SUBSTR(vcontent.var3,2000,5000);
var4_sub := SUBSTR(vcontent.var4::VARCHAR,2000,5000);
var5_sub := SUBSTR(vcontent.var5::VARCHAR,2000,5000);

EXCEPTION WHEN OTHERS THEN
RAISE NOTICE 'Data for rowid % is corrupt', badid;
CONTINUE;
END;

END LOOP;
END;
$$;

I also found the following has been reported: 
https://www.postgresql.org/message-id/20161201165505.4360.28...@wrigleys.postgresql.org

Best wishes,
Harry

> On 7 Jun 2017, at 15:22, Achilleas Mantzios  
> wrote:
> 
> On 07/06/2017 16:33, ADSJ (Adam Sjøgren) wrote:
>> Our database has started reporting errors like this:
>> 
>>   2017-05-31 13:48:10 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 14242189 in pg_toast_10919630
>>   ...
>>   2017-06-01 11:06:56 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 19573520 in pg_toast_10919630
>> 
>> (157 times, for different toast values, same pg_toast_nnn). pg_toast_10919630
>> corresponds to a table with around 168 million rows.
>> 
>> These went away, but the next day we got similar errors from another
>> table:
>> 
>>   2017-06-02 05:59:50 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 47060150 in pg_toast_10920100
>>   ...
>>   2017-06-02 06:14:54 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 47226455 in pg_toast_10920100
>> 
>> (Only 4 this time) pg_toast_10920100 corresponds to a table with holds
>> around 320 million rows (these are our two large tables).
>> 
>> The next day we got 6 such errors and the day after 10 such errors. On
>> June 5th we got 94, yesterday we got 111, of which one looked a little
>> different:
>> 
>>   2017-06-06 17:32:21 CEST ERROR:  unexpected chunk size 1996 (expected 
>> 1585) in final chunk 0 for toast value 114925100 in pg_toast_10920100
>> 
>> and today the logs have 65 lines, ending with these:
>> 
>>   2017-06-07 14:49:53 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 131114834 in pg_toast_10920100
>>   2017-06-07 14:53:41 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 131149566 in pg_toast_10920100
> First try to find which tables those toast relations refer to :
> select 10919630::regclass , 10920100::regclass ;
> Are those critical tables? Can you restore them somehow?
> 
> Also you may consider
> REINDEX TABLE pg_toast.pg_toast_10920100;
> REINDEX TABLE pg_toast.pg_toast_10919630;
> REINDEX TABLE ;
> REINDEX TABLE ;
> 
> also VACUUM the above tables.
> 
> You might want to write a function which iterates over the damaged table's 
> rows in order to identify the damaged row(s). And then do some good update to 
> create a new version

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

14 matches

Site Navigation

Mail list logo

Footer information