Bug#1032104: marked as done (linux: ppc64el iouring corrupted read)

2024-01-14 Thread Debian Bug Tracking System
Your message dated Sun, 14 Jan 2024 19:48:12 +
with message-id 
and subject line Bug#1032104: fixed in linux 5.10.205-1
has caused the Debian Bug report #1032104,
regarding linux: ppc64el iouring corrupted read
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
1032104: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032104
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Source: linux
Version: 5.10.0-21-powerpc64le
Severity: grave
Justification: causes non-serious data loss
X-Debbugs-Cc: dan...@mariadb.org

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***

>From https://jira.mariadb.org/browse/MDEV-30728

MariaDB's mtr tests on a number of specific tests depend on the correct
kernel operation.

As observed in these tests, there is a ~1/5 chance the
encryption.innodb_encryption test will read zeros on the later part of
the 16k pages that InnoDB uses by default.

This affects MariaDB-10.6+ packages where there is a liburing in the
distribution.

This has been observed in the CI of Debian
(https://ci.debian.net/packages/m/mariadb/testing/ppc64el/)
and upstreams https://buildbot.mariadb.org/#/builders/318.
The one ppc64le worker that has the Debian 5.10.0-21 kernel,
the same as the Debian CI, has the prefix ppc64le-db-bbw1-*.

Test faults occur on all MariaDB 10.6+ builds in containers on this kernel.
There a no faults on non-ppc64le or RHEL7/8 based ppc64le kernels.

To reproduce:

apt-get install mariadb-test
cd /usr/share/mysql/mysql-test
./mtr --mysqld=--innodb-flush-method=fsync --mysqld=--innodb-use-native-aio=1 
--vardir=/var/lib/mysql  --force encryption.innodb_encryption,innodb,undo0 
--repeat=12 

A test will frequenty fail.

2023-02-28  1:41:01 0 [ERROR] InnoDB: Database page corruption on disk or a 
failed read of file './ibdata1' page [page id: space=0, page number=282]. You 
may have to recover from a backup.

(the page number isn't predictable)

The complete mtr error log of mariadb server is $PWD/var/log/mysqld.1.err

I tested on tmpfs. This is a different fault from bug #1020831 as:
* there is no iouring error, just a bunch of zeros where data was
  expected.
* this is ppc64le only.

Note, more serious faults exist on overlayfs (MDEV-28751) and remote
filesystems so sticking to local xfs, ext4, btrfs is recommended.

-- System Information:
Debian Release: bullseye
  APT prefers jammy-updates
  APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500, 'jammy'), 
(100, 'jammy-backports')
Architecture: ppc64el (ppc64le)

Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect
--- End Message ---
--- Begin Message ---
Source: linux
Source-Version: 5.10.205-1
Done: Salvatore Bonaccorso 

We believe that the bug you reported is fixed in the latest version of
linux, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1032...@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Salvatore Bonaccorso  (supplier of updated linux package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmas...@ftp-master.debian.org)


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Format: 1.8
Date: Sat, 30 Dec 2023 10:41:34 +0100
Source: linux
Architecture: source
Version: 5.10.205-1
Distribution: bullseye-security
Urgency: high
Maintainer: Debian Kernel Team 
Changed-By: Salvatore Bonaccorso 
Closes: 1032104 1035587 1052304
Changes:
 linux (5.10.205-1) bullseye-security; urgency=high
 .
   * New upstream stable update:
 https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.198
 - NFS: Use the correct commit info in nfs_join_page_group()
 - NFS/pNFS: Report EINVAL errors from connect() to the server
 - SUNRPC: Mark the cred for revalidation if the server rejects it
 - tra

Bug#1032104: marked as done (linux: ppc64el iouring corrupted read)

2023-12-09 Thread Debian Bug Tracking System
Your message dated Sat, 09 Dec 2023 17:56:32 +
with message-id 
and subject line Bug#1032104: fixed in linux 6.1.66-1
has caused the Debian Bug report #1032104,
regarding linux: ppc64el iouring corrupted read
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
1032104: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032104
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Source: linux
Version: 5.10.0-21-powerpc64le
Severity: grave
Justification: causes non-serious data loss
X-Debbugs-Cc: dan...@mariadb.org

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***

>From https://jira.mariadb.org/browse/MDEV-30728

MariaDB's mtr tests on a number of specific tests depend on the correct
kernel operation.

As observed in these tests, there is a ~1/5 chance the
encryption.innodb_encryption test will read zeros on the later part of
the 16k pages that InnoDB uses by default.

This affects MariaDB-10.6+ packages where there is a liburing in the
distribution.

This has been observed in the CI of Debian
(https://ci.debian.net/packages/m/mariadb/testing/ppc64el/)
and upstreams https://buildbot.mariadb.org/#/builders/318.
The one ppc64le worker that has the Debian 5.10.0-21 kernel,
the same as the Debian CI, has the prefix ppc64le-db-bbw1-*.

Test faults occur on all MariaDB 10.6+ builds in containers on this kernel.
There a no faults on non-ppc64le or RHEL7/8 based ppc64le kernels.

To reproduce:

apt-get install mariadb-test
cd /usr/share/mysql/mysql-test
./mtr --mysqld=--innodb-flush-method=fsync --mysqld=--innodb-use-native-aio=1 
--vardir=/var/lib/mysql  --force encryption.innodb_encryption,innodb,undo0 
--repeat=12 

A test will frequenty fail.

2023-02-28  1:41:01 0 [ERROR] InnoDB: Database page corruption on disk or a 
failed read of file './ibdata1' page [page id: space=0, page number=282]. You 
may have to recover from a backup.

(the page number isn't predictable)

The complete mtr error log of mariadb server is $PWD/var/log/mysqld.1.err

I tested on tmpfs. This is a different fault from bug #1020831 as:
* there is no iouring error, just a bunch of zeros where data was
  expected.
* this is ppc64le only.

Note, more serious faults exist on overlayfs (MDEV-28751) and remote
filesystems so sticking to local xfs, ext4, btrfs is recommended.

-- System Information:
Debian Release: bullseye
  APT prefers jammy-updates
  APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500, 'jammy'), 
(100, 'jammy-backports')
Architecture: ppc64el (ppc64le)

Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect
--- End Message ---
--- Begin Message ---
Source: linux
Source-Version: 6.1.66-1
Done: Salvatore Bonaccorso 

We believe that the bug you reported is fixed in the latest version of
linux, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1032...@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Salvatore Bonaccorso  (supplier of updated linux package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmas...@ftp-master.debian.org)


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Format: 1.8
Date: Sat, 09 Dec 2023 16:48:39 +0100
Source: linux
Architecture: source
Version: 6.1.66-1
Distribution: bookworm
Urgency: medium
Maintainer: Debian Kernel Team 
Changed-By: Salvatore Bonaccorso 
Closes: 1032104 1057790 1057843
Changes:
 linux (6.1.66-1) bookworm; urgency=medium
 .
   * New upstream stable update:
 https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.65
 - afs: Fix afs_server_list to be cleaned up with RCU
 - afs: Make error on cell lookup failure consistent with OpenAFS
 - [arm64,armhf] drm/panel: simple: Fix Innolux G101ICE-L01 bus flags
 - [arm64,armhf] drm/panel: simple: Fix 

Bug#1032104: Fixed in 4.19.301, 5.10.203, 6.1.66

2023-12-08 Thread Salvatore Bonaccorso
So the fix landed as well in 5.10.203 and 6.1.66 in particular, will
add a respective closer for this bug with those rebases. This means
the update will be in the next upload rebasing at least to those
versions (it was too late for the next round of point release for
bookworm).



Bug#1032104: marked as done (linux: ppc64el iouring corrupted read)

2023-12-03 Thread Debian Bug Tracking System
Your message dated Sun, 03 Dec 2023 20:48:02 +
with message-id 
and subject line Bug#1032104: fixed in linux 6.6.4-1~exp1
has caused the Debian Bug report #1032104,
regarding linux: ppc64el iouring corrupted read
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
1032104: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032104
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Source: linux
Version: 5.10.0-21-powerpc64le
Severity: grave
Justification: causes non-serious data loss
X-Debbugs-Cc: dan...@mariadb.org

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***

>From https://jira.mariadb.org/browse/MDEV-30728

MariaDB's mtr tests on a number of specific tests depend on the correct
kernel operation.

As observed in these tests, there is a ~1/5 chance the
encryption.innodb_encryption test will read zeros on the later part of
the 16k pages that InnoDB uses by default.

This affects MariaDB-10.6+ packages where there is a liburing in the
distribution.

This has been observed in the CI of Debian
(https://ci.debian.net/packages/m/mariadb/testing/ppc64el/)
and upstreams https://buildbot.mariadb.org/#/builders/318.
The one ppc64le worker that has the Debian 5.10.0-21 kernel,
the same as the Debian CI, has the prefix ppc64le-db-bbw1-*.

Test faults occur on all MariaDB 10.6+ builds in containers on this kernel.
There a no faults on non-ppc64le or RHEL7/8 based ppc64le kernels.

To reproduce:

apt-get install mariadb-test
cd /usr/share/mysql/mysql-test
./mtr --mysqld=--innodb-flush-method=fsync --mysqld=--innodb-use-native-aio=1 
--vardir=/var/lib/mysql  --force encryption.innodb_encryption,innodb,undo0 
--repeat=12 

A test will frequenty fail.

2023-02-28  1:41:01 0 [ERROR] InnoDB: Database page corruption on disk or a 
failed read of file './ibdata1' page [page id: space=0, page number=282]. You 
may have to recover from a backup.

(the page number isn't predictable)

The complete mtr error log of mariadb server is $PWD/var/log/mysqld.1.err

I tested on tmpfs. This is a different fault from bug #1020831 as:
* there is no iouring error, just a bunch of zeros where data was
  expected.
* this is ppc64le only.

Note, more serious faults exist on overlayfs (MDEV-28751) and remote
filesystems so sticking to local xfs, ext4, btrfs is recommended.

-- System Information:
Debian Release: bullseye
  APT prefers jammy-updates
  APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500, 'jammy'), 
(100, 'jammy-backports')
Architecture: ppc64el (ppc64le)

Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect
--- End Message ---
--- Begin Message ---
Source: linux
Source-Version: 6.6.4-1~exp1
Done: Bastian Blank 

We believe that the bug you reported is fixed in the latest version of
linux, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1032...@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Bastian Blank  (supplier of updated linux package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmas...@ftp-master.debian.org)


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Format: 1.8
Date: Sun, 03 Dec 2023 20:57:56 +0100
Source: linux
Architecture: source
Version: 6.6.4-1~exp1
Distribution: experimental
Urgency: medium
Maintainer: Debian Kernel Team 
Changed-By: Bastian Blank 
Closes: 1032104 1037938
Changes:
 linux (6.6.4-1~exp1) experimental; urgency=medium
 .
   * New upstream stable update:
 https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.6.4
 - nvmet: nul-terminate the NQNs passed in the connect command
   (CVE-2023-6121)
 .
   [ Bastian Blank ]
   * Fix build dependency on rsync.
   * Fix build dependency on kernel-wedge.
   * udeb: Make i2c-hid modules optional.
 .
   [ Timothy 

Bug#1032104:

2023-11-18 Thread Timothy Pearson
Root cause found, merge request here: 
https://salsa.debian.org/kernel-team/linux/-/merge_requests/917



Processed: bug 1032104 is forwarded to https://lore.kernel.org/regressions/19221908.47168775.1699937769845.javamail.zim...@raptorengineeringinc.com/T/#u https://lore.kernel.org/regressions/480932026.4

2023-11-13 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> forwarded 1032104 
> https://lore.kernel.org/regressions/19221908.47168775.1699937769845.javamail.zim...@raptorengineeringinc.com/T/#u
>  
> https://lore.kernel.org/regressions/480932026.45576726.1699374859845.javamail.zim...@raptorengineeringinc.com/
>  https://lore.kernel.org/all/2b015a34-220e-674e-7301-2cf17ef45...@kernel.dk/
Bug #1032104 [src:linux] linux: ppc64el iouring corrupted read
Changed Bug forwarded-to-address to 
'https://lore.kernel.org/regressions/19221908.47168775.1699937769845.javamail.zim...@raptorengineeringinc.com/T/#u
 
https://lore.kernel.org/regressions/480932026.45576726.1699374859845.javamail.zim...@raptorengineeringinc.com/
 https://lore.kernel.org/all/2b015a34-220e-674e-7301-2cf17ef45...@kernel.dk/' 
from '! 
https://lore.kernel.org/regressions/19221908.47168775.1699937769845.javamail.zim...@raptorengineeringinc.com/T/#u'.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
1032104: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032104
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1032104: Status update

2023-11-13 Thread Timothy Pearson
I have traced this bug to a missing memory barrier in the powerpc IPI handling 
code.  io_uring uses task_work_add() to schedule I/O worker creation, which in 
turn issues an IPI, and when precise timing conditions are met the inconsistent 
state between the two CPU cores can lead to corruption of userspace data in RAM.

I have sent a patch upstream, and created a merge request for Debian here:

https://salsa.debian.org/kernel-team/linux/-/merge_requests/907



Bug#1032104: Status update

2023-11-07 Thread Timothy Pearson
I have traced this to a regression in the Linux kernel.  The issue appears to 
be a type of data race that is more likely to occur on ppc64el than on other 
architectures, but is also likely to affect other architectures.  The issue 
remains in the latest GIT version of the Linux kernel, and I am working with 
both upstream and our internal resources to try to isolate the root cause and 
generate a fix.

In the interim, disabling the io_uring subsystem will allow mariadb to function 
normally.  Given the nature of the kernel bug, I would recommend disabling 
io_uring entirely in the kernel configuration for affected systems, as other 
applications may also be impacted by the data corruption.



Bug#1032104: Still present in Bookworm

2023-09-07 Thread Timothy Pearson
We've started hitting this on a busy server after upgrading to Bookworm:

2023-09-07 17:00:31 0 [Warning] You need to use --log-bin to make 
--expire-logs-days or --binlog-expire-logs-seconds work.
2023-09-07 17:00:31 0 [Note] Server socket created on IP: '127.0.0.1'.
2023-09-07 17:00:31 0 [Note] /usr/sbin/mariadbd: ready for connections.
Version: '10.11.3-MariaDB-1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  
Debian 12
2023-09-07 17:00:31 0 [Note] InnoDB: Buffer pool(s) load completed at 230907 
17:00:31
2023-09-07 20:35:06 8630 [ERROR] InnoDB: Database page corruption on disk or a 
failed read of file './database/data_table.ibd' page [page id: space=393, page 
number=1534]. You may have to recover from a backup.
2023-09-07 20:35:06 8630 [Note] InnoDB: Page dump (8192 bytes):
2023-09-07 20:35:06 8630 [Note] InnoDB: 
86c518ee05fe05fd05ff00067ee5a53145bf
2023-09-07 20:35:06 8630 [Note] InnoDB: 
0189000f3eff803d3e030002003a003b
2023-09-07 20:35:06 8630 [Note] InnoDB: 
045a6881
2023-09-07 20:35:06 8630 [Note] InnoDB: 
12d4ae6314accb64e0a8630400b55a4b8f6557796d5eb10dee577503
2023-09-07 20:35:06 8630 [Note] InnoDB: 
c64ee090070c90e9fd7e2014256e140c31c2b84d193051d88fefebbe72b9caaa
2023-09-07 20:35:06 8630 [Note] InnoDB: 
aa36b4985494699461a42852fe41a48ca248f91b5186496a1009f10320bc42d6
2023-09-07 20:35:06 8630 [Note] InnoDB: 
bef79c7b6fd53d7546a73df0b5caf5d53d6b7dafb5f63ed7ba3bdddbd7ceaeb5
2023-09-07 20:35:06 8630 [Note] InnoDB: 
7fbefdb3d5e7b58ff2e2804eeedd3fa674ba789fee9547e9f8f4e4deebc7f4ee
2023-09-07 20:35:06 8630 [Note] InnoDB: 
e2f1bb2fef53393d3a7ef96be5e8f0e5d716f9381d3fb9f7069da6c5c1f26727
2023-09-07 20:35:06 8630 [Note] InnoDB: 
f71eec7ff5de57d2f13bf71e3c3a7aefbdc5e1c3ee95f4b013f28b27ef3f54c6
2023-09-07 20:35:06 8630 [Note] InnoDB: 
a44aac65a0646a544255978c7554b3d05652ff28ff3d12da3fdd5efff9cceaf3
2023-09-07 20:35:06 8630 [Note] InnoDB: 
f9bf6a9f7ff9853ffef2f667ff3bd7b2d4e43c9134ae70a69088490b1b6d0a2a
2023-09-07 20:35:06 8630 [Note] InnoDB: 
7852f8bd97ae751feb1e0c1cfc7c6e0ebe4e3fa483e3270d8074ce09a77dd245
2023-09-07 20:35:06 8630 [Note] InnoDB: 
7b321c284b67524ec12525580ed8b742c631dfbc3f85d9daa0133952d6280ab5
2023-09-07 20:35:06 8630 [Note] InnoDB: 
b028926a9456a6500d8b15e6dbdd7707ccfffbd4f27e1f7fa0c1705e0ac79c95
2023-09-07 20:35:06 8630 [Note] InnoDB: 
cb1c44c93269ca8695676d4b60dec9fa10388efff6fff4b8bfb4fd39e0a72cbc
2023-09-07 20:35:06 8630 [Note] InnoDB: 
2c26daacaab2c20a5d3c809baa6355a29415feadbaffc5dcf8df4a070774daa5
2023-09-07 20:35:06 8630 [Note] InnoDB: 
c3da80d40684a565112db9a26a55896ba254bda7ec8a14b6e4818191d0710eae
2023-09-07 20:35:06 8630 [Note] InnoDB: 
ffd31407316a8a3a3b67854825e69482cd5278958b032f3d077bddf7060e7e39
2023-09-07 20:35:06 8630 [Note] InnoDB: 
3707f7d3c9a3ee0de2c7402254c39102f08aaa751445db2475cd35e8685cd241
2023-09-07 20:35:06 8630 [Note] InnoDB: 
ca5a060a7623c719f8d4c15417184e5c925146061b0c33eace07e3654ac58283
2023-09-07 20:35:06 8630 [Note] InnoDB: 
9e81e7bb3707067e357f172c96c5bc4c3fb28c311738666b63b5a1c860220aa0
2023-09-07 20:35:06 8630 [Note] InnoDB: 
ea628c8f9b06d8c48ca3defbd054de0ba98c660bd12a7c93641f3102285b6f63
2023-09-07 20:35:06 8630 [Note] InnoDB: 
492a8715ea9bdd5b03ea5fcf8e3a1d1086f6c93bb46ce0200b5b4c7b2f7d724e
2023-09-07 20:35:06 8630 [Note] InnoDB: 
3b5652b28dca5a5b83e5b0067e316c1cfbcdff9cca78100163d44aad35a66921
2023-09-07 20:35:06 8630 [Note] InnoDB: 
937c44fb57a525c7487685fd56f79d01fb6fe6c6fe95633a39e9904384371482
2023-09-07 20:35:06 8630 [Note] InnoDB: 
6b4e967c102a455d6d515a4b9b4375687965fd00fe72dc38fadb7f3f95791fdb
2023-09-07 20:35:06 8630 [Note] InnoDB: 
2a11ca05cc561d42162260b21699bc9414d3d9e54df77f73a37f351d3ea4e325
2023-09-07 20:35:06 8630 [Note] InnoDB: 
8050b35656b1e06c6370b93a8995a46b72a8f694f3007c2b641cf38dbda98cd7
2023-09-07 20:35:06 8630 [Note] InnoDB: 
a48553598548c280670cba5c94773561b4b0952bcccf76af0f987ffb34aa7db9
2023-09-07 20:35:06 8630 [Note] InnoDB: 
da83ab46d59443abbb1c83748c1fa4429cbc4a7abbcc2726da07a6d092d1de60
2023-09-07 20:35:06 8630 [Note] InnoDB: 
60586da9266b52a02a5bad83578c939dbdb63f44ce57df8b9372f4f8f0b47bb3
2023-09-07 20:35:06 8630 [Note] InnoDB: 
89b2e5682e3a20d55201371592229a9a527602722e665a36ddaac47743c739d8
2023-09-07 20:35:06 8630 [Note] InnoDB: 
fbc7290ed890434547634df499547186a3935240d529cb7dc637da66ff037373
2023-09-07 20:35:06 8630 [Note] InnoDB: 
d06fa7878b93533a6e304cb04a64af183c44e858283cfcc0d79065117993fa9d
2023-09-07 20:35:06 8630 [Note] InnoDB: 
c071fc2ffc5d8f7bb4cb33a5040e72c56c0f6d7114c8298af83236d5bb7ebedf
2023-09-07 20:35:06 8630 [Note] InnoDB: 
e886dcef7f706efc5f5df069f7e034adbad689e8bc434f1379a36cc88692a4a2
2023-09-07 20:35:06 8630 [Note] InnoDB: 
30dcbca8d10de02f468d237ff1afa7328f0de25451b10a7210b34d356b2f8268
2023-09-07 20:35:06 8630 [Note] InnoDB: 
d97769691c2e56ff879e6ef52f2d092bacf0a86a4c55845ca3774a90909804d9
2023-09-07 20:35:06 8630 [Note] InnoDB: 
5be9cd78f54fb899bbff35957dd02c3

Bug#1032104: linux: ppc64el iouring corrupted read

2023-05-24 Thread Otto Kekäläinen
Hello!

This is not fixed.

I sampled failing autopkgtests for MariaDB at
https://ci.debian.net/packages/m/mariadb/testing/ppc64el/ between May
7th and 22nd. They still have crashes that include error message
'Database page corruption on disk'. Both failing and passing ones were
running kernel: Linux 6.1.0-9-powerpc64le #1 SMP Debian 6.1.27-1
(2023-05-08) Most passing ones were on ci-worker-ppc64el-03, but also
on -02. The failing ones were on workers -01 and -02. Since -02 had
both failing and passing it indicates that this is not a hardware
issue.

The overall symptoms indicate that this is a software issue that
started on Feb 6th (kernel: Linux 5.10.0-21-powerpc64le) and it
happens sporadically, not on every run, and continues to happen.

- Otto



Bug#1032104: linux: ppc64el iouring corrupted read

2023-05-24 Thread Salvatore Bonaccorso
Hi Otto,

On Sun, Apr 09, 2023 at 03:30:35PM -0700, Otto Kekäläinen wrote:
> > > > Paul Gevers asked if the issues are gone as well with 6.1.12-1
> > > > (or later 6.1.y series versions, which will land in bookworm). That
> > > > would be valuable information to know as well to exclude we do not
> > > > have the issue as well in bookworm.
> > >
> > > Were you able to verify this?
> 
> Yes and new kernel did not fix it.
> 
> I reviewed now all ppc64el autopkgtest runs of src:mariadb at
> https://ci.debian.net/packages/m/mariadb/testing/ppc64el/
> 
> This is still happening on latest kernel and latest src:mariadb in
> bookworm. The failing test varies, but they all have in common that
> they error on 'Database page corruption on disk'.
> 
> autopkgtest [20:11:55]: starting date and time: 2023-04-08 20:11:55+
> autopkgtest [20:12:17]: testbed running kernel: Linux
> 6.1.0-7-powerpc64le #1 SMP Debian 6.1.20-1 (2023-03-19)
> autopkgtest [20:12:39]: testing package mariadb version 1:10.11.2-1
> Completed: Failed 6/1021 tests, 99.41% were successful.
> Failing test(s): main.innodb_ext_key main.statistics_upgrade_not_done
> 
> Attached summary of downloading all recent logs and running:
> $ zgrep -e 'starting date' -e 'running kernel' -e 'testing package
> mariadb version' -e 'Completed: ' -e 'Failing test(s)' *.gz | tee
> mariadb-autopkgtest-ppc64el-summary.txt

Are those issues still present with recent kernels? There were again
enough io_uring based changes which make worth rebase our checking on
those.

Regards,
Salvatore



Bug#1032104: linux: ppc64el iouring corrupted read

2023-04-09 Thread Otto Kekäläinen
> > > Paul Gevers asked if the issues are gone as well with 6.1.12-1
> > > (or later 6.1.y series versions, which will land in bookworm). That
> > > would be valuable information to know as well to exclude we do not
> > > have the issue as well in bookworm.
> >
> > Were you able to verify this?

Yes and new kernel did not fix it.

I reviewed now all ppc64el autopkgtest runs of src:mariadb at
https://ci.debian.net/packages/m/mariadb/testing/ppc64el/

This is still happening on latest kernel and latest src:mariadb in
bookworm. The failing test varies, but they all have in common that
they error on 'Database page corruption on disk'.

autopkgtest [20:11:55]: starting date and time: 2023-04-08 20:11:55+
autopkgtest [20:12:17]: testbed running kernel: Linux
6.1.0-7-powerpc64le #1 SMP Debian 6.1.20-1 (2023-03-19)
autopkgtest [20:12:39]: testing package mariadb version 1:10.11.2-1
Completed: Failed 6/1021 tests, 99.41% were successful.
Failing test(s): main.innodb_ext_key main.statistics_upgrade_not_done

Attached summary of downloading all recent logs and running:
$ zgrep -e 'starting date' -e 'running kernel' -e 'testing package
mariadb version' -e 'Completed: ' -e 'Failing test(s)' *.gz | tee
mariadb-autopkgtest-ppc64el-summary.txt
30542346.log.gz:autopkgtest [16:38:18]: starting date and time: 2023-01-20 
16:38:18+
30542346.log.gz:autopkgtest [16:39:14]: testbed running kernel: Linux 
5.10.0-20-powerpc64le #1 SMP Debian 5.10.158-2 (2022-12-13)
30542346.log.gz:autopkgtest [16:39:30]: testing package mariadb version 
1:10.11.1-1
30542346.log.gz:Completed: All 1016 tests were successful.

31013059.log.gz:autopkgtest [23:16:23]: starting date and time: 2023-02-03 
23:16:23+
31013059.log.gz:autopkgtest [23:16:53]: testbed running kernel: Linux 
5.10.0-20-powerpc64le #1 SMP Debian 5.10.158-2 (2022-12-13)
31013059.log.gz:autopkgtest [23:17:06]: testing package mariadb version 
1:10.11.1-2
31013059.log.gz:Completed: All 1016 tests were successful.

31114152.log.gz:autopkgtest [10:00:31]: starting date and time: 2023-02-06 
10:00:31+
31114152.log.gz:autopkgtest [10:00:57]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31114152.log.gz:autopkgtest [10:01:09]: testing package mariadb version 
1:10.11.1-3
31114152.log.gz:Completed: Failed 2/1016 tests, 99.80% were successful.
31114152.log.gz:Failing test(s): main.xa_prepared_binlog_off 
main.update_use_source

31138628.log.gz:autopkgtest [06:52:36]: starting date and time: 2023-02-07 
06:52:36+
31138628.log.gz:autopkgtest [06:53:04]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31138628.log.gz:autopkgtest [06:53:17]: testing package mariadb version 
1:10.11.1-3
31138628.log.gz:Completed: All 1016 tests were successful.

31204767.log.gz:autopkgtest [12:32:51]: starting date and time: 2023-02-10 
12:32:51+
31204767.log.gz:autopkgtest [12:33:23]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31204767.log.gz:autopkgtest [12:33:46]: testing package mariadb version 
1:10.11.1-4
31204767.log.gz:Completed: Failed 2/1016 tests, 99.80% were successful.
31204767.log.gz:Failing test(s): main.innodb_ext_key main.order_by_innodb

31253808.log.gz:autopkgtest [19:05:34]: starting date and time: 2023-02-11 
19:05:34+
31253808.log.gz:autopkgtest [19:06:15]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31253808.log.gz:autopkgtest [19:06:25]: testing package mariadb version 
1:10.11.1-4
31253808.log.gz:Completed: All 1016 tests were successful.

31452860.log.gz:autopkgtest [09:50:34]: starting date and time: 2023-02-17 
09:50:34+
31452860.log.gz:autopkgtest [09:51:00]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31452860.log.gz:autopkgtest [09:51:21]: testing package mariadb version 
1:10.11.1-5
31452860.log.gz:Completed: Failed 6/1020 tests, 99.41% were successful.
31452860.log.gz:Failing test(s): main.ctype_utf8mb4_innodb 
main.index_merge_innodb

31480673.log.gz:autopkgtest [01:00:30]: starting date and time: 2023-02-18 
01:00:30+
31480673.log.gz:autopkgtest [01:01:00]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31480673.log.gz:autopkgtest [01:01:17]: testing package mariadb version 
1:10.11.2-1
31480673.log.gz:Completed: Failed 6/1021 tests, 99.41% were successful.
31480673.log.gz:Failing test(s): main.xa_prepared_binlog_off main.range_mrr_icp

31509348.log.gz:autopkgtest [05:09:32]: starting date and time: 2023-02-19 
05:09:32+
31509348.log.gz:autopkgtest [05:10:50]: testbed running kernel: Linux 
5.10.0-21-powerpc64le #1 SMP Debian 5.10.162-1 (2023-01-21)
31509348.log.gz:autopkgtest [05:11:06]: testing package mariadb version 
1:10.11.2-1
31509348.log.gz:Completed: Failed 3/1019 tests, 99.71% were successful.
31509348.log.gz:Failing test(s): main.ctype_utf8mb4_innodb

323410

Bug#1032104: linux: ppc64el iouring corrupted read

2023-04-09 Thread Paul Gevers

Hi Otto,

On 09-04-2023 03:54, Otto Kekäläinen wrote:

Paul Gevers asked if the issues are gone as well with 6.1.12-1
(or later 6.1.y series versions, which will land in bookworm). That
would be valuable information to know as well to exclude we do not
have the issue as well in bookworm.


Were you able to verify this?


No, not yet.

I have not done new uploads to experimental after the one I mentioned
and linked above from March 18th.


I don't understand this point, so I wonder if you understood my 
question. Maybe you did, but in my view no new uploads are needed to 
answer the bookworm question.



The builds for unstable are passing because I forced the tests to run
with regular fsync instead of native I/O in
https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/fc1358087b39ac6520420c7bbae2e536bc86748d.
I will test this again later but right now I don't want to do any
extra uploads as the package is pending unblock and inclusion in
Bullseye (Bug#1033811) and I don't want one single minor issue to
jeopardize getting fixes for multiple major issues forward.


My point was that I upgraded the ppc64el hosts where ci.debian.net runs 
the autopkgtests (so *not* the Debian build infrastructure). Since that 
upgrade, all tests on ci.debian.net *in every suite* have been using the 
bookworm (6.1.y) kernel.


E.g. in unstable MariaDb 1:10.11.2-1 (so before the "Prevent 
mariadb-test-run from using native I/O on ppc64el and s390x due to Linux 
kernel bug" change) passed on 2023-03-26 10:39 but failed on the same 
day at 14:40. Is any of the failures on ppc64el before 1:10.11.2-2 and 
after 2023-03-09 from the same kernel issue we're discussing here (and 
thus the kernel still needs fixing in bookworm). Or are all the failures 
in that time-span from something else, and thus can we conclude that the 
kernel *probably* (no proof of course) got fixed between the version of 
the kernel in bullseye and the version in bookworm.


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032104: linux: ppc64el iouring corrupted read

2023-04-08 Thread Otto Kekäläinen
> > On Sat, Mar 18, 2023 at 11:19:29PM -0700, Otto Kekäläinen wrote:
> > > Any updates on this one?
> > >
> > > I am still seeing the main.index_merge_innodb failure in
> > > https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1678728871&raw=0
> > > and rebuild 
> > > https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1679174850&raw=0.
> > >
> > > Logs show: Kernel: Linux 5.10.0-21-powerpc64le #1 SMP Debian
> > > 5.10.162-1 (2023-01-21) ppc64el (ppc64le)
> >
> > Remember that with the 5.10.162 upstream version the io_uring code was
> > rebased to the 5.15-stable one. So it is likely, and it maches the
> > verison ranges, that the regression was introduced with this
> > particular changes. Ideally someone with access to the given
> > architecture, can verify that the issue is gone with the current
> > 5.10.175 upstream (where there were several followup fixes, in
> > particular e.g. a similar one for s390x), and if not, reports the
> > problem to upstream.
> >
> > Paul Gevers asked if the issues are gone as well with 6.1.12-1
> > (or later 6.1.y series versions, which will land in bookworm). That
> > would be valuable information to know as well to exclude we do not
> > have the issue as well in bookworm.
>
> Were you able to verify this?

No, not yet.

I have not done new uploads to experimental after the one I mentioned
and linked above from March 18th.

The builds for unstable are passing because I forced the tests to run
with regular fsync instead of native I/O in
https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/fc1358087b39ac6520420c7bbae2e536bc86748d.
I will test this again later but right now I don't want to do any
extra uploads as the package is pending unblock and inclusion in
Bullseye (Bug#1033811) and I don't want one single minor issue to
jeopardize getting fixes for multiple major issues forward.



Processed: Re: Bug#1032104: linux: ppc64el iouring corrupted read

2023-04-06 Thread Debian Bug Tracking System
Processing control commands:

> tags -1 + moreinfo
Bug #1032104 [src:linux] linux: ppc64el iouring corrupted read
Added tag(s) moreinfo.

-- 
1032104: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032104
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1032104: linux: ppc64el iouring corrupted read

2023-04-06 Thread Salvatore Bonaccorso
Control: tags -1 + moreinfo

Hi
On Sun, Mar 19, 2023 at 05:02:19PM +0100, Salvatore Bonaccorso wrote:
> Hi,
> 
> On Sat, Mar 18, 2023 at 11:19:29PM -0700, Otto Kekäläinen wrote:
> > Any updates on this one?
> > 
> > I am still seeing the main.index_merge_innodb failure in
> > https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1678728871&raw=0
> > and rebuild 
> > https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1679174850&raw=0.
> > 
> > Logs show: Kernel: Linux 5.10.0-21-powerpc64le #1 SMP Debian
> > 5.10.162-1 (2023-01-21) ppc64el (ppc64le)
> 
> Remember that with the 5.10.162 upstream version the io_uring code was
> rebased to the 5.15-stable one. So it is likely, and it maches the
> verison ranges, that the regression was introduced with this
> particular changes. Ideally someone with access to the given
> architecture, can verify that the issue is gone with the current
> 5.10.175 upstream (where there were several followup fixes, in
> particular e.g. a similar one for s390x), and if not, reports the
> problem to upstream.
> 
> Paul Gevers asked if the issues are gone as well with 6.1.12-1
> (or later 6.1.y series versions, which will land in bookworm). That
> would be valuable information to know as well to exclude we do not
> have the issue as well in bookworm.

Were you able to verify this?

Regards,
Salvatore



Bug#1032104: linux: ppc64el iouring corrupted read

2023-03-19 Thread Salvatore Bonaccorso
Hi,

On Sat, Mar 18, 2023 at 11:19:29PM -0700, Otto Kekäläinen wrote:
> Any updates on this one?
> 
> I am still seeing the main.index_merge_innodb failure in
> https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1678728871&raw=0
> and rebuild 
> https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1679174850&raw=0.
> 
> Logs show: Kernel: Linux 5.10.0-21-powerpc64le #1 SMP Debian
> 5.10.162-1 (2023-01-21) ppc64el (ppc64le)

Remember that with the 5.10.162 upstream version the io_uring code was
rebased to the 5.15-stable one. So it is likely, and it maches the
verison ranges, that the regression was introduced with this
particular changes. Ideally someone with access to the given
architecture, can verify that the issue is gone with the current
5.10.175 upstream (where there were several followup fixes, in
particular e.g. a similar one for s390x), and if not, reports the
problem to upstream.

Paul Gevers asked if the issues are gone as well with 6.1.12-1
(or later 6.1.y series versions, which will land in bookworm). That
would be valuable information to know as well to exclude we do not
have the issue as well in bookworm.

Regards,
Salvatore



Bug#1032104: linux: ppc64el iouring corrupted read

2023-03-18 Thread Otto Kekäläinen
Any updates on this one?

I am still seeing the main.index_merge_innodb failure in
https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1678728871&raw=0
and rebuild 
https://buildd.debian.org/status/fetch.php?pkg=mariadb&arch=ppc64el&ver=1%3A10.11.2-2%7Eexp1&stamp=1679174850&raw=0.

Logs show: Kernel: Linux 5.10.0-21-powerpc64le #1 SMP Debian
5.10.162-1 (2023-01-21) ppc64el (ppc64le)



Bug#1032104: linux: ppc64el iouring corrupted read

2023-03-18 Thread Paul Gevers

On Mon, 6 Mar 2023 13:25:36 +1100 Daniel Black  wrote:

Since revering to linux-image-5.10.0-20 we've been free of the same errors.


On ci.debian.net I upgraded all ppc64el hosts to bookworm on 2023-03-09.

debian@ci-worker-ppc64el-04:~$ uname -a
Linux ci-worker-ppc64el-04 6.1.0-5-powerpc64le #1 SMP Debian 6.1.12-1 
(2023-02-15) ppc64le GNU/Linux


Can you check if the errors are still the same (yes, there's still 
intermittent failures).


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1032104: linux: ppc64el iouring corrupted read

2023-03-05 Thread Daniel Black
Since revering to linux-image-5.10.0-20 we've been free of the same errors.



Bug#1032104: linux: ppc64el iouring corrupted read

2023-02-27 Thread Daniel Black
On Tue, Feb 28, 2023 at 5:24 PM Diederik de Haas  wrote:
>
> On Tuesday, 28 February 2023 04:13:18 CET Daniel Black wrote:
> > Source: linux
> > Version: 5.10.0-21-powerpc64le
> > Severity: grave
> > Justification: causes non-serious data loss
> > X-Debbugs-Cc: dan...@mariadb.org
> >
> > >From https://jira.mariadb.org/browse/MDEV-30728
> >
> > MariaDB's mtr tests on a number of specific tests depend on the correct
> > kernel operation.
> >
> > As observed in these tests, there is a ~1/5 chance the
> > encryption.innodb_encryption test will read zeros on the later part of
> > the 16k pages that InnoDB uses by default.
> >
> > This affects MariaDB-10.6+ packages where there is a liburing in the
> > distribution.
> >
> > I tested on tmpfs. This is a different fault from bug #1020831 as:
> > * there is no iouring error, just a bunch of zeros where data was
> >   expected.
> > * this is ppc64le only.
>
> What was the last kernel where this problem did NOT occur?

2022-12-19 03:55:34 install linux-image-5.10.0-20-powerpc64le:ppc64el
 5.10.158-2

no similar errors between ^ and ..

2023-01-24 03:19:59 install linux-image-5.10.0-21-powerpc64le:ppc64el
 5.10.162-1
(no other linux image installs in between these two)

first failure found ~ Feb 4 2023. Unsure when kernel rebooted to this
kernel bug it does appear to be the last revision.
https://buildbot.mariadb.org/#/builders/318/builds/10008

log example https://ci.mariadb.org/32263/logs/ppc64le-debian-11/mysqld.1.err.7
(search for CURRENT_TEST: encryption.innodb_encryption) - contains hex
dump of page

> It's probably needed to pinpoint the (upstream) commit that caused this error/
> issue and the best start is normally finding the closest range with Debian
> kernel releases where it did not and did occur.
>
> > -- System Information:
> > Debian Release: bullseye
> >   APT prefers jammy-updates
> >   APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500,
> > 'jammy'), (100, 'jammy-backports') Architecture: ppc64el (ppc64le)
> >
> > Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
> > Init: unable to detect
>
> Why is there no 'bullseye' in APT policy's output?
> Mixing distrubutions (aka FrankenDebian) isn't recommended, but seeing no
> bullseye in there is odd, especially since the kernel version very much does
> look like Debian.

Apologies for the FrankenDebian look. This was a jammy container and
jammy report bug with bullseye edited (badly) in the system info.



Bug#1032104: linux: ppc64el iouring corrupted read

2023-02-27 Thread Diederik de Haas
On Tuesday, 28 February 2023 04:13:18 CET Daniel Black wrote:
> Source: linux
> Version: 5.10.0-21-powerpc64le
> Severity: grave
> Justification: causes non-serious data loss
> X-Debbugs-Cc: dan...@mariadb.org
> 
> >From https://jira.mariadb.org/browse/MDEV-30728
> 
> MariaDB's mtr tests on a number of specific tests depend on the correct
> kernel operation.
> 
> As observed in these tests, there is a ~1/5 chance the
> encryption.innodb_encryption test will read zeros on the later part of
> the 16k pages that InnoDB uses by default.
> 
> This affects MariaDB-10.6+ packages where there is a liburing in the
> distribution.
> 
> I tested on tmpfs. This is a different fault from bug #1020831 as:
> * there is no iouring error, just a bunch of zeros where data was
>   expected.
> * this is ppc64le only.

What was the last kernel where this problem did NOT occur?
It's probably needed to pinpoint the (upstream) commit that caused this error/
issue and the best start is normally finding the closest range with Debian 
kernel releases where it did not and did occur.

> -- System Information:
> Debian Release: bullseye
>   APT prefers jammy-updates
>   APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500,
> 'jammy'), (100, 'jammy-backports') Architecture: ppc64el (ppc64le)
> 
> Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
> Init: unable to detect

Why is there no 'bullseye' in APT policy's output?
Mixing distrubutions (aka FrankenDebian) isn't recommended, but seeing no 
bullseye in there is odd, especially since the kernel version very much does 
look like Debian.

signature.asc
Description: This is a digitally signed message part.


Bug#1032104: linux: ppc64el iouring corrupted read

2023-02-27 Thread Daniel Black
Source: linux
Version: 5.10.0-21-powerpc64le
Severity: grave
Justification: causes non-serious data loss
X-Debbugs-Cc: dan...@mariadb.org

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***

>From https://jira.mariadb.org/browse/MDEV-30728

MariaDB's mtr tests on a number of specific tests depend on the correct
kernel operation.

As observed in these tests, there is a ~1/5 chance the
encryption.innodb_encryption test will read zeros on the later part of
the 16k pages that InnoDB uses by default.

This affects MariaDB-10.6+ packages where there is a liburing in the
distribution.

This has been observed in the CI of Debian
(https://ci.debian.net/packages/m/mariadb/testing/ppc64el/)
and upstreams https://buildbot.mariadb.org/#/builders/318.
The one ppc64le worker that has the Debian 5.10.0-21 kernel,
the same as the Debian CI, has the prefix ppc64le-db-bbw1-*.

Test faults occur on all MariaDB 10.6+ builds in containers on this kernel.
There a no faults on non-ppc64le or RHEL7/8 based ppc64le kernels.

To reproduce:

apt-get install mariadb-test
cd /usr/share/mysql/mysql-test
./mtr --mysqld=--innodb-flush-method=fsync --mysqld=--innodb-use-native-aio=1 
--vardir=/var/lib/mysql  --force encryption.innodb_encryption,innodb,undo0 
--repeat=12 

A test will frequenty fail.

2023-02-28  1:41:01 0 [ERROR] InnoDB: Database page corruption on disk or a 
failed read of file './ibdata1' page [page id: space=0, page number=282]. You 
may have to recover from a backup.

(the page number isn't predictable)

The complete mtr error log of mariadb server is $PWD/var/log/mysqld.1.err

I tested on tmpfs. This is a different fault from bug #1020831 as:
* there is no iouring error, just a bunch of zeros where data was
  expected.
* this is ppc64le only.

Note, more serious faults exist on overlayfs (MDEV-28751) and remote
filesystems so sticking to local xfs, ext4, btrfs is recommended.

-- System Information:
Debian Release: bullseye
  APT prefers jammy-updates
  APT policy: (500, 'jammy-updates'), (500, 'jammy-security'), (500, 'jammy'), 
(100, 'jammy-backports')
Architecture: ppc64el (ppc64le)

Kernel: Linux 5.10.0-21-powerpc64le (SMP w/128 CPU threads)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect