Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Hi On Tue, Nov 07, 2023 at 08:33:58PM +0100, Diederik de Haas wrote: > Control: found -1 6.1~rc3-1~exp1 > Control: found -1 6.1.55-1 > > On Saturday, 4 November 2023 20:35:43 CET Jose M Calhariz wrote: > > > Ok. Please test (when you have time) 6.1.55-1. > > > > Fail : Linux afs31 6.1.0-0-amd64 #1 SMP PREEMPT_DYNAMIC Debian > > 6.1~rc3-1~exp1 (2022-11-02) x86_64 GNU/Linux > > > > Fail : Linux afs31 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 > > (2023-09-29) x86_64 GNU/Linux > > > > Done. I tested even the first 6.1 on Debian. Both of them failed. > > Thanks, updated metadata accordingly. > So now we know it's indeed present in the whole 6.1 series. > > > > Unfortunately there isn't a 6.2 kernel uploaded to the Debian archive and > > > thus not available on snapshot.d.o, but testing 6.3.1-1~exp1 should be > > > useful. > > Please test with with 6.3.1-1~exp1 to make sure it was fixed then (too). > > Unfortunately, the commit list between 6.1 and 6.3.1 is quite large: > me@pc:~/dev/kernel.org/linux$ git log --oneline v6.1..v6.3.1 -- fs/xfs | wc -l > 159 > > If that list was small, I could've suggested to try 'backporting' a couple of > patches, but that avenue seems rather pointless in this case. > > It's probably also useful to verify whether it's also present in the whole > 5.10 series, which should give (even) more data points. > > I think the next step should be to 'forward' this bug report to the upstream > mailing list at linux-...@vger.kernel.org I do not follow closely linux-xfs mailing list, but I think other people already reported problems with 6.1 and are trying to do the effort of delimiting the patch and test a backport to 6.1. Kind regards Jose M Calhariz -- -- Egoista, s. m. Um sujeito mais interessado em si prĂ³prio que em mim. -- Ambrose Bierce signature.asc Description: PGP signature
Processed: Re: Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Processing control commands: > found -1 6.1~rc3-1~exp1 Bug #1040416 [src:linux] linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems. Marked as found in versions linux/6.1~rc3-1~exp1. > found -1 6.1.55-1 Bug #1040416 [src:linux] linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems. Marked as found in versions linux/6.1.55-1. -- 1040416: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040416 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Control: found -1 6.1~rc3-1~exp1 Control: found -1 6.1.55-1 On Saturday, 4 November 2023 20:35:43 CET Jose M Calhariz wrote: > > Ok. Please test (when you have time) 6.1.55-1. > > Fail : Linux afs31 6.1.0-0-amd64 #1 SMP PREEMPT_DYNAMIC Debian > 6.1~rc3-1~exp1 (2022-11-02) x86_64 GNU/Linux > > Fail : Linux afs31 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 > (2023-09-29) x86_64 GNU/Linux > > Done. I tested even the first 6.1 on Debian. Both of them failed. Thanks, updated metadata accordingly. So now we know it's indeed present in the whole 6.1 series. > > Unfortunately there isn't a 6.2 kernel uploaded to the Debian archive and > > thus not available on snapshot.d.o, but testing 6.3.1-1~exp1 should be > > useful. Please test with with 6.3.1-1~exp1 to make sure it was fixed then (too). Unfortunately, the commit list between 6.1 and 6.3.1 is quite large: me@pc:~/dev/kernel.org/linux$ git log --oneline v6.1..v6.3.1 -- fs/xfs | wc -l 159 If that list was small, I could've suggested to try 'backporting' a couple of patches, but that avenue seems rather pointless in this case. It's probably also useful to verify whether it's also present in the whole 5.10 series, which should give (even) more data points. I think the next step should be to 'forward' this bug report to the upstream mailing list at linux-...@vger.kernel.org signature.asc Description: This is a digitally signed message part.
Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Hi On Thu, Nov 02, 2023 at 07:40:38PM +0100, Diederik de Haas wrote: > > On Thursday, 2 November 2023 18:03:25 CET Jose M Calhariz wrote: > > On Thu, Nov 02, 2023 at 03:37:39PM +0100, Diederik de Haas wrote: > > > On Wednesday, 5 July 2023 19:07:15 CET Jose M Calhariz wrote: > > > > Package: src:linux > > > > Version: 6.1.27-1 > > > > > > Can you try with the latest version in the 6.1.x series to see if the > > > problem is still there? > > > > As I need to setup ASAP the servers in production I do not know if I > > have time in the next days. It works with backports kernels. > > No problem. > > > The latest kernels I tested were: > > Fail : Linux afs31 6.1.0-10-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.37-1 > > (2023-07-03) x86_64 GNU/Linux > > Ok. Please test (when you have time) 6.1.55-1. Fail : Linux afs31 6.1.0-0-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1~rc3-1~exp1 (2022-11-02) x86_64 GNU/Linux Fail : Linux afs31 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux Done. I tested even the first 6.1 on Debian. Both of them failed. > Also verify if it's also present in 6.1~rc3-1~exp1 to make sure it's present > in the whole 6.1 series. > Use https://snapshot.debian.org/binary/linux-image-amd64/ to get it/them. > > If the bug is NOT present in either the latest or the first, then try other > versions till you find the last one that work and the first one that fails. > > > OK : Linux afs31 6.4.0-0.deb12.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian > > 6.4.4-3~bpo12+1 (2023-08-08) x86_64 GNU/Linux > > It was fixed in 6.3.7-1, so it was expected that a later versions also works. > But let's ignore bpo as it likely won't provide useful data points. > > Unfortunately there isn't a 6.2 kernel uploaded to the Debian archive and > thus > not available on snapshot.d.o, but testing 6.3.1-1~exp1 should be useful. > > > The bug is present on Debian v11 too. So is an old bug with fixes on > > kernel 6.2 rc something. > > I'd recommend to focus first on the 6.1 series for now. > If at a later point testing with 5.10 may be useful, we can do that then. Kind regards Jose M Calhariz -- -- A vida feliz, meu Deus, consiste em nos alegrarmos em vos, de vos e por vos signature.asc Description: PGP signature
Processed: Re: Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Processing control commands: > found -1 6.1.37-1 Bug #1040416 [src:linux] linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems. Marked as found in versions linux/6.1.37-1. -- 1040416: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040416 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Control: found -1 6.1.37-1 On Thursday, 2 November 2023 18:03:25 CET Jose M Calhariz wrote: > On Thu, Nov 02, 2023 at 03:37:39PM +0100, Diederik de Haas wrote: > > On Wednesday, 5 July 2023 19:07:15 CET Jose M Calhariz wrote: > > > Package: src:linux > > > Version: 6.1.27-1 > > > > Can you try with the latest version in the 6.1.x series to see if the > > problem is still there? > > As I need to setup ASAP the servers in production I do not know if I > have time in the next days. It works with backports kernels. No problem. > The latest kernels I tested were: > Fail : Linux afs31 6.1.0-10-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.37-1 > (2023-07-03) x86_64 GNU/Linux Ok. Please test (when you have time) 6.1.55-1. Also verify if it's also present in 6.1~rc3-1~exp1 to make sure it's present in the whole 6.1 series. Use https://snapshot.debian.org/binary/linux-image-amd64/ to get it/them. If the bug is NOT present in either the latest or the first, then try other versions till you find the last one that work and the first one that fails. > OK : Linux afs31 6.4.0-0.deb12.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian > 6.4.4-3~bpo12+1 (2023-08-08) x86_64 GNU/Linux It was fixed in 6.3.7-1, so it was expected that a later versions also works. But let's ignore bpo as it likely won't provide useful data points. Unfortunately there isn't a 6.2 kernel uploaded to the Debian archive and thus not available on snapshot.d.o, but testing 6.3.1-1~exp1 should be useful. > The bug is present on Debian v11 too. So is an old bug with fixes on > kernel 6.2 rc something. I'd recommend to focus first on the 6.1 series for now. If at a later point testing with 5.10 may be useful, we can do that then. signature.asc Description: This is a digitally signed message part.
Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
On Thu, Nov 02, 2023 at 03:37:39PM +0100, Diederik de Haas wrote: > Control: tag -1 moreinfo > > On Wednesday, 5 July 2023 19:07:15 CET Jose M Calhariz wrote: > > Package: src:linux > > Version: 6.1.27-1 > > Can you try with the latest version in the 6.1.x series to see if the problem > is still there? As I need to setup ASAP the servers in production I do not know if I have time in the next days. It works with backports kernels. The latest kernels I tested were: Fail : Linux afs31 6.1.0-10-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.37-1 (2023-07-03) x86_64 GNU/Linux OK : Linux afs31 6.4.0-0.deb12.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.4.4-3~bpo12+1 (2023-08-08) x86_64 GNU/Linux > > > On this hardware I am chasing a data corruption for several months on > > Debian V11 and Debian v12. Now that I was pointed that linux kernel > > had some problems with XFS solved in later 6.3 kernel I can reproduce > > the problem. > > > > It seams the problem went away with current Debian testing kernel: > > > > ii linux-image-6.3.0-1-amd646.3.7-1 amd64Linux 6.3 > > for 64-bit PCs (signed) > > > > Is there anyone willing to backport the XFS fixes into > > linux-image-6.1.0 and linux-image-5.10.0? > > If the problem is still present in the latest 6.1 kernel, then you need to > find > out which patch(es) actually fix the problem. > The easiest way to start with that is to find the last kernel which exhibits > the issue and then the first one where it is fixed. > https://snapshot.debian.org/binary/linux-image-amd64/ should help > with that. The bug is present on Debian v11 too. So is an old bug with fixes on kernel 6.2 rc something. > > When the range has been narrowed, a `git bisect` should identify the specific > commit(s) which fixes the issue. > https://wiki.debian.org/DebianKernel/GitBisect should help with that > > When that/those have been identified, it should be reported to the upstream > kernel so that they can incorporate those fixes in their LTS kernel(s) which > Debian then will pick up automatically. > > HTH -- -- A vida feliz, meu Deus, consiste em nos alegrarmos em vos, de vos e por vos signature.asc Description: PGP signature
Processed: Re: Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Processing control commands: > tag -1 moreinfo Bug #1040416 [src:linux] linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems. Added tag(s) moreinfo. -- 1040416: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040416 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Bug#1040416: linux-image-6.1.0-9-amd64: Under heavy load Debian V12 and V11 causes data corruption on XFS filesystems.
Control: tag -1 moreinfo On Wednesday, 5 July 2023 19:07:15 CET Jose M Calhariz wrote: > Package: src:linux > Version: 6.1.27-1 Can you try with the latest version in the 6.1.x series to see if the problem is still there? > On this hardware I am chasing a data corruption for several months on > Debian V11 and Debian v12. Now that I was pointed that linux kernel > had some problems with XFS solved in later 6.3 kernel I can reproduce > the problem. > > It seams the problem went away with current Debian testing kernel: > > ii linux-image-6.3.0-1-amd646.3.7-1 amd64Linux 6.3 > for 64-bit PCs (signed) > > Is there anyone willing to backport the XFS fixes into > linux-image-6.1.0 and linux-image-5.10.0? If the problem is still present in the latest 6.1 kernel, then you need to find out which patch(es) actually fix the problem. The easiest way to start with that is to find the last kernel which exhibits the issue and then the first one where it is fixed. https://snapshot.debian.org/binary/linux-image-amd64/ should help with that. When the range has been narrowed, a `git bisect` should identify the specific commit(s) which fixes the issue. https://wiki.debian.org/DebianKernel/GitBisect should help with that When that/those have been identified, it should be reported to the upstream kernel so that they can incorporate those fixes in their LTS kernel(s) which Debian then will pick up automatically. HTH signature.asc Description: This is a digitally signed message part.