Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2011-01-07 Thread Matt
On Thu, Jan 6, 2011 at 4:56 PM, Heinz Diehl h...@fancy-poultry.org wrote: On 05.12.2010, Milan Broz wrote: It still seems to like dmcrypt with its parallel processing is just trigger to another bug in 37-rc. To come back to this: my 3 systems (XFS filesystem) running the latest

Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2011-01-06 Thread Heinz Diehl
On 05.12.2010, Milan Broz wrote: It still seems to like dmcrypt with its parallel processing is just trigger to another bug in 37-rc. To come back to this: my 3 systems (XFS filesystem) running the latest dm-crypt-scale-to-multiple-cpus patch from Andi Kleen/Milan Broz have not showed a

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-16 Thread Chris Mason
Excerpts from Dave Chinner's message of 2010-12-15 22:37:18 -0500: On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote: Usually the trick to reproducing filesystem corruptions is adding memory pressure. The corruption is probably a bad interaction between reads and writes, and

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Mon, Dec 13, 2010 at 7:56 PM, Jon Nelson jnel...@jamponi.net wrote: On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o ty...@mit.edu wrote: On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote: I'm glad you've been able to reproduce the problem! If you should need any further assistance,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Andi Kleen
I have a question though: the deactivation of multiple page-io submission support most likely only would affect bigger systems or also desktop systems (like mine) ? I think this is not a final fix, just a workaround. The problem with the other path still really needs to be tracked down. -Andi

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote: I have a question though: the deactivation of multiple page-io submission support most likely only would affect bigger systems or also desktop systems (like mine) ? I think this is not a final fix, just a workaround. The

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:25 PM, Matt jackdac...@gmail.com wrote: On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote: I have a question though: the deactivation of multiple page-io submission support most likely only would affect bigger systems or also desktop systems (like

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-13 Thread Jon Nelson
On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o ty...@mit.edu wrote: On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote: I'm glad you've been able to reproduce the problem! If you should need any further assistance, please do not hesitate to ask. This patch seems to fix the problem for me.  

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Jon Nelson
On Sat, Dec 11, 2010 at 9:16 PM, Jon Nelson jnel...@jamponi.net wrote: On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o ty...@mit.edu wrote: Yes, indeed.  Is this in the virtualized environment or on real hardware at this point?  And how many CPU's do you have configured in your virtualized

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Ted Ts'o
On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote: I have one CPU configured in the environment, 512MB of memory. I have not done any memory-constriction tests whatsoever. I've finally been able to reproduce it myself, on real hardware. SMP is not necessary to reproduce it, although

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Jon Nelson
On Sun, Dec 12, 2010 at 6:43 AM, Ted Ts'o ty...@mit.edu wrote: On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote: I have one CPU configured in the environment, 512MB of memory. I have not done any memory-constriction tests whatsoever. I've finally been able to reproduce it myself,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Ted Ts'o
On Fri, Dec 10, 2010 at 08:14:56PM -0600, Jon Nelson wrote: Barring false negatives, bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc appears to be the culprit (according to git bisect). I will test bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc again, confirm the behavior, and work backwards to try to

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Ted Ts'o
One experiment --- can you try this with the file system mounted with data=writeback, and see if the problem reproduces in that journalling mode? I want to rule out (if possible) journal_submit_inode_data_buffers() racing with mpage_da_submit_io(). I don't think that's the issue, but I'd prefer

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Jon Nelson
On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o ty...@mit.edu wrote: Yes, indeed. Is this in the virtualized environment or on real hardware at this point? And how many CPU's do you have configured in your virtualized environment, and how memory memory? Is having a certain number of CPU's

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote: On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote: On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 from the tests I've done that one showed

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson jnel...@jamponi.net wrote: On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote: On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote: On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: Try a kernel before

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 10:54 AM, Jon Nelson jnel...@jamponi.net wrote: On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson jnel...@jamponi.net wrote: On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote: On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote: On Fri, Dec 10,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Tue, Dec 07, 2010 at 09:37:20PM -0600, Jon Nelson wrote: One difference is the location of the transaction logs (pg_xlog). In my case, /var/lib/pgsql/data *is* mountpoint for the test volume (actually, it's a symlink to the mount point). In your case, that is not so. Perhaps that makes a

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote: You should be OK, there. Are you using encryption or no? I had difficulty replicating the issue without encryption. Yes, I'm using encryption. LUKS with aes-xts-plain-sha256, and then LVM on top of LUKS. If you can point out how

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 2:13 PM, Ted Ts'o ty...@mit.edu wrote: On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote: You should be OK, there. Are you using encryption or no? I had difficulty replicating the issue without encryption. Yes, I'm using encryption.  LUKS with

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Andi Kleen
512MB. 'free' reports 75MB, 419MB free. I originally noticed the problem on really real hardware (thinkpad T61p), however. If you can easily reproduce it could you try a git bisect? -Andi -- a...@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Chris Mason
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: 512MB. 'free' reports 75MB, 419MB free. I originally noticed the problem on really real hardware (thinkpad T61p), however. If you can easily reproduce it could you try a git bisect? Do we have a known good kernel?

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Matt
On Fri, Dec 10, 2010 at 2:38 AM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: 512MB. 'free' reports 75MB, 419MB free. I originally noticed the problem on really real hardware (thinkpad T61p), however. If you can easily

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Mike Fedyk
On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: 512MB. 'free' reports 75MB, 419MB free. I originally noticed the problem on really real hardware (thinkpad T61p), however. If you can easily

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Chris Mason
Excerpts from Mike Fedyk's message of 2010-12-09 20:58:40 -0500: On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: 512MB. 'free' reports 75MB, 419MB free. I originally noticed the problem on

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 from the tests I've done that one showed the least or no corruption if you count the empty /etc/env.d/03opengl as an artefact Yes, that's a good test. Also try commit

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote: On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 from the tests I've done that one showed the least or no corruption if you count the empty /etc/env.d/03opengl as

Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-08 Thread Milan Broz
On 12/08/2010 04:29 AM, Jon Nelson wrote: Maybe not so fantastic. I kept testing and had no more failures. At all. After 40+ iterations I gave up. I went back to trying ext4 on a LUKS volume. The 'hit' ratio went to something like 1 in 3, or better. Encryption usually propagates bit

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-08 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500: On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-08 Thread Jon Nelson
On Tue, Dec 7, 2010 at 9:37 PM, Jon Nelson jnel...@jamponi.net wrote: On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote: On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: 1. create a database (from bash): createdb test 2. place the following contents in a file (I

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE KOTD kernel) I easily encounter the issue. Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64 install, installed all updates, installed postgresql and the 'KOTD' (Kernel of the Day) kernel, and ran the

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Mike Snitzer
On Tue, Dec 07 2010 at 1:10pm -0500, Jon Nelson jnel...@jamponi.net wrote: I finally found some time to test this out. With 2.6.37-rc4 (openSUSE KOTD kernel) I easily encounter the issue. Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64 install, installed all

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote: On Tue, Dec 07 2010 at  1:10pm -0500, Jon Nelson jnel...@jamponi.net wrote: I finally found some time to test this out. With 2.6.37-rc4 (openSUSE KOTD kernel) I easily encounter the issue. Using a virtual machine, I

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500: On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote: On Tue, Dec 07 2010 at  1:10pm -0500, Jon Nelson jnel...@jamponi.net wrote: I finally found some time to test this out. With 2.6.37-rc4 (openSUSE KOTD

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500: On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote: On Tue, Dec 07 2010 at  1:10pm -0500, Jon Nelson jnel...@jamponi.net wrote:

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Ted Ts'o
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: 1. create a database (from bash): createdb test 2. place the following contents in a file (I used 't.sql'): begin; create temporary table foo as select x as a, ARRAY[x] as b FROM generate_series(1, 1000 ) AS x;

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote: postgresql errors. Typically, header corruption but from the limited visibility I've had into this via strace, what I see is zeroed pages where there

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote: postgresql errors. Typically, header corruption but from the limited

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote:

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:33 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote:

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote: On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: 1. create a database (from bash): createdb test 2. place the following contents in a file (I used 't.sql'): begin; create temporary table foo as select x as

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote: On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: 1. create a database (from bash): createdb test 2. place the following contents in a file (I used 't.sql'): begin; create temporary table foo as select x as

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote: Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: On Tue, Dec 7,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-05 Thread Heinz Diehl
On 05.12.2010, Matt wrote: I should have made it clear that the results I get are observed when using the kernels/checkouts *with* the dm-crypt multi-cpu patch, without the patch I didn't see that kind of problems (hardlocks, files missing, etc.) I have to take back my other two emails,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Milan Broz
On 12/05/2010 11:09 AM, Heinz Diehl wrote: On 05.12.2010, Matt wrote: I have to take back my other two emails, stating that no corruption happened with the dm-crypt multi-cpu patch. Today, I encountered filesystem corruption on one, and a complete hardlock on another machine. No logfile

Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Heinz Diehl
On 05.12.2010, Milan Broz wrote: Which kernel? 2.6.37-rc? 2.6.37-rc4 on one and 2.6.37-rc3-git2 on the other machine. Anyone seen this with 2.6.36 and the same dmcrypt patch? (All info I had is that is is stable with here.) Both 2.6.36 and 2.6.36.1 with your patch have been running

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Theodore Tso
On Dec 5, 2010, at 5:21 AM, Milan Broz wrote: Which kernel? 2.6.37-rc? Anyone seen this with 2.6.36 and the same dmcrypt patch? (All info I had is that is is stable with here.) It still seems to like dmcrypt with its parallel processing is just trigger to another bug in 37-rc. I've

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Matt
On Sun, Dec 5, 2010 at 2:24 PM, Theodore Tso ty...@mit.edu wrote: On Dec 5, 2010, at 5:21 AM, Milan Broz wrote: Which kernel? 2.6.37-rc? Anyone seen this with 2.6.36 and the same dmcrypt patch? (All info I had is that is is stable with here.) It still seems to like dmcrypt with its

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Ted Ts'o
On Sun, Dec 05, 2010 at 02:44:14PM +0100, Matt wrote: gcc version 4.5.1 (Gentoo Hardened 4.5.1-r1 p1.4, pie-0.4.5) This is probably just me being paranoid, but it might be worth trying using a gcc 4.4.x compiler and see if that makes any difference. There have been some other gcc 4.5-caused

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Heinz Diehl
On 05.12.2010, Theodore Tso wrote: As another thought, what version of GCC are people using who are having difficulty? Could this perhaps be a compiler-related issue? h...@liesel:~ gcc -v Using built-in specs. Target: x86_64-suse-linux Configured with: ../configure --prefix=/usr

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Daniel J Blueman
Hi Heinz, On 5 December 2010 14:33, Heinz Diehl h...@fritha.org wrote: On 05.12.2010, Theodore Tso wrote: As another thought, what version of GCC are people using who are having difficulty? Could this perhaps be a compiler-related issue? h...@liesel:~ gcc -v Using built-in specs. Target:

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Andi Kleen
I've been using a kernel which is between 2.6.37-rc2 and -rc3 with a LUKS / dm-crypt / LVM / ext4 setup for my primary file systems, and I haven't observed any corruption for the last two weeks or so. It's on my todo list to upgrade to top of Linus's tree, but perhaps this is a useful

Re: hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Mike Snitzer
On Sun, Dec 05 2010 at 3:28pm -0500, Andi Kleen a...@firstfloor.org wrote: As another thought, what version of GCC are people using who are having difficulty? Could this perhaps be a compiler-related issue? A compiler problem seems very unlikely here. What may be an useful

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Valdis . Kletnieks
On Sun, 05 Dec 2010 08:24:32 EST, Theodore Tso said: I've been using a kernel which is between 2.6.37-rc2 and -rc3 with a LUKS / dm-crypt / LVM / ext4 setup for my primary file systems, and I haven't observed any corruption for the last two weeks or so. Pretty much exactly the same setup

Re: [dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption?

2010-12-05 Thread Heinz Diehl
On 06.12.2010, Daniel J Blueman wrote: A bit late to the party, but does memtest86 pass over multiple iterations? Yes, it does. This machine had not a single fault in several years, it's absolutely rock-stable. These freezes/corruptions are the first ones ever, and they vanish when I go down

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote: On Wed, Dec 01 2010 at  3:45pm -0500, Milan Broz mb...@redhat.com wrote: On 12/01/2010 08:34 PM, Jon Nelson wrote: Perhaps this is useful: for myself, I found that when I started using 2.6.37rc3 that postgresql

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Mike Snitzer
On Sat, Dec 04 2010 at 2:18pm -0500, Matt jackdac...@gmail.com wrote: On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote: Matt and Jon, If you'd be up to it: could you try testing your dm-crypt+ext4 corruption reproducers against the following two 2.6.37-rc commits:

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Heinz Diehl
On 04.12.2010, Matt wrote: I'm not sure if it's even a problem with ext4 - I haven't had the time to test with XFS yet I can and have run both -rc3 and -rc4 with Milans patch v6, without any problems at all, under heavy load and disk I/O. I'm using XFS exclusively. The system is a testing

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer snit...@redhat.com wrote: On Sat, Dec 04 2010 at  2:18pm -0500, Matt jackdac...@gmail.com wrote: On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote: Matt and Jon, If you'd be up to it: could you try testing your

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer snit...@redhat.com wrote: On Sat, Dec 04 2010 at  2:18pm -0500, Matt jackdac...@gmail.com wrote: On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote: Matt and Jon, If you'd be up to it: could you try testing your

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-02 Thread Matt
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote: On Wed, Dec 01 2010 at  3:45pm -0500, Milan Broz mb...@redhat.com wrote: On 12/01/2010 08:34 PM, Jon Nelson wrote: Perhaps this is useful: for myself, I found that when I started using 2.6.37rc3 that postgresql

hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-01 Thread Mike Snitzer
On Wed, Dec 01 2010 at 3:45pm -0500, Milan Broz mb...@redhat.com wrote: On 12/01/2010 08:34 PM, Jon Nelson wrote: Perhaps this is useful: for myself, I found that when I started using 2.6.37rc3 that postgresql starting having a *lot* of problems with corruption. Specifically, I noted