Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-16 Thread Chris Mason
Excerpts from Dave Chinner's message of 2010-12-15 22:37:18 -0500: > On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote: > > > > Usually the trick to reproducing filesystem corruptions is adding memory > > pressure. The corruption is probably a bad interaction between reads > > and write

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Dave Chinner
On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500: > > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason wrote: > > > Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: > > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:25 PM, Matt wrote: > On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen wrote: >>> I have a question though: the deactivation of multiple page-io >>> submission support most likely only would affect bigger systems or >>> also desktop systems (like mine) ? >> >> I think this is

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen wrote: >> I have a question though: the deactivation of multiple page-io >> submission support most likely only would affect bigger systems or >> also desktop systems (like mine) ? > > I think this is not a final fix, just a workaround. > The problem wit

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Andi Kleen
> I have a question though: the deactivation of multiple page-io > submission support most likely only would affect bigger systems or > also desktop systems (like mine) ? I think this is not a final fix, just a workaround. The problem with the other path still really needs to be tracked down. -An

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-15 Thread Matt
On Mon, Dec 13, 2010 at 7:56 PM, Jon Nelson wrote: > On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o wrote: >> On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote: >>> I'm glad you've been able to reproduce the problem! If you should need >>> any further assistance, please do not hesitate to ask

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-13 Thread Jon Nelson
On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o wrote: > On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote: >> I'm glad you've been able to reproduce the problem! If you should need >> any further assistance, please do not hesitate to ask. > > This patch seems to fix the problem for me.  (Unles

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Ted Ts'o
On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote: > I'm glad you've been able to reproduce the problem! If you should need > any further assistance, please do not hesitate to ask. This patch seems to fix the problem for me. (Unless the partition is mounted with mblk_io_submit.) Could y

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Jon Nelson
On Sun, Dec 12, 2010 at 6:43 AM, Ted Ts'o wrote: > On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote: >> > I have one CPU configured in the environment, 512MB of memory. >> > I have not done any memory-constriction tests whatsoever. > > I've finally been able to reproduce it myself, on re

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Ted Ts'o
On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote: > > I have one CPU configured in the environment, 512MB of memory. > > I have not done any memory-constriction tests whatsoever. I've finally been able to reproduce it myself, on real hardware. SMP is not necessary to reproduce it, altho

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-12 Thread Jon Nelson
On Sat, Dec 11, 2010 at 9:16 PM, Jon Nelson wrote: > On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o wrote: >> Yes, indeed.  Is this in the virtualized environment or on real >> hardware at this point?  And how many CPU's do you have configured in >> your virtualized environment, and how memory memory?

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Jon Nelson
On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o wrote: > Yes, indeed. Is this in the virtualized environment or on real > hardware at this point? And how many CPU's do you have configured in > your virtualized environment, and how memory memory? Is having a > certain number of CPU's critical for repr

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Ted Ts'o
One experiment --- can you try this with the file system mounted with data=writeback, and see if the problem reproduces in that journalling mode? I want to rule out (if possible) journal_submit_inode_data_buffers() racing with mpage_da_submit_io(). I don't think that's the issue, but I'd prefer t

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-11 Thread Ted Ts'o
On Fri, Dec 10, 2010 at 08:14:56PM -0600, Jon Nelson wrote: > > Barring false negatives, bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc > > appears to be the culprit (according to git bisect). > > I will test bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc again, confirm > > the behavior, and work backwards to

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 10:54 AM, Jon Nelson wrote: > On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson wrote: >> On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson wrote: >>> On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o wrote: On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: > > Try a kernel

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson wrote: > On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson wrote: >> On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o wrote: >>> On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 f

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-10 Thread Jon Nelson
On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson wrote: > On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o wrote: >> On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: >>> >>> Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 >>> >>> from the tests I've done that one showed the least or no corr

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o wrote: > On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: >> >> Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 >> >> from the tests I've done that one showed the least or no corruption if >> you count the empty /etc/env.d/03opengl as an a

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote: > > Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179 > > from the tests I've done that one showed the least or no corruption if > you count the empty /etc/env.d/03opengl as an artefact Yes, that's a good test. Also try commit bd2

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 8:00 PM, Chris Mason wrote: > Excerpts from Mike Fedyk's message of 2010-12-09 20:58:40 -0500: >> On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason wrote: >> > Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: >> >> > 512MB. >> >> > >> >> > 'free' reports 75MB, 419

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Chris Mason
Excerpts from Mike Fedyk's message of 2010-12-09 20:58:40 -0500: > On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason wrote: > > Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: > >> > 512MB. > >> > > >> > 'free' reports 75MB, 419MB free. > >> > > >> > I originally noticed the problem on

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Mike Fedyk
On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason wrote: > Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: >> > 512MB. >> > >> > 'free' reports 75MB, 419MB free. >> > >> > I originally noticed the problem on really real hardware (thinkpad >> > T61p), however. >> >> If you can easily rep

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Matt
On Fri, Dec 10, 2010 at 2:38 AM, Chris Mason wrote: > Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: >> > 512MB. >> > >> > 'free' reports 75MB, 419MB free. >> > >> > I originally noticed the problem on really real hardware (thinkpad >> > T61p), however. >> >> If you can easily re

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Chris Mason
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500: > > 512MB. > > > > 'free' reports 75MB, 419MB free. > > > > I originally noticed the problem on really real hardware (thinkpad > > T61p), however. > > If you can easily reproduce it could you try a git bisect? Do we have a known g

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Andi Kleen
> 512MB. > > 'free' reports 75MB, 419MB free. > > I originally noticed the problem on really real hardware (thinkpad > T61p), however. If you can easily reproduce it could you try a git bisect? -Andi -- a...@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send t

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 2:13 PM, Ted Ts'o wrote: > On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote: >> >> You should be OK, there. Are you using encryption or no? >> I had difficulty replicating the issue without encryption. > > Yes, I'm using encryption.  LUKS with aes-xts-plain-sha256,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote: > > You should be OK, there. Are you using encryption or no? > I had difficulty replicating the issue without encryption. Yes, I'm using encryption. LUKS with aes-xts-plain-sha256, and then LVM on top of LUKS. > > If you can point out

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Jon Nelson
On Thu, Dec 9, 2010 at 12:01 PM, Ted Ts'o wrote: > On Tue, Dec 07, 2010 at 09:37:20PM -0600, Jon Nelson wrote: >> One difference is the location of the transaction logs (pg_xlog). In >> my case, /var/lib/pgsql/data *is* mountpoint for the test volume >> (actually, it's a symlink to the mount point

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-09 Thread Ted Ts'o
On Tue, Dec 07, 2010 at 09:37:20PM -0600, Jon Nelson wrote: > One difference is the location of the transaction logs (pg_xlog). In > my case, /var/lib/pgsql/data *is* mountpoint for the test volume > (actually, it's a symlink to the mount point). In your case, that is > not so. Perhaps that makes a

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-08 Thread Jon Nelson
On Tue, Dec 7, 2010 at 9:37 PM, Jon Nelson wrote: > On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o wrote: >> On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: >>> > 1. create a database (from bash): >>> > >>> > createdb test >>> > >>> > 2. place the following contents in a file (I used 't.s

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-08 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500: > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason wrote: > > Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason wrote: > >> > Excerpts from Jon Nelson's message of 2010-12-0

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason >>

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o wrote: > On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: >> > 1. create a database (from bash): >> > >> > createdb test >> > >> > 2. place the following contents in a file (I used 't.sql'): >> > >> > begin; >> > create temporary table foo as s

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason wrote: >> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: >> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason >>

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: > On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason wrote: > > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: > >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: > >> > Excerpts from Jon Nelson's message of 2010-12-0

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o wrote: > On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: >> > 1. create a database (from bash): >> > >> > createdb test >> > >> > 2. place the following contents in a file (I used 't.sql'): >> > >> > begin; >> > create temporary table foo as s

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason >>

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: > On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: > > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: > >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason > >> wrote: > >> >> postgresql errors. Typically, header co

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:33 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason >>

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: > On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: > > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: > >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason > >> wrote: > >> >> postgresql errors. Typically, header co

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason wrote: >> >> postgresql errors. Typically, header corruption but from the limited >> >> visibility I've had into this via strace, w

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: > On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason wrote: > >> postgresql errors. Typically, header corruption but from the limited > >> visibility I've had into this via strace, what I see is zeroed pages > >> where there shouldn't be.

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Ted Ts'o
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote: > > 1. create a database (from bash): > > > > createdb test > > > > 2. place the following contents in a file (I used 't.sql'): > > > > begin; > > create temporary table foo as select x as a, ARRAY[x] as b FROM > > generate_series(1,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500: >> On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer wrote: >> > On Tue, Dec 07 2010 at  1:10pm -0500, >> > Jon Nelson wrote: >> > >> >> I finally found some time to test this out.

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500: > On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer wrote: > > On Tue, Dec 07 2010 at  1:10pm -0500, > > Jon Nelson wrote: > > > >> I finally found some time to test this out. With 2.6.37-rc4 (openSUSE > >> KOTD kernel) I easily encount

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer wrote: > On Tue, Dec 07 2010 at  1:10pm -0500, > Jon Nelson wrote: > >> I finally found some time to test this out. With 2.6.37-rc4 (openSUSE >> KOTD kernel) I easily encounter the issue. >> >> Using a virtual machine, I created a stock, minimal openS

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Mike Snitzer
On Tue, Dec 07 2010 at 1:10pm -0500, Jon Nelson wrote: > I finally found some time to test this out. With 2.6.37-rc4 (openSUSE > KOTD kernel) I easily encounter the issue. > > Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64 > install, installed all updates, installed po

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
Excerpts from Jon Nelson's message of 2010-12-07 13:10:49 -0500: > I finally found some time to test this out. With 2.6.37-rc4 (openSUSE > KOTD kernel) I easily encounter the issue. > > Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64 > install, installed all updates, insta

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Jon Nelson
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE KOTD kernel) I easily encounter the issue. Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64 install, installed all updates, installed postgresql and the 'KOTD' (Kernel of the Day) kernel, and ran the foll

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-07 Thread Chris Mason
On Sun, Dec 05, 2010 at 12:47:11AM +0100, Matt wrote: > > OK. > > meanwhile I think I got some interesting news: > > after some time of running (around 1 to 1.5 hours) I noticed the > following BUG with ext4: > > [ 4421.503477] [ cut here ] > [ 4421.503482] kernel BUG at

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-05 Thread Heinz Diehl
On 05.12.2010, Matt wrote: > I should have made it clear that the results I get are observed when > using the kernels/checkouts *with* the dm-crypt multi-cpu patch, > without the patch I didn't see that kind of problems (hardlocks, files > missing, etc.) I have to take back my other two emails,

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer wrote: > On Sat, Dec 04 2010 at  2:18pm -0500, > Matt wrote: > >> On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer wrote: >> > Matt and Jon, >> > >> > If you'd be up to it: could you try testing your dm-crypt+ext4 >> > corruption reproducers against the

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer wrote: > On Sat, Dec 04 2010 at  2:18pm -0500, > Matt wrote: > >> On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer wrote: >> > Matt and Jon, >> > >> > If you'd be up to it: could you try testing your dm-crypt+ext4 >> > corruption reproducers against the

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Heinz Diehl
On 04.12.2010, Matt wrote: > I'm not sure if it's even a problem with ext4 - I haven't had the time > to test with XFS yet I can and have run both -rc3 and -rc4 with Milans patch v6, without any problems at all, under heavy load and disk I/O. I'm using XFS exclusively. The system is a testing s

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Mike Snitzer
On Sat, Dec 04 2010 at 2:18pm -0500, Matt wrote: > On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer wrote: > > Matt and Jon, > > > > If you'd be up to it: could you try testing your dm-crypt+ext4 > > corruption reproducers against the following two 2.6.37-rc commits: > > > > 1) 1de3e3df917459422cb

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-04 Thread Matt
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer wrote: > On Wed, Dec 01 2010 at  3:45pm -0500, > Milan Broz wrote: > >> >> On 12/01/2010 08:34 PM, Jon Nelson wrote: >> > Perhaps this is useful: for myself, I found that when I started using >> > 2.6.37rc3 that postgresql starting having a *lot* of p

Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-02 Thread Matt
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer wrote: > On Wed, Dec 01 2010 at  3:45pm -0500, > Milan Broz wrote: > >> >> On 12/01/2010 08:34 PM, Jon Nelson wrote: >> > Perhaps this is useful: for myself, I found that when I started using >> > 2.6.37rc3 that postgresql starting having a *lot* of p

hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

2010-12-01 Thread Mike Snitzer
On Wed, Dec 01 2010 at 3:45pm -0500, Milan Broz wrote: > > On 12/01/2010 08:34 PM, Jon Nelson wrote: > > Perhaps this is useful: for myself, I found that when I started using > > 2.6.37rc3 that postgresql starting having a *lot* of problems with > > corruption. Specifically, I noted zeroed page

Re: dm-crypt barrier support is effective

2010-12-01 Thread Milan Broz
On 12/01/2010 08:34 PM, Jon Nelson wrote: > Perhaps this is useful: for myself, I found that when I started using > 2.6.37rc3 that postgresql starting having a *lot* of problems with > corruption. Specifically, I noted zeroed pages, corruption in headers, > all sorts of stuff on /newly created/ ta

Re: dm-crypt barrier support is effective

2010-12-01 Thread Heinz Diehl
On 01.12.2010, Milan Broz wrote: > Anyway, I run several tests on 2.6.37-rc3+ and see no integrity > problems (using xfs,ext3 and ext4 over dmcrypt). Not that this might help, but just for testing purposes, I have run all the -rcX from 2.6.36 on with Milan's patch (XFS filesystem) under heavy

Re: dm-crypt barrier support is effective

2010-12-01 Thread Jon Nelson
On Wed, Dec 1, 2010 at 12:24 PM, Milan Broz wrote: > On 12/01/2010 06:35 PM, Matt wrote: >> Thanks for pointing to v6 ! I hadn't noticed that there was a new one :) >> >> Well, so I'll restore my box to a working/productive state and will >> try out v6 (I'm pretty confident that it'll work without

Re: dm-crypt barrier support is effective

2010-12-01 Thread Milan Broz
On 12/01/2010 06:35 PM, Matt wrote: > Thanks for pointing to v6 ! I hadn't noticed that there was a new one :) > > Well, so I'll restore my box to a working/productive state and will > try out v6 (I'm pretty confident that it'll work without problems). It's the same as previous, just with fixed h

Re: dm-crypt barrier support is effective

2010-12-01 Thread Matt
On Wed, Dec 1, 2010 at 5:52 PM, Mike Snitzer wrote: > On Wed, Dec 01 2010 at 11:05am -0500, > Matt wrote: > >> On Mon, Nov 15, 2010 at 12:24 AM, Matt wrote: >> > On Sun, Nov 14, 2010 at 10:54 PM, Milan Broz wrote: >> >> On 11/14/2010 10:49 PM, Matt wrote: >> >>> only with the dm-crypt scaling p

Re: dm-crypt barrier support is effective

2010-12-01 Thread Mike Snitzer
On Wed, Dec 01 2010 at 11:05am -0500, Matt wrote: > On Mon, Nov 15, 2010 at 12:24 AM, Matt wrote: > > On Sun, Nov 14, 2010 at 10:54 PM, Milan Broz wrote: > >> On 11/14/2010 10:49 PM, Matt wrote: > >>> only with the dm-crypt scaling patch I could observe the data-corruption > >> > >> even with v

Re: dm-crypt barrier support is effective

2010-12-01 Thread Matt
On Mon, Nov 15, 2010 at 12:24 AM, Matt wrote: > On Sun, Nov 14, 2010 at 10:54 PM, Milan Broz wrote: >> On 11/14/2010 10:49 PM, Matt wrote: >>> only with the dm-crypt scaling patch I could observe the data-corruption >> >> even with v5 I sent on Friday? >> >> Are you sure that it is not related to

Re: dm-crypt barrier support is effective

2010-11-15 Thread Milan Broz
On 11/15/2010 08:25 AM, Heinz Diehl wrote: > On 15.11.2010, Milan Broz wrote: > > drivers/md/dm-crypt.c: In function crypt_ctr': > drivers/md/dm-crypt.c:1408: error: WQ_MEM_RECLAIM' undeclared (first use in > this function) > drivers/md/dm-crypt.c:1408: error: (Each undeclared identifier is repo

Re: dm-crypt barrier support is effective

2010-11-14 Thread Heinz Diehl
On 15.11.2010, Milan Broz wrote: > even with v5 I sent on Friday? Your v5 patch applies cleanly to 2.6.36, but fails to build on my system: [] LD fs/xfs/xfs.o LD fs/xfs/built-in.o CC fs/compat_ioctl.o drivers/md/dm-crypt.c: In function crypt_ctr': drivers/md/dm-crypt.c:1408:

Re: dm-crypt barrier support is effective

2010-11-14 Thread Matt
On Sun, Nov 14, 2010 at 10:54 PM, Milan Broz wrote: > On 11/14/2010 10:49 PM, Matt wrote: >> only with the dm-crypt scaling patch I could observe the data-corruption > > even with v5 I sent on Friday? > > Are you sure that it is not related to some fs problem in 2.6.37-rc1? > > If it works on 2.6.

Re: dm-crypt barrier support is effective

2010-11-14 Thread Milan Broz
On 11/14/2010 10:49 PM, Matt wrote: > only with the dm-crypt scaling patch I could observe the data-corruption even with v5 I sent on Friday? Are you sure that it is not related to some fs problem in 2.6.37-rc1? If it works on 2.6.36 without problems, it is probably problems somewhere else (flus

Re: dm-crypt barrier support is effective (was: Re: DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ?)

2010-11-14 Thread Matt
On Sun, Nov 14, 2010 at 9:59 PM, Mike Snitzer wrote: > On Mon, Nov 08 2010 at 12:59pm -0500, > Chris Mason wrote: > >> Excerpts from Mike Snitzer's message of 2010-11-08 09:58:09 -0500: >> > On Sun, Nov 07 2010 at  6:05pm -0500, >> > Andi Kleen wrote: >> > >> > > On Sun, Nov 07, 2010 at 10:39:23

dm-crypt barrier support is effective (was: Re: DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ?)

2010-11-14 Thread Mike Snitzer
On Mon, Nov 08 2010 at 12:59pm -0500, Chris Mason wrote: > Excerpts from Mike Snitzer's message of 2010-11-08 09:58:09 -0500: > > On Sun, Nov 07 2010 at 6:05pm -0500, > > Andi Kleen wrote: > > > > > On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote: > > > > On 11/07/2010 08:45 PM, And