On Thu, Jan 6, 2011 at 4:56 PM, Heinz Diehl h...@fancy-poultry.org wrote:
On 05.12.2010, Milan Broz wrote:
It still seems to like dmcrypt with its parallel processing is just
trigger to another bug in 37-rc.
To come back to this: my 3 systems (XFS filesystem) running the latest
On 05.12.2010, Milan Broz wrote:
It still seems to like dmcrypt with its parallel processing is just
trigger to another bug in 37-rc.
To come back to this: my 3 systems (XFS filesystem) running the latest
dm-crypt-scale-to-multiple-cpus patch from Andi Kleen/Milan Broz have
not showed a
Excerpts from Dave Chinner's message of 2010-12-15 22:37:18 -0500:
On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote:
Usually the trick to reproducing filesystem corruptions is adding memory
pressure. The corruption is probably a bad interaction between reads
and writes, and
On Mon, Dec 13, 2010 at 7:56 PM, Jon Nelson jnel...@jamponi.net wrote:
On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o ty...@mit.edu wrote:
On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote:
I'm glad you've been able to reproduce the problem! If you should need
any further assistance,
I have a question though: the deactivation of multiple page-io
submission support most likely only would affect bigger systems or
also desktop systems (like mine) ?
I think this is not a final fix, just a workaround.
The problem with the other path still really needs to be tracked down.
-Andi
On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote:
I have a question though: the deactivation of multiple page-io
submission support most likely only would affect bigger systems or
also desktop systems (like mine) ?
I think this is not a final fix, just a workaround.
The
On Wed, Dec 15, 2010 at 8:25 PM, Matt jackdac...@gmail.com wrote:
On Wed, Dec 15, 2010 at 8:16 PM, Andi Kleen a...@firstfloor.org wrote:
I have a question though: the deactivation of multiple page-io
submission support most likely only would affect bigger systems or
also desktop systems (like
On Sun, Dec 12, 2010 at 8:06 PM, Ted Ts'o ty...@mit.edu wrote:
On Sun, Dec 12, 2010 at 07:11:28AM -0600, Jon Nelson wrote:
I'm glad you've been able to reproduce the problem! If you should need
any further assistance, please do not hesitate to ask.
This patch seems to fix the problem for me.
On Sat, Dec 11, 2010 at 9:16 PM, Jon Nelson jnel...@jamponi.net wrote:
On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o ty...@mit.edu wrote:
Yes, indeed. Is this in the virtualized environment or on real
hardware at this point? And how many CPU's do you have configured in
your virtualized
On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote:
I have one CPU configured in the environment, 512MB of memory.
I have not done any memory-constriction tests whatsoever.
I've finally been able to reproduce it myself, on real hardware. SMP
is not necessary to reproduce it, although
On Sun, Dec 12, 2010 at 6:43 AM, Ted Ts'o ty...@mit.edu wrote:
On Sun, Dec 12, 2010 at 04:18:29AM -0600, Jon Nelson wrote:
I have one CPU configured in the environment, 512MB of memory.
I have not done any memory-constriction tests whatsoever.
I've finally been able to reproduce it myself,
On Fri, Dec 10, 2010 at 08:14:56PM -0600, Jon Nelson wrote:
Barring false negatives, bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc
appears to be the culprit (according to git bisect).
I will test bd2d0210cf22f2bd0cef72eb97cf94fc7d31d8cc again, confirm
the behavior, and work backwards to try to
One experiment --- can you try this with the file system mounted with
data=writeback, and see if the problem reproduces in that journalling
mode?
I want to rule out (if possible) journal_submit_inode_data_buffers()
racing with mpage_da_submit_io(). I don't think that's the issue, but
I'd prefer
On Sat, Dec 11, 2010 at 7:40 PM, Ted Ts'o ty...@mit.edu wrote:
Yes, indeed. Is this in the virtualized environment or on real
hardware at this point? And how many CPU's do you have configured in
your virtualized environment, and how memory memory? Is having a
certain number of CPU's
On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote:
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote:
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote:
Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179
from the tests I've done that one showed
On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson jnel...@jamponi.net wrote:
On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote:
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote:
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote:
Try a kernel before
On Fri, Dec 10, 2010 at 10:54 AM, Jon Nelson jnel...@jamponi.net wrote:
On Fri, Dec 10, 2010 at 8:58 AM, Jon Nelson jnel...@jamponi.net wrote:
On Fri, Dec 10, 2010 at 12:52 AM, Jon Nelson jnel...@jamponi.net wrote:
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote:
On Fri, Dec 10,
On Tue, Dec 07, 2010 at 09:37:20PM -0600, Jon Nelson wrote:
One difference is the location of the transaction logs (pg_xlog). In
my case, /var/lib/pgsql/data *is* mountpoint for the test volume
(actually, it's a symlink to the mount point). In your case, that is
not so. Perhaps that makes a
On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote:
You should be OK, there. Are you using encryption or no?
I had difficulty replicating the issue without encryption.
Yes, I'm using encryption. LUKS with aes-xts-plain-sha256, and then
LVM on top of LUKS.
If you can point out how
On Thu, Dec 9, 2010 at 2:13 PM, Ted Ts'o ty...@mit.edu wrote:
On Thu, Dec 09, 2010 at 12:10:58PM -0600, Jon Nelson wrote:
You should be OK, there. Are you using encryption or no?
I had difficulty replicating the issue without encryption.
Yes, I'm using encryption. LUKS with
512MB.
'free' reports 75MB, 419MB free.
I originally noticed the problem on really real hardware (thinkpad
T61p), however.
If you can easily reproduce it could you try a git bisect?
-Andi
--
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500:
512MB.
'free' reports 75MB, 419MB free.
I originally noticed the problem on really real hardware (thinkpad
T61p), however.
If you can easily reproduce it could you try a git bisect?
Do we have a known good kernel?
On Fri, Dec 10, 2010 at 2:38 AM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500:
512MB.
'free' reports 75MB, 419MB free.
I originally noticed the problem on really real hardware (thinkpad
T61p), however.
If you can easily
On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500:
512MB.
'free' reports 75MB, 419MB free.
I originally noticed the problem on really real hardware (thinkpad
T61p), however.
If you can easily
Excerpts from Mike Fedyk's message of 2010-12-09 20:58:40 -0500:
On Thu, Dec 9, 2010 at 5:38 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Andi Kleen's message of 2010-12-09 18:16:16 -0500:
512MB.
'free' reports 75MB, 419MB free.
I originally noticed the problem on
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote:
Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179
from the tests I've done that one showed the least or no corruption if
you count the empty /etc/env.d/03opengl as an artefact
Yes, that's a good test. Also try commit
On Thu, Dec 9, 2010 at 8:38 PM, Ted Ts'o ty...@mit.edu wrote:
On Fri, Dec 10, 2010 at 02:53:30AM +0100, Matt wrote:
Try a kernel before 5a87b7a5da250c9be6d757758425dfeaf8ed3179
from the tests I've done that one showed the least or no corruption if
you count the empty /etc/env.d/03opengl as
On 12/08/2010 04:29 AM, Jon Nelson wrote:
Maybe not so fantastic. I kept testing and had no more failures. At
all. After 40+ iterations I gave up.
I went back to trying ext4 on a LUKS volume. The 'hit' ratio went to
something like 1 in 3, or better.
Encryption usually propagates bit
Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500:
On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500:
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts
On Tue, Dec 7, 2010 at 9:37 PM, Jon Nelson jnel...@jamponi.net wrote:
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote:
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
1. create a database (from bash):
createdb test
2. place the following contents in a file (I
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE
KOTD kernel) I easily encounter the issue.
Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64
install, installed all updates, installed postgresql and the 'KOTD'
(Kernel of the Day)
kernel, and ran the
On Tue, Dec 07 2010 at 1:10pm -0500,
Jon Nelson jnel...@jamponi.net wrote:
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE
KOTD kernel) I easily encounter the issue.
Using a virtual machine, I created a stock, minimal openSUSE 11.3 x86_64
install, installed all
On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote:
On Tue, Dec 07 2010 at 1:10pm -0500,
Jon Nelson jnel...@jamponi.net wrote:
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE
KOTD kernel) I easily encounter the issue.
Using a virtual machine, I
Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500:
On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote:
On Tue, Dec 07 2010 at 1:10pm -0500,
Jon Nelson jnel...@jamponi.net wrote:
I finally found some time to test this out. With 2.6.37-rc4 (openSUSE
KOTD
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 13:45:14 -0500:
On Tue, Dec 7, 2010 at 12:22 PM, Mike Snitzer snit...@redhat.com wrote:
On Tue, Dec 07 2010 at 1:10pm -0500,
Jon Nelson jnel...@jamponi.net wrote:
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
1. create a database (from bash):
createdb test
2. place the following contents in a file (I used 't.sql'):
begin;
create temporary table foo as select x as a, ARRAY[x] as b FROM
generate_series(1, 1000 ) AS x;
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote:
postgresql errors. Typically, header corruption but from the limited
visibility I've had into this via strace, what I see is zeroed pages
where there
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com wrote:
postgresql errors. Typically, header corruption but from the limited
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com
wrote:
On Tue, Dec 7, 2010 at 2:33 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7,
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason chris.ma...@oracle.com
wrote:
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7,
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote:
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
1. create a database (from bash):
createdb test
2. place the following contents in a file (I used 't.sql'):
begin;
create temporary table foo as select x as
Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500:
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts
On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500:
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7,
On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o ty...@mit.edu wrote:
On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
1. create a database (from bash):
createdb test
2. place the following contents in a file (I used 't.sql'):
begin;
create temporary table foo as select x as
On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:
On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason chris.ma...@oracle.com wrote:
Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:
On Tue, Dec 7,
On 05.12.2010, Matt wrote:
I should have made it clear that the results I get are observed when
using the kernels/checkouts *with* the dm-crypt multi-cpu patch,
without the patch I didn't see that kind of problems (hardlocks, files
missing, etc.)
I have to take back my other two emails,
On 12/05/2010 11:09 AM, Heinz Diehl wrote:
On 05.12.2010, Matt wrote:
I have to take back my other two emails, stating that no corruption
happened with the dm-crypt multi-cpu patch. Today, I encountered
filesystem corruption on one, and a complete hardlock on another machine.
No logfile
On 05.12.2010, Milan Broz wrote:
Which kernel? 2.6.37-rc?
2.6.37-rc4 on one and 2.6.37-rc3-git2 on the other machine.
Anyone seen this with 2.6.36 and the same dmcrypt patch?
(All info I had is that is is stable with here.)
Both 2.6.36 and 2.6.36.1 with your patch have been running
On Dec 5, 2010, at 5:21 AM, Milan Broz wrote:
Which kernel? 2.6.37-rc?
Anyone seen this with 2.6.36 and the same dmcrypt patch?
(All info I had is that is is stable with here.)
It still seems to like dmcrypt with its parallel processing is just
trigger to another bug in 37-rc.
I've
On Sun, Dec 5, 2010 at 2:24 PM, Theodore Tso ty...@mit.edu wrote:
On Dec 5, 2010, at 5:21 AM, Milan Broz wrote:
Which kernel? 2.6.37-rc?
Anyone seen this with 2.6.36 and the same dmcrypt patch?
(All info I had is that is is stable with here.)
It still seems to like dmcrypt with its
On Sun, Dec 05, 2010 at 02:44:14PM +0100, Matt wrote:
gcc version 4.5.1 (Gentoo Hardened 4.5.1-r1 p1.4, pie-0.4.5)
This is probably just me being paranoid, but it might be worth trying
using a gcc 4.4.x compiler and see if that makes any difference.
There have been some other gcc 4.5-caused
On 05.12.2010, Theodore Tso wrote:
As another thought, what version of GCC are people using who
are having difficulty? Could this perhaps be a compiler-related issue?
h...@liesel:~ gcc -v
Using built-in specs.
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr
Hi Heinz,
On 5 December 2010 14:33, Heinz Diehl h...@fritha.org wrote:
On 05.12.2010, Theodore Tso wrote:
As another thought, what version of GCC are people using who
are having difficulty? Could this perhaps be a compiler-related issue?
h...@liesel:~ gcc -v
Using built-in specs.
Target:
I've been using a kernel which is between 2.6.37-rc2 and -rc3 with a LUKS /
dm-crypt / LVM / ext4 setup for my primary file systems, and I haven't
observed any corruption for the last two weeks or so. It's on my todo list
to upgrade to top of Linus's tree, but perhaps this is a useful
On Sun, Dec 05 2010 at 3:28pm -0500,
Andi Kleen a...@firstfloor.org wrote:
As another thought, what version of GCC are people using who are having
difficulty? Could this perhaps be a compiler-related issue?
A compiler problem seems very unlikely here.
What may be an useful
On Sun, 05 Dec 2010 08:24:32 EST, Theodore Tso said:
I've been using a kernel which is between 2.6.37-rc2 and -rc3 with a LUKS /
dm-crypt / LVM / ext4 setup for my primary file systems, and I haven't
observed
any corruption for the last two weeks or so.
Pretty much exactly the same setup
On 06.12.2010, Daniel J Blueman wrote:
A bit late to the party, but does memtest86 pass over multiple iterations?
Yes, it does. This machine had not a single fault in several years, it's
absolutely rock-stable. These freezes/corruptions are the first ones ever,
and they vanish when I go down
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote:
On Wed, Dec 01 2010 at 3:45pm -0500,
Milan Broz mb...@redhat.com wrote:
On 12/01/2010 08:34 PM, Jon Nelson wrote:
Perhaps this is useful: for myself, I found that when I started using
2.6.37rc3 that postgresql
On Sat, Dec 04 2010 at 2:18pm -0500,
Matt jackdac...@gmail.com wrote:
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote:
Matt and Jon,
If you'd be up to it: could you try testing your dm-crypt+ext4
corruption reproducers against the following two 2.6.37-rc commits:
On 04.12.2010, Matt wrote:
I'm not sure if it's even a problem with ext4 - I haven't had the time
to test with XFS yet
I can and have run both -rc3 and -rc4 with Milans patch v6, without any
problems at
all, under heavy load and disk I/O. I'm using XFS exclusively. The system
is a testing
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer snit...@redhat.com wrote:
On Sat, Dec 04 2010 at 2:18pm -0500,
Matt jackdac...@gmail.com wrote:
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote:
Matt and Jon,
If you'd be up to it: could you try testing your
On Sat, Dec 4, 2010 at 8:38 PM, Mike Snitzer snit...@redhat.com wrote:
On Sat, Dec 04 2010 at 2:18pm -0500,
Matt jackdac...@gmail.com wrote:
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote:
Matt and Jon,
If you'd be up to it: could you try testing your
On Wed, Dec 1, 2010 at 10:23 PM, Mike Snitzer snit...@redhat.com wrote:
On Wed, Dec 01 2010 at 3:45pm -0500,
Milan Broz mb...@redhat.com wrote:
On 12/01/2010 08:34 PM, Jon Nelson wrote:
Perhaps this is useful: for myself, I found that when I started using
2.6.37rc3 that postgresql
On Wed, Dec 01 2010 at 3:45pm -0500,
Milan Broz mb...@redhat.com wrote:
On 12/01/2010 08:34 PM, Jon Nelson wrote:
Perhaps this is useful: for myself, I found that when I started using
2.6.37rc3 that postgresql starting having a *lot* of problems with
corruption. Specifically, I noted
66 matches
Mail list logo