Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-03-22 Thread Dan Halbert
I'd like to mention what might be a new twist on this problem. We are seeing the same kind of 4k-block data corruption on multiple Tyan dual-Opteron boards (S3870) with a ServerWorks chipset, not Nvidia. I wonder if it really an Nvidia-specific issue. The Nvidia boards are a lot more popular,

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-03-22 Thread Christoph Anton Mitterer
Hi folks. 1) Are there any new developments in this issue? Does someone know if AMD and Nvidia is still investigating? 2) Steve Langasek from Debian sent me a patch that disables the hw-iommu per default on Nvidia boards. I've attached it in the kernel bugzilla and asked for inclusion in the

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-03-22 Thread Christoph Anton Mitterer
Hi folks. 1) Are there any new developments in this issue? Does someone know if AMD and Nvidia is still investigating? 2) Steve Langasek from Debian sent me a patch that disables the hw-iommu per default on Nvidia boards. I've attached it in the kernel bugzilla and asked for inclusion in the

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-03-22 Thread Dan Halbert
I'd like to mention what might be a new twist on this problem. We are seeing the same kind of 4k-block data corruption on multiple Tyan dual-Opteron boards (S3870) with a ServerWorks chipset, not Nvidia. I wonder if it really an Nvidia-specific issue. The Nvidia boards are a lot more popular,

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-16 Thread Robert Hancock
Christoph Anton Mitterer wrote: Ok,.. that sounds reasonable,.. so the whole thing might (!) actually be a hardware design error,... but we just don't use that hardware any longer when accessing devices via sata_nv. So this doesn't solve our problem with PATA drives or other devices (although

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-16 Thread Christoph Anton Mitterer
Robert Hancock wrote: >> What is that GART thing exactly? Is this the hardware IOMMU? I've always >> thought GART was something graphics card related,.. but if so,.. how >> could this solve our problem (that seems to occur mainly on harddisks)? >> > The GART built into the Athlon 64/Opteron

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-16 Thread Christoph Anton Mitterer
Robert Hancock wrote: What is that GART thing exactly? Is this the hardware IOMMU? I've always thought GART was something graphics card related,.. but if so,.. how could this solve our problem (that seems to occur mainly on harddisks)? The GART built into the Athlon 64/Opteron CPUs is

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-16 Thread Robert Hancock
Christoph Anton Mitterer wrote: Ok,.. that sounds reasonable,.. so the whole thing might (!) actually be a hardware design error,... but we just don't use that hardware any longer when accessing devices via sata_nv. So this doesn't solve our problem with PATA drives or other devices (although

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Robert Hancock
Christoph Anton Mitterer wrote: Sorry, as always I've forgot some things... *g* Robert Hancock wrote: If this is related to some problem with using the GART IOMMU with memory hole remapping enabled What is that GART thing exactly? Is this the hardware IOMMU? I've always thought GART was

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Sorry, as always I've forgot some things... *g* Robert Hancock wrote: > If this is related to some problem with using the GART IOMMU with memory > hole remapping enabled What is that GART thing exactly? Is this the hardware IOMMU? I've always thought GART was something graphics card related,..

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Hi everybody. Sorry again for my late reply... Robert gave us the following interesting information some days ago: Robert Hancock wrote: > If this is related to some problem with using the GART IOMMU with memory > hole remapping enabled, then 2.6.20-rc kernels may avoid this problem on >

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Hi. Some days ago I received the following message from "Sunny Days". I think he did not send it lkml so I forward it now: Sunny Days wrote: > hello, > > i have done some extensive testing on this. > > various opterons, always single socket > various dimms 1 and 2gb modules > and hitachi+seagate

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Hi. Some days ago I received the following message from Sunny Days. I think he did not send it lkml so I forward it now: Sunny Days wrote: hello, i have done some extensive testing on this. various opterons, always single socket various dimms 1 and 2gb modules and hitachi+seagate disks

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Hi everybody. Sorry again for my late reply... Robert gave us the following interesting information some days ago: Robert Hancock wrote: If this is related to some problem with using the GART IOMMU with memory hole remapping enabled, then 2.6.20-rc kernels may avoid this problem on nForce4

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Christoph Anton Mitterer
Sorry, as always I've forgot some things... *g* Robert Hancock wrote: If this is related to some problem with using the GART IOMMU with memory hole remapping enabled What is that GART thing exactly? Is this the hardware IOMMU? I've always thought GART was something graphics card related,..

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-15 Thread Robert Hancock
Christoph Anton Mitterer wrote: Sorry, as always I've forgot some things... *g* Robert Hancock wrote: If this is related to some problem with using the GART IOMMU with memory hole remapping enabled What is that GART thing exactly? Is this the hardware IOMMU? I've always thought GART was

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-04 Thread Christoph Anton Mitterer
Hi. Just for you information: I've put the issue into the kernel.org bugzilla. http://bugzilla.kernel.org/show_bug.cgi?id=7768 Chris. begin:vcard fn:Mitterer, Christoph Anton n:Mitterer;Christoph Anton email;internet:[EMAIL PROTECTED] x-mozilla-html:TRUE version:2.1 end:vcard

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-04 Thread Christoph Anton Mitterer
Hi. Just for you information: I've put the issue into the kernel.org bugzilla. http://bugzilla.kernel.org/show_bug.cgi?id=7768 Chris. begin:vcard fn:Mitterer, Christoph Anton n:Mitterer;Christoph Anton email;internet:[EMAIL PROTECTED] x-mozilla-html:TRUE version:2.1 end:vcard

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-03 Thread Robert Hancock
Christoph Anton Mitterer wrote: Hi. Perhaps some of you have read my older two threads: http://marc.theaimsgroup.com/?t=11631244001=1=2 and the even older http://marc.theaimsgroup.com/?t=11629131451=1=2 The issue was basically the following: I found a severe bug mainly by fortune

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-03 Thread Christoph Anton Mitterer
Hi everybody. After my last mails to this issue (btw: anything new in the meantime? I received no replys..) I wrote again to nvidia and AMD... This time with some more success. Below is the answer from Mr. Friedman to my mail. He says that he wasn't able to reproduce the problem and asks for a

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-03 Thread Christoph Anton Mitterer
Hi everybody. After my last mails to this issue (btw: anything new in the meantime? I received no replys..) I wrote again to nvidia and AMD... This time with some more success. Below is the answer from Mr. Friedman to my mail. He says that he wasn't able to reproduce the problem and asks for a

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2007-01-03 Thread Robert Hancock
Christoph Anton Mitterer wrote: Hi. Perhaps some of you have read my older two threads: http://marc.theaimsgroup.com/?t=11631244001r=1w=2 and the even older http://marc.theaimsgroup.com/?t=11629131451r=1w=2 The issue was basically the following: I found a severe bug mainly by fortune

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread Christoph Anton Mitterer
John A Chaves wrote: > I didn't need to run a specific test for this. The normal workload of the > machine approximates a continuous selftest for almost the last year. > > Large files (4-12GB is typical) are being continuously packed and unpacked > with gzip and bzip2. Statistical analysis of

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread John A Chaves
On Friday 22 December 2006 20:04, Christoph Anton Mitterer wrote: > This brings me to: > Chris Wedgwood wrote: > > Does anyone have an amd64 with an nforce4 chipset and >4GB that does > > NOT have this problem? If so it might be worth chasing the BIOS > > vendors to see what errata they are

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread Christoph Anton Mitterer
Hi my friends It became a little bit silent about this issue... any new ideas or results? Karsten Weiss wrote: > BTW: Did someone already open an official bug at > http://bugzilla.kernel.org ? Karsten, did you already file a bug? I told the whole issue to the Debian people which are

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread Christoph Anton Mitterer
Hi my friends It became a little bit silent about this issue... any new ideas or results? Karsten Weiss wrote: BTW: Did someone already open an official bug at http://bugzilla.kernel.org ? Karsten, did you already file a bug? I told the whole issue to the Debian people which are about

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread John A Chaves
On Friday 22 December 2006 20:04, Christoph Anton Mitterer wrote: This brings me to: Chris Wedgwood wrote: Does anyone have an amd64 with an nforce4 chipset and 4GB that does NOT have this problem? If so it might be worth chasing the BIOS vendors to see what errata they are dealing with.

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-22 Thread Christoph Anton Mitterer
John A Chaves wrote: I didn't need to run a specific test for this. The normal workload of the machine approximates a continuous selftest for almost the last year. Large files (4-12GB is typical) are being continuously packed and unpacked with gzip and bzip2. Statistical analysis of the

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-15 Thread Paul Slootman
[EMAIL PROTECTED] wrote: >On Wed, Dec 13, 2006 at 09:11:29PM +0100, Christoph Anton Mitterer wrote: > >> - error in the Opteron (memory controller) >> - error in the Nvidia chipsets >> - error in the kernel > >My guess without further information would be that some, but not all >BIOSes are doing

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-15 Thread Paul Slootman
[EMAIL PROTECTED] wrote: On Wed, Dec 13, 2006 at 09:11:29PM +0100, Christoph Anton Mitterer wrote: - error in the Opteron (memory controller) - error in the Nvidia chipsets - error in the kernel My guess without further information would be that some, but not all BIOSes are doing some work to

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Dax Kelson
On Sat, 2006-12-02 at 01:56 +0100, Christoph Anton Mitterer wrote: > Hi. > > Perhaps some of you have read my older two threads: > http://marc.theaimsgroup.com/?t=11631244001=1=2 and the even > older http://marc.theaimsgroup.com/?t=11629131451=1=2 > > The issue was basically the

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Christoph Anton Mitterer
Muli Ben-Yehuda wrote: >> 4) >> And does someone know if the nforce/opteron iommu requires IBM Calgary >> IOMMU support? >> > It doesn't, Calgary isn't found in machine with Opteron CPUs or NForce > chipsets (AFAIK). However, compiling Calgary in should make no > difference, as we detect in

Re: [PATCH 2nd try] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 02:16:31PM +0100, Karsten Weiss wrote: > On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: > > > The rest looks good. Please resend and I'll add my Acked-by. > > Thanks a lot for your comments and suggestions. Here's my 2nd try: > > === > > From: Karsten Weiss <[EMAIL

[PATCH 2nd try] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Karsten Weiss
On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: > The rest looks good. Please resend and I'll add my Acked-by. Thanks a lot for your comments and suggestions. Here's my 2nd try: === From: Karsten Weiss <[EMAIL PROTECTED]> $ diffstat ~/iommu-patch_v2.patch Documentation/kernel-parameters.txt |

Re: [PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:38:08PM +0100, Karsten Weiss wrote: > On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: > > > On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: > > > > > BTW: It would be really great if this area of the kernel would get some > > > more and better

[PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Karsten Weiss
On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: > On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: > > > BTW: It would be really great if this area of the kernel would get some > > more and better documentation. The information at > > linux-2.6/Documentation/x86_64/boot_options.txt

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 02:52:35AM -0700, Erik Andersen wrote: > On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote: > > > I just realized that booting with "iommu=soft" makes my pcHDTV > > > HD5500 DVB cards not work. Time to go back to disabling the > > > memhole and losing 1 GB.

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Erik Andersen
On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote: > > I just realized that booting with "iommu=soft" makes my pcHDTV > > HD5500 DVB cards not work. Time to go back to disabling the > > memhole and losing 1 GB. :-( > > That points to a bug in the driver (likely) or swiotlb

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 09:11:29PM +0100, Christoph Anton Mitterer wrote: > - error in the Opteron (memory controller) > - error in the Nvidia chipsets > - error in the kernel My guess without further information would be that some, but not all BIOSes are doing some work to avoid this. Does

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:33:23AM +0100, Christoph Anton Mitterer wrote: > 4) > And does someone know if the nforce/opteron iommu requires IBM Calgary > IOMMU support? It doesn't, Calgary isn't found in machine with Opteron CPUs or NForce chipsets (AFAIK). However, compiling Calgary in should

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 01:29:25PM -0700, Erik Andersen wrote: > On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: > > We could not reproduce the data corruption anymore if we boot > > the machines with the kernel parameter "iommu=soft" i.e. if we > > use software bounce buffering

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: > FWIW: As far as I understand the linux kernel code (I am no kernel > developer so please correct me if I am wrong) the PCI dma mapping code is > abstracted by struct dma_mapping_ops. I.e. there are currently four > possible

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: FWIW: As far as I understand the linux kernel code (I am no kernel developer so please correct me if I am wrong) the PCI dma mapping code is abstracted by struct dma_mapping_ops. I.e. there are currently four possible

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Wed, Dec 13, 2006 at 01:29:25PM -0700, Erik Andersen wrote: On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if we use software bounce buffering instead of

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:33:23AM +0100, Christoph Anton Mitterer wrote: 4) And does someone know if the nforce/opteron iommu requires IBM Calgary IOMMU support? It doesn't, Calgary isn't found in machine with Opteron CPUs or NForce chipsets (AFAIK). However, compiling Calgary in should make

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 09:11:29PM +0100, Christoph Anton Mitterer wrote: - error in the Opteron (memory controller) - error in the Nvidia chipsets - error in the kernel My guess without further information would be that some, but not all BIOSes are doing some work to avoid this. Does anyone

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Erik Andersen
On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote: I just realized that booting with iommu=soft makes my pcHDTV HD5500 DVB cards not work. Time to go back to disabling the memhole and losing 1 GB. :-( That points to a bug in the driver (likely) or swiotlb (unlikely), as

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 02:52:35AM -0700, Erik Andersen wrote: On Thu Dec 14, 2006 at 11:23:11AM +0200, Muli Ben-Yehuda wrote: I just realized that booting with iommu=soft makes my pcHDTV HD5500 DVB cards not work. Time to go back to disabling the memhole and losing 1 GB. :-( That

[PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Karsten Weiss
On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: BTW: It would be really great if this area of the kernel would get some more and better documentation. The information at linux-2.6/Documentation/x86_64/boot_options.txt is very

Re: [PATCH] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 12:38:08PM +0100, Karsten Weiss wrote: On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: On Wed, Dec 13, 2006 at 09:34:16PM +0100, Karsten Weiss wrote: BTW: It would be really great if this area of the kernel would get some more and better documentation. The

[PATCH 2nd try] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Karsten Weiss
On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: The rest looks good. Please resend and I'll add my Acked-by. Thanks a lot for your comments and suggestions. Here's my 2nd try: === From: Karsten Weiss [EMAIL PROTECTED] $ diffstat ~/iommu-patch_v2.patch Documentation/kernel-parameters.txt |

Re: [PATCH 2nd try] Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Muli Ben-Yehuda
On Thu, Dec 14, 2006 at 02:16:31PM +0100, Karsten Weiss wrote: On Thu, 14 Dec 2006, Muli Ben-Yehuda wrote: The rest looks good. Please resend and I'll add my Acked-by. Thanks a lot for your comments and suggestions. Here's my 2nd try: === From: Karsten Weiss [EMAIL PROTECTED] $

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Christoph Anton Mitterer
Muli Ben-Yehuda wrote: 4) And does someone know if the nforce/opteron iommu requires IBM Calgary IOMMU support? It doesn't, Calgary isn't found in machine with Opteron CPUs or NForce chipsets (AFAIK). However, compiling Calgary in should make no difference, as we detect in run-time

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-14 Thread Dax Kelson
On Sat, 2006-12-02 at 01:56 +0100, Christoph Anton Mitterer wrote: Hi. Perhaps some of you have read my older two threads: http://marc.theaimsgroup.com/?t=11631244001r=1w=2 and the even older http://marc.theaimsgroup.com/?t=11629131451r=1w=2 The issue was basically the following:

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Hi. I've just looked for some kernel config options that might relate to our issue: 1) Old style AMD Opteron NUMA detection (CONFIG_K8_NUMA) Enable K8 NUMA node topology detection. You should say Y here if you have a multi processor AMD K8 system. This uses an old method to read the NUMA

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 08:57:23PM +0100, Christoph Anton Mitterer wrote: > Don't understand me wrong,.. I don't use Windows (expect for upgrading > my Plextor firmware and EAC ;) )... but I ask because the more > information we get (even if it's not Linux specific) the more steps we > can take ;)

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Lennart Sorensen wrote: > I upgrade my plextor firmware using linux. pxupdate for most devices, > and pxfw for new drivers (like the PX760). Works perfectly for me. It > is one of the reasons I buy plextors. Yes I know about it,.. although never tested it,... anyway the main reason for Windows

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Chris Wedgwood wrote: > > Any ideas why iommu=disabled in the bios does not solve the issue? > > The kernel will still use the IOMMU if the BIOS doesn't set it up if > it can, check your dmesg for IOMMU strings, there might be something > printed to this effect. FWIW: As

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Erik Andersen wrote: > I just realized that booting with "iommu=soft" makes my pcHDTV > HD5500 DVB cards not work. Time to go back to disabling the > memhole and losing 1 GB. :-( Crazy,... I have a Hauppauge Nova-T 500 DualDVB-T card,... I'll check it later if I have the same problem and will

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Erik Andersen
On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: > We could not reproduce the data corruption anymore if we boot > the machines with the kernel parameter "iommu=soft" i.e. if we > use software bounce buffering instead of the hw-iommu. I just realized that booting with "iommu=soft"

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Erik Andersen wrote: > On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: > > Last week we did some more testing with the following result: > > > > We could not reproduce the data corruption anymore if we boot the machines > > with the kernel parameter

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Erik Andersen
On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: > Last week we did some more testing with the following result: > > We could not reproduce the data corruption anymore if we boot the machines > with the kernel parameter "iommu=soft" i.e. if we use software bounce > buffering

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Chris Wedgwood wrote: >> Did anyone made any test under Windows? I cannot set there >> iommu=soft, can I? >> > Windows never uses the hardware iommu, so it's always doing the > equivalent on iommu=soft > That would mean that I'm not able to reproduce the issue unter windows, right? Does

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: > "Memory hole mapping" was set to "hardware". With "disabled" we only > see 3 of our 4 GB memory. > That sounds reasonable,... I even only see 2,5 GB,.. as my memhole takes 1536 MB (don't ask me which PCI device needs that much address space ;) ) begin:vcard fn:Mitterer,

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: > Of course, the big question "Why does the hardware iommu *not* > work on those machines?" still remains. > I'm going to check AMDs errata docs these days,.. perhaps I find something that relates. But I'd ask you to do the same as I don't consider myself as an expert in

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Christoph Anton Mitterer wrote: Christoph, I will carefully re-read your entire posting and the included links on Monday and will also try the memory hole setting. And did you get out anything new? As I already mentioned the kernel parameter "iommu=soft" fixes the data

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 08:18:21PM +0100, Christoph Anton Mitterer wrote: > booting with iommu=soft => works fine > booting with iommu=noagp => DOESN'T solve the error > booting with iommu=off => the system doesn't even boot and panics > When I set IOMMU to disabled in the BIOS the error is not

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 08:20:59PM +0100, Christoph Anton Mitterer wrote: > Did anyone made any test under Windows? I cannot set there > iommu=soft, can I? Windows never uses the hardware iommu, so it's always doing the equivalent on iommu=soft - To unsubscribe from this list: send the line

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Ah and I forgot,... Did anyone made any test under Windows? I cannot set there iommu=soft, can I? Chris. begin:vcard fn:Mitterer, Christoph Anton n:Mitterer;Christoph Anton email;internet:[EMAIL PROTECTED] x-mozilla-html:TRUE version:2.1 end:vcard

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: > Last week we did some more testing with the following result: > > We could not reproduce the data corruption anymore if we boot the machines > with the kernel parameter "iommu=soft" i.e. if we use software bounce > buffering instead of the hw-iommu. (As mentioned before,

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: > Here's a diff of a corrupted and a good file written during our > testcase: > > ("-" == corrupted file, "+" == good file) > ... > 009f2ff0 67 2a 4c c4 6d 9d 34 44 ad e6 3c 45 05 9a 4d c4 > |g*L.m.4D.. -009f3000 39 60 e6 44 20 ab 46 44 56 aa 46 44 c2 35 e6 44 |9.D

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: Here's a diff of a corrupted and a good file written during our testcase: (- == corrupted file, + == good file) ... 009f2ff0 67 2a 4c c4 6d 9d 34 44 ad e6 3c 45 05 9a 4d c4 |g*L.m.4D..E..M.| -009f3000 39 60 e6 44 20 ab 46 44 56 aa 46 44 c2 35 e6 44 |9.D

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: Last week we did some more testing with the following result: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if we use software bounce buffering instead of the hw-iommu. (As mentioned before, booting

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Ah and I forgot,... Did anyone made any test under Windows? I cannot set there iommu=soft, can I? Chris. begin:vcard fn:Mitterer, Christoph Anton n:Mitterer;Christoph Anton email;internet:[EMAIL PROTECTED] x-mozilla-html:TRUE version:2.1 end:vcard

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 08:20:59PM +0100, Christoph Anton Mitterer wrote: Did anyone made any test under Windows? I cannot set there iommu=soft, can I? Windows never uses the hardware iommu, so it's always doing the equivalent on iommu=soft - To unsubscribe from this list: send the line

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Chris Wedgwood
On Wed, Dec 13, 2006 at 08:18:21PM +0100, Christoph Anton Mitterer wrote: booting with iommu=soft = works fine booting with iommu=noagp = DOESN'T solve the error booting with iommu=off = the system doesn't even boot and panics When I set IOMMU to disabled in the BIOS the error is not solved-

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Karsten Weiss wrote: Of course, the big question Why does the hardware iommu *not* work on those machines? still remains. I'm going to check AMDs errata docs these days,.. perhaps I find something that relates. But I'd ask you to do the same as I don't consider myself as an expert in these

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Christoph Anton Mitterer wrote: Christoph, I will carefully re-read your entire posting and the included links on Monday and will also try the memory hole setting. And did you get out anything new? As I already mentioned the kernel parameter iommu=soft fixes the data

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Chris Wedgwood wrote: Did anyone made any test under Windows? I cannot set there iommu=soft, can I? Windows never uses the hardware iommu, so it's always doing the equivalent on iommu=soft That would mean that I'm not able to reproduce the issue unter windows, right? Does that apply

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Erik Andersen
On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: Last week we did some more testing with the following result: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if we use software bounce buffering instead of

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Erik Andersen wrote: On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: Last week we did some more testing with the following result: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Erik Andersen
On Mon Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if we use software bounce buffering instead of the hw-iommu. I just realized that booting with iommu=soft makes my

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Erik Andersen wrote: I just realized that booting with iommu=soft makes my pcHDTV HD5500 DVB cards not work. Time to go back to disabling the memhole and losing 1 GB. :-( Crazy,... I have a Hauppauge Nova-T 500 DualDVB-T card,... I'll check it later if I have the same problem and will inform

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Karsten Weiss
On Wed, 13 Dec 2006, Chris Wedgwood wrote: Any ideas why iommu=disabled in the bios does not solve the issue? The kernel will still use the IOMMU if the BIOS doesn't set it up if it can, check your dmesg for IOMMU strings, there might be something printed to this effect. FWIW: As far as I

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Lennart Sorensen wrote: I upgrade my plextor firmware using linux. pxupdate for most devices, and pxfw for new drivers (like the PX760). Works perfectly for me. It is one of the reasons I buy plextors. Yes I know about it,.. although never tested it,... anyway the main reason for Windows is

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Lennart Sorensen
On Wed, Dec 13, 2006 at 08:57:23PM +0100, Christoph Anton Mitterer wrote: Don't understand me wrong,.. I don't use Windows (expect for upgrading my Plextor firmware and EAC ;) )... but I ask because the more information we get (even if it's not Linux specific) the more steps we can take ;) I

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-13 Thread Christoph Anton Mitterer
Hi. I've just looked for some kernel config options that might relate to our issue: 1) Old style AMD Opteron NUMA detection (CONFIG_K8_NUMA) Enable K8 NUMA node topology detection. You should say Y here if you have a multi processor AMD K8 system. This uses an old method to read the NUMA

amd64 iommu causing corruption? (was Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!)

2006-12-11 Thread Chris Wedgwood
On Mon, Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: > We could not reproduce the data corruption anymore if we boot the > machines with the kernel parameter "iommu=soft" i.e. if we use > software bounce buffering instead of the hw-iommu. (As mentioned > before, booting with mem=2g

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-11 Thread Karsten Weiss
On Sat, 2 Dec 2006, Karsten Weiss wrote: > On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: > > > I found a severe bug mainly by fortune because it occurs very rarely. > > My test looks like the following: I have about 30GB of testing data on > > This sounds very familiar! One of the Linux

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-11 Thread Karsten Weiss
On Sat, 2 Dec 2006, Karsten Weiss wrote: On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: I found a severe bug mainly by fortune because it occurs very rarely. My test looks like the following: I have about 30GB of testing data on This sounds very familiar! One of the Linux compute

amd64 iommu causing corruption? (was Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!)

2006-12-11 Thread Chris Wedgwood
On Mon, Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote: We could not reproduce the data corruption anymore if we boot the machines with the kernel parameter iommu=soft i.e. if we use software bounce buffering instead of the hw-iommu. (As mentioned before, booting with mem=2g works

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-07 Thread Christoph Anton Mitterer
Ville Herva wrote: > I saw something very similar with Via KT133 years ago. Then the culprit was > botched PCI implementation that sometimes corrupted PCI transfers when there > was heavy PCI I/O going on. Usually than meant running two disk transfers at > the same time. Doing heavy network I/O at

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-07 Thread Christoph Anton Mitterer
Ville Herva wrote: I saw something very similar with Via KT133 years ago. Then the culprit was botched PCI implementation that sometimes corrupted PCI transfers when there was heavy PCI I/O going on. Usually than meant running two disk transfers at the same time. Doing heavy network I/O at the

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Christoph Anton Mitterer
Chris Wedgwood wrote: > Heh, I see this also with an Tyan S2866 (nforce4 chipset). I've been > aware something is a miss for a while because if I transfer about 40GB > of data from one machine to another there are checksum mismatches and > some files have to be transfered again. > It seems

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Christoph Anton Mitterer
Alan wrote: > See the thread http://lkml.org/lkml/2006/8/16/305 > Hi Alan. Thanks for your reply. I've read this thread already some weeks ago but from my limited knowledge I understood, that this was an issue related to a SCSI adapter or so. Or did I understand this wrong. And as soon as

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Alan
On Sat, 2 Dec 2006 12:00:36 +0100 (CET) Karsten Weiss <[EMAIL PROTECTED]> wrote: > Hello Christoph! > > On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: > > > I found a severe bug mainly by fortune because it occurs very rarely. > > My test looks like the following: I have about 30GB of

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Karsten Weiss
Hello Christoph! On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: I found a severe bug mainly by fortune because it occurs very rarely. My test looks like the following: I have about 30GB of testing data on This sounds very familiar! One of the Linux compute clusters I administer at work

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Karsten Weiss
Hello Christoph! On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: I found a severe bug mainly by fortune because it occurs very rarely. My test looks like the following: I have about 30GB of testing data on This sounds very familiar! One of the Linux compute clusters I administer at work

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Alan
On Sat, 2 Dec 2006 12:00:36 +0100 (CET) Karsten Weiss [EMAIL PROTECTED] wrote: Hello Christoph! On Sat, 2 Dec 2006, Christoph Anton Mitterer wrote: I found a severe bug mainly by fortune because it occurs very rarely. My test looks like the following: I have about 30GB of testing data

Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!

2006-12-02 Thread Christoph Anton Mitterer
Alan wrote: See the thread http://lkml.org/lkml/2006/8/16/305 Hi Alan. Thanks for your reply. I've read this thread already some weeks ago but from my limited knowledge I understood, that this was an issue related to a SCSI adapter or so. Or did I understand this wrong. And as soon as

  1   2   >