Re: NFS client bug in 2.6.8-2.6.11
Hi Bernardo (et al). Apologies - I've not been reading my account for a wee while. Then again, I probably don't have much useful to add to the debate right now ;-) --- Bernardo Innocenti <[EMAIL PROTECTED]> wrote: > Anders Saaby wrote: > > Anyways if your server has only run with 2.6.10 - try 2.6.11. > > Thank you, I've finally nailed it down by upgrading the > *server* kernel from 2.6.10-1.770_FC3 to 2.6.10-1.770_FC3. Hmm, I will infer from a previous email you sent that you mean 766_FC3 for the "from" kernel. > The latter is basically 2.6.10-ac12 plus a bunch of vendor > specific patches. 766 -> 770 sounds like a "small" (ish) number of patches to check, if we're lucky. Did you wade through 'em all yet? Any smoking guns? Regards, Neil PS: oh bugger, just remembered that I also reproduced my bug with a 2.6.8 kernel on the server; admittedly though it was an FC2 kernel so who knows what extra patches it had. __ Do you Yahoo!? Make Yahoo! your home page http://www.yahoo.com/r/hs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS client bug in 2.6.8-2.6.11
Hi Bernardo (et al). Apologies - I've not been reading my account for a wee while. Then again, I probably don't have much useful to add to the debate right now ;-) --- Bernardo Innocenti [EMAIL PROTECTED] wrote: Anders Saaby wrote: Anyways if your server has only run with 2.6.10 - try 2.6.11. Thank you, I've finally nailed it down by upgrading the *server* kernel from 2.6.10-1.770_FC3 to 2.6.10-1.770_FC3. Hmm, I will infer from a previous email you sent that you mean 766_FC3 for the from kernel. The latter is basically 2.6.10-ac12 plus a bunch of vendor specific patches. 766 - 770 sounds like a small (ish) number of patches to check, if we're lucky. Did you wade through 'em all yet? Any smoking guns? Regards, Neil PS: oh bugger, just remembered that I also reproduced my bug with a 2.6.8 kernel on the server; admittedly though it was an FC2 kernel so who knows what extra patches it had. __ Do you Yahoo!? Make Yahoo! your home page http://www.yahoo.com/r/hs - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS (ext3/VFS?) bug in 2.6.8/10
--- Markus Plail <[EMAIL PROTECTED]> wrote: > I can't help you, but just want to say that I also see those errors > on a local xfs file system, so it doesn't seem to be a NFS problem > I was first seeing this with 2.6.11-rc3-mm1 on a directory with 8k. Hmm, I played about a bit with XFS, but couldn't get my particular recipe to generate any errors whatsoever (either local or with NFS). Can you elaborate? Do you have a simple method which reproduces it? Regards, Neil PS: I have half an idea that the problem might be to do with the dcache. This is probably a wild shot in the dark though as it would be difficult to find someone who knows less than me about the whole VFS ;-)) PPS: sorry for lack of direct reply - had to manually cut and paste your message in from an archive, as I forgot to mention in my original post that I'm off-list at the moment. ___ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS (ext3/VFS?) bug in 2.6.8/10
--- Markus Plail [EMAIL PROTECTED] wrote: I can't help you, but just want to say that I also see those errors on a local xfs file system, so it doesn't seem to be a NFS problem I was first seeing this with 2.6.11-rc3-mm1 on a directory with 8k. Hmm, I played about a bit with XFS, but couldn't get my particular recipe to generate any errors whatsoever (either local or with NFS). Can you elaborate? Do you have a simple method which reproduces it? Regards, Neil PS: I have half an idea that the problem might be to do with the dcache. This is probably a wild shot in the dark though as it would be difficult to find someone who knows less than me about the whole VFS ;-)) PPS: sorry for lack of direct reply - had to manually cut and paste your message in from an archive, as I forgot to mention in my original post that I'm off-list at the moment. ___ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS (ext3/VFS?) bug in 2.6.8/10
--- Neil Conway <[EMAIL PROTECTED]> wrote: > works even on machines with 256MB of RAM. The odd thing I haven't > figured out yet is that the fuslwr machine mentioned above has 2GB of > RAM, and ALL of it is HIGHMEM. Must be a kernel CONFIG option I > guess. > (Rant: what replaces Configure.help???) D'oh!! Brain fade. I mean to type, "2GB of RAM, and ALL of it is LOWMEM". Sigh... Neil __ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
NFS (ext3/VFS?) bug in 2.6.8/10
these kernels are non-vanilla. If people really want to see it tried on a vanilla kernel I can do that too. I have tried to reproduce this recipe on a 2.4.18 machine (2.4.18-27.7.xsmp RH7.3) and failed. I couldn't get nr_inodes to reduce (the kernel docs, while ambiguous, suggest this just doesn't happen under 2.4). Possibly, 2.4 simply doesn't have this bug. I don't actually know whether the "bug" is in the NFS client or server code, or ext3 code (unlikely?) or indeed vfs/mm layers. I can't explain it to myself with anything other than a kernel bug somewhere though. I've browsed the nfs source code a bit looking for page size related stuff, but I'm not familiar with it and predictably I got nowhere. I'm happy to try out patches and suggestions though. Apologies for the length! Any suggestions for what to try next? Many thanks, Neil Conway PS: originally, my test machine was failing REALLY fast. I noticed that it had about 896MB of LOWMEM, and the rest of the 2GB was HIGHMEM. It failed every time LowFree blipped down to zero while NFS transfers were underway. By booting with mem=800M, most of the symptoms went away but not all, and I ended up arriving at the recipe above, which works even on machines with 256MB of RAM. The odd thing I haven't figured out yet is that the fuslwr machine mentioned above has 2GB of RAM, and ALL of it is HIGHMEM. Must be a kernel CONFIG option I guess. (Rant: what replaces Configure.help???) __ Do you Yahoo!? Yahoo! Mail - 250MB free storage. Do more. Manage less. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
NFS (ext3/VFS?) bug in 2.6.8/10
on a vanilla kernel I can do that too. I have tried to reproduce this recipe on a 2.4.18 machine (2.4.18-27.7.xsmp RH7.3) and failed. I couldn't get nr_inodes to reduce (the kernel docs, while ambiguous, suggest this just doesn't happen under 2.4). Possibly, 2.4 simply doesn't have this bug. I don't actually know whether the bug is in the NFS client or server code, or ext3 code (unlikely?) or indeed vfs/mm layers. I can't explain it to myself with anything other than a kernel bug somewhere though. I've browsed the nfs source code a bit looking for page size related stuff, but I'm not familiar with it and predictably I got nowhere. I'm happy to try out patches and suggestions though. Apologies for the length! Any suggestions for what to try next? Many thanks, Neil Conway PS: originally, my test machine was failing REALLY fast. I noticed that it had about 896MB of LOWMEM, and the rest of the 2GB was HIGHMEM. It failed every time LowFree blipped down to zero while NFS transfers were underway. By booting with mem=800M, most of the symptoms went away but not all, and I ended up arriving at the recipe above, which works even on machines with 256MB of RAM. The odd thing I haven't figured out yet is that the fuslwr machine mentioned above has 2GB of RAM, and ALL of it is HIGHMEM. Must be a kernel CONFIG option I guess. (Rant: what replaces Configure.help???) __ Do you Yahoo!? Yahoo! Mail - 250MB free storage. Do more. Manage less. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NFS (ext3/VFS?) bug in 2.6.8/10
--- Neil Conway [EMAIL PROTECTED] wrote: works even on machines with 256MB of RAM. The odd thing I haven't figured out yet is that the fuslwr machine mentioned above has 2GB of RAM, and ALL of it is HIGHMEM. Must be a kernel CONFIG option I guess. (Rant: what replaces Configure.help???) D'oh!! Brain fade. I mean to type, 2GB of RAM, and ALL of it is LOWMEM. Sigh... Neil __ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Argh... --- Neil Conway <[EMAIL PROTECTED]> wrote: > Hi... > > --- Bodo Eggert <[EMAIL PROTECTED]> wrote: > > No common x86 BIOS can understand any partition table. Booting is > > done by > > loading the first sector of the boot device and executing it. The > > common > > D'oh!! Red-face here. Can't believe my brainlessness. > Thanks for putting me straight - that explains a lot. Now to try it > ;-) Ah, if only it was that simple. Since writing the above, I've been searching for more info. I downloaded four different versions of grub (GNU Grub Legacy, GNU Grub2, gentoo and Fedora Core 3). NONE of these showed any evidence of GPT support (I was in a hurry, so I searched for strings EFI, GUID, GPT, TB). Mucho confused puppy here. I fail to see how grub can work on a GPT boot device if it can't parse the partition table. I conclude that I'm still missing something. Perhaps a layer before grub is supposed to parse the GPT instead? If so, isn't that getting us straight back to a GPT-aware BIOS? Tell me if this logic is broken: even if a special boot sector is used, which IS GPT-aware (though fitting that into the boot sector would be a challenge ;-)), once grub loads, it's still going to have to figure out how to find the root(hdX,Y) partition from which to load the kernel image. This surely means it has to have either a GPT-parser internally, or rely on a pre-parsed list. No? Perhaps one of the other several distros (that I didn't check) has a GPT-aware grub. But Tomas Carnecky said early in this thread that gentoo had allowed him to set up a GPT-booting system on x86. I guess it's possible that a cheat was used - maybe an old-style partition table in the MBR was used to define the first (boot) partition, but surely that's forbidden by the whole EFI spec anyway? Andries Brouwer kindly wrote a patch which I haven't had time to test yet (see earlier in thread). While it would be nice to find a way around the problem which didn't require deviations from vanilla distros, I think Andries' patch is looking like the only sane fix right now. Anyone with a definitive answer to the question "can I use GPT on a vanilla x86 mobo", do speak up :-) Regards, Neil PS: I really didn't think that >2TiB disks were quite so far out on the bleeding edge :-/ __ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Argh... --- Neil Conway [EMAIL PROTECTED] wrote: Hi... --- Bodo Eggert [EMAIL PROTECTED] wrote: No common x86 BIOS can understand any partition table. Booting is done by loading the first sector of the boot device and executing it. The common D'oh!! Red-face here. Can't believe my brainlessness. Thanks for putting me straight - that explains a lot. Now to try it ;-) Ah, if only it was that simple. Since writing the above, I've been searching for more info. I downloaded four different versions of grub (GNU Grub Legacy, GNU Grub2, gentoo and Fedora Core 3). NONE of these showed any evidence of GPT support (I was in a hurry, so I searched for strings EFI, GUID, GPT, TB). Mucho confused puppy here. I fail to see how grub can work on a GPT boot device if it can't parse the partition table. I conclude that I'm still missing something. Perhaps a layer before grub is supposed to parse the GPT instead? If so, isn't that getting us straight back to a GPT-aware BIOS? Tell me if this logic is broken: even if a special boot sector is used, which IS GPT-aware (though fitting that into the boot sector would be a challenge ;-)), once grub loads, it's still going to have to figure out how to find the root(hdX,Y) partition from which to load the kernel image. This surely means it has to have either a GPT-parser internally, or rely on a pre-parsed list. No? Perhaps one of the other several distros (that I didn't check) has a GPT-aware grub. But Tomas Carnecky said early in this thread that gentoo had allowed him to set up a GPT-booting system on x86. I guess it's possible that a cheat was used - maybe an old-style partition table in the MBR was used to define the first (boot) partition, but surely that's forbidden by the whole EFI spec anyway? Andries Brouwer kindly wrote a patch which I haven't had time to test yet (see earlier in thread). While it would be nice to find a way around the problem which didn't require deviations from vanilla distros, I think Andries' patch is looking like the only sane fix right now. Anyone with a definitive answer to the question can I use GPT on a vanilla x86 mobo, do speak up :-) Regards, Neil PS: I really didn't think that 2TiB disks were quite so far out on the bleeding edge :-/ __ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Hi... --- Bodo Eggert <[EMAIL PROTECTED]> wrote: > No common x86 BIOS can understand any partition table. Booting is > done by > loading the first sector of the boot device and executing it. The > common D'oh!! Red-face here. Can't believe my brainlessness. Thanks for putting me straight - that explains a lot. Now to try it ;-) Neil PS: I should go back to sleep now, clearly not awake for the last month. __ Do you Yahoo!? Yahoo! Mail - now with 250MB free storage. Learn more. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Hi... --- Bodo Eggert [EMAIL PROTECTED] wrote: No common x86 BIOS can understand any partition table. Booting is done by loading the first sector of the boot device and executing it. The common D'oh!! Red-face here. Can't believe my brainlessness. Thanks for putting me straight - that explains a lot. Now to try it ;-) Neil PS: I should go back to sleep now, clearly not awake for the last month. __ Do you Yahoo!? Yahoo! Mail - now with 250MB free storage. Learn more. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Howdy... --- "Pedro Venda (SYSADM)" <[EMAIL PROTECTED]> wrote: > Neil Conway wrote: > > Howdy... > > After much banging of heads on walls, I am throwing in the towel > and > > asking the experts ;-) ... To cut a long story short: > > Is it possible to make a 3TB disk work properly in Linux? > > Our "disk" is 12x300GB in RAID5 (with 1 hot-spare) on a 3ware > 9500-S12, > > so it's actually 2.7TiB ish. It's also /dev/sda - i.e., the one > and > > only disk in the system. > > not meaning to criticise... but isn't it a good idea to have a > separate raid1 volume to boot the system? Well, yes, and we would if we could. Sadly, the chassis we got from our vendor only has space for the 12 hot-swap disks, and we need the capacity too badly to lose 2 slots for a boot volume. If only we could take a sliver of each of the 12 disks to make a tiny RAID5 boot volume... Regards, Neil __ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Howdy... Apologies for the somewhat tardy reply; I've been concentrating on getting the hardware to play nice recently and not worrying so much about the software. --- Tomas Carnecky <[EMAIL PROTECTED]> wrote: > It was gentoo, and I even think I installed it right onto the GPT > disk, > so no migration. But I'm not sure. You just have to look that your > kernel supports GPT. I don't know if the kernel from the gentoo > livecd > supports GPT. > > Also have a look here how to create GPT partitions: > http://www.google.ch/search?q=site%3Ausefulthings.org.uk+gpt > I think I did it like it's shown there, mklabel, mkpart and mount > them. > I don't think I migrated from MSDOS to GPT, because I don't even know > how it'is possible if you have only one disk with the system on it. Bizarre... I will give this a try on a spare system as soon as I can. I thought sure I had read somewhere that typical x86 PC BIOSes just didn't understand the GPT ptbl, and thus couldn't boot from a GPT'ed disk. thanks, Neil __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Howdy... Apologies for the somewhat tardy reply; I've been concentrating on getting the hardware to play nice recently and not worrying so much about the software. --- Tomas Carnecky [EMAIL PROTECTED] wrote: It was gentoo, and I even think I installed it right onto the GPT disk, so no migration. But I'm not sure. You just have to look that your kernel supports GPT. I don't know if the kernel from the gentoo livecd supports GPT. Also have a look here how to create GPT partitions: http://www.google.ch/search?q=site%3Ausefulthings.org.uk+gpt I think I did it like it's shown there, mklabel, mkpart and mount them. I don't think I migrated from MSDOS to GPT, because I don't even know how it'is possible if you have only one disk with the system on it. Bizarre... I will give this a try on a spare system as soon as I can. I thought sure I had read somewhere that typical x86 PC BIOSes just didn't understand the GPT ptbl, and thus couldn't boot from a GPT'ed disk. thanks, Neil __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3TB disk hassles
Howdy... --- Pedro Venda (SYSADM) [EMAIL PROTECTED] wrote: Neil Conway wrote: Howdy... After much banging of heads on walls, I am throwing in the towel and asking the experts ;-) ... To cut a long story short: Is it possible to make a 3TB disk work properly in Linux? Our disk is 12x300GB in RAID5 (with 1 hot-spare) on a 3ware 9500-S12, so it's actually 2.7TiB ish. It's also /dev/sda - i.e., the one and only disk in the system. not meaning to criticise... but isn't it a good idea to have a separate raid1 volume to boot the system? Well, yes, and we would if we could. Sadly, the chassis we got from our vendor only has space for the 12 hot-swap disks, and we need the capacity too badly to lose 2 slots for a boot volume. If only we could take a sliver of each of the 12 disks to make a tiny RAID5 boot volume... Regards, Neil __ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with Via VT82C686A
Hi... mythos wrote: > > I have installed a second hard drive in my system in the second > channel of my controller.But when I try to enable DMA I get: > hdc: DMA disabled > hdc: timeout waiting for DMA > ide_dmaproc: chipset supported ide_dma_timeout func only: 14 > hdc: irq timeout: status=0x58 { DriveReady SeekComplete DataRequest } > [snip] > I thought that there were problems only with Via VT82C686B. > Can anyone please help me? > My motherbord is an ASUS K7V with KX133 chipset. Well, I posted a message about three weeks ago "IDE corruption, 2.2, VIA chipset in PIO mode" in which I described problems with a VIA 686A + IBM 75GXP, which occurred in both UDMA and PIO modes. I thought Alan Cox's suggestion of cable problems seemed believable at the time (and duly performed the brown paper bag). But, I've since removed disk+cable and transferred them to a Dell system (810 chipset) and really hammered the disk. I deliberately didn't install the (newly arrived) 80-core cables because I wanted to try and exonerate the VIA mobo by reproducing the errors on the Dell. >From the buildup, you've probably guessed that I have failed to reproduce the error... Despite a serious workover, no errors whatsoever. (Of course, it might still not be the VIA chipset's fault.) Due to limited time and the fact that the VIA machines are more or less 24-hour production boxes, I may not be able to retry the disk+cable on the 686A chipset, but the whole experience has soured me on VIA. Web searches with "linux via ide corruption" show scary hit rates. Also, the VIA web site has what it laughably bills as fixes for the "alleged" 686b problems, but (to paraphrase) it basically says "try this, and it should work, but if not then try this, this, this, this, and then some unsupported stuff contributed by users...". Call me a pedant if you like, but that isn't language to convince me that they (A) understand the problem, (B) have fixed the problem. Add to that (see Alan's remarks over an extended period) the fact that they really don't appear Linux-friendly with regard to providing information... I won't be allowing any more VIA boards on-site. Neil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with Via VT82C686A
Hi... mythos wrote: I have installed a second hard drive in my system in the second channel of my controller.But when I try to enable DMA I get: hdc: DMA disabled hdc: timeout waiting for DMA ide_dmaproc: chipset supported ide_dma_timeout func only: 14 hdc: irq timeout: status=0x58 { DriveReady SeekComplete DataRequest } [snip] I thought that there were problems only with Via VT82C686B. Can anyone please help me? My motherbord is an ASUS K7V with KX133 chipset. Well, I posted a message about three weeks ago IDE corruption, 2.2, VIA chipset in PIO mode in which I described problems with a VIA 686A + IBM 75GXP, which occurred in both UDMA and PIO modes. I thought Alan Cox's suggestion of cable problems seemed believable at the time (and duly performed the brown paper bag). But, I've since removed disk+cable and transferred them to a Dell system (810 chipset) and really hammered the disk. I deliberately didn't install the (newly arrived) 80-core cables because I wanted to try and exonerate the VIA mobo by reproducing the errors on the Dell. From the buildup, you've probably guessed that I have failed to reproduce the error... Despite a serious workover, no errors whatsoever. (Of course, it might still not be the VIA chipset's fault.) Due to limited time and the fact that the VIA machines are more or less 24-hour production boxes, I may not be able to retry the disk+cable on the 686A chipset, but the whole experience has soured me on VIA. Web searches with linux via ide corruption show scary hit rates. Also, the VIA web site has what it laughably bills as fixes for the alleged 686b problems, but (to paraphrase) it basically says try this, and it should work, but if not then try this, this, this, this, and then some unsupported stuff contributed by users Call me a pedant if you like, but that isn't language to convince me that they (A) understand the problem, (B) have fixed the problem. Add to that (see Alan's remarks over an extended period) the fact that they really don't appear Linux-friendly with regard to providing information... I won't be allowing any more VIA boards on-site. Neil - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IDE corruption, 2.2, VIA chipset in PIO mode
Alan Cox wrote: > > Sigh. Ah, I think I see a nice brown bag, in a nice deep hole. > > Its only a pointer. PIO speed cable errors tend to imply a bad cable problem > (eg not properly connected ribbon). So it could still be that the problem is > elsewhere Ah OK. Though a cable fault does seem consistent with the evidence... (I swear I read the FAQ before posting!) In practice, does a BadCRC error EVER imply a crap/buggy chipset? On the flip side, the cable isn't too long, isn't damaged, and was very definitely seated properly (I did it personally and took some care over that). On the third hand, I don't know where it came from, and somebody had spilled coffee on it in a previous life :-) (not the connectors!). To approach the question from a different angle completely: DARE I use the VIA 686A in UDMA-33/66[/100 if capable?] mode, or is it not really up to the job? I've seen so many posts on a search for "linux via ide corruption" that I'm uneasy about repeating the experiments on what is a production box... thanks, Neil PS: 80pin cables on the way :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IDE corruption, 2.2, VIA chipset in PIO mode
Alan Cox wrote: Sigh. Ah, I think I see a nice brown bag, in a nice deep hole. Its only a pointer. PIO speed cable errors tend to imply a bad cable problem (eg not properly connected ribbon). So it could still be that the problem is elsewhere Ah OK. Though a cable fault does seem consistent with the evidence... (I swear I read the FAQ before posting!) In practice, does a BadCRC error EVER imply a crap/buggy chipset? On the flip side, the cable isn't too long, isn't damaged, and was very definitely seated properly (I did it personally and took some care over that). On the third hand, I don't know where it came from, and somebody had spilled coffee on it in a previous life :-) (not the connectors!). To approach the question from a different angle completely: DARE I use the VIA 686A in UDMA-33/66[/100 if capable?] mode, or is it not really up to the job? I've seen so many posts on a search for linux via ide corruption that I'm uneasy about repeating the experiments on what is a production box... thanks, Neil PS: 80pin cables on the way :-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
IDE corruption, 2.2, VIA chipset in PIO mode
Summary: we got IDE trashage in PIO mode with a VIA 686A IDE chipset, using 2.2.12-20smp (RH6.1 stock). Disk is an IBM 75GXP 75GB, mobo is Gigabyte GA-6VXDC7 (IIRC). Story: had the system hooked up with SCSI disk, needed more disk space, had IBM EIDE handy, stuck it in, no UDMA cable handy so used 40pin cable. Verified with hdparm that disk was using UDMA mode 2 (rather than 3 or 4 which would have needed the 80pin cable). Because it was an IDE disk in a box with a SCSI system disk, I made a little /ideboot partition (50 megs or so) and avoided LILO hassles by parking the kernel+initrd on that. Rest of disk was just /data partition. Used for a while, copied about 7gigs onto it. Then got lots of BadCRC errors when reading from disk (from dma_intr). Decided to disable DMA as a result of this... Sometime later tried to reboot, couldn't. Closer examination showed the /ideboot partition was hosed. No worries, we thought, it's just been screwed by the DMA being on earlier. So, I just rebuilt /ideboot (a little optimistic of me) and got it booting again, and then compared files on /data with the original data files. When they failed to match, I decided to blitz the whole lot, repartition both partitions and remake the fs's (still thinking at this point that the main cause was the original 7 gigs of copying while DMA was enabled). All of this rebuilding was done in PIO mode. So, having recopied the data onto the disk again, sometime later one of us reboots it, and hey presto, it doesn't reboot, and yes, it's due to the little partition /ideboot being hosed again (illegal triply indirect blocks, bad inodes etc...). So, I'm now left thinking that this final failure (and thus by inference maybe the others too) really can't have been caused by DMA problems... (The only little caveat is that when I blitzed the lot, rebuilding the partition table and both filesystems, I didn't wipe out the entire boot sector/cylinder, so in principle some tiny vestigial memories of the corruption might have persisted??) I've searched the web, and found plenty of people suffering from broken DMA on the VIA chipsets, but no clear reports of PIO breakage. It does seem incredible that a chipset could fail to work reliably in PIO mode, but it's either that, or the 2.2.12-20smp kernel, or a broken disk or motherboard. Given VIA's apparent flakiness, the chipset seems like a good candidate for suspicion... Anyone out there got the answers? Neil PS: 2.2.12-20smp - argh puke, but not my machine so not my kernel choice... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SMP race in ext2 - metadata corruption.
Hiya. Linus Torvalds wrote: > So anybody who depends on "dump" getting backups right is already playing > russian rulette with their backups. It's not at all guaranteed to get the > right results - you may end up having stale data in the buffer cache that > ends up being "backed up". > > Dump was a stupid program in the first place. Leave it behind. Ouch. I just re-read the man page and it doesn't caution (*) against using it on mounted filesystems. That probably means that there are thousands of other losers like me using it on production machines. Volunteers to (a) change the man page, (b) talk to the distros about dumping "dump"? > However, it may be that in the long run it would be advantageous to have a > "filesystem maintenance interface" for doing things like backups and > defragmentation.. Yup, sounds good. Neil (*) The KNOWNBUGS file mentions "possible" problems while dumping active mounted filesystems, but I've (elsewhere) seen these characterised as no real problem; also, this falls a long way short of discouraging use in this fashion. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/