RE: "big" machines running Debian?
Here at Vulcan We run some 2x4 processor 32GB RAM 2TB Arrays via fibre channel without a problem with Lenny Debian Peter Yorke From: Igor Támara [i...@tamarapatino.org] Sent: Saturday, February 21, 2009 5:00 AM To: debian-amd64@lists.debian.org Subject: "big" machines running Debian? Hi, at some Datacenter here on my country they only want the machines to be installed with RHEL or Suse, every time I dig more into those distros I fall in love more with Debian. This is why I'm asking about machines that have many cores and lots of RAM and plenty of disk. Here (at my country) big means more than 4x4 cores , more than 16Gb of RAM, and more than 1Tb on disk, excluding clusters, also SAN are good to know about. Is there a place where one can post the machines to make some feel of trusting for others? I'm using Debian from about 2000 and had the opportunity of use sparc, powerpc, x86 and AMD64 ports, and ever had to go back with another distro, I'm really happy with Debian, so I want to use it as many places as possible. Thanks in advacne for any information. -- Recomiendo Imágenes de OpenClipart http://www.openclipart.org -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
RE: 3w_9xxx drive/lenny kernel
My current firmware is 3.08.02.005 and kernel is 2.6.24-7 -Original Message- From: nicholas materer [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2008 8:10 AM To: Peter Yorke; debian-amd64@lists.debian.org Subject: RE: 3w_9xxx drive/lenny kernel Markus, I am using the standard debian kernel. I just to see that it was installed correctly. Peter, you reminded me that some time ago I had to upgrade the firmware to get the system to work. My card is a model 9500S-4LP with a firmware (from show ver with the tw_ctl tool) CLI Version = 2.00.07.003 API Version = 2.04.00.003 CLI Compatible Range = [2.00.00.001 to 2.00.07.003] It looks like there is a new version of the firmware. I try an upgrade tonight. Thanks Nick -Original Message- From: Peter Yorke [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2008 8:38 AM To: [EMAIL PROTECTED]; debian-amd64@lists.debian.org Subject: Re: 3w_9xxx drive/lenny kernel I have been running 3ware 9550's on lenny for over six months without issue with the 2.6.22 and .2.6.24 kernels. What model card are you running? What version of firmware on the card? Peter Typed on my Black Berry Device - Original Message - From: nicholas materer <[EMAIL PROTECTED]> To: debian-amd64@lists.debian.org Sent: Mon Jun 30 20:51:57 2008 Subject: 3w_9xxx drive/lenny kernel I just updated a system from etch to lenny. With either the linux-image-2.6.24-1-amd64 or the linux-image-2.6.22-3-amd64 kernel, my 3w_9 raid controller did not function. On the console, lines and lines of PCI parity error are reported. The 2.6.18-6-amd64 from etch works fine, even after the system is updated. Unfortunately, the system boots off the raid (not the best idea) so I need to make more effect to capture the exact errors. Did not learn much from Googling. Has anyone had similar experiences with the 3w_9xxx card? Sort lspci output: 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:05.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 02:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2) 08:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 08:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 0a:04.0 RAID bus controller: 3ware Inc 9xxx-series SATA-RAID 80:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:01.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) Nick Materer -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
RE: 3w_9xxx drive/lenny kernel
Here is the relevant article from 3Ware http://www.3ware.com/kb/article.aspx?id=15012 Peter -Original Message- From: Peter Yorke [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2008 6:38 AM To: [EMAIL PROTECTED]; debian-amd64@lists.debian.org Subject: Re: 3w_9xxx drive/lenny kernel I have been running 3ware 9550's on lenny for over six months without issue with the 2.6.22 and .2.6.24 kernels. What model card are you running? What version of firmware on the card? Peter Typed on my Black Berry Device - Original Message - From: nicholas materer <[EMAIL PROTECTED]> To: debian-amd64@lists.debian.org Sent: Mon Jun 30 20:51:57 2008 Subject: 3w_9xxx drive/lenny kernel I just updated a system from etch to lenny. With either the linux-image-2.6.24-1-amd64 or the linux-image-2.6.22-3-amd64 kernel, my 3w_9 raid controller did not function. On the console, lines and lines of PCI parity error are reported. The 2.6.18-6-amd64 from etch works fine, even after the system is updated. Unfortunately, the system boots off the raid (not the best idea) so I need to make more effect to capture the exact errors. Did not learn much from Googling. Has anyone had similar experiences with the 3w_9xxx card? Sort lspci output: 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:05.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 02:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2) 08:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 08:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 0a:04.0 RAID bus controller: 3ware Inc 9xxx-series SATA-RAID 80:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:01.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) Nick Materer -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: 3w_9xxx drive/lenny kernel
I have been running 3ware 9550's on lenny for over six months without issue with the 2.6.22 and .2.6.24 kernels. What model card are you running? What version of firmware on the card? Peter Typed on my Black Berry Device - Original Message - From: nicholas materer <[EMAIL PROTECTED]> To: debian-amd64@lists.debian.org Sent: Mon Jun 30 20:51:57 2008 Subject: 3w_9xxx drive/lenny kernel I just updated a system from etch to lenny. With either the linux-image-2.6.24-1-amd64 or the linux-image-2.6.22-3-amd64 kernel, my 3w_9 raid controller did not function. On the console, lines and lines of PCI parity error are reported. The 2.6.18-6-amd64 from etch works fine, even after the system is updated. Unfortunately, the system boots off the raid (not the best idea) so I need to make more effect to capture the exact errors. Did not learn much from Googling. Has anyone had similar experiences with the 3w_9xxx card? Sort lspci output: 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev a2) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev a3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:05.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 02:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2) 08:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 08:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) 08:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) 0a:04.0 RAID bus controller: 3ware Inc 9xxx-series SATA-RAID 80:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:01.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 80:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 80:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) Nick Materer -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
RE: Any thoughts on ext4? [Was: Reiser4 patches]
Yes, You can mount a ext3 file system as ext4 1. Using 'mkfs.ext3 /dev/DEVICE' to create the file system 2. To mount the partition as Ext4: mount -t ext4dev /dev/DEV /wherever To enable extents, use: mount -t ext4dev -o extents /dev/DEVICE /wherever Once mounted with -o extents, the partition cannot be mounted with -t ext3 anymore! Just wanted to know if anyone had any experience with the new file system in a non-production system -Original Message- From: Lennart Sorensen [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 8:27 AM To: Peter Yorke Cc: Chris Wakefield; debian-amd64@lists.debian.org Subject: Re: Any thoughts on ext4? [Was: Reiser4 patches] On Thu, Apr 03, 2008 at 08:15:20AM -0700, Peter Yorke wrote: > Curious if anyone has had some experience with ext4? Not me. I will wait for them to finish writing it first. :) Is it even at a state where one can try it out? -- Len Sorensen
Any thoughts on ext4? [Was: Reiser4 patches]
Curious if anyone has had some experience with ext4? Peter Yorke -Original Message- From: Lennart Sorensen [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 02, 2008 3:36 PM To: Chris Wakefield Cc: debian-amd64@lists.debian.org Subject: Re: Reiser4 patches. On Wed, Apr 02, 2008 at 03:17:24PM -0700, Chris Wakefield wrote: > Greetings: > > I've been looking for an up-to-date URL regarding Reiser4 filesystem; > namesys.com is down. Does anyone know where I can find Reiser4 stuff? > > Although I'm a big Debian fan and fairly experienced, I don't know if it's > possible to patch Debian sources with Reiser4 from the repos on the fly with > dpkg or...? > > I have managed to chroot into my latest deb install on Reiser4, but I was > hoping to BOOT directly to it > > Grub2 is not patched yet for Reiser4; There are some patches for legacy Grub, > but its not compiling. I think it's the fact that the latest dev files are: > 1.0.6-1 and the legacy grub patch is: 1.0.5. > > Any ideas would be appreciated. > > By-the-way, I noticed quite a difference of performance with Reiser4. Personally I prefer my data intact. I had enough bad experience with reiserfs3 that version 4 has no interest what so ever. The fact it isn't part of the kernel also indicates to me it is nowhere near ready for prime use. -- Len Sorensen -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Opinions on ext3 vs XFS vs reiserfs for LAMP server
One exception is with a RAID controller with battery backup I can have power outages on XFS mounts without loss or corruption of data Peter - Original Message - From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: debian-amd64@lists.debian.org Sent: Thu Aug 23 16:07:46 2007 Subject: Re: Opinions on ext3 vs XFS vs reiserfs for LAMP server Quoting Jim Crilly <[EMAIL PROTECTED]>: > On 08/23/07 10:03:24AM -0700, [EMAIL PROTECTED] wrote: >> >> The problem of zeroing files of XFS still exists, however its not some >> mythical type of corruption. You'll only see it on files recently >> written to within seconds (say approx 60 secs) of a hard power off. If >> you can't risk it, or think you may have encounter the odd hard reset, >> ext3 might be a better choice. >> > > Actually it's been fixed as of 2.6.22: > http://oss.sgi.com/projects/xfs/faq.html#nulls > > Of course that doesn't help you if you're using sticking with the kernel > shipped with etch. > I'm not so sure its fixed. I just tested with a sid samba box, running 2.6.22 kernel, and XFS filesystem. Connected to it via a WinXP box and copied a word doc file to it. Soft rebooted the samba box to make sure the file was sync'd to hard drive. Re-connected to samba share and opened the word document, added some text lines to it, saved and quit Word, then yanked the power out. Rebooted and re-connected to the samba share again only to find the file full of squares. Ext3 would have at least retained the original contents of the file. I tested the exact same thing again but waited 60 seconds after saving the file, and then yanked the power out. Upon a boot up, the file was intact and the save worked. So you still have about a 60 second window of newly written files and a power loss for data corruption, unless the program can sync it to disk before that. Cheers, Mike
RE: ext3 vs reiserfs 3.6
My biggest problem with reiser wasn't crashing, but data/disk corruption with sleepcat databases. But I agree, the memory testing is critical before putting a system into service. Since moving to XFS, rock solid! Even during power outages. Peter Yorke Sr. Linux Server Engineer Vulcan, Inc. -Original Message- From: Goswin von Brederlow [mailto:[EMAIL PROTECTED] Sent: Friday, July 28, 2006 4:32 AM To: Giacomo Mulas Cc: debian-amd64@lists.debian.org Subject: Re: ext3 vs reiserfs 3.6 Giacomo Mulas <[EMAIL PROTECTED]> writes: > On Fri, 28 Jul 2006, Francesco Pietra wrote: > >> Since I changed from reiserfs 3.6 to ext3 with debian etch amd64, the system >> no more suffered any crash, after days of running a very heavy computation >> with mpqc 2.3.1 with thread command for two dual opterons and 8 GB ram. The >> computation has now ended to full convergence. That was the most stressing >> action of memory I could conceive. > > Would you be able to put together and make available a simple script with > such a stress test for other people to try? Especially if you know that it > consistently crashes your previous setup. I, for one, am curious, since I > have 7 rock stable amd64 machines happily crunching numbers with (other) > quantum chemistry applications and using reiserfs. None of them is SMP > though. > > Bye > Giacomo We (at my workplace) have lots of them, smp and not, with reiserfs and they don't usualy crash. They do crash a lot when we get new ones untill we weed out all the bad ram and such but after that the majority runs stable. The rest we swap cpu or the mainboard till they work. We still do have problems with reiserfs every now and then though. Having power getting cut from nodes without proper shutdown seems to be a problem for reiserfs. On reboot the syslogd hangs for ages unless /var is reformated. Recently I convinced my boss to switch to another filesystem but we still have to test crash (e.g. pull the power every 5 minutes) the different FSes a lot to see which is most robust. Personaly I use ext3 and never had problems on amd64. MfG Goswin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
RE: reiserfs/md1/failure/threads
Sigh It's always something -Original Message- From: Gabor Gombas [mailto:[EMAIL PROTECTED] Sent: Thursday, July 20, 2006 2:49 AM To: Peter Yorke Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; debian-amd64@lists.debian.org Subject: Re: reiserfs/md1/failure/threads On Wed, Jul 19, 2006 at 06:07:35AM -0700, Peter Yorke wrote: > For disk intensive applications like databases and those that stream > data, XFS is a better choice due to the inherent performance > capabilities and it's mature 64bit legacy in the SGI OS http://oss.sgi.com/projects/xfs/faq.html#dir2 Nothing is perfect. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences -
Re: reiserfs/md1/failure/threads
While ext3 is a very stable file system in the amd64 Debian disro For disk intensive applications like databases and those that stream data, XFS is a better choice due to the inherent performance capabilities and it's mature 64bit legacy in the SGI OS Peter Peter Yorke Sr. Linux Server Engineer -- Thumb typed from a tiny keyboard. - Original Message - From: Francesco Pietra <[EMAIL PROTECTED]> To: Lennart Sorensen <[EMAIL PROTECTED]> Cc: debian-amd64@lists.debian.org Sent: Wed Jul 19 01:14:31 2006 Subject: Re: reiserfs/md1/failure/threads Thank you for most detailed instructions. On a global balance, I decided to carry out a fresh install of amd64 to have ext3 as file system. You (and general) strong advice to change to ext3 can not be ignored. I am just downloading the amd64 net CD install built freshly daily, so that I can also help the preparation of the beta3 release of the net install. This does not mean that I can be sure about the genuinity of my hardware but your examination of the signals I have produced suggests that it is. I have postponed the examination of the harware because the disks are OK and memories could be changed should they prove faulty. It seems the contrary of what one normally does: hardware before software but I am not sure to arrive at a conclusive test of my hardware. I have not much to install besides base OS and a few tasks: jwd window manager, sensors, compilers if needed, your compilation of mpqc, my re-compilation of molecular mechanics (to carry out in any case because of improvements to the code). That's about all. I can anticipate that mpqc 2..3.1 proves great. Thanks again francesco On Tuesday 18 July 2006 19:44, Lennart Sorensen wrote: > On Tue, Jul 18, 2006 at 05:33:51PM +0200, Francesco Pietra wrote: > > Not to insist any further on the relative merits of the various > > filesystems, but in the general interest of maintaining amd64 (and > > therefore of examinining parameters one at once, withouth mixing > > problems), did you notice my e-mail of today emphasizing that after the > > crash my data are intact? I wonder whether your suspicion about memory or > > cpu may be the point. How to carry out a thourough memory test and > > identifying which slot is defective, if any? Although Kingston ECC, one > > of the eight slots (1GB each) might be defective. > > Well I have certainly seen a number of messages from people with > opterons having memory problems over the last few months. The opteron > seems to be very picky about memory quality, which makes some sense > given have efficiently it uses it. It drives the memory quite hard. > > Simplest way I know of to test memory andd cpu, is to run a lot of large > kernel compiles. Often a memory problem will cause that to segfault. > Anything htat uses lots of cpu and lots of memory is usually a good > test, at least if it fails spectacularly on an error, like gcc tends to > do. > > To test the memory, remove half of it, and try the test. If it fails, > replace one stick of memory with one of the other ones, until you can > run the test without a problem. You could probably even run the test > with 1 or 2 sticks of memory. A number of people have managed to find > faulty memory on an opteron this way. Some people have come back going > "I found a faulty stick of memory" after swearing that memtest86 had > said all their ram was fine and they were sure their name brand ram > wasn't faulty. :) memtest86 does't catch all errors. Of course with > ECC memory I would have expected to see a machine check exception (MCE) > if there was any single bit errors in the memory. I am still most > inclined to blame reiserfs or perhaps the cpu. Of course since it was > multiple errors all coming from reiserfs, with apparently nothing else > seeing a problem, I really think it may simply be a reiserfs bug. I was > using XFS before on early 2.6 kernels on i386, and even tually had to > give up and move to ext3 since it just wasn't reliably on top of LVM on > top of MD raid. The filesystem had some bad interaction with the LVM > and MD raid that made it not work. It probably got fixed since, but I > needed something that worked then, and ext3 worked. > > > What about checking the cpu? I can simply tell that I monitored the > > temperature during the long calculation, with the machine in a strongly > > ventilated area. Starting from 36C, the temp raised to 44C at maximum. I > > don't know the correspondence with real temp ($sensors) but the > > difference should tell. AMD for my 265 dual opterons indicates case > > temperature 49-67C (is what I measured just case temp?). AMD also > > indicate as temp limits 10-35, but I gues this sho
Re: reiserfs/md1/failure/threads
A few other solutions I had with reiser Was seeing consistent sleepycat database corruption together with the type of log messages you described on MD softraids about 6 months ago I ended up dumping the reiser for xfs and upgrading the kernel from 2.6.8 to 2.6.12 For reasons of performance/stability/realiability, some of these systems needed 3ware raid controllers which required kernel upgrades to 2.6.16 I would have liked to have stayed with reiser, but it just didn't seem to behave well in the Debian amd64 OS Today I have over 100 happy Debian amd64 servers running 2.6.12/16 with xfs on soft and hard raid configurations Peter Peter Yorke Sr. Linux Server Engineer -- Thumb typed from a tiny keyboard. - Original Message - From: Jo Shields <[EMAIL PROTECTED]> To: Erik Mouw <[EMAIL PROTECTED]> Cc: Francesco Pietra <[EMAIL PROTECTED]>; debian-amd64@lists.debian.org ; [EMAIL PROTECTED] <[EMAIL PROTECTED]> Sent: Tue Jul 18 05:37:20 2006 Subject: Re: reiserfs/md1/failure/threads Erik Mouw wrote: > On Tue, Jul 18, 2006 at 12:01:31PM +0100, Jo Shields wrote: > >> Mickael Marchand wrote: >> >>> check your memory (yes it's going to be long, but that's almost always >>> the reason of reiserfs failures) >>> I am stressing hard reiserfs on various amd64/em64t boxes, no pbl so >>> far. >>> every box I found corrupting filesystems were having : >>> 1 - bad hard drives that a low scan confirmed >>> or 2 - bad memory that a real long memtest could detect >>> >>> Cheers, >>> Mik >>> >>> >> I'll add to this - I've seen corruption with all filesystems on my >> office desktop (which has screwed memory, but they refuse to fix it). >> EXT3 gave up on journalling & just started writing junk, costing me my >> /home. >> > > Ext2/ext3 complains about errors, but you normally don't see that > because it's hidden in the system log files. It's a good thing to mount > partitions with the "errors=remount-ro" option. If anything goes wrong, > the kernel will mount the partition read-only. Reboot+fsck will save > your data. > > >> Reiser is lasting up better, but reiserfsck segfaults when it >> sees /home >> > > That means that the filesystem has errors. Reiserfsck is able to detect > them, but because nobody has seen those errors before it will segfault > on them. That also means that the reiserfs filesystem driver in the > kernel will happily screw the filesystem further up without notice. > Back up your data *NOW* before it's too late. > Meh. I stopped using my desktop for real work a while back, since it locks up under load. If I ask nicely, they take my machine away for a week, sit on it, then give it back to me. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
RE: Adding tg3 into a 2.6.12 kernel
That worked fine, Thanks Peter -Original Message- From: Johan Groth [mailto:[EMAIL PROTECTED] Sent: Friday, July 22, 2005 9:39 AM To: Peter Yorke Cc: debian-amd64@lists.debian.org Subject: Re: Adding tg3 into a 2.6.12 kernel Peter Yorke wrote: > How does one add the tg3 into a 2.6.12 kernel or will be included at > some point? When I installed linux-image-2.6.12-1-amd64-k8-smp, it was already there :), so you don't have to do anything. So both my big concerns where solved in one go, the first one being support for LSI MegaRaid SATA, which requires megaraid_mbox. /Johan
Adding tg3 into a 2.6.12 kernel
How does one add the tg3 into a 2.6.12 kernel or will be included at some point? Peter > >> Must remember to add back tg3 I guess which disappeared after 2.6.8 >> sometime. Unless it didn't make it in I guess. :) >