Re: raid1 issue, somewhat related to recent "debian on big machines"
> > > You create the partitions in the installer, setting the partition use to > 'raid'. You then select configure raid, and setup all your raid devices. > Then when you return to the partition tool you will see the new raid > md devices, which you then set to 'use as LVM volume', then you go to > configure logical volume management, and setup your lvm volumes there > for your actual file systems, then when you come back to the partition > tool again you will see the lvm volumes listed and you can select them > and pick the filesystem type and mountpoint for each LVM volume. > This might be handy http://www.howtoforge.com/set-up-a-fully-encrypted-raid1-lvm-system > > -- > Len Sorensen > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact > listmas...@lists.debian.org > > -- Med venlig hilsen Dennis Johansen Hegnstoften 63 2630 Taastrup Tlf: 4371 8584 Mob: 27 515 217 E-mail: dennis.johan...@hegnstoften.net www.hegnstoften.net
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: > To my dismay, I tried (repeatedly) unsuccessfully to implement the > scheme below on old Tyan S2895 with two dual-opteron and two new > Maxtor 250GB, before moving to the new machine. With the recent amd > installer, I tried to set up (manually) the two partitions on both > disks to set up raid1. > > First, I tried with a 0.2GB partition for boot but I found no way to > have lvm for the other partition and where to set the root file > system. > > Then, I tried with a 1GB partition but found no way to have it for > both boot and root. > > In both cases, the installer claimed to have the root file system. > > What I need to have for the compilations of applications are /home > /usr /opt /var /swap. The bad way I used previously, was to start from > these partitions and put each on raid. So I finished with so many > raid#. You create the partitions in the installer, setting the partition use to 'raid'. You then select configure raid, and setup all your raid devices. Then when you return to the partition tool you will see the new raid md devices, which you then set to 'use as LVM volume', then you go to configure logical volume management, and setup your lvm volumes there for your actual file systems, then when you come back to the partition tool again you will see the lvm volumes listed and you can select them and pick the filesystem type and mountpoint for each LVM volume. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Fri, Mar 13, 2009 at 11:45:37AM +0100, Goswin von Brederlow wrote: > Alex Samad writes: > > > On Wed, Mar 11, 2009 at 11:02:31AM +0100, Goswin von Brederlow wrote: > >> Alex Samad writes: > >> > >> > On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: [snip] > > I would still argue for a separate /boot - plain old ext2, mount it ro > > until kernel upgrade, maybe store a rescue image on their, and with the > > size of disks now a days whats 500m or even 10G > > Have a 1GB / and mount it read-only. > > But if you do want a seperate /boot then put / on lvm too and move and > link /etc/lvm to /boot/lvm. There is no reason to have another > partition and raid and the benefits of lvm are there too. yes I could, but I wouldn't just piece of mind, if I waste a partition then I waste 1, again another layer of "just in case", I can easily load up 1 disk from a raid1, but trying to decide where the data is from a lvm on raid is well. > > MfG > Goswin > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org > > -- "I mean, there was a serious international effort to say to Saddam Hussein, you're a threat. And the 9/11 attacks extenuated that threat, as far as I�concerned." - George W. Bush 12/12/2005 Philadelphia, PA signature.asc Description: Digital signature
Re: raid1 issue, somewhat related to recent "debian on big machines"
Alex Samad writes: > On Wed, Mar 11, 2009 at 11:02:31AM +0100, Goswin von Brederlow wrote: >> Alex Samad writes: >> >> > On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: >> >> To my dismay, I tried (repeatedly) unsuccessfully to implement the >> >> scheme below on old Tyan S2895 with two dual-opteron and two new >> >> Maxtor 250GB, before moving to the new machine. With the recent amd >> >> installer, I tried to set up (manually) the two partitions on both >> >> disks to set up raid1. >> >> >> >> First, I tried with a 0.2GB partition for boot but I found no way to >> >> have lvm for the other partition and where to set the root file >> >> system. >> >> >> >> Then, I tried with a 1GB partition but found no way to have it for >> >> both boot and root. >> > >> > from memory but the outline of who I install >> > >> > Create 3 paritions 1 2 3 on sda and sdb of 500M 10G (this is going to be >> > raid1) the rest of the hard drive >> > >> > select all the partitions to be a raid device >> > >> > configure raid >> > md0 = sda1 sdb1 >> > md1 = sda2 sdb2 >> > md2 = sda3 sdb3 >> > >> > select md0 as type ext2 mount /boot >> > select md1 as type ext3 mount / >> > select md2 as type lvm device >> >> If you have a seperate / then you don't need /boot and 10G for / >> without /home, /usr, /var (see below) is way too big. > I had forgotten about /var I usually only place /var/log on a separate > lvm > > I would still argue for a separate /boot - plain old ext2, mount it ro > until kernel upgrade, maybe store a rescue image on their, and with the > size of disks now a days whats 500m or even 10G Have a 1GB / and mount it read-only. But if you do want a seperate /boot then put / on lvm too and move and link /etc/lvm to /boot/lvm. There is no reason to have another partition and raid and the benefits of lvm are there too. MfG Goswin -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Wed, Mar 11, 2009 at 11:02:31AM +0100, Goswin von Brederlow wrote: > Alex Samad writes: > > > On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: > >> To my dismay, I tried (repeatedly) unsuccessfully to implement the > >> scheme below on old Tyan S2895 with two dual-opteron and two new > >> Maxtor 250GB, before moving to the new machine. With the recent amd > >> installer, I tried to set up (manually) the two partitions on both > >> disks to set up raid1. > >> > >> First, I tried with a 0.2GB partition for boot but I found no way to > >> have lvm for the other partition and where to set the root file > >> system. > >> > >> Then, I tried with a 1GB partition but found no way to have it for > >> both boot and root. > > > > from memory but the outline of who I install > > > > Create 3 paritions 1 2 3 on sda and sdb of 500M 10G (this is going to be > > raid1) the rest of the hard drive > > > > select all the partitions to be a raid device > > > > configure raid > > md0 = sda1 sdb1 > > md1 = sda2 sdb2 > > md2 = sda3 sdb3 > > > > select md0 as type ext2 mount /boot > > select md1 as type ext3 mount / > > select md2 as type lvm device > > If you have a seperate / then you don't need /boot and 10G for / > without /home, /usr, /var (see below) is way too big. I had forgotten about /var I usually only place /var/log on a separate lvm I would still argue for a separate /boot - plain old ext2, mount it ro until kernel upgrade, maybe store a rescue image on their, and with the size of disks now a days whats 500m or even 10G > > > configure lvm > > > > ... create your lvm partitions > > select each one and specify fs type and mount point > > > > then proceed > > The tricky part I think is that you have to configure the partitions > to be used for raid before you can actualy create a raid. Then you > have to configure the raid devices to be used for lvm before one can > actualy create the lvm stuff. It makes raid/lvm kind of hidden. yep, if you follow the steps above that should cover it, you have to build the building blocks > > MfG > Goswin > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org > > -- "I would have to ask the questioner. I haven't had a chance to ask the questioners the question they've been questioning. On the other hand, I firmly believe she'll be a fine secretary of labor. And I've got confidence in Linda Chavez. She is a�she'll bring an interesting perspective to the Labor Department." - George W. Bush 01/08/2001 Austin, TX signature.asc Description: Digital signature
Re: raid1 issue, somewhat related to recent "debian on big machines"
Alex Samad writes: > On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: >> To my dismay, I tried (repeatedly) unsuccessfully to implement the >> scheme below on old Tyan S2895 with two dual-opteron and two new >> Maxtor 250GB, before moving to the new machine. With the recent amd >> installer, I tried to set up (manually) the two partitions on both >> disks to set up raid1. >> >> First, I tried with a 0.2GB partition for boot but I found no way to >> have lvm for the other partition and where to set the root file >> system. >> >> Then, I tried with a 1GB partition but found no way to have it for >> both boot and root. > > from memory but the outline of who I install > > Create 3 paritions 1 2 3 on sda and sdb of 500M 10G (this is going to be > raid1) the rest of the hard drive > > select all the partitions to be a raid device > > configure raid > md0 = sda1 sdb1 > md1 = sda2 sdb2 > md2 = sda3 sdb3 > > select md0 as type ext2 mount /boot > select md1 as type ext3 mount / > select md2 as type lvm device If you have a seperate / then you don't need /boot and 10G for / without /home, /usr, /var (see below) is way too big. > configure lvm > > ... create your lvm partitions > select each one and specify fs type and mount point > > then proceed The tricky part I think is that you have to configure the partitions to be used for raid before you can actualy create a raid. Then you have to configure the raid devices to be used for lvm before one can actualy create the lvm stuff. It makes raid/lvm kind of hidden. MfG Goswin -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Wed, Mar 11, 2009 at 09:53:14AM +0100, Francesco Pietra wrote: > To my dismay, I tried (repeatedly) unsuccessfully to implement the > scheme below on old Tyan S2895 with two dual-opteron and two new > Maxtor 250GB, before moving to the new machine. With the recent amd > installer, I tried to set up (manually) the two partitions on both > disks to set up raid1. > > First, I tried with a 0.2GB partition for boot but I found no way to > have lvm for the other partition and where to set the root file > system. > > Then, I tried with a 1GB partition but found no way to have it for > both boot and root. from memory but the outline of who I install Create 3 paritions 1 2 3 on sda and sdb of 500M 10G (this is going to be raid1) the rest of the hard drive select all the partitions to be a raid device configure raid md0 = sda1 sdb1 md1 = sda2 sdb2 md2 = sda3 sdb3 select md0 as type ext2 mount /boot select md1 as type ext3 mount / select md2 as type lvm device configure lvm ... create your lvm partitions select each one and specify fs type and mount point then proceed > > In both cases, the installer claimed to have the root file system. > > What I need to have for the compilations of applications are /home > /usr /opt /var /swap. The bad way I used previously, was to start from > these partitions and put each on raid. So I finished with so many > raid#. > > thanks > francesco > > > [snip] -- "Justice ought to be fair." - George W. Bush 12/15/2004 Washington, DC speaking at the White House Economic Conference signature.asc Description: Digital signature
Re: raid1 issue, somewhat related to recent "debian on big machines"
To my dismay, I tried (repeatedly) unsuccessfully to implement the scheme below on old Tyan S2895 with two dual-opteron and two new Maxtor 250GB, before moving to the new machine. With the recent amd installer, I tried to set up (manually) the two partitions on both disks to set up raid1. First, I tried with a 0.2GB partition for boot but I found no way to have lvm for the other partition and where to set the root file system. Then, I tried with a 1GB partition but found no way to have it for both boot and root. In both cases, the installer claimed to have the root file system. What I need to have for the compilations of applications are /home /usr /opt /var /swap. The bad way I used previously, was to start from these partitions and put each on raid. So I finished with so many raid#. thanks francesco On Tue, Mar 3, 2009 at 9:52 PM, Alex Samad wrote: > On Tue, Mar 03, 2009 at 12:26:27PM +0100, Goswin von Brederlow wrote: >> Francesco Pietra writes: > [snip] > >> >> That is a lot of raids. Have you ever thought about using LVM? The >> different raid1 will mess up each others assumption about the head >> positioning of the component devices. On read the linux kernel tries >> to use the disk with the shorter seek and assumes the head is where it >> left it on the last access. But if one of the other raids used that >> disk the head will be way off. >> >> I would suggest the following scheme: > > this is what I would recommend as well >> >> sda1 / sdb1 : 100Mb raid1 for /boot (or 1GB for / + /boot) >> sda2 / sdb2 : rest raid1 with lvm >> >> MfG >> Goswin >> >> >> -- >> To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org >> with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org >> >> > > -- > "Perhaps one way will be, if we use military force, in the post-Saddam Iraq > the U.N. will definitely need to have a role. And that way it can begin to > get its legs, legs of responsibility back." > > - George W. Bush > 03/16/2003 > the Azores, Portugal > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAkmtmHwACgkQkZz88chpJ2OpcACgmoZ41ASQaImVnKgcXiovFAya > DKwAnA4YwO7GWaL4QHnx02mSnAQdgmSM > =SAtK > -END PGP SIGNATURE- > > -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
lsore...@csclub.uwaterloo.ca (Lennart Sorensen) writes: > On Tue, Mar 03, 2009 at 10:59:50PM +0100, Francesco Pietra wrote: >> I understand that the double recommendation is fine. Though, I am >> pressed by answering the referees about a submitted paper as they >> requested additional computation. That was going on until the host >> suspended access to sda. As I find risky to go on with one disk only >> (for a many days computation), could you please explain how to >> reactivate the removed sda, or format it to try if it recovers? I made >> some proposals in previous post. Or indicated that the best is >> replacing the disk with a new one. > > You can simply ask mdadm to readd it back and let it rebuild, but likely > the error will happen again and you will need to replace the disk. > > If you replace the disk (with a disk at least as big as the old one), If you do have hotplug support then don't forget to mdadm --remove sda form all raids before pulling the disk. > then copy the partition table from the working drive and reread the > partition table (hdparm -z /dev/sda can do that for you) and finally > readd to the various raids again. You do not need to create a filesystem, > since the filesystem runs on the raid, not the individual partitions, > and hence you already have your filesystems made. MfG Goswin -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Tue, Mar 03, 2009 at 10:59:50PM +0100, Francesco Pietra wrote: > I understand that the double recommendation is fine. Though, I am > pressed by answering the referees about a submitted paper as they > requested additional computation. That was going on until the host > suspended access to sda. As I find risky to go on with one disk only > (for a many days computation), could you please explain how to > reactivate the removed sda, or format it to try if it recovers? I made > some proposals in previous post. Or indicated that the best is > replacing the disk with a new one. You can simply ask mdadm to readd it back and let it rebuild, but likely the error will happen again and you will need to replace the disk. If you replace the disk (with a disk at least as big as the old one), then copy the partition table from the working drive and reread the partition table (hdparm -z /dev/sda can do that for you) and finally readd to the various raids again. You do not need to create a filesystem, since the filesystem runs on the raid, not the individual partitions, and hence you already have your filesystems made. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Tue, Mar 3, 2009 at 9:52 PM, Alex Samad wrote: > On Tue, Mar 03, 2009 at 12:26:27PM +0100, Goswin von Brederlow wrote: >> Francesco Pietra writes: > [snip] > >> >> That is a lot of raids. Have you ever thought about using LVM? The >> different raid1 will mess up each others assumption about the head >> positioning of the component devices. On read the linux kernel tries >> to use the disk with the shorter seek and assumes the head is where it >> left it on the last access. But if one of the other raids used that >> disk the head will be way off. >> >> I would suggest the following scheme: > > this is what I would recommend as well I understand that the double recommendation is fine. Though, I am pressed by answering the referees about a submitted paper as they requested additional computation. That was going on until the host suspended access to sda. As I find risky to go on with one disk only (for a many days computation), could you please explain how to reactivate the removed sda, or format it to try if it recovers? I made some proposals in previous post. Or indicated that the best is replacing the disk with a new one. Thanks francesco >> >> sda1 / sdb1 : 100Mb raid1 for /boot (or 1GB for / + /boot) >> sda2 / sdb2 : rest raid1 with lvm >> >> MfG >> Goswin >> >> >> -- >> To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org >> with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org >> >> > > -- > "Perhaps one way will be, if we use military force, in the post-Saddam Iraq > the U.N. will definitely need to have a role. And that way it can begin to > get its legs, legs of responsibility back." > > - George W. Bush > 03/16/2003 > the Azores, Portugal > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAkmtmHwACgkQkZz88chpJ2OpcACgmoZ41ASQaImVnKgcXiovFAya > DKwAnA4YwO7GWaL4QHnx02mSnAQdgmSM > =SAtK > -END PGP SIGNATURE- > > -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On Tue, Mar 03, 2009 at 12:26:27PM +0100, Goswin von Brederlow wrote: > Francesco Pietra writes: [snip] > > That is a lot of raids. Have you ever thought about using LVM? The > different raid1 will mess up each others assumption about the head > positioning of the component devices. On read the linux kernel tries > to use the disk with the shorter seek and assumes the head is where it > left it on the last access. But if one of the other raids used that > disk the head will be way off. > > I would suggest the following scheme: this is what I would recommend as well > > sda1 / sdb1 : 100Mb raid1 for /boot (or 1GB for / + /boot) > sda2 / sdb2 : rest raid1 with lvm > > MfG > Goswin > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org > > -- "Perhaps one way will be, if we use military force, in the post-Saddam Iraq the U.N. will definitely need to have a role. And that way it can begin to get its legs, legs of responsibility back." - George W. Bush 03/16/2003 the Azores, Portugal signature.asc Description: Digital signature
Re: raid1 issue, somewhat related to recent "debian on big machines"
t; c8 00 08 47 9d 9b 4c 00 00:52:46.400 READ DMA > > Error 10 occurred at disk power-on lifetime: 1940 hours (80 days + 20 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 40 51 08 4a 9d 9b ec Error: UNC 8 sectors at LBA = 0x0c9b9d4a = 211524938 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- > c8 00 08 47 9d 9b 4c 00 00:52:42.950 READ DMA > ec 00 08 4a 9d 9b 00 00 00:52:42.950 IDENTIFY DEVICE > c8 00 08 47 9d 9b 4c 00 00:52:42.950 READ DMA > ec 00 08 4a 9d 9b 00 00 00:52:42.950 IDENTIFY DEVICE > c8 00 08 47 9d 9b 4c 00 00:52:42.950 READ DMA > > Error 9 occurred at disk power-on lifetime: 1940 hours (80 days + 20 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 40 51 08 4a 9d 9b ec Error: UNC 8 sectors at LBA = 0x0c9b9d4a = 211524938 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- > c8 00 08 47 9d 9b 4c 00 00:52:39.800 READ DMA > ec 00 08 4a 9d 9b 00 00 00:52:39.800 IDENTIFY DEVICE > c8 00 08 47 9d 9b 4c 00 00:52:39.800 READ DMA > ec 00 08 4a 9d 9b 00 00 00:52:39.800 IDENTIFY DEVICE > c8 00 08 47 9d 9b 4c 00 00:52:39.800 READ DMA > > Error 8 occurred at disk power-on lifetime: 1940 hours (80 days + 20 hours) > When the command that caused the error occurred, the device was > active or idle. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 40 51 08 4a 9d 9b ec Error: UNC 8 sectors at LBA = 0x0c9b9d4a = 211524938 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- > c8 00 08 47 9d 9b 4c 00 00:52:36.600 READ DMA > ec 00 08 4a 9d 9b 00 00 00:52:36.600 IDENTIFY DEVICE > c8 00 08 47 9d 9b 4c 00 00:52:36.600 READ DMA > c8 00 08 3f 9d 9b 4c 00 00:52:36.600 READ DMA > c8 00 08 37 9d 9b 4c 00 00:52:36.600 READ DMA > > SMART Self-test log structure revision number 1 > No self-tests have been logged. [To run self-tests, use: smartctl -t] > === > > I.e., I am still uncertain if sda has to be replaced with a new disk, > or the errors reported were temporary and have been removed by raid1. > > thanks > > francesco == ===> > > -- Forwarded message -- > From: Francesco Pietra > Date: Tue, Mar 3, 2009 at 11:21 AM > Subject: Re: raid1 issue, somewhat related to recent "debian on big machines" > To: Ron Johnson > > > On Tue, Mar 3, 2009 at 10:08 AM, Ron Johnson wrote: >> On 03/03/2009 02:53 AM, Francesco Pietra wrote: >>> >>> lupus in fabula as a follow up of my short intervention on raid1 with >>> my machine to the thread "Debian on big systems". >>> >>> System: supermicro H8QC8 m.board, two WD Raptor SATA 150GB, Debian >>> amd64 lenny, raid1 >>> >>> While running an electronic molecular calculation - estimated to four >>> days time - I noticed by chance on the screen (what is not in the out >>> file of the calculation) that there was a disk problem. I took some >>> scattered notes from the scree: >>> >>> RAID1 conf printout >>> >>> wd: 1 rd:2 >> >> [snip] >> >> What you are looking for should be in syslog, not your application's log. > > > OK, but /var/log/syslog > > tels nothing more that I took notice about from the screen: sda sector > 0 problematic, disk failure, continuing on one disk. My question is, > what does the lshw -disk output mean (SCSI vs SATA, as I have shown), > and if one disk has to be replaced with a new one. If so, to identify > which is which can I detach the SATA connection to the disks and see > which one works? > > thanks > francesco > > >> >> -- >> Ron Johnson, Jr. >> Jefferson LA USA >> >> The feeling of disgust at seeing a human female in a Relationship >> with a chimp male is Homininphobia, and you should be ashamed of >> yourself. >> >> >> -- >> To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org >> with a subject of "unsubscribe". Trouble? Contact >> listmas...@lists.debian.org >> >> > -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Fwd: raid1 issue, somewhat related to recent "debian on big machines"
SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] === I.e., I am still uncertain if sda has to be replaced with a new disk, or the errors reported were temporary and have been removed by raid1. thanks francesco -- Forwarded message -- From: Francesco Pietra Date: Tue, Mar 3, 2009 at 11:21 AM Subject: Re: raid1 issue, somewhat related to recent "debian on big machines" To: Ron Johnson On Tue, Mar 3, 2009 at 10:08 AM, Ron Johnson wrote: > On 03/03/2009 02:53 AM, Francesco Pietra wrote: >> >> lupus in fabula as a follow up of my short intervention on raid1 with >> my machine to the thread "Debian on big systems". >> >> System: supermicro H8QC8 m.board, two WD Raptor SATA 150GB, Debian >> amd64 lenny, raid1 >> >> While running an electronic molecular calculation - estimated to four >> days time - I noticed by chance on the screen (what is not in the out >> file of the calculation) that there was a disk problem. I took some >> scattered notes from the scree: >> >> RAID1 conf printout >> >> wd: 1 rd:2 > > [snip] > > What you are looking for should be in syslog, not your application's log. OK, but /var/log/syslog tels nothing more that I took notice about from the screen: sda sector 0 problematic, disk failure, continuing on one disk. My question is, what does the lshw -disk output mean (SCSI vs SATA, as I have shown), and if one disk has to be replaced with a new one. If so, to identify which is which can I detach the SATA connection to the disks and see which one works? thanks francesco > > -- > Ron Johnson, Jr. > Jefferson LA USA > > The feeling of disgust at seeing a human female in a Relationship > with a chimp male is Homininphobia, and you should be ashamed of > yourself. > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact > listmas...@lists.debian.org > > -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
Francesco Pietra writes: The important bit is: > Disk failure on sda1, disabling device So one of your disks failed. The kernel marked it as such and continious on the other disk alone. This is invisible to the application (apart from the short hickup) and no data is corrupted or lost. That is why you use raid1 after all. > Personalities : [raid1] > md6 : active raid1 sda8[2](F) sdb8[1] > 102341952 blocks [2/1] [_U] > > md5 : active raid1 sda7[2](F) sdb7[1] > 1951744 blocks [2/1] [_U] > > md4 : active raid1 sda6[2](F) sdb6[1] > 2931712 blocks [2/1] [_U] > > md3 : active raid1 sda5[2](F) sdb5[1] > 14651136 blocks [2/1] [_U] > > md1 : active raid1 sda2[2](F) sdb2[1] > 6835584 blocks [2/1] [_U] > > md0 : active raid1 sda1[2](F) sdb1[1] > 2931712 blocks [2/1] [_U] > > md2 : active raid1 sda3[2](F) sdb3[1] > 14651200 blocks [2/1] [_U] That is a lot of raids. Have you ever thought about using LVM? The different raid1 will mess up each others assumption about the head positioning of the component devices. On read the linux kernel tries to use the disk with the shorter seek and assumes the head is where it left it on the last access. But if one of the other raids used that disk the head will be way off. I would suggest the following scheme: sda1 / sdb1 : 100Mb raid1 for /boot (or 1GB for / + /boot) sda2 / sdb2 : rest raid1 with lvm MfG Goswin -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: raid1 issue, somewhat related to recent "debian on big machines"
On 03/03/2009 02:53 AM, Francesco Pietra wrote: lupus in fabula as a follow up of my short intervention on raid1 with my machine to the thread "Debian on big systems". System: supermicro H8QC8 m.board, two WD Raptor SATA 150GB, Debian amd64 lenny, raid1 While running an electronic molecular calculation - estimated to four days time - I noticed by chance on the screen (what is not in the out file of the calculation) that there was a disk problem. I took some scattered notes from the scree: RAID1 conf printout wd: 1 rd:2 [snip] What you are looking for should be in syslog, not your application's log. -- Ron Johnson, Jr. Jefferson LA USA The feeling of disgust at seeing a human female in a Relationship with a chimp male is Homininphobia, and you should be ashamed of yourself. -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
raid1 issue, somewhat related to recent "debian on big machines"
lupus in fabula as a follow up of my short intervention on raid1 with my machine to the thread "Debian on big systems". System: supermicro H8QC8 m.board, two WD Raptor SATA 150GB, Debian amd64 lenny, raid1 While running an electronic molecular calculation - estimated to four days time - I noticed by chance on the screen (what is not in the out file of the calculation) that there was a disk problem. I took some scattered notes from the scree: RAID1 conf printout wd: 1 rd:2 disk0 wd:1 o:0 dev: sda6 disk0 wd:1 o:0 dev: sdb6 md: recovery of raid array md4 minimum guaranteed speed 1000 kB/sec/disk using max available idle I/O bandwidth but no more than 20 .. Disk failure on sda1, disabling device Operation continues on 1 devices. raid sdb1: redirecting sector 262176 to another mirror RAID1 conf printout wd:1 rd:2 .. disk1, wd:0 0:1 dev:sdb7 = Then, the electronic molecular calculation resumed - with all CPUs at work, as indicated by top - and in its output file there was no trace of the above problems. Command: lshw -class disk reported: *-cdrom description: DVD writer product: PIONEER DVD-RW DVR-111D vendor: Pioneer physical id: 0 bus info: i...@0.0 logical name: /dev/hda version: 1.02 capabilities: packet atapi cdrom removable nonmagnetic dma lba iordy pm audio cd-r cd-rw dvd dvd-r configuration: mode=udma4 status=nodisc *-disk:0 description: SCSI Disk physical id: 0 bus info: s...@0:0.0.0 logical name: /dev/sda size: 139GiB (150GB) *-disk:1 description: ATA Disk product: WDC WD1500ADFD-0 vendor: Western Digital physical id: 1 bus info: s...@1:0.0.0 logical name: /dev/sdb version: 20.0 serial: WD-WMAP41173675 size: 139GiB (150GB) capabilities: partitioned partitioned:dos configuration: ansiversion=5 signature=000b05ba The description of disk 0 was cryptic to me. As there have been RAM problems, I also run lshw -class memory all DIMMs are correctly reported. No mem problem. === Then I run: /proc/mdstat the output was: Personalities : [raid1] md6 : active raid1 sda8[2](F) sdb8[1] 102341952 blocks [2/1] [_U] md5 : active raid1 sda7[2](F) sdb7[1] 1951744 blocks [2/1] [_U] md4 : active raid1 sda6[2](F) sdb6[1] 2931712 blocks [2/1] [_U] md3 : active raid1 sda5[2](F) sdb5[1] 14651136 blocks [2/1] [_U] md1 : active raid1 sda2[2](F) sdb2[1] 6835584 blocks [2/1] [_U] md0 : active raid1 sda1[2](F) sdb1[1] 2931712 blocks [2/1] [_U] md2 : active raid1 sda3[2](F) sdb3[1] 14651200 blocks [2/1] [_U] unused devices: === I would appreciate advice. thanks francesco pietra -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org