Re: IBM xSeries stop responding during RAID1 reconstruction

2006-06-17 Thread Niccolo Rigacci
On Thursday 15 June 2006 12:13, you wrote:
 If this is causing a lockup, then there is something else wrong, just
 as any single process should not - by writing constantly to disks - be
 able to clog up the whole system.

 Maybe if you could get the result of
   alt-sysrq-P

I tried some kernel changes enabling the HyperThreading on the (single) P4 
processor and enabling CONFIG_PREEMPT_VOLUNTARY=y, but with no success.

During the lookup Alt-SysRq-P constantly says that:

  EIP is at mwait_idle+0x1a/0x2e

While Alt-SysRq-T shows - among other processes - the MD syncing and the bash 
looked-up; this is the hand-copied call traces:

md3_resync
  device_barrier
  default_wake_function
  sync_request
  __generic_unplug_device
  md_do_sync
  schedule
  md_thread
  md_thread
  kthread
  kthread
  kernel_thread_helper

bash
  io_schedule
  sync_buffer
  sync_buffer
  __wait_on_bit_lock
  sync_buffer
  out_of_line_wait_on_bit_lock
  wake_bit_function
  __lock_buffer
  do_get_write_access
  __ext3_get_inode_loc
  jurnal_get_write_access
  ext3_reserve_inode_write
  ext3_mark_inode_dirty
  ext3_dirty_inode
  __mark_inode_dirty
  update_atime
  vfs_readdir
  sys_getdents64
  filldir64
  syscall_call


This is also the top output, which runs regularly during the lookup:

top - 11:40:41 up 7 min,  2 users,  load average: 8.70, 4.92, 2.04
Tasks:  70 total,   1 running,  69 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2% us,  0.7% sy,  0.0% ni, 98.7% id,  0.0% wa,  0.0% hi,  0.5% si
Mem:906212k total,58620k used,   847592k free, 3420k buffers
Swap:  1951736k total,0k used,  1951736k free,23848k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
  829 root  10  -5 000 S1  0.0   0:01.70 md3_raid1
 2823 root  10  -5 000 D1  0.0   0:01.62 md3_resync
1 root  16   0  1956  656  560 S0  0.1   0:00.52 init
2 root  RT   0 000 S0  0.0   0:00.00 migration/0
3 root  34  19 000 S0  0.0   0:00.00 ksoftirqd/0
4 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
5 root  RT   0 000 S0  0.0   0:00.00 migration/1
6 root  34  19 000 S0  0.0   0:00.00 ksoftirqd/1
7 root  RT   0 000 S0  0.0   0:00.00 watchdog/1
8 root  10  -5 000 S0  0.0   0:00.01 events/0
9 root  10  -5 000 S0  0.0   0:00.01 events/1
   10 root  10  -5 000 S0  0.0   0:00.00 khelper
   11 root  10  -5 000 S0  0.0   0:00.00 kthread
   14 root  10  -5 000 S0  0.0   0:00.00 kblockd/0
   15 root  10  -5 000 S0  0.0   0:00.00 kblockd/1
   16 root  11  -5 000 S0  0.0   0:00.00 kacpid
  152 root  20   0 000 S0  0.0   0:00.00 pdflush
  153 root  15   0 000 D0  0.0   0:00.00 pdflush
  154 root  17   0 000 S0  0.0   0:00.00 kswapd0
  155 root  11  -5 000 S0  0.0   0:00.00 aio/0
  156 root  11  -5 000 S0  0.0   0:00.00 aio/1
  755 root  10  -5 000 S0  0.0   0:00.00 kseriod
  796 root  10  -5 000 S0  0.0   0:00.00 ata/0
  797 root  11  -5 000 S0  0.0   0:00.00 ata/1
  799 root  11  -5 000 S0  0.0   0:00.00 scsi_eh_0
  800 root  11  -5 000 S0  0.0   0:00.00 scsi_eh_1
  825 root  15   0 000 S0  0.0   0:00.00 kirqd
  831 root  10  -5 000 D0  0.0   0:00.00 md2_raid1
  833 root  10  -5 000 S0  0.0   0:00.00 md1_raid1
  834 root  10  -5 000 D0  0.0   0:00.00 md0_raid1
  835 root  15   0 000 D0  0.0   0:00.00 kjournald
  932 root  18  -4  2192  584  368 S0  0.1   0:00.19 udevd
 1698 root  10  -5 000 S0  0.0   0:00.00 khubd
 2031 root  22   0 000 S0  0.0   0:00.00 kjournald
 2032 root  15   0 000 D0  0.0   0:00.00 kjournald
 2142 daemon16   0  1708  364  272 S0  0.0   0:00.00 portmap
 2464 root  16   0  2588  932  796 S0  0.1   0:00.01 syslogd

-- 
Niccolo Rigacci
Firenze - Italy

War against Iraq? Not in my name!
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 software problems after loosing 4 disks for 48 hours

2006-06-17 Thread Wilson Wilson

Neil great stuff, its online now!!!

I followed your 2nd suggestion and ran mdadm --create /dev/md0 -f -l5
-n15 -c32  /dev/sd[bcdefghijklmnop]1 , after 8 hours we reached 99.9%
and some errors appeared on sdh1 which was then kicked from archive
but it was fully online.

Ill see if any further errors are reported on sdh, but in the meantime
I hotadded it back into the array which was successful.

To my surprise a full fsck reported a clean volume.

I am still unsure how this raid5 volume was partially readable with 4
disks missing.  My understanding each file is written across all disks
apart from one, which is used for CRC.  So if 2 disks are offline the
whole thing should be unreadable.

Once again thanks for your help


On 6/16/06, Neil Brown [EMAIL PROTECTED] wrote:

On Friday June 16, [EMAIL PROTECTED] wrote:

 And is there a way if more then 1 disks goes offline, for the whole
 array to be taken offline? My understanding of raid5 is loose 1+ disks
 and nothing on the raid would be readable. this is not the case here.


Nothing will be writable, but some blocks might be readable.


 All the disks are online now, what do I need to do to rebuild the array?

Have you tried
 mdadm --assemble --force /dev/md0 /dev/sd[bcdefghijklmnop]1
??
Actually, it occurs to me that that might not do the best thing if 4
drives disappeared at exactly the same time (though it is unlikely
that you would notice)
You should probably use
 mdadm --create /dev/md0 -f -l5 -n15 -c32  /dev/sd[bcdefghijklmnop]1
This is assuming that  e,f,g,h were in that order in the array before
they died.
The '-f' is quite important - it tells mdadm not recover a spare, but
to resync the parity blocks.

NeilBrown


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Neil Brown wrote:

On Friday June 16, [EMAIL PROTECTED] wrote:
  
Thanks for all the advice. One final question, what kernel and mdadm 
versions do I need?



For resizing raid5:

mdadm-2.4 or later
linux-2.6.17-rc2 or later

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

Ok, I tried and screwed up!

I upgraded my kernel and mdadm.
I set the grow going and all looked well, so as it said it was going to 
take 430 minutes, I went to Starbucks. When I came home there had been a 
power cut, but my UPS had shut the system down. When power returned I 
rebooted. Now I think I had failed to set the new partition on /dev/hdc1 
to Raid Autodetect, so it didn't find it at reboot. I tried to hot add 
it, but now I seem to have a deadlock situation. Although --detail shows 
that it is degraded and recovering, /proc/mdstat shows it is reshaping. 
In truth there is no disk activity and the count in /proc/mdstat is not 
changing. I gues sthe only good news is that I can still mount the 
device and my data is fine. Please see below...


Any ideas what I should do next? Thanks

Nigel

[EMAIL PROTECTED] ~]# uname -a
Linux homepc.nigelterry.net 2.6.17-rc6 #1 SMP Sat Jun 17 11:05:52 EDT 
2006 x86_64 x86_64 x86_64 GNU/Linux

[EMAIL PROTECTED] ~]# mdadm --version
mdadm - v2.5.1 -  16 June 2006
[EMAIL PROTECTED] ~]# mdadm --detail /dev/md0
/dev/md0:
   Version : 00.91.03
 Creation Time : Tue Apr 18 17:44:34 2006
Raid Level : raid5
Array Size : 490223104 (467.51 GiB 501.99 GB)
   Device Size : 245111552 (233.76 GiB 250.99 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Sat Jun 17 15:15:05 2006
 State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
 Spare Devices : 1

Layout : left-symmetric
Chunk Size : 128K

Reshape Status : 6% complete
 Delta Devices : 1, (3-4)

  UUID : 50e3173e:b5d2bdb6:7db3576b:644409bb
Events : 0.3211829

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   8   171  active sync   /dev/sdb1
  2   3   652  active sync   /dev/hdb1
  3   003  removed

  4  221-  spare   /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=...]  reshape =  6.9% (17073280/245111552) 
finish=86.3min speed=44003K/sec


unused devices: none
[EMAIL PROTECTED] ~]#

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 software problems after loosing 4 disks for 48 hours

2006-06-17 Thread David Greaves
Wilson Wilson wrote:
 Neil great stuff, its online now!!!
Congratulations :)

 I am still unsure how this raid5 volume was partially readable with 4
 disks missing.  My understanding each file is written across all disks
 apart from one, which is used for CRC.  So if 2 disks are offline the
 whole thing should be unreadable.
I'll try :)

md doesn't operate at a file level, it operates on chunks. The chunk
could be 64Kb in size.

For raid5 each stripe is made of n-1 chunks. (raid6 would be n-2).
When a stripe is read, if your file is in one of the chunks that's still
there then you're in luck.

I guess md knows it's degraded and gives as much data back as possible.

This means that you have a certain probability of accessing a given file
depending on it's size, the filesystem and the degree to which the array
is degraded.

FWIW I'd *never* try a r/w operation on such a degraded array.

Speculation:
I'm surprised you could mount such a 'sparse' array though. I wonder if
some filesystems (like xfs) would just barf as they mounted because they
have more distributed mount-time data structures and would spot the
missing chunks.  Others (ext3?) may just mount and try to read blocks on
demand.

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Neil Brown
On Saturday June 17, [EMAIL PROTECTED] wrote:
 
 Any ideas what I should do next? Thanks
 

Looks like you've probably hit a bug.  I'll need a bit more info
though.

First:

 [EMAIL PROTECTED] ~]# cat /proc/mdstat
 Personalities : [raid5] [raid4]
 md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
   490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
 [UUU_]
   [=...]  reshape =  6.9% (17073280/245111552) 
 finish=86.3min speed=44003K/sec
 
 unused devices: none

This really makes it look like the reshape is progressing.  How
long after the reboot was this taken?  How long after hdc1 has hot
added (roughly)?  What does it show now?

What happens if you remove hdc1 again?  Does the reshape keep going?

What I would expect to happen in this case is that the array reshapes
into a degraded array, then the missing disk is recovered onto hdc1.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Nigel J. Terry wrote:


Neil Brown wrote:

On Saturday June 17, [EMAIL PROTECTED] wrote:
 

Any ideas what I should do next? Thanks




Looks like you've probably hit a bug.  I'll need a bit more info
though.

First:

 

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 
[4/3] [UUU_]
  [=...]  reshape =  6.9% (17073280/245111552) 
finish=86.3min speed=44003K/sec


unused devices: none



This really makes it look like the reshape is progressing.  How
long after the reboot was this taken?  How long after hdc1 has hot
added (roughly)?  What does it show now?

What happens if you remove hdc1 again?  Does the reshape keep going?

What I would expect to happen in this case is that the array reshapes
into a degraded array, then the missing disk is recovered onto hdc1.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  
I don't know how long the system was reshaping before the power went 
off, and then I had to restart when the power came back. It claimed it 
was going to take 430 minutes, so 6% would be about 25 minutes, which 
could make good sense, certainly it looked like it was working fine 
when I went out.


Now nothing is happening, it shows:

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 
[4/3] [UUU_]
 [=...]  reshape =  6.9% (17073280/245111552) 
finish=2281.2min speed=1665K/sec


unused devices: none
[EMAIL PROTECTED] ~]#

so the only thing changing is the time till finish.

I'll try removing and adding /dev/hdc1 again. Will it make any 
difference if the device is mounted or not?


Nigel

Tried remove and add, made no difference:
[EMAIL PROTECTED] ~]# mdadm /dev/md0 --remove /dev/hdc1
mdadm: hot removed /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=...]  reshape =  6.9% (17073280/245111552) 
finish=2321.5min speed=1636K/sec


unused devices: none
[EMAIL PROTECTED] ~]# mdadm /dev/md0 --add /dev/hdc1
mdadm: re-added /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 hdc1[4](S) sdb1[1] sda1[0] hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=...]  reshape =  6.9% (17073280/245111552) 
finish=2329.3min speed=1630K/sec


unused devices: none
[EMAIL PROTECTED] ~]#

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Neil Brown

OK, thanks for the extra details.  I'll have a look and see what I can
find, but it'll probably be a couple of days before I have anything
useful for you.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Neil Brown wrote:

OK, thanks for the extra details.  I'll have a look and see what I can
find, but it'll probably be a couple of days before I have anything
useful for you.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

OK, I'll try and be patient :-) At least everything else is working.

Let me know if you need to ssh to my machine.

Nigel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html