smal mdadm 1.5 bug

2006-08-21 Thread jwillem



We have had a raidsystem running for several years 
Created by the Fedora Core 2 installscript. 

Is has bit bigger raidsystem of more than 11 TB
It was installed on 32 bit Fedora Core 2  linux 2.6.10 with mdadm 1.50

After a failure we had one disk that was out sync. 
Couldn't assemble raidset, even with --force - Cannot write superblock

After looking at the sourcecode in the file util.c  from mdadm-1.5.0 we
found out that variable size was negative
We changed variable size from long size into unsigned long size wich
fixed this problem.

Seems that 1.9 already fixed the problem, perhaps even earlyer
But also that 
 Changes Prior to 1.5.0 release
 -   Fix compiling error with BLKGETSIZE64 and some signed/unsigned 
 comparison warnings.

This was the version we were working on, so something went wrong there. 

Greetings Jan-Willem Michels ( Boy Kentrop working at our company fixed
our problem) 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH] Fix a potential NULL dereference in md/raid1

2006-08-21 Thread Greg KH
On Mon, Aug 21, 2006 at 10:05:26AM +1000, NeilBrown wrote:
 patch for 2.6.17 stable series.
 Thanks,
 NeilBrown
 ### Comments for Changeset
 
 At the point where this 'atomic_add' is, rdev could be NULL,
 as seen by the fact that we test for this in the very next 
 statement.
 Further is it is really the wrong place of the add.
 We could add to the count of corrected errors 
 once the are sure it was corrected, not before
 trying to correct it.
 
 Signed-off-by: Neil Brown [EMAIL PROTECTED]

Queued for -stable, thanks.

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: invalid superblock - *again*

2006-08-21 Thread Dexter Filmore
Am Montag, 21. August 2006 13:04 schrieb Dexter Filmore:
I seriously don't know what's going on here.
I upgraded packages and rebooted the machine to find that now disk 4 of 4 is 
not assembled.

Here's dmesg and mdadm -E 

* dmesg **
[   38.439644] md: md0 stopped.
[   38.536089] md: bindsdb1
[   38.536301] md: bindsdc1
[   38.536501] md: bindsdd1
[   38.536702] md: bindsda1
[   38.536733] md: kicking non-fresh sdd1 from array!
[   38.536751] md: unbindsdd1
[   38.536765] md: export_rdev(sdd1)
[   38.536794] raid5: device sda1 operational as raid disk 0
[   38.536812] raid5: device sdc1 operational as raid disk 2
[   38.536831] raid5: device sdb1 operational as raid disk 1
[   38.537453] raid5: allocated 4195kB for md0
[   38.537471] raid5: raid level 5 set md0 active with 3 out of 4 devices, 
algor
ithm 2
[   38.537499] RAID5 conf printout:
[   38.537513]  --- rd:4 wd:3 fd:1
[   38.537528]  disk 0, o:1, dev:sda1
[   38.537543]  disk 1, o:1, dev:sdb1
[   38.537558]  disk 2, o:1, dev:sdc1
*

Most notable: [   38.536733] md: kicking non-fresh sdd1 from array!
What does this mean?

* mdadm -E /dev/sdd1 
/dev/sdd1:
  Magic : a92b4efc
Version : 00.90.02
   UUID : 7f103422:7be2c2ce:e67a70be:112a2914
  Creation Time : Tue May  9 01:11:41 2006
 Raid Level : raid5
Device Size : 244187904 (232.88 GiB 250.05 GB)
 Array Size : 732563712 (698.63 GiB 750.15 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

Update Time : Tue Aug 22 01:42:36 2006
  State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
   Checksum : 33b2d59b - correct
 Events : 0.765488

 Layout : left-symmetric
 Chunk Size : 32K

  Number   Major   Minor   RaidDevice State
this 3   8   493  active sync   /dev/sdd1

   0 0   810  active sync   /dev/sda1
   1 1   8   171  active sync   /dev/sdb1
   2 2   8   332  active sync   /dev/sdc1
   3 3   8   493  active sync   /dev/sdd1

*

What's happening here? What can I do? Do I have to readd sdd and resync? Or is 
there an easier way out? What causes these issues?


-- 
-BEGIN GEEK CODE BLOCK-
Version: 3.12
GCS d--(+)@ s-:+ a- C UL++ P+++ L+++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h++ r* y?
--END GEEK CODE BLOCK--

http://www.stop1984.com
http://www.againsttcpa.com
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: invalid superblock - *again*

2006-08-21 Thread Neil Brown
On Tuesday August 22, [EMAIL PROTECTED] wrote:
 Am Montag, 21. August 2006 13:04 schrieb Dexter Filmore:
 I seriously don't know what's going on here.
 I upgraded packages and rebooted the machine to find that now disk 4 of 4 is 
 not assembled.
 
 Here's dmesg and mdadm -E 
 
 * dmesg **
 [   38.439644] md: md0 stopped.
 [   38.536089] md: bindsdb1
 [   38.536301] md: bindsdc1
 [   38.536501] md: bindsdd1
 [   38.536702] md: bindsda1
 [   38.536733] md: kicking non-fresh sdd1 from array!
 [   38.536751] md: unbindsdd1
 [   38.536765] md: export_rdev(sdd1)
 [   38.536794] raid5: device sda1 operational as raid disk 0
 [   38.536812] raid5: device sdc1 operational as raid disk 2
 [   38.536831] raid5: device sdb1 operational as raid disk 1
 [   38.537453] raid5: allocated 4195kB for md0
 [   38.537471] raid5: raid level 5 set md0 active with 3 out of 4 devices, 
 algor
 ithm 2
 [   38.537499] RAID5 conf printout:
 [   38.537513]  --- rd:4 wd:3 fd:1
 [   38.537528]  disk 0, o:1, dev:sda1
 [   38.537543]  disk 1, o:1, dev:sdb1
 [   38.537558]  disk 2, o:1, dev:sdc1
 *
 
 Most notable: [   38.536733] md: kicking non-fresh sdd1 from array!
 What does this mean?

It means that the 'event' count on sdd1 is old compared to that on
the other partitions.  The most likely explanation is that when the
array was last running, sdd1 was not part of it.

 
 What's happening here? What can I do? Do I have to readd sdd and resync? Or 
 is 
 there an easier way out? What causes these issues?
 

Yes, you need to add sdd1 back to the array and it will resync.

I would need some precise recent history of the array to know why this
happened.  That might not be easy to come by.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html