RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-17 Thread Jeff Zheng
 Fix confirmed, filled the whole 11T hard disk, without crashing.
I presume this would go into 2.6.22

Thanks again.

Jeff

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Zheng
> Sent: Thursday, 17 May 2007 5:39 p.m.
> To: Neil Brown; [EMAIL PROTECTED]; Michal Piotrowski; Ingo 
> Molnar; [EMAIL PROTECTED]; 
> linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
> Subject: RE: Software raid0 will crash the file-system, when 
> each disk is 5TB
> 
> 
> Yeah, seems you've locked it down, :D. I've written 600GB of 
> data now, and anything is still fine.
> Will let it run overnight, and fill the whole 11T. I'll post 
> the result tomorrow
> 
> Thanks a lot though.
> 
> Jeff 
> 
> > -Original Message-
> > From: Neil Brown [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, 17 May 2007 5:31 p.m.
> > To: [EMAIL PROTECTED]; Jeff Zheng; Michal Piotrowski; Ingo Molnar; 
> > [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; 
> > [EMAIL PROTECTED]
> > Subject: RE: Software raid0 will crash the file-system, 
> when each disk 
> > is 5TB
> > 
> > On Thursday May 17, [EMAIL PROTECTED] wrote:
> > > 
> > > Uhm, I just noticed something.
> > > 'chunk' is unsigned long, and when it gets shifted up, we
> > might lose
> > > bits.  That could still happen with the 4*2.75T 
> arrangement, but is 
> > > much more likely in the 2*5.5T arrangement.
> > 
> > Actually, it cannot be a problem with the 4*2.75T arrangement.
> >   chuck << chunksize_bits
> > 
> > will not exceed the size of the underlying device *in*kilobytes*.
> > In that case that is 0xAE9EC800 which will git in a 32bit long.
> > We don't double it to make sectors until after we add
> > zone->dev_offset, which is "sector_t" and so 64bit 
> arithmetic is used.
> > 
> > So I'm quite certain this bug will cause exactly the problems 
> > experienced!!
> > 
> > > 
> > > Jeff, can you try this patch?
> > 
> > Don't bother about the other tests I mentioned, just try this one.
> > Thanks.
> > 
> > NeilBrown
> > 
> > > Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
> > > 
> > > ### Diffstat output
> > >  ./drivers/md/raid0.c |2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
> > > --- .prev/drivers/md/raid0.c  2007-05-17 
> > 10:33:30.0 +1000
> > > +++ ./drivers/md/raid0.c  2007-05-17 15:02:15.0 +1000
> > > @@ -475,7 +475,7 @@ static int raid0_make_request (request_q
> > >   x = block >> chunksize_bits;
> > >   tmp_dev = zone->dev[sector_div(x, zone->nb_dev)];
> > >   }
> > > - rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1)
> > > + rsect = sector_t)chunk << chunksize_bits) +
> > > +zone->dev_offset)<<1)
> > >   + sect_in_chunk;
> > >   
> > >   bio->bi_bdev = tmp_dev->bdev;
> > 
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-raid" in the body of a message to 
> [EMAIL PROTECTED] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-17 Thread Jeff Zheng
 Fix confirmed, filled the whole 11T hard disk, without crashing.
I presume this would go into 2.6.22

Thanks again.

Jeff

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Zheng
 Sent: Thursday, 17 May 2007 5:39 p.m.
 To: Neil Brown; [EMAIL PROTECTED]; Michal Piotrowski; Ingo 
 Molnar; [EMAIL PROTECTED]; 
 linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
 Subject: RE: Software raid0 will crash the file-system, when 
 each disk is 5TB
 
 
 Yeah, seems you've locked it down, :D. I've written 600GB of 
 data now, and anything is still fine.
 Will let it run overnight, and fill the whole 11T. I'll post 
 the result tomorrow
 
 Thanks a lot though.
 
 Jeff 
 
  -Original Message-
  From: Neil Brown [mailto:[EMAIL PROTECTED]
  Sent: Thursday, 17 May 2007 5:31 p.m.
  To: [EMAIL PROTECTED]; Jeff Zheng; Michal Piotrowski; Ingo Molnar; 
  [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; 
  [EMAIL PROTECTED]
  Subject: RE: Software raid0 will crash the file-system, 
 when each disk 
  is 5TB
  
  On Thursday May 17, [EMAIL PROTECTED] wrote:
   
   Uhm, I just noticed something.
   'chunk' is unsigned long, and when it gets shifted up, we
  might lose
   bits.  That could still happen with the 4*2.75T 
 arrangement, but is 
   much more likely in the 2*5.5T arrangement.
  
  Actually, it cannot be a problem with the 4*2.75T arrangement.
chuck  chunksize_bits
  
  will not exceed the size of the underlying device *in*kilobytes*.
  In that case that is 0xAE9EC800 which will git in a 32bit long.
  We don't double it to make sectors until after we add
  zone-dev_offset, which is sector_t and so 64bit 
 arithmetic is used.
  
  So I'm quite certain this bug will cause exactly the problems 
  experienced!!
  
   
   Jeff, can you try this patch?
  
  Don't bother about the other tests I mentioned, just try this one.
  Thanks.
  
  NeilBrown
  
   Signed-off-by: Neil Brown [EMAIL PROTECTED]
   
   ### Diffstat output
./drivers/md/raid0.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
   
   diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
   --- .prev/drivers/md/raid0.c  2007-05-17 
  10:33:30.0 +1000
   +++ ./drivers/md/raid0.c  2007-05-17 15:02:15.0 +1000
   @@ -475,7 +475,7 @@ static int raid0_make_request (request_q
 x = block  chunksize_bits;
 tmp_dev = zone-dev[sector_div(x, zone-nb_dev)];
 }
   - rsect = (((chunk  chunksize_bits) + zone-dev_offset)1)
   + rsect = sector_t)chunk  chunksize_bits) +
   +zone-dev_offset)1)
 + sect_in_chunk;
 
 bio-bi_bdev = tmp_dev-bdev;
  
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-raid in the body of a message to 
 [EMAIL PROTECTED] More majordomo info at  
 http://vger.kernel.org/majordomo-info.html
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

Yeah, seems you've locked it down, :D. I've written 600GB of data now,
and anything is still fine.
Will let it run overnight, and fill the whole 11T. I'll post the result
tomorrow

Thanks a lot though.

Jeff 

> -Original Message-
> From: Neil Brown [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, 17 May 2007 5:31 p.m.
> To: [EMAIL PROTECTED]; Jeff Zheng; Michal Piotrowski; Ingo 
> Molnar; [EMAIL PROTECTED]; 
> linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
> Subject: RE: Software raid0 will crash the file-system, when 
> each disk is 5TB
> 
> On Thursday May 17, [EMAIL PROTECTED] wrote:
> > 
> > Uhm, I just noticed something.
> > 'chunk' is unsigned long, and when it gets shifted up, we 
> might lose 
> > bits.  That could still happen with the 4*2.75T arrangement, but is 
> > much more likely in the 2*5.5T arrangement.
> 
> Actually, it cannot be a problem with the 4*2.75T arrangement.
>   chuck << chunksize_bits
> 
> will not exceed the size of the underlying device *in*kilobytes*.
> In that case that is 0xAE9EC800 which will git in a 32bit long.
> We don't double it to make sectors until after we add
> zone->dev_offset, which is "sector_t" and so 64bit arithmetic is used.
> 
> So I'm quite certain this bug will cause exactly the problems 
> experienced!!
> 
> > 
> > Jeff, can you try this patch?
> 
> Don't bother about the other tests I mentioned, just try this one.
> Thanks.
> 
> NeilBrown
> 
> > Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
> > 
> > ### Diffstat output
> >  ./drivers/md/raid0.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
> > --- .prev/drivers/md/raid0.c2007-05-17 
> 10:33:30.0 +1000
> > +++ ./drivers/md/raid0.c2007-05-17 15:02:15.0 +1000
> > @@ -475,7 +475,7 @@ static int raid0_make_request (request_q
> > x = block >> chunksize_bits;
> > tmp_dev = zone->dev[sector_div(x, zone->nb_dev)];
> > }
> > -   rsect = (((chunk << chunksize_bits) + zone->dev_offset)<<1)
> > +   rsect = sector_t)chunk << chunksize_bits) + 
> > +zone->dev_offset)<<1)
> > + sect_in_chunk;
> >   
> > bio->bi_bdev = tmp_dev->bdev;
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

> What is the nature of the corruption?  Is it data in a file 
> that is wrong when you read it back, or does the filesystem 
> metadata get corrupted?
The corruption is in fs metadata, jfs is completely destroied, after 
Umount, fsck does not recogonize it as jfs anymore. Xfs gives kernel 
Crash, but seems still recoverable.
> 
> Can you try the configuration that works, and sha1sum the 
> files after you have written them to make sure that they 
> really are correct?
We have verified the data on the working configuration, we have written 
around 900 identical 10G files , and verified that the md5sum is
actually
the same. The verification took two days though :)

> My thought here is "maybe there is a bad block on one device, 
> and the block is used for data in the 'working' config, and 
> for metadata in the 'broken' config.
> 
> Can you try a degraded raid10 configuration. e.g.
> 
>mdadm -C /dev/md1 --level=10 --raid-disks=4 /dev/first missing \
>/dev/second missing
> 
> That will lay out the data in exactly the same place as with 
> raid0, but will use totally different code paths to access 
> it.  If you still get a problem, then it isn't in the raid0 code.

I will try this later today. As I'm now trying different size of the
component.
3.4T, seems working. Test 4.1T right now.

> Maybe try version 1 metadata (mdadm --metadata=1).  I doubt 
> that would make a difference, but as I am grasping at straws 
> already, it may be a straw woth trying.

Well the problem may also be in 3ware disk array, or disk array driver.
The guy
complaining about the same problem is also using 3ware disk array
controller.
But there is no way to verify that and a single disk array has been
working fine for us.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
I tried the patch, same problem show up, but no bug_on report

Is there any other things I can do?


Jeff


> Yes, I meant 2T, and yes, the components are always over 2T.  
> So I'm at a complete loss.  The raid0 code follows the same 
> paths and does the same things and uses 64bit arithmetic where needed.
> 
> So I have no idea how there could be a difference between 
> these two cases.  
> 
> I'm at a loss...
> 
> NeilBrown
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

> The only difference of any significance between the working 
> and non-working configurations is that in the non-working, 
> the component devices are larger than 2Gig, and hence have 
> sector offsets greater than 32 bits.

Do u mean 2T here?, but in both configuartion, the component devices are
larger than 2T (2.25T&5.5T).
 
> This does cause a slightly different code path in one place, 
> but I cannot see it making a difference.  But maybe it does.
> 
> What architecture is this running on?
> What C compiler are you using?

I386(i686)
Gcc 4.0.2 20051125, 
Distro is Fedora core, we've tried fc4 and fc6.

> Can you try with this patch?  It is the only thing that I can 
> find that could conceivably go wrong.
> 

OK, I will try the patach and post the result.

Best Regards
Jeff Zheng

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
 
You will definitely meet the same problem. As very large hardware disk
becomes more and more popular, this will become a big issue for software
raid. 


Jeff

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 17 May 2007 6:04 a.m.
To: Andreas Dilger
Cc: Jeff Zheng; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB


my experiance is taht if you don't have CONFIG_LBD enabled then the
kernel will report the larger disk as 2G and everything will work, you
just won't get all the space.

plus he seems to be crashing around 500G of data

and finally (if I am reading the post correctly) if he configures the
drives as 4x2.2TB=11TB instead of 2x5.5TB=11TB he doesn't have the same
problem.

I'm getting ready to setup a similar machine that will have 3x10TB (3 15
disk arrays with 750G drives), but won't be ready to try this for a few
more days.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
Problem is that is only happens when you actually write data to the
raid. You need the actual space to reproduce the problem.

Jeff 

-Original Message-
From: Jan Engelhardt [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 17 May 2007 6:17 a.m.
To: [EMAIL PROTECTED]
Cc: Andreas Dilger; Jeff Zheng; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB


On May 16 2007 11:04, [EMAIL PROTECTED] wrote:
>
> I'm getting ready to setup a similar machine that will have 3x10TB (3 
> 15 disk arrays with 750G drives), but won't be ready to try this for a
few more days.

You could emulate it with VMware. Big disks are quite "cheap" when they
are not allocated.


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
Problem is that is only happens when you actually write data to the
raid. You need the actual space to reproduce the problem.

Jeff 

-Original Message-
From: Jan Engelhardt [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 17 May 2007 6:17 a.m.
To: [EMAIL PROTECTED]
Cc: Andreas Dilger; Jeff Zheng; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB


On May 16 2007 11:04, [EMAIL PROTECTED] wrote:

 I'm getting ready to setup a similar machine that will have 3x10TB (3 
 15 disk arrays with 750G drives), but won't be ready to try this for a
few more days.

You could emulate it with VMware. Big disks are quite cheap when they
are not allocated.


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
 
You will definitely meet the same problem. As very large hardware disk
becomes more and more popular, this will become a big issue for software
raid. 


Jeff

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 17 May 2007 6:04 a.m.
To: Andreas Dilger
Cc: Jeff Zheng; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB


my experiance is taht if you don't have CONFIG_LBD enabled then the
kernel will report the larger disk as 2G and everything will work, you
just won't get all the space.

plus he seems to be crashing around 500G of data

and finally (if I am reading the post correctly) if he configures the
drives as 4x2.2TB=11TB instead of 2x5.5TB=11TB he doesn't have the same
problem.

I'm getting ready to setup a similar machine that will have 3x10TB (3 15
disk arrays with 750G drives), but won't be ready to try this for a few
more days.

David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

 The only difference of any significance between the working 
 and non-working configurations is that in the non-working, 
 the component devices are larger than 2Gig, and hence have 
 sector offsets greater than 32 bits.

Do u mean 2T here?, but in both configuartion, the component devices are
larger than 2T (2.25T5.5T).
 
 This does cause a slightly different code path in one place, 
 but I cannot see it making a difference.  But maybe it does.
 
 What architecture is this running on?
 What C compiler are you using?

I386(i686)
Gcc 4.0.2 20051125, 
Distro is Fedora core, we've tried fc4 and fc6.

 Can you try with this patch?  It is the only thing that I can 
 find that could conceivably go wrong.
 

OK, I will try the patach and post the result.

Best Regards
Jeff Zheng

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng
I tried the patch, same problem show up, but no bug_on report

Is there any other things I can do?


Jeff


 Yes, I meant 2T, and yes, the components are always over 2T.  
 So I'm at a complete loss.  The raid0 code follows the same 
 paths and does the same things and uses 64bit arithmetic where needed.
 
 So I have no idea how there could be a difference between 
 these two cases.  
 
 I'm at a loss...
 
 NeilBrown
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

 What is the nature of the corruption?  Is it data in a file 
 that is wrong when you read it back, or does the filesystem 
 metadata get corrupted?
The corruption is in fs metadata, jfs is completely destroied, after 
Umount, fsck does not recogonize it as jfs anymore. Xfs gives kernel 
Crash, but seems still recoverable.
 
 Can you try the configuration that works, and sha1sum the 
 files after you have written them to make sure that they 
 really are correct?
We have verified the data on the working configuration, we have written 
around 900 identical 10G files , and verified that the md5sum is
actually
the same. The verification took two days though :)

 My thought here is maybe there is a bad block on one device, 
 and the block is used for data in the 'working' config, and 
 for metadata in the 'broken' config.
 
 Can you try a degraded raid10 configuration. e.g.
 
mdadm -C /dev/md1 --level=10 --raid-disks=4 /dev/first missing \
/dev/second missing
 
 That will lay out the data in exactly the same place as with 
 raid0, but will use totally different code paths to access 
 it.  If you still get a problem, then it isn't in the raid0 code.

I will try this later today. As I'm now trying different size of the
component.
3.4T, seems working. Test 4.1T right now.

 Maybe try version 1 metadata (mdadm --metadata=1).  I doubt 
 that would make a difference, but as I am grasping at straws 
 already, it may be a straw woth trying.

Well the problem may also be in 3ware disk array, or disk array driver.
The guy
complaining about the same problem is also using 3ware disk array
controller.
But there is no way to verify that and a single disk array has been
working fine for us.

Jeff
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-16 Thread Jeff Zheng

Yeah, seems you've locked it down, :D. I've written 600GB of data now,
and anything is still fine.
Will let it run overnight, and fill the whole 11T. I'll post the result
tomorrow

Thanks a lot though.

Jeff 

 -Original Message-
 From: Neil Brown [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, 17 May 2007 5:31 p.m.
 To: [EMAIL PROTECTED]; Jeff Zheng; Michal Piotrowski; Ingo 
 Molnar; [EMAIL PROTECTED]; 
 linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
 Subject: RE: Software raid0 will crash the file-system, when 
 each disk is 5TB
 
 On Thursday May 17, [EMAIL PROTECTED] wrote:
  
  Uhm, I just noticed something.
  'chunk' is unsigned long, and when it gets shifted up, we 
 might lose 
  bits.  That could still happen with the 4*2.75T arrangement, but is 
  much more likely in the 2*5.5T arrangement.
 
 Actually, it cannot be a problem with the 4*2.75T arrangement.
   chuck  chunksize_bits
 
 will not exceed the size of the underlying device *in*kilobytes*.
 In that case that is 0xAE9EC800 which will git in a 32bit long.
 We don't double it to make sectors until after we add
 zone-dev_offset, which is sector_t and so 64bit arithmetic is used.
 
 So I'm quite certain this bug will cause exactly the problems 
 experienced!!
 
  
  Jeff, can you try this patch?
 
 Don't bother about the other tests I mentioned, just try this one.
 Thanks.
 
 NeilBrown
 
  Signed-off-by: Neil Brown [EMAIL PROTECTED]
  
  ### Diffstat output
   ./drivers/md/raid0.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff .prev/drivers/md/raid0.c ./drivers/md/raid0.c
  --- .prev/drivers/md/raid0.c2007-05-17 
 10:33:30.0 +1000
  +++ ./drivers/md/raid0.c2007-05-17 15:02:15.0 +1000
  @@ -475,7 +475,7 @@ static int raid0_make_request (request_q
  x = block  chunksize_bits;
  tmp_dev = zone-dev[sector_div(x, zone-nb_dev)];
  }
  -   rsect = (((chunk  chunksize_bits) + zone-dev_offset)1)
  +   rsect = sector_t)chunk  chunksize_bits) + 
  +zone-dev_offset)1)
  + sect_in_chunk;

  bio-bi_bdev = tmp_dev-bdev;
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-15 Thread Jeff Zheng
Here is the information of the created raid0. Hope it is enough.

Jeff

The crashing one:
md: bind
md: bind
md: raid0 personality registered for level 0
md0: setting max_sectors to 4096, segment boundary to 1048575
raid0: looking at sde
raid0:   comparing sde(5859284992) with sde(5859284992)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sdd
raid0:   comparing sdd(5859284992) with sde(5859284992)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 11718569984 blocks.
raid0 : conf->hash_spacing is 11718569984 blocks.
raid0 : nb_zone is 2.
raid0 : Allocating 8 bytes for hash.
JFS: nTxBlock = 8192, nTxLock = 65536

The working one:
md: bind
md: bind
md: bind
md: bind
md0: setting max_sectors to 4096, segment boundary to 1048575
raid0: looking at sdd
raid0:   comparing sdd(2929641472) with sdd(2929641472)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sdg
raid0:   comparing sdg(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: looking at sdf
raid0:   comparing sdf(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: looking at sde
raid0:   comparing sde(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 11718565888 blocks.
raid0 : conf->hash_spacing is 11718565888 blocks.
raid0 : nb_zone is 2.
raid0 : Allocating 8 bytes for hash.
JFS: nTxBlock = 8192, nTxLock = 65536

-Original Message-
From: Neil Brown [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, 16 May 2007 12:04 p.m.
To: Michal Piotrowski
Cc: Jeff Zheng; Ingo Molnar; [EMAIL PROTECTED];
linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB

On Wednesday May 16, [EMAIL PROTECTED] wrote:
> >
> > Anybody have a clue?
> >

No...
When a raid0 array is assemble, quite a lot of message get printed
about number of zones and hash_spacing etc.  Can you collect and post
those.  Both for the failing case (2*5.5T) and the working case
(4*2.55T) is possible.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Software raid0 will crash the file-system, when each disk is 5TB

2007-05-15 Thread Jeff Zheng
Hi everyone:

We are experiencing problems with software raid0, with very
large disk arrays.
We are using two 3ware disk array controllers, each of them is connected
8 750GB harddrives. And we build a software raid0 on top of that. The
total capacity is 5.5TB+5.5TB=11TB

We use jfs as the file-system, we have a test application that write
data continuously to the disks. After writing 52 10GB files, jfs
crashed. And we are not able to recover it, fsck doesn't recognise it
anymore.
We then tried xfs, same application, lasted a little longer, but gives
kernel crash later.

We then reconfigured the hardware array, this time we configured two
disk array from each controller, than we have 4 disk arrays, each of
them have 4 750GB harddrives. Than build a new software raid0 on top of
that. Total capacity is still the same, but 2.75T+2.75T+2.75T+2.75T=11T.

This time we managed to fill the whole 11T data without problem, we are
still doing validation on all 11TB of data written to the disks.

It happened on 2.6.20 and 2.6.13.

So I think the problem is in the way on software raid handling very
large disk, maybe a integer overflow or something. I've searched on the
web, only find another guy complaining the same thing on the xfs mailing
list.

Anybody have a clue?


Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Software raid0 will crash the file-system, when each disk is 5TB

2007-05-15 Thread Jeff Zheng
Hi everyone:

We are experiencing problems with software raid0, with very
large disk arrays.
We are using two 3ware disk array controllers, each of them is connected
8 750GB harddrives. And we build a software raid0 on top of that. The
total capacity is 5.5TB+5.5TB=11TB

We use jfs as the file-system, we have a test application that write
data continuously to the disks. After writing 52 10GB files, jfs
crashed. And we are not able to recover it, fsck doesn't recognise it
anymore.
We then tried xfs, same application, lasted a little longer, but gives
kernel crash later.

We then reconfigured the hardware array, this time we configured two
disk array from each controller, than we have 4 disk arrays, each of
them have 4 750GB harddrives. Than build a new software raid0 on top of
that. Total capacity is still the same, but 2.75T+2.75T+2.75T+2.75T=11T.

This time we managed to fill the whole 11T data without problem, we are
still doing validation on all 11TB of data written to the disks.

It happened on 2.6.20 and 2.6.13.

So I think the problem is in the way on software raid handling very
large disk, maybe a integer overflow or something. I've searched on the
web, only find another guy complaining the same thing on the xfs mailing
list.

Anybody have a clue?


Jeff
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Software raid0 will crash the file-system, when each disk is 5TB

2007-05-15 Thread Jeff Zheng
Here is the information of the created raid0. Hope it is enough.

Jeff

The crashing one:
md: bindsdd
md: bindsde
md: raid0 personality registered for level 0
md0: setting max_sectors to 4096, segment boundary to 1048575
raid0: looking at sde
raid0:   comparing sde(5859284992) with sde(5859284992)
raid0:   END
raid0:   == UNIQUE
raid0: 1 zones
raid0: looking at sdd
raid0:   comparing sdd(5859284992) with sde(5859284992)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 11718569984 blocks.
raid0 : conf-hash_spacing is 11718569984 blocks.
raid0 : nb_zone is 2.
raid0 : Allocating 8 bytes for hash.
JFS: nTxBlock = 8192, nTxLock = 65536

The working one:
md: bindsde
md: bindsdf
md: bindsdg
md: bindsdd
md0: setting max_sectors to 4096, segment boundary to 1048575
raid0: looking at sdd
raid0:   comparing sdd(2929641472) with sdd(2929641472)
raid0:   END
raid0:   == UNIQUE
raid0: 1 zones
raid0: looking at sdg
raid0:   comparing sdg(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: looking at sdf
raid0:   comparing sdf(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: looking at sde
raid0:   comparing sde(2929641472) with sdd(2929641472)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 11718565888 blocks.
raid0 : conf-hash_spacing is 11718565888 blocks.
raid0 : nb_zone is 2.
raid0 : Allocating 8 bytes for hash.
JFS: nTxBlock = 8192, nTxLock = 65536

-Original Message-
From: Neil Brown [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, 16 May 2007 12:04 p.m.
To: Michal Piotrowski
Cc: Jeff Zheng; Ingo Molnar; [EMAIL PROTECTED];
linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
Subject: Re: Software raid0 will crash the file-system, when each disk
is 5TB

On Wednesday May 16, [EMAIL PROTECTED] wrote:
 
  Anybody have a clue?
 

No...
When a raid0 array is assemble, quite a lot of message get printed
about number of zones and hash_spacing etc.  Can you collect and post
those.  Both for the failing case (2*5.5T) and the working case
(4*2.55T) is possible.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/