Hello again,
any idea what can be done in such a case ?
Regards
Heiko
Hello,
after a crash (hardware failure) of an OST with two lustre partitions one
partition (/dev/sdb) cannot be remounted after restart.
The second (/dev/sdc) partition mounts fine.
What needs to be done in such a case ?
I
>/ clip off one of the OSS:
/>/
/>/ Aug 13 17:26:48 lustre-oss-0-1 kernel: LustreError: 137-5: UUID
/>/ 'lfs-OST0004_UUID' is not available for connect(no target)
/
I dont see any hardware issues.
Could this be caused by extremely high load? I'm seeing up to 300 load on a
dual processor quad
I just realized that I may not have answered your question, and I'm
not sure if the patch is in the source posted at sun.com or not.
If not, it is in the bug as an attachment -
https://bugzilla.lustre.org/show_bug.cgi?id=14040
-frank
On Aug 13, 2008, at 1:07 PM, Frank Leers wrote:
On Aug 13
On Aug 13, 2008, at 12:38 PM, Brock Palen wrote:
Is the cache patch for mv_sata noted in the sun paper on the x4500
available? Or has it been rolled into the source distributed by sun?
What source are you referring to?
It can be had here http://www.sun.com/servers/x64/x4500/downloads.jsp
T
Is the cache patch for mv_sata noted in the sun paper on the x4500
available? Or has it been rolled into the source distributed by sun?
Trying to avoid data loss.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
The disks are DDN S2A 9550.
I didnt see any issues on the hardware side. And my IO tests finished
fine which was using all of the OST. Could long IO wait cause an error
like that?
I'v seen the OSS load hit 300 on a IO stress test.
-Alex
___
Lustre-d
On Wed, 2008-08-13 at 22:39 +0900, Alex Lee wrote:
> I have a system thats been spitting out OST disconnect messages under
> heavy load. I'm guessing the OST eventually reconnects.
> I want to say this happens when the OSS is extremely overloaded but I
> did notice this happening even under light
I have a system thats been spitting out OST disconnect messages under
heavy load. I'm guessing the OST eventually reconnects.
I want to say this happens when the OSS is extremely overloaded but I
did notice this happening even under light load. Only the OSS seems to
spit out any error messages.
On Wed, 2008-08-13 at 11:55 +0300, Alex wrote:
> Hello Brian,
Hi.
> Thanks for your prompt reply... See my comments inline..
NP.
> i have a cluster with 3 servers (let say they are web servers
> for simplicity). All web servers are serving the same content from a shared
> storage volume mount
Hello,
after a crash (hardware failure) of an OST with two lustre partitions one
partition (/dev/sdb) cannot be remounted after restart.
The second (/dev/sdc) partition mounts fine.
What needs to be done in such a case ?
I tried to move the mountpoint because of the "file exists" message but tha
Hello Brian,
Thanks for your prompt reply... See my comments inline..
> Right. So you have 8 iscsi disks.
Yes, let simplify our test environment.
- i have 2 lvs routers (one active router and one backup router used for
failover) to balance connections through our servers located behind them.
-
11 matches
Mail list logo