Re: [Lustre-discuss] making a client reconnect to OST

2008-01-25 Thread Jim Harm
On the client i tried the lctl --device $number deactivate which worked followed by llctl --device $number activate which i believe should have done the same thing this failed without error notice to me. i ended up having to umount and mount, which finally reconnected the ost. At 12:55 PM -0700 1

Re: [Lustre-discuss] Lustre 1.6.4.1 - client lockup

2008-01-25 Thread Harald van Pee
Hi, thats interessting for me, can you just try what happens if you delete a large directory (lots of files with couple of GB total space) from this client? I have a test cluster with 1.6.4.1 kernel 2.6.18.8 vanilla running. The clients are patchless, server and clients are rock stable since

Re: [Lustre-discuss] Lustre 1.6.4.1 - client lockup

2008-01-25 Thread Kilian CAVALOTTI
Hi Niklas, On Friday 25 January 2008 07:10:47 am Niklas Edmundsson wrote: > We're able to consistently kill the lustre client with bonnie in > combination with striping. Out of curiosity, I tried to reproduce your experiment, and didn't encounter any problem. All the bonnie processes ran fine.

Re: [Lustre-discuss] lustre fail

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 18:28 +0100, Papp Tamas wrote: > What does this message mean? > > Jan 24 18:20:53 meta1 kernel: LustreError: 11-0: an error occurred > while communicating with [EMAIL PROTECTED] The ost_connect operation > failed with -19 -19 = -ENODEV (from /usr/include/asm/errno.h) That means

Re: [Lustre-discuss] making a client reconnect to OST

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 10:23 -0500, Brock Palen wrote: > I have a client (one of our login nodes) that was evicted by one of > the OST's but not both of them. So some files are accessible others > are not. Strange thing is that both the OST's live on the same OSS. > > Is there a way to ask lustre

Re: [Lustre-discuss] e2fsprogs version recommended for Lustre 1.6.4.2

2008-01-25 Thread Lundgren, Andrew
I got them yesterday. Thank you. > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Friday, January 25, 2008 12:57 AM > To: Lundgren, Andrew > Cc: Sébastien Buisson; [EMAIL PROTECTED] > Subject: RE: [Lustre-discuss] e2fsprogs version recommended > for Lustre

Re: [Lustre-discuss] lustre errors?

2008-01-25 Thread Isaac Huang
On Wed, Jan 23, 2008 at 10:06:54AM -0500, Brock Palen wrote: > We are seeing allot of messages like the following: > > Lustre: 3177:0:(router.c:167:lnet_notify()) Ignoring prediction from > [EMAIL PROTECTED] of [EMAIL PROTECTED] down 9786657437 seconds in > the future It's a known & fixed bug

Re: [Lustre-discuss] drbd(fail?)

2008-01-25 Thread Brian J. Murrell
On Thu, 2008-01-24 at 19:26 +0100, Papp Tamás wrote: > helo Everybody! Hi. > Does anybody have an idea, what happened, what would have to make with > any part of the history? I'm not sure how many people here use drbd or are expert enough on it to answer your questions, but perhaps a more drbd

[Lustre-discuss] Lustre 1.6.4.1 - client lockup

2008-01-25 Thread Niklas Edmundsson
Hi again! We're able to consistently kill the lustre client with bonnie in combination with striping. This is Lustre 1.6.4.1, Debian 2.6.18 amd64 kernel with lustre patches on both server and clients (ie. not patchless client, even though we're pretty sure that it's the same bug that bites us

Re: [Lustre-discuss] e2fsprogs version recommended for Lustre 1.6.4.2

2008-01-25 Thread Girish Shilamkar
Hi, I think you might have tried this link: ftp://ftp.lustre.org/pub/Lustre/Tools/e2fsprogs/ which does give an error 505. The following link does work for me: ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/ Regards, Girish. On Thu, 2008-01-24 at 12:32 -0700, Lundgren, Andrew wrote: > I tried

Re: [Lustre-discuss] getting filesystem metadata

2008-01-25 Thread James Braid
On 21 Jan 2008, at 15:03, Brock Palen wrote: > It appears if i do a find that the OST's still do allot of disk > seeks. Is there a way to explicitly pull this data from just the mdt > with out walking all 3 TB of data? I don't think there's a "nice" way to do this just yet. We snapshot the MDS v

Re: [Lustre-discuss] Lustre 1.6.4.2 released

2008-01-25 Thread James Braid
On 22 Jan 2008, at 18:27, Erich Focht wrote: > I'm waiting for a RHEL5.1 client, too, and somewhat hoped it would > come > along with 1.6.4.2. What is the targetted (approximate) release date > for 1.6.5? There's patches in bugzilla somewhere for rhel5.1 support if you need it before the next

Re: [Lustre-discuss] files/directories are temporarily unavailable on patchless clients

2008-01-25 Thread Harald van Pee
On Thursday 24 January 2008 08:13 pm, you wrote: > Hello Harald, > > > Jan 21 18:12:51 node0010 kernel: Lustre: 5717:0: > > (namei.c:235:ll_mdc_blocking_ast()) More than 1 alias dir 134120476 alias > > 2 Jan 21 18:12:51 node0010 kernel: Lustre: 5717:0: > > (namei.c:235:ll_mdc_blocking_ast()) Skippe

Re: [Lustre-discuss] e2fsprogs version recommended for Lustre 1.6.4.2

2008-01-25 Thread Lundgren, Andrew
I tried via a browser and got 505 errors saying I could not change into the directory. I tried via ftp and it blocked due to system load -- Andrew > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Thursday, January 24, 2008 12:28 PM > To: Sébastien Bui

Re: [Lustre-discuss] e2fsprogs version recommended for Lustre 1.6.4.2

2008-01-25 Thread Girish Shilamkar
Hi, Please use this link, instead. ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/ Regards, Girish On Wed, 2008-01-23 at 00:11 +0530, Girish Shilamkar wrote: > Hi, > ftp://ftp.lustre.org/pub/Lustre/Tools/e2fsprogs/ > The latest version of e2fsprogs for lustre can normally be found here

Re: [Lustre-discuss] files/directories are temporarily unavailable on patchless clients

2008-01-25 Thread Bernd Schubert
Hello Harald, > Jan 21 18:12:51 node0010 kernel: Lustre: 5717:0: > (namei.c:235:ll_mdc_blocking_ast()) More than 1 alias dir 134120476 alias 2 > Jan 21 18:12:51 node0010 kernel: Lustre: 5717:0: > (namei.c:235:ll_mdc_blocking_ast()) Skipped 6 previous similar messages this looks very much like a r

[Lustre-discuss] drbd(fail?)

2008-01-25 Thread Papp Tamás
helo Everybody! I have a strange problem with my cluster. Yesterday I saw, node3 of my lustre cluster (it's the pair of node4 of the heartbeat+drbd cluster) was freezed up and node4 didn't took over the OST. After reboot it always wrote 'System halted.' on console, but it cannot be down. I

Re: [Lustre-discuss] e2fsprogs version recommended for Lustre 1.6.4.2

2008-01-25 Thread Lundgren, Andrew
Any progress on this? Is there some place else we can download this from? > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Tuesday, January 22, 2008 11:41 AM > To: Sébastien Buisson > Cc: Lundgren, Andrew; [EMAIL PROTECTED] > Subject: Re: [Lustre-discuss] e

[Lustre-discuss] lustre fail

2008-01-25 Thread Papp Tamas
helo! What does this message mean? Jan 24 18:20:53 meta1 kernel: LustreError: 11-0: an error occurred while communicating with [EMAIL PROTECTED] The ost_connect operation failed with -19 Thank, tamas ___ Lustre-discuss mailing list Lustre-discuss@list

[Lustre-discuss] making a client reconnect to OST

2008-01-25 Thread Brock Palen
I have a client (one of our login nodes) that was evicted by one of the OST's but not both of them. So some files are accessible others are not. Strange thing is that both the OST's live on the same OSS. The errors in dmesg are: LustreError: 11-0: an error occurred while communicating with

Re: [Lustre-discuss] Lustre 1.6.4.2 released

2008-01-25 Thread Jody McIntyre
Hi Per, On Tue, Jan 22, 2008 at 03:37:37PM +0100, Per Lundqvist wrote: > Hi Jody, I notice that the rhel kernels are quite old: 2.6.9-55.0.9.EL is > from rhel4 update 5 and 2.6.18-8.1.14.el5 is from rhel5, before update 1. > The latest snapshots are rhel4 update 6 (2.6.9-67.0.1.EL) and rhel5 up

[Lustre-discuss] lustre errors?

2008-01-25 Thread Brock Palen
We are seeing allot of messages like the following: Lustre: 3177:0:(router.c:167:lnet_notify()) Ignoring prediction from [EMAIL PROTECTED] of [EMAIL PROTECTED] down 9786657437 seconds in the future Lustre: 3243:0:(ldlm_lib.c:519:target_handle_reconnect()) nobackup- OST: 606834a0-0818-74c2