Re: [Lustre-discuss] odd kernel crash after a heartbeat failover

2010-04-16 Thread Andreas Dilger
On 2010-04-16, at 11:29, John White wrote: > Just to follow-up, after enabling netconsole to get some meaningful > logging out of these OSSs, it is clear that there's a problem with > the backend storage communication and that this certainly isn't a > lustre issue. Thanks folks. > > On Apr 1

Re: [Lustre-discuss] llapi stripe_size

2010-04-16 Thread Andreas Dilger
On 2010-04-16, at 11:07, burlen wrote: > calling llapi_file_create reports the following error: > > error: bad stripe_size 4096, must be an even multiple of 65536 bytes: > Invalid argument (22) > > The operation manual said: This value must be an even multiple of > system > page size, as shown by

Re: [Lustre-discuss] odd kernel crash after a heartbeat failover

2010-04-16 Thread John White
Just to follow-up, after enabling netconsole to get some meaningful logging out of these OSSs, it is clear that there's a problem with the backend storage communication and that this certainly isn't a lustre issue. Thanks folks. John White High Performance Computing Services (HP

[Lustre-discuss] llapi stripe_size

2010-04-16 Thread burlen
calling llapi_file_create reports the following error: error: bad stripe_size 4096, must be an even multiple of 65536 bytes: Invalid argument (22) The operation manual said: This value must be an even multiple of system page size, as shown by getpagesize. The value 4096 above was returned from

Re: [Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

2010-04-16 Thread Andreas Dilger
On 2010-04-16, at 01:27, Christos Theodosiou wrote: > our lustre installation uses two failover MDSes, which serve 10 > file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version. > > By monitoring the MDSes I noticed that we get frequent error messages > (1-4 times/hour) which looklike this:

Re: [Lustre-discuss] fseeks on lustre

2010-04-16 Thread Ronald K Long
After doing some more digging it looks as though a bug was reported on this in 2007. https://bugzilla.lustre.org/show_bug.cgi?id=12739 We have loaded the patch for lustre attached to this bug, however when running the set_param command I am getting the following error. lctl set_param llite*.

[Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

2010-04-16 Thread Christos Theodosiou
Hi all, our lustre installation uses two failover MDSes, which serve 10 file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version. By monitoring the MDSes I noticed that we get frequent error messages (1-4 times/hour) which looklike this: Apr 16 10:36:10 lustre01 kernel: LustreError: 3138