Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Andreas Dilger
On 2010-11-11, at 19:53, Christopher Walker wrote: > Thanks very much for your reply. I've tried remaking the mdsdb and all > of the ostdb's, but I still get the same error -- it checks the first 34 > osts without a problem, but can't find the ostdb file for the 35th > (which has ost_idx 42): > >

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Wang Yibin
This is a bug in llapi_lov_get_uuids() which assigns UUID to the wrong OST index when there are sparse OST(s). Please file a bug for this. Before this bug can be fixed, you can apply the following patch to e2fsprogs(version 1.41.12.2.ora1) lfsck.c as a workaround (not verified though). --- e2fs

[Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Tina Friedrich
Hello List, we re-export the file system via NFS for a couple of things. All the re-exporters are Red Hat 5.5 servers running kernel 2.6.18-194.17.1.el5 (patchless clients). We upgraded our Lustre system from 1.6.7 to 1.8.3.ddn3.3 last week. That seems to have introduced a problem. Since this

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Bernd Schubert
Hello Tina, On Friday, November 12, 2010, Tina Friedrich wrote: > Hello List, > > we re-export the file system via NFS for a couple of things. All the > re-exporters are Red Hat 5.5 servers running kernel 2.6.18-194.17.1.el5 > (patchless clients). that is your problem. You MUST use a patched ver

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Tina Friedrich
It does seem to allow NFS exports using RPC version 3 just fine though. It's just the version 1 & 2 were it doesn't work. But thanks, I'll try the patched kernel. Tina On 12/11/10 12:46, Bernd Schubert wrote: > Hello Tina, > > On Friday, November 12, 2010, Tina Friedrich wrote: >> Hello List, >

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Tina Friedrich
Hello again, nope, running with / exporting from a server with the patched kernel running does not change this behaviour at all. mountvers=3 works, 1 and 2 don't. Tina On 12/11/10 13:28, Tina Friedrich wrote: > It does seem to allow NFS exports using RPC version 3 just fine though. > It's just

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Christopher Walker
Thanks Andreas. The orphan data is scattered throughout the array, although it's primarily on one OST (30) which seems to have been hit particularly hard by this outage: [r...@iliadaccess04 lfsck2]# grep ERROR lfsck2.out lfsck: ost_idx 5: pass2 ERROR: 3817 dangling inodes found (654297 files t

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Christopher Walker
Thanks *very* much -- I'll give this a shot later today and let you know how it goes. Best, Chris On 11/12/10 3:17 AM, Wang Yibin wrote: > This is a bug in llapi_lov_get_uuids() which assigns UUID to the wrong OST > index when there are sparse OST(s). > Please file a bug for this. > > Before thi

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Bernd Schubert
Hello Tina, On 11/12/2010 03:44 PM, Tina Friedrich wrote: > Hello again, > > nope, running with / exporting from a server with the patched kernel > running does not change this behaviour at all. mountvers=3 works, 1 and > 2 don't. I can reproduce it, so NFSv2 support got broken. Which issue ha

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Tina Friedrich
You're asking questions! No chance we can upgrade the Lustre version again for the next couple of months anyway and it appears to be no problem to make the stupid embedded things use NFSv3. So tar, I would say. Tina On 12/11/10 16:22, Bernd Schubert wrote: > Hello Tina, > > On 11/12/2010 03:44

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Christopher Walker
Thanks again for this patch. I just have one quick question about this -- 1.41.12.2.ora1 seems to require lustre_user.h from 1.8.x -- is OK to use a version of lfsck compiled against 1.8.x on a 1.6.6 filesystem, and with {mds,ost}db that were created with 1.41.6? Best, Chris On 11/12/10 3:17 AM,

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Wang Yibin
For the moment, without investigation, I am not sure about this - There may or may not be compatibility issue. Please checkout the version of the e2fsprogs which is identical with that on your system and patch against the lfsck.c accordingly. Then you can compile against 1.6.6. 在 2010-11-13,上午1

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Alexey Lyashkov
Hi Tina, if i correctly remember, lustre inode identifier can't correctly encoded in NFS fid v1, due NFS limits. NFS have too short structure to store FS id, generated from lustre client superblock. that is reason to have incorrect conversion from NFS fid to lustre inode identifier and nfsd can

Re: [Lustre-discuss] NFS problem after upgrade to 1.8.3

2010-11-12 Thread Alexey Lyashkov
Bernd, that is problem not related to stack size. mountd should encode {sb->sb_dev, inode->i_no} in own structure, but for mount v1 sb_dev too short to store full sb_dev (it is 16bit, but nfs fid v1 has 8 for it) bit but lustre client can generate that. in that case nfsd got invalid NFS fid to c