Dear All,

We have a legacy version of Lustre installed as part of a DDN storage solution:

lustre: 2.4.3 (circa 2011)

kernel: patchless_client

Build Version: 
EXAScaler-ddn1.0--PRISTINE-2.6.32-358.23.2.el6_lustre.es279.devel.x86_64



It has been running fine for years but after a particularly bad power 
failure,it started producing the following messages:

Jan 15 10:03:07 mds2 kernel: : LustreError: 
3394:0:(osp_precreate.c:989:osp_precreate_thread()) 
scratch-OST0014-osc-MDT0000: cannot precreate objects: rc = -116
Jan 15 10:03:07 mds2 kernel: : LustreError: 
3394:0:(osp_precreate.c:989:osp_precreate_thread()) Skipped 210 previous 
similar messages
Jan 15 10:07:51 mds2 kernel: : Lustre: scratch-OST000f-osc-MDT0000: slow 
creates, last=[0x1000f0000:0x1217571a:0x0], next=[0x1000f0000:0x1217571a:0x0], 
reserved=0, syn_changes=0, syn_rpc_in_progress=0, status=0
Jan 15 10:07:51 mds2 kernel: : Lustre: Skipped 3 previous similar messages
Jan 15 10:08:32 oss5 kernel: : LustreError: 
26943:0:(ofd_obd.c:1348:ofd_create()) scratch-OST0004: unable to precreate: rc 
= -116
Jan 15 10:08:32 oss5 kernel: : LustreError: 
26943:0:(ofd_obd.c:1348:ofd_create()) Skipped 66 previous similar messages
Jan 15 10:09:26 oss4 kernel: : LustreError: 
18223:0:(ofd_obd.c:1348:ofd_create()) scratch-OST000f: unable to precreate: rc 
= -116
Jan 15 10:09:26 oss4 kernel: : LustreError: 
18223:0:(ofd_obd.c:1348:ofd_create()) Skipped 70 previous similar messages
Jan 15 10:09:37 oss3 kernel: : LustreError: 
16621:0:(ofd_obd.c:1348:ofd_create()) scratch-OST0014: unable to precreate: rc 
= -116
Jan 15 10:09:37 oss3 kernel: : LustreError: 
16621:0:(ofd_obd.c:1348:ofd_create()) Skipped 77 previous similar messages
Jan 15 10:09:38 mds2 kernel: : Lustre: scratch-OST0014-osc-MDT0000: slow 
creates, last=[0x100140000:0x11dd257a:0x0], next=[0x100140000:0x11dd257a:0x0], 
reserved=0, syn_changes=0, syn_rpc_in_progress=0, status=-116
Jan 15 10:13:12 mds2 kernel: : LustreError: 
3404:0:(osp_precreate.c:484:osp_precreate_send()) scratch-OST0004-osc-MDT0000: 
can't precreate: rc = -116
Jan 15 10:13:12 mds2 kernel: : LustreError: 
3404:0:(osp_precreate.c:484:osp_precreate_send()) Skipped 226 previous similar 
messages
Jan 15 10:13:12 mds2 kernel: : LustreError: 
3404:0:(osp_precreate.c:484:osp_precreate_send()) Skipped 226 previous similar 
messages
Jan 15 10:13:12 mds2 kernel: : LustreError: 
3404:0:(osp_precreate.c:989:osp_precreate_thread()) 
scratch-OST0004-osc-MDT0000: cannot precreate objects: rc = -116
Jan 15 10:13:12 mds2 kernel: : LustreError: 
3404:0:(osp_precreate.c:989:osp_precreate_thread()) Skipped 226 previous 
similar messages
Jan 15 10:18:37 oss5 kernel: : LustreError: 
1791:0:(ofd_obd.c:1348:ofd_create()) scratch-OST0004: unable to precreate: rc = 
-116
Jan 15 10:18:37 oss5 kernel: : LustreError: 
1791:0:(ofd_obd.c:1348:ofd_create()) Skipped 77 previous similar messages
Jan 15 10:19:36 oss4 kernel: : LustreError: 
1687:0:(ofd_obd.c:1348:ofd_create()) scratch-OST000f: unable to precreate: rc = 
-116
Jan 15 10:19:36 oss4 kernel: : LustreError: 
1687:0:(ofd_obd.c:1348:ofd_create()) Skipped 77 previous similar messages
Jan 15 10:19:42 oss3 kernel: : LustreError: 
1196:0:(ofd_obd.c:1348:ofd_create()) scratch-OST0014: unable to precreate: rc = 
-116
Jan 15 10:19:42 oss3 kernel: : LustreError: 
1196:0:(ofd_obd.c:1348:ofd_create()) Skipped 75 previous similar messages
Jan 15 10:23:16 mds2 kernel: : LustreError: 
3400:0:(osp_precreate.c:484:osp_precreate_send()) scratch-OST000f-osc-MDT0000: 
can't precreate: rc = -116

The messages concern the same 3 OSTs and appear both on the OSS servers serving 
those OSTs and the mds server responsible for that filesystem (/global/scratch).
They appear continuously, about every 4 minutes, and appear as soon as the 
filesystem is mounted.... even before any I/O occurs.  In other words, even on 
an inactive filesystem, the messages appear continuously.

While everything seems to work, the performance is terrible.  Creating a 
directory on the filesystem can take 1-2 minutes to complete.  The load on the 
mds server climbs to incredibly high values (100-160) during normal I/O 
operations and the filesystem overall is extremely slow.  The mds server 
complains about slow connections (see messages above).

We think the error messages above indicate the problem but despite searching 
many hours on the web, have not been able to find any documentation about what 
may be causing them, or how to correct the issue.

Any help would be greatly appreciated. Thanks a million for any suggestions and 
solutions....

All the best
Roman


_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to