Greetings again! Thank you very much, I have formatted one OST using FS lustre00 (mkfs.lustre --ost --reformat --fsname=lustre00 --mgsnode=192.168.11...@o2ib 801); restarted Lustre file system and got all OST's UP and active. But when I tried to test system, I've meet another error: On client, I created directory; set striping (I choosed small chunks to got file on both OST's) and added files to the directory. But when I made lfs getstripe command I got next stack:
[ka...@client]$ sudo lfs getstripe My_file.tif OBDS:_formatting.txt INF0.txt My.log~ ha_errors~ links~ writeconf 0: lustre00-OST0000_UUID ACTIVE~ My_New_test.txt input lustre-log writeconf~ 1: lustre00-OST0001_UUID ACTIVE ha_errors links tmp *** buffer overflow detected ***: lfs terminated ======= Backtrace: =========tre00/ /lib64/libc.so.6(__fortify_fail+0x37)[0x7f577afeca27] /lib64/libc.so.6(+0xdea40)[0x7f577afeaa40] /lib64/libc.so.6(+0xddd04)[0x7f577afe9d04] lfs[0x41872c]~]# vim After_formatting.txt lfs[0x418dcc] lfs[0x4192fd] lfs[0x403e70] lfs[0x409678] lfs[0x404947] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f577af2ab6d] lfs[0x402f89] ======= Memory map: ======== 00400000-0045c000 r-xp 00000000 08:01 7401524 /usr/bin/lfs 0065b000-0065c000 r--p 0005b000 08:01 7401524 /usr/bin/lfs 0065c000-0065d000 rw-p 0005c000 08:01 7401524 /usr/bin/lfs 0065d000-00691000 rw-p 0065d000 00:00 0 [heap] 7f577aad4000-7f577aaea000 r-xp 00000000 08:01 1941620 /lib64/libgcc_s.so.1 7f577aaea000-7f577ace9000 ---p 00016000 08:01 1941620 /lib64/libgcc_s.so.1 7f577ace9000-7f577acea000 r--p 00015000 08:01 1941620 /lib64/libgcc_s.so.1 7f577acea000-7f577aceb000 rw-p 00016000 08:01 1941620 /lib64/libgcc_s.so.1 7f577aceb000-7f577ad08000 r-xp 00000000 08:01 1941553 /lib64/libtinfo.so.5.7 7f577ad08000-7f577af07000 ---p 0001d000 08:01 1941553 /lib64/libtinfo.so.5.7 7f577af07000-7f577af0b000 r--p 0001c000 08:01 1941553 /lib64/libtinfo.so.5.7 7f577af0b000-7f577af0c000 rw-p 00020000 08:01 1941553 /lib64/libtinfo.so.5.7 7f577af0c000-7f577b056000 r-xp 00000000 08:01 1941507 /lib64/libc-2.11.1.so 7f577b056000-7f577b256000 ---p 0014a000 08:01 1941507 /lib64/libc-2.11.1.so 7f577b256000-7f577b25a000 r--p 0014a000 08:01 1941507 /lib64/libc-2.11.1.so 7f577b25a000-7f577b25b000 rw-p 0014e000 08:01 1941507 /lib64/libc-2.11.1.so 7f577b25b000-7f577b260000 rw-p 7f577b25b000 00:00 0 7f577b260000-7f577b298000 r-xp 00000000 08:01 1941558 /lib64/libreadline.so.5.2 7f577b298000-7f577b497000 ---p 00038000 08:01 1941558 /lib64/libreadline.so.5.2 7f577b497000-7f577b499000 r--p 00037000 08:01 1941558 /lib64/libreadline.so.5.2 7f577b499000-7f577b49f000 rw-p 00039000 08:01 1941558 /lib64/libreadline.so.5.2 7f577b49f000-7f577b4a1000 rw-p 7f577b49f000 00:00 0 7f577b4a1000-7f577b4bd000 r-xp 00000000 08:01 1941513 /lib64/ld-2.11.1.so 7f577b6af000-7f577b6b2000 rw-p 7f577b6af000 00:00 0 7f577b6b5000-7f577b6b6000 rw-p 7f577b6b5000 00:00 0 7f577b6b6000-7f577b6bb000 rw-s 00000000 00:08 1835008 /SYSV00000000 (deleted) 7f577b6bb000-7f577b6bc000 rw-p 7f577b6bb000 00:00 0 7f577b6bc000-7f577b6bd000 r--p 0001b000 08:01 1941513 /lib64/ld-2.11.1.so 7f577b6bd000-7f577b6be000 rw-p 0001c000 08:01 1941513 /lib64/ld-2.11.1.so 7f577b6be000-7f577b6bf000 rw-p 7f577b6be000 00:00 0 7fff06a07000-7fff06a1c000 rw-p 7ffffffea000 00:00 0 [stack] 7fff06a41000-7fff06a42000 r-xp 7fff06a41000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted Is there some way to repair it? I've tried writeconfig but nothing changes. _________________ Thanks, Katya Andreas Dilger wrote: > On 2010-04-19, at 01:41, x...@xgl.pereslavl.ru wrote: >> I have 1 OST that seems like inactive device on client: >> [Client] lfs df -h >> UUID bytes Used Available Use% Mounted on >> lustre00-MDT0000_UUID 814.8G 471.8M 767.8G 0% /mnt/lustre00[MDT:0] >> lustre00-OST0000_UUID: inactive device >> lustre00-OST0001_UUID 7.2T 10.4G 6.8T 0% /mnt/lustre00[OST:1] >> How can I activate this device? >> >> I have 2 OSSs theoretically configured as a failover pair using >> heartbeat, 1 MDS and 2 OSTs accessible from both OSS-es. >> haresources: >> my1.localdomain Filesystem::/dev/disk/by-id/scsi-801::/mnt/ost0::lustre >> my2.localdomain >> Filesystem::/dev/disk/by-id/scsi-800::/mnt/mdt::lustre >> Filesystem::/dev/disk/by-id/scsi-802::/mnt/ost1::lustre >> >> On both OSS-es this device seems like active: >> [my2.localdomain ~]# lctl dl >> 5 UP osc lustre00-OST0001-osc lustre00-mdtlov_UUID 5 >> 6 UP osc lustre00-OST0000-osc lustre00-mdtlov_UUID 5 >> 8 UP obdfilter lustre00-OST0001 lustre00-OST0001_UUID 7 >> >> 0 UP mgc mgc192.168.11....@o2ib 89a7ffad-6d5e-8468-1b95-c694f35b8ad1 5 >> 1 UP ost OSS OSS_uuid 3 >> 2 UP obdfilter lustre-OST0000 lustre-OST0000_UUID 3 >> >> What am I missing? > > > If, in fact, the OST is active on both OSSes, that would be VERY bad. > However, it seems like you have two different OSTs, one in the > "lustre" filesystem, one in the "lustre00" filesystem, so it seems you > have some sort of a configuration problem. > > Cheers, Andreas > -- > Andreas Dilger > Principal Engineer, Lustre Group > Oracle Corporation Canada Inc. > > > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss