Re: [lustre-discuss] failed OST recover
Many years ago when I was using Lustre-1.8.X, I used to suffer the same nightmare as you now. The following procedure saved me. But I am not sure whether it works to you or not. Thank you! I had found this recipe, but in new lustre versions it does not work, ll_recover_lost_found_objs does not exists any more. I have 2.12.2 installed. As I understand, its function is integrated into lfsck procedure now. But it does not work as I expect. Can anybody give me a clue how to force this procedure? Should I stop all clients and do lsfck with enabled broken OST? I do not want to experiment, while I have tens of users and one week of lustre unavailability without significant results looks very bad for me. 1. umount all the clients, umount OST. 2. mount OST as ldiskfs: mount -t ldiskfs /dev/ /mnt 3. Run the command: ll_recover_lost_found_objs -d At that event it restored about 70% of data back. In case that you want to remove the files which were lost in OST, but unfortunately using "rm -f " does not work: 1. Record the full paths of the files which you want to remove. 2. umount all client, OST, and MDT. 3. Mount MDT as ldiskfs: mount -t ldiskfs /dev/ /mnt 4. Go to /mnt/ROOT/. You will find the completed directory tree of your Lustre file system, but without the file contents. You can remove the files you want from here. Cheers, T.H.Hsieh On Mon, Nov 30, 2020 at 01:09:07PM +0300, Sergey Zhumatiy wrote: Hello! Please, help to resolve... One ost on my lustre installation has been failed. It lost all fs metadatam so I couldn't mount it as lustre filesystem. I've checked it by e2fsck and all data was moved into lost+found folder. Then I moved this folder to another storage, re-created this ost (with old target index), then put back lost+found folder. After mount this ost lustre, I've started lfsck on mds. In several hours I disabled this ost, because no client can work. Then lustre become heathy, and I started lfs_migrate from this ost. But it seems, that data was not restored by lfsck and lfs_migrate moved a few of files and the rest is 'endpoint not connected'. How can I restore some data and delete unrecoverable data? -- With respect Serg. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org -- С уважением Serg. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] failed OST recover
Dear Serg, Many years ago when I was using Lustre-1.8.X, I used to suffer the same nightmare as you now. The following procedure saved me. But I am not sure whether it works to you or not. 1. umount all the clients, umount OST. 2. mount OST as ldiskfs: mount -t ldiskfs /dev/ /mnt 3. Run the command: ll_recover_lost_found_objs -d At that event it restored about 70% of data back. In case that you want to remove the files which were lost in OST, but unfortunately using "rm -f " does not work: 1. Record the full paths of the files which you want to remove. 2. umount all client, OST, and MDT. 3. Mount MDT as ldiskfs: mount -t ldiskfs /dev/ /mnt 4. Go to /mnt/ROOT/. You will find the completed directory tree of your Lustre file system, but without the file contents. You can remove the files you want from here. Cheers, T.H.Hsieh On Mon, Nov 30, 2020 at 01:09:07PM +0300, Sergey Zhumatiy wrote: > Hello! > Please, help to resolve... One ost on my lustre installation has been > failed. It lost all fs metadatam so I couldn't mount it as lustre > filesystem. I've checked it by e2fsck and all data was moved into lost+found > folder. Then I moved this folder to another storage, re-created this ost > (with old target index), then put back lost+found folder. > > After mount this ost lustre, I've started lfsck on mds. In several hours I > disabled this ost, because no client can work. Then lustre become heathy, > and I started lfs_migrate from this ost. > > But it seems, that data was not restored by lfsck and lfs_migrate moved a > few of files and the rest is 'endpoint not connected'. > > How can I restore some data and delete unrecoverable data? > > -- > With respect >Serg. > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] The safe path for upgrading servers from 2.5.3 to 2.12.x
Dear Nguyen, Usually the upgrade procedure is the following: 1. Shutdown the Lustre file system completely. (umount all the clients and servers) 2. In all the clients and servers, install the new version of Lustre software. If your servers are Lustre with ldiskfs backend, please install the recommand version of e2fsprogs package. (see https://www.lustre.org/download/) 3. In all the servers and clients, unload the Lustre modules by: lustre_rmmod Or you can reboot these machines instead. 4. In each of the servers, run the following command to upgrade: tunefs.lustre --writeconf /dev/ Note that you have to do it for all the MGS / MDT / OST. 5. When running tunefs.lustre, it may prompt you to turn on some options of the ldiskfs file system of corresponding device using the e2fsprogs utilities. Just follow the indications. 6. Then the upgrade is completed. You can try to restart the Lustre file system. I used to upgrade from version 2.5.X to 2.10.X and 2.12.X directly. Everything looks fine to me. Cheers, T.H.Hsieh On Mon, Nov 30, 2020 at 11:17:03PM +0700, Nguyen Viet Cuong wrote: > Hi there, > > Can anyone advise me the safe way to upgrade Lustre server from 2.5.3 to > 2.10.x or 2.12.x? I am running 2.5.3 on CentOS 6.5 with a FDR card. I now > have to upgrade the card to EDR or HDR. > > And, is there anyone successfully connecting servers with the mix of EDR > and HDR200? And how? > > Best regards, > Nguyen Viet Cuong > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] The safe path for upgrading servers from 2.5.3 to 2.12.x
Hi there, Can anyone advise me the safe way to upgrade Lustre server from 2.5.3 to 2.10.x or 2.12.x? I am running 2.5.3 on CentOS 6.5 with a FDR card. I now have to upgrade the card to EDR or HDR. And, is there anyone successfully connecting servers with the mix of EDR and HDR200? And how? Best regards, Nguyen Viet Cuong ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Bad performance lustre
Hi Lustre Community , Could you help me solve my luster perfomance problem. ? I have just set up several Lustre gateways between two infiniband networks A and B. Node A <--> Getway <--> Node B lustre 2.12.2 Lustre 2.13.3 Lustre 2.12.3 I am using ConnectX 5 Mellanox cards on servers with AMD Epyc processors. I have tested and validated all speeds at infiniband (ib_read_bw and ib_write_bw) and tcp / ip (iperf3) level. >From a lustre point of view (lnet selfttest the bit rates are correct between >the nodes of A and the gateways. On the other hand, the performance of the B nodes and the gateways are very very low: [LNet Rates of servers] [R] Avg: 26 RPC/s Min: 26 RPC/s Max: 26 RPC/s [W] Avg: 36 RPC/s Min: 36 RPC/s Max: 36 RPC/s [LNet Bandwidth of servers] [R] Avg: 8.61 MiB/s Min: 8.61 MiB/s Max: 8.61 MiB/s [W] Avg: 9.59 MiB/s Min: 9.59 MiB/s Max: 9.59 MiB/s [LNet Rates of servers] [R] Avg: 33 RPC/s Min: 33 RPC/s Max: 33 RPC/s [W] Avg: 42 RPC/s Min: 42 RPC/s Max: 42 RPC/s [LNet Bandwidth of servers] [R] Avg: 8.62 MiB/s Min: 8.62 MiB/s Max: 8.62 MiB/s [W] Avg: 8.81 MiB/s Min: 8.81 MiB/s Max: 8.81 MiB/s Do you have any idea on the causes of these ridiculous flow rates? Regards, ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] failed OST recover
Hello! Please, help to resolve... One ost on my lustre installation has been failed. It lost all fs metadatam so I couldn't mount it as lustre filesystem. I've checked it by e2fsck and all data was moved into lost+found folder. Then I moved this folder to another storage, re-created this ost (with old target index), then put back lost+found folder. After mount this ost lustre, I've started lfsck on mds. In several hours I disabled this ost, because no client can work. Then lustre become heathy, and I started lfs_migrate from this ost. But it seems, that data was not restored by lfsck and lfs_migrate moved a few of files and the rest is 'endpoint not connected'. How can I restore some data and delete unrecoverable data? -- With respect Serg. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Quota related (Anilkumar Naik)
Sorry, I typed the wrong word. You should replace qouta by quota. Anilkumar Naik 于2020年11月30日周一 下午2:41写道: > Below commands having errors for me. From our lustre details, could you > please provide exact command to run at our server.thank you. > > Regards, > Anilkumar > > On Mon, 30 Nov, 2020, 6:59 am 肖正刚, wrote: > >> Hi, >> you can enable user quota on mgs by >> " >> lctl conf_param your_fsname.qouta.mdt=u >> lctl conf_param your_fsname.qouta.ost=u >> " >> details about quota in lustre manual chapter 25 >> https://doc.lustre.org/lustre_manual.xhtml#configuringquotas >> >> >> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] 2.12 Client connecting to 2.5 Server
Thanks! I will try it and update result to you. Nguyen Viet Cuong On Mon, Nov 30, 2020 at 4:00 PM Tung-Han Hsieh < thhs...@twcp1.phys.ntu.edu.tw> wrote: > Hello, > > It is OK. We have a cluster with Lustre-2.5.3 installed in > the Lustre servers, and the clients with Lustre 2.5.3, 2.10.7, > and 2.12.5 mounted the Lustre-2.5.3 servers. So far there is > no problems. > > Cheers, > > T.H.Hsieh > > On Mon, Nov 30, 2020 at 03:48:07PM +0700, Nguyen Viet Cuong wrote: > > Hi there, > > > > Did anyone try to use 2.12 client with 2.5 server? Is it compatible? If I > > take a test on a live system, any risk? > > > > Thanks! > > Cuong Nguyen > > > ___ > > lustre-discuss mailing list > > lustre-discuss@lists.lustre.org > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] 2.12 Client connecting to 2.5 Server
Hello, It is OK. We have a cluster with Lustre-2.5.3 installed in the Lustre servers, and the clients with Lustre 2.5.3, 2.10.7, and 2.12.5 mounted the Lustre-2.5.3 servers. So far there is no problems. Cheers, T.H.Hsieh On Mon, Nov 30, 2020 at 03:48:07PM +0700, Nguyen Viet Cuong wrote: > Hi there, > > Did anyone try to use 2.12 client with 2.5 server? Is it compatible? If I > take a test on a live system, any risk? > > Thanks! > Cuong Nguyen > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] 2.12 Client connecting to 2.5 Server
Hi there, Did anyone try to use 2.12 client with 2.5 server? Is it compatible? If I take a test on a live system, any risk? Thanks! Cuong Nguyen ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org