Re: [lustre-discuss] OST is not mounting

Thomas Roth via lustre-discuss Mon, 13 Nov 2023 04:35:55 -0800

So, did you do the "writeconf"? And the OST mounted afterwards?

As I understand, the MGS was under the impression that this re-mountingOST was actually a new one using an old index.

So, what made your repaired OST look new/different ?

I would probably have mounted it locally, as an ext4 file system, ifonly to check that there is data still present (ok, "df" would do that,too)."tunefs.lustre --dryrun" will show other quantum numbers that _shouldnot_ change when taking down and remounting an OST.

And since "writeconf" has to be done on all targets, you have to takedown your MDS anyhow - so nothing is lost by simply trying an MDS restart?


Regards
Thomas

On 11/5/23 17:11, Backer via lustre-discuss wrote:

Hi,
I am new to this email list. Looking to get some help on why an OST isnot getting mounted.
The cluster was running healthy and the OST experienced an issue andLinux re-mounted the OST read only. After fixing the issue and rebootingthe node multiple times, it wouldn't mount.
When the mount is done, the mount command errors out stating that thatthe index is already in use. The index for the device is 33. There isno place where this index is mounted.
The debug message from the MGS during the mount is attached at the endof this email. It is asking to use writeconf. After using writeconfig,the device was mounted. Looking for a couple of things here.
- I am hoping that the writeconf method is the right thing to do here.
- Why did OST become in this state after the write failure and wasmounted RO. The write error was due to iSCSI target going offline andcoming back after a few seconds later.
20000000:01000000:17.0:1698240468.758487:0:91492:0:(mgs_handler.c:496:mgs_target_reg())
 updating fs1-OST0021, index=33

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:4403:mgs_write_log_target())
 Process entered

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:671:mgs_set_index())
 Process entered

20000000:00000001:17.0:1698240468.758488:0:91492:0:(mgs_llog.c:572:mgs_find_or_make_fsdb())
 Process entered

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:551:mgs_find_or_make_fsdb_nolock())
 Process entered

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:565:mgs_find_or_make_fsdb_nolock())
 Process leaving (rc=0 : 0 : 0)

20000000:00000001:17.0:1698240468.758489:0:91492:0:(mgs_llog.c:578:mgs_find_or_make_fsdb())
 Process leaving (rc=0 : 0 : 0)

20000000:02020000:17.0:1698240468.758490:0:91492:0:(mgs_llog.c:711:mgs_set_index())
 140-5: Server fs1-OST0021 requested index 33, but that index is already in 
use. Use --writeconf to force

20000000:00000001:17.0:1698240468.772355:0:91492:0:(mgs_llog.c:712:mgs_set_index())
 Process leaving via out_up (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)

20000000:00000001:17.0:1698240468.772356:0:91492:0:(mgs_llog.c:4408:mgs_write_log_target())
 Process leaving (rc=18446744073709551518 : -98 : ffffffffffffff9e)

20000000:00020000:17.0:1698240468.772357:0:91492:0:(mgs_handler.c:503:mgs_target_reg())
 Failed to write fs1-OST0021 log (-98)

20000000:00000001:17.0:1698240468.783747:0:91492:0:(mgs_handler.c:504:mgs_target_reg())
 Process leaving via out (rc=18446744073709551518 : -98 : 0xffffffffffffff9e)




_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] OST is not mounting

Reply via email to