This looks like https://jira.whamcloud.com/browse/LU-16655 causing problems 
after the upgrade from 2.12.x to 2.15.[012] breaking the Object Index files.

A patch for this has already been landed to b2_15 and will be included in 
2.15.3. If you've hit this issue, then you need to backup/delete the OI files 
(off of Lustre) and run OI Scrub to rebuild them.

I believe the OI Scrub/rebuild is described in the Lustre Manual.

Cheers, Andreas

On May 3, 2023, at 09:30, Colin Faber via lustre-discuss 
<lustre-discuss@lists.lustre.org> wrote:


Hi,

What does your client log indicate? (dmesg / syslog)

On Wed, May 3, 2023, 7:32 AM Jane Liu via lustre-discuss 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> wrote:
Hello,

I'm writing to ask for your help on one issue we observed after a major
upgrade of a large Lustre system from RHEL7 + 2.12.9 to RHEL8 + 2.15.2.
Basically we preserved MDT disk (VDisk on a VM) and also all OST disk
(JBOD) in RHEL7 and then reinstalled RHEL8 OS and then attached those
preserved disks to RHEL8 OS. However, I met an issue after the OS
upgrade and lustre installation.

I believe the issue is related to metadata.

The old MDS was a virtual machine, and the MDT vdisk was preserved
during the upgrade. When a new VM was created with the same hostname and
IP, the preserved MDT vdisk was attached to it. Everything seemed fine
initially. However, after the client mount was completed, the file
listing displayed question marks, as shown below:

[root@experimds01 ~]# mount -t lustre 11.22.33.44@tcp:/experi01
/mntlustre/
[root@experimds01 ~]# cd /mntlustre/
[root@experimds01 mntlustre]# ls -l
ls: cannot access 'experipro': No such file or directory
ls: cannot access 'admin': No such file or directory
ls: cannot access 'test4': No such file or directory
ls: cannot access 'test3': No such file or directory
total 0
d????????? ? ? ? ?            ? admin
d????????? ? ? ? ?            ? experipro
-????????? ? ? ? ?            ? test3
-????????? ? ? ? ?            ? test4

I shut down the MDT and ran "e2fsck -p
/dev/mapper/experimds01-experimds01". It reported "primary superblock
features different from
  backup, check forced."

[root@experimds01 ~]# e2fsck -p /dev/mapper/experimds01-experimds01
experi01-MDT0000 primary superblock features different from backup,
check forced.
experi01-MDT0000: 9493348/429444224 files (0.5% non-contiguous),
109369520/268428864 blocks

Running e2fsck again showed that the filesystem was clean.
[root@experimds01 /]# e2fsck -p /dev/mapper/experimds01-experimds01
experi01-MDT0000: clean, 9493378/429444224 files, 109369610/268428864
blocks

However, the issue persisted. The file listing continued to display
question marks.

Do you have any idea what could be causing this problem and how to fix
it? By the way, I have an e2image backup of the MDT from the
RHEL7 system just in case we need fix it using the backup. Also, after
the upgrade, the command "lfs df" shows that all OSTs and MDT
  are fine.

Thank you in advance for your assistance.

Best regards,
Jane
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to