Hi, Haoyang

Maybe you should rebuild the MOFED with new kernel first, then rebuild
lustre server package.
1) about restore
I think you can try switch to the old kernel first, but as you said, you
have rebuild the MOFED under the new kernel, so once you go back to the old
kernel you need to rebuild MOFED(make sure the versions are the same) .
 If this not worked, you can try reinstall the IO servers as what you have
done at the very beginning, I recommand you  use a new drive to install OS.

2) about data loss
No data loss, they are stored in mgt&mdt&osts.

Thanks
Regards,

<lustre-discuss-requ...@lists.lustre.org> 于2021年7月27日周二 上午4:28写道:

> Send lustre-discuss mailing list submissions to
>         lustre-discuss@lists.lustre.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> or, via email, send a message with subject or body 'help' to
>         lustre-discuss-requ...@lists.lustre.org
>
> You can reach the person managing the list at
>         lustre-discuss-ow...@lists.lustre.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lustre-discuss digest..."
>
>
> Today's Topics:
>
>    1. Recover from broken lustre updates (Haoyang Liu)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 26 Jul 2021 16:28:26 +0800 (GMT+08:00)
> From: "Haoyang Liu" <liuhaoy...@pku.edu.cn>
> To: lustre-discuss@lists.lustre.org
> Subject: [lustre-discuss] Recover from broken lustre updates
> Message-ID: <5e70f6a4.db93.17ae1edf43b.coremail.liuhaoy...@pku.edu.cn>
> Content-Type: text/plain; charset=UTF-8
>
> Hi all,
>
> I am using Lustre 2.7 along with mlnx infiniband. Recently I by mistake
> perform a system update and after the update the lustre modules won't load.
>
> System configuration before the update:
> centos-7.3, kernel version: 3.10.0-514.2.2.el7_lustre.gba8983e.x86_64
> lustre version:
> 2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64
> mlnx-ofed version:
> 4.2.1.2.0.1.gf8de107.kver.3.10.0_514.2.2.el7_lustre.gba8983e.x86_64.x86_64
>
> System configuration after the update:
> centos-7.3, kernel version: 3.10.0-514.2.2.el7_lustre.x86_64
> lustre version: 2.7.19.8-3.10.0_514.2.2.el7_lustre.x86_64.x86_64
> mlnx-ofed version:
> 4.2.1.2.0.1.gf8de107.kver.3.10.0_514.2.2.el7_lustre.gba8983e.x86_64.x86_64
>
> The update seems to just replace the linux kernel with a different patch
> version (w/o gba8983e),
> and rebuild the lustre modules (no upgrading for lustre). However, the
> lustre modules are built against the wrong version
> of mlnx-ofed. dmesg shows the following errors:
>
>
> [17509.744301] ko2iblnd: disagrees about version of symbol
> ib_fmr_pool_unmap
> [17509.744307] ko2iblnd: Unknown symbol ib_fmr_pool_unmap (err -22)
> [17509.744317] ko2iblnd: disagrees about version of symbol ib_create_cq
> [17509.744319] ko2iblnd: Unknown symbol ib_create_cq (err -22)
> [17509.744332] ko2iblnd: disagrees about version of symbol
> rdma_resolve_addr
> [17509.744334] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22)
> [17509.744345] ko2iblnd: disagrees about version of symbol
> ib_create_fmr_pool
> ...
>
> I've tried to build mlnx-ofed under the updated kernel, but the problem
> still exists.
>
> My questions:
> 1) how to restore the lustre system before the updates? The following RPMs
> are already present on my server:
> ----------------
> kernel-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kernel-devel-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kernel-headers-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kernel-tools-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kernel-tools-libs-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kernel-tools-libs-devel-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> kmod-spl-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64-0.6.5.7-1.el7.x86_64.rpm
> kmod-spl-devel-0.6.5.7-1.el7.x86_64.rpm
>
> kmod-spl-devel-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64-0.6.5.7-1.el7.x86_64.rpm
> kmod-zfs-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64-0.6.5.7-1.el7.x86_64.rpm
> kmod-zfs-devel-0.6.5.7-1.el7.x86_64.rpm
>
> kmod-zfs-devel-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64-0.6.5.7-1.el7.x86_64.rpm
> libnvpair1-0.6.5.7-1.el7.x86_64.rpm
> libuutil1-0.6.5.7-1.el7.x86_64.rpm
> libzfs2-0.6.5.7-1.el7.x86_64.rpm
> libzfs2-devel-0.6.5.7-1.el7.x86_64.rpm
> libzpool2-0.6.5.7-1.el7.x86_64.rpm
>
> lustre-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
> lustre-dkms-2.7.19.8-1.el7.noarch.rpm
>
> lustre-iokit-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-modules-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-osd-ldiskfs-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-osd-ldiskfs-mount-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-osd-zfs-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-osd-zfs-mount-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-source-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
>
> lustre-tests-2.7.19.8-3.10.0_514.2.2.el7_lustre.gba8983e.x86_64_gba8983e.x86_64.rpm
> mlnx-ofa_kernel-4.2-OFED.4.2.1.2.0.1.gf8de107.x86_64.rpm
> mlnx-ofa_kernel-devel-4.2-OFED.4.2.1.2.0.1.gf8de107.x86_64.rpm
>
> mlnx-ofa_kernel-modules-4.2-OFED.4.2.1.2.0.1.gf8de107.kver.3.10.0_514.2.2.el7_lustre.gba8983e.x86_64.x86_64.rpm
> perf-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> python-perf-3.10.0-514.2.2.el7_lustre.gba8983e.x86_64.rpm
> spl-0.6.5.7-1.el7.x86_64.rpm
> spl-dkms-0.6.5.7-1.el7.noarch.rpm
> zfs-0.6.5.7-1.el7.x86_64.rpm
> zfs-dkms-0.6.5.7-1.el7.noarch.rpm
> zfs-dracut-0.6.5.7-1.el7.x86_64.rpm
> zfs-test-0.6.5.7-1.el7.x86_64.rpm
> ----------------
>
> 2) What is the risk of my data loss?
>
>
> Thanks,
>
> Haoyang
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> ------------------------------
>
> End of lustre-discuss Digest, Vol 184, Issue 17
> ***********************************************
>
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to