[lustre-discuss] How to make OSTs active again
I have a simple lustre setup ( 1 MGS, 1 MDS (2 MDT), 2 OSS (2 OST each) and 1 client node to run some IO load). I was testing what happens if one of the OSS dies (but no impact to data). To recover from failed OSS, I create a new instance and attached the 2 OSTs from failed node. I assume, since I am using existing OSTs from failed node and the index will remain the same, I tried directly mount of it like below: mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Since I tried many different time, I also tried the below: Ran mkfs.lustre on the OSTs: mkfs.lustre --fsname=lustrefs --index=2 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdb mkfs.lustre --fsname=lustrefs --index=3 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdc mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Ran mkfs.lustre on the OSTs with --reformat --replace mkfs.lustre --fsname=lustrefs --reformat --replace --index=2 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdb mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mkfs.lustre --fsname=lustrefs --reformat --replace --index=3 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdc mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Questions: 1. After OSS node was replaced, the client node mount was still in hang state and I had to reboot the client node for the mount to work. Is there some config I need to set , so it auto-recovers. 2. On the client node, I see the 2 OSTs are showing as INACTIVE, how do I make them active again. I read on forums to do “lctl –device recover/activate and I ran that on MDS and Client, and it still shows INACTIVE. It was confusing on what to pass as and where to find the correct name. [root@client-1 ~]# lfs osts OBDS: 0: lustrefs-OST_UUID ACTIVE 1: lustrefs-OST0001_UUID ACTIVE 2: lustrefs-OST0002_UUID INACTIVE 3: lustrefs-OST0003_UUID INACTIVE [root@client-1 ~]# lctl dl 0 UP mgc MGC10.0.6.2@tcp1 0e4fae60-66e5-963d-1aea-59b80f9fd77b 4 1 UP lov lustrefs-clilov-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 3 2 UP lmv lustrefs-clilmv-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 3 UP mdc lustrefs-MDT-mdc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 4 UP mdc lustrefs-MDT0001-mdc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 5 UP osc lustrefs-OST0002-osc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 6 UP osc lustrefs-OST0003-osc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 7 UP osc lustrefs-OST-osc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 8 UP osc lustrefs-OST0001-osc-89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 [root@client-1 ~]# MDS node $ sudo lctl dl 0 UP osd-ldiskfs lustrefs-MDT0001-osd lustrefs-MDT0001-osd_UUID 10 1 UP osd-ldiskfs lustrefs-MDT-osd lustrefs-MDT-osd_UUID 11 2 UP mgc MGC10.0.6.2@tcp1 acc3160e-9975-9262-89e1-8dc66812ac94 4 3 UP mds MDS MDS_uuid 2 4 UP lod lustrefs-MDT-mdtlov lustrefs-MDT-mdtlov_UUID 3 5 UP mdt lustrefs-MDT lustrefs-MDT_UUID 18 6 UP mdd lustrefs-MDD lustrefs-MDD_UUID 3 7 UP qmt lustrefs-QMT lustrefs-QMT_UUID 3 8 UP osp lustrefs-MDT0001-osp-MDT lustrefs-MDT-mdtlov_UUID 4 9 UP osp lustrefs-OST0002-osc-MDT lustrefs-MDT-mdtlov_UUID 4 10 UP osp lustrefs-OST0003-osc-MDT lustrefs-MDT-mdtlov_UUID 4 11 UP osp lustrefs-OST-osc-MDT lustrefs-MDT-mdtlov_UUID 4 12 UP osp lustrefs-OST0001-osc-MDT lustrefs-MDT-mdtlov_UUID 4 13 UP lwp lustrefs-MDT-lwp-MDT lustrefs-MDT-lwp-MDT_UUID 4 14 UP lod lustrefs-MDT0001-mdtlov lustrefs-MDT0001-mdtlov_UUID 3 15 UP mdt lustrefs-MDT0001 lustrefs-MDT0001_UUID 14 16 UP mdd lustrefs-MDD0001 lustrefs-MDD0001_UUID 3 17 UP osp lustrefs-MDT-osp-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 18 UP osp lustrefs-OST0002-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 19 UP osp lustrefs-OST0003-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 20 UP osp lustrefs-OST-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 21 UP osp lustrefs-OST0001-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 22 UP lwp lustrefs-MDT-lwp-MDT0001 lustrefs-MDT-lwp-MDT0001_UUID 4 Thanks, Pinkesh Valdria Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA https://blogs.oracle.com/author/pinkesh-valdria ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre using RDMA (RoCEv2)
This is my first attempt to configure Lustre for RDMA (Mellanox RoCEv2). lnetctl net show net: - net type: lo local NI(s): - nid: 0@lo status: up Below results in an error. The interface (ens800f0) is working and I can ping other nodes on that network. lnetctl net add --net o2ib --if ens800f0 add: - net: errno: -100 descr: "cannot add network: Network is down" [root@inst-fknk9-relaxing-louse ~]# dmesg | tail [ 1399.903159] Lustre: Lustre: Build Version: 2.12.6 [ 1427.411527] LNetError: 20092:0:(o2iblnd.c:2781:kiblnd_dev_failover()) Failed to bind ens800f0:192.168.169.112 to device( (null)): -19 [ 1427.564213] LNetError: 20092:0:(o2iblnd.c:3314:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19 [ 1428.681259] LNetError: 105-4: Error -100 starting up LNI o2ib [ 1474.343671] LNetError: 20260:0:(o2iblnd.c:2781:kiblnd_dev_failover()) Failed to bind ens800f0:192.168.169.112 to device( (null)): -19 [ 1474.496347] LNetError: 20260:0:(o2iblnd.c:3314:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19 [ 1475.610993] LNetError: 105-4: Error -100 starting up LNI o2ib [ 1535.441463] LNetError: 20549:0:(o2iblnd.c:2781:kiblnd_dev_failover()) Failed to bind ens800f0:192.168.169.112 to device( (null)): -19 [ 1535.594183] LNetError: 20549:0:(o2iblnd.c:3314:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -19 [ 1536.709841] LNetError: 105-4: Error -100 starting up LNI o2ib Interface: ens800f0 is the 100Gbps RDMA Mlnx NIC: ip addr 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: ens300f0: mtu 9000 qdisc mq state UP group default qlen 1000 link/ether b8:ce:f6:25:ff:5e brd ff:ff:ff:ff:ff:ff inet 172.16.5.112/22 brd 172.16.7.255 scope global dynamic ens300f0 valid_lft 84734sec preferred_lft 84734sec 3: ens300f1: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether b8:ce:f6:25:ff:5f brd ff:ff:ff:ff:ff:ff 4: ens800f0: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 04:3f:72:e3:08:42 brd ff:ff:ff:ff:ff:ff inet 192.168.169.112/22 brd 192.168.171.255 scope global ens800f0 valid_lft forever preferred_lft forever 5: ens800f1: mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 04:3f:72:e3:08:43 brd ff:ff:ff:ff:ff:ff OS: RHCK 7.9 3.10.0-1160.2.1.el7_lustre.x86_64 OFED: Mellanox ofed_info -n 4.9-3.1.5.0 cat /etc/lnet.conf is empty cat /etc/modprobe.d/lnet.conf cat: /etc/modprobe.d/lnet.conf: No such file or directory [root@inst-fknk9-relaxing-louse ~]# modprobe -v lustre insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/obdclass.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/ptlrpc.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/fld.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/fid.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/lov.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/osc.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/mdc.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/lmv.ko insmod /lib/modules/3.10.0-1160.2.1.el7_lustre.x86_64/extra/lustre/fs/lustre.ko [root@inst-fknk9-relaxing-louse ~]# Based on discussion threads from Google search, one thread said to add this, still same error. echo 'options lnet networks="o2ib(ens800f0)" ' > /etc/modprobe.d/lustre.conf echo 'options lnet networks="o2ib(ens800f0)" ' > /etc/modprobe.d/lnet.conf Thanks, Pinkesh Valdria Principal Solutions Architect – HPC ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] MOFED & Lustre 2.14.51 - install fails with dependency failure related to ksym/MOFED
Running out of ideas. I also searched old messages on this distro and on Google and found an unanswered questions from Aug, 2020 - https://www.mail-archive.com/lustre-discuss@lists.lustre.org/msg16346.html Hello Laura, I tried your recommendation of passing mlnx_add_kernel_support.sh --kmp , my steps are below, but I still get the same ksym error for lustre clients. Also I see that the kmp support is still not enabled (may be KMP support is only available on Redhat & SUSE, but not CentOS, Oracle Linux etc – based on this link: https://docs.mellanox.com/display/MLNXOFEDv461000/Installing+Mellanox+OFED ) Step1: On build server: ./mlnx_add_kernel_support.sh --make-tgz --verbose --yes --kernel 3.10.0-1160.15.2.el7_lustre.x86_64 --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7_lustre.x86_64 --tmpdir /tmp --distro ol7.9 --mlnx_ofed /root/MLNX_OFED_LINUX-5.3-1.0.0.1-ol7.9-x86_64 –kmp …. Detected MLNX_OFED_LINUX-5.3-1.0.0.1 …. Building MLNX_OFED_LINUX RPMS . Please wait... …. Running MLNX_OFED_SRC-5.3-1.0.0.1/install.pl --tmpdir /tmp/mlnx_iso.7168_logs --kernel-only --kernel 3.10.0-1160.15.2.el7_lustre.x86_64 --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7_lustre.x86_64 --builddir /tmp/mlnx_iso.7168 --build-only --distro ol7.9 --bump-kmp-version 202105221109 …. Creating metadata-rpms for 3.10.0-1160.15.2.el7_lustre.x86_64 ... Created /tmp/MLNX_OFED_LINUX-5.3-1.0.0.1-ol7.9-x86_64-ext.tgz Then: ./mlnxofedinstall --kernel 3.10.0-1160.15.2.el7_lustre.x86_64 --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7_lustre.x86_64 --add-kernel-support --skip-repo --skip-distro-check --distro ol7.9 –kmp ….. Installation finished successfully. … Updating / installing... 1:mlnx-fw-updater-5.3-1.0.0.1 # [100%] Failed to update Firmware. See /tmp/MLNX_OFED_LINUX.235507.logs/fw_update.log … To load the new driver, run: /etc/init.d/openibd restart Ran /etc/init.d/openibd restart Unloading HCA driver: [ OK ] Loading HCA driver and Access Layer: [ OK ] Step2: On build server: Create Lustre client package ./configure --disable-server --enable-client \ --with-linux=/usr/src/kernels/*_lustre.x86_64 \ --with-o2ib=/usr/src/ofa_kernel/default make rpms Step3: On Lustre client node: Install MOFED Untar the MOFED package from Step1. Run mlnxofedinstall (I tried running with and without --kmp , but same ksym error). 1. Passing --kmp parameter mlnxofedinstall --force --kernel 3.10.0-1160.15.2.el7_lustre.x86_64 --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7_lustre.x86_64 --skip-distro-check --distro ol7.9 --kmp 1. Not Passing --kmp parameter mlnxofedinstall --force --kernel 3.10.0-1160.15.2.el7_lustre.x86_64 --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7_lustre.x86_64 --skip-distro-check --distro ol7.9 Step4: On Lustre client node: yum localinstall lustre-client-2.14.51-1.el7.x86_64.rpm kmod-lustre-client-2.14.51-1.el7.x86_64.rpm …… Error: Package: kmod-lustre-client-2.14.51-1.el7.x86_64 (/kmod-lustre-client-2.14.51-1.el7.x86_64) Requires: ksym(__ib_create_cq) = 0x1bb05802 Error: Package: kmod-lustre-client-2.14.51-1.el7.x86_64 (/kmod-lustre-client-2.14.51-1.el7.x86_64) Requires: ksym(rdma_listen) = 0xf6bd553e ….. I tried 3 different scenarios: 1. Copied the lustre client rpms from build server to luster client node and ran above command - it failed 2. On lustre client node – create lustre package after running step 3a (mlnxofedinstall command with --kmp) 3. On lustre client node – create lustre package after running step 3b (mlnxofedinstall command without --kmp) Any other suggestions ? Sidenote: For lustre server – I found this workaround, not sure yet if it will create issues once I mount and run Lustre. Previously, I was installing lustre on Lustre servers using below command and getting ksym errors sudo yum install lustre-tests -y But if I use, the below, the install works: rpm -ivh --nodeps lustre-2.14.51-1.el7.x86_64.rpm kmod-lustre-2.14.51-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.14.51-1.el7.x86_64.rpm lustre-osd-ldiskfs-mount-2.14.51-1.el7.x86_64.rpm lustre-resource-agents-2.14.51-1.el7.x86_64.rpm I have tested “modprobe lnet” and “lnetctl net add”, MGS/MGT mount works, MDT mount fails with “Invalid filesystem option set: dirdata,uninit_bg,^extents,dir_nlink,quota,project,huge_file,ea_inode,large_dir,flex_bg” Thanks, Pinkesh Valdria From: Pinkesh Valdria Date: Friday, May 21, 2021 at 6:04 PM To: "lustre-discuss@lists.lustre.org" Subject: MOFED & Lustre 2.14.51 - install fails with dependency failure related to ksym/MOFED Sorry for a long email, wanted to make sure I share enough details for community to provide guidance. I am building all lustre packages for Oracle Linux7.9-RHCK and MOFED: 5.3-1.0.0.1 using steps described here: https
[lustre-discuss] MOFED & Lustre 2.14.51 - install fails with dependency failure related to ksym/MOFED
ustreserver) Requires: ksym(rdma_disconnect) = 0x49262e62 Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(rdma_connect_locked) = 0x7eaa4a8a …. …. All ib/rdma related errors similar to above for kmod-lustre.x …. Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(ib_destroy_cq_user) = 0x5671830b You could try using --skip-broken to work around the problem ** Found 3 pre-existing rpmdb problem(s), 'yum check' output follows: oracle-cloud-agent-1.11.1-5104.el7.x86_64 is a duplicate with oracle-cloud-agent-1.8.2-3843.el7.x86_64 rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-3.0) rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-route-3.0) [opc@inst-dwnv3-topical-goblin ~]$ RPMS from: LDISKFS and Patching the Linux Kernel ls lustre-kernel/RPMS/ * bpftool-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * bpftool-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debuginfo-common-x86_64-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-headers-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-libs-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-libs-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * python-perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * python-perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm MOFED rpms Steps followed: Download from MLNX site the source: MLNX_OFED_SRC-5.3-1.0.0.1.tgz tar -zvxf $HOME/MLNX_OFED_SRC-5.3-1.0.0.1.tgz cd MLNX_OFED_SRC-5.3-1.0.0.1/ ./install.pl --build-only --kernel-only \ --kernel 3.10.0-1160.15.2.el7.x86_64 \ --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7.x86_64 cp RPMS/*/*/*.rpm $HOME/releases/mofed Question: I am passing regular kernel (3.10.0-1160.15.2.el7.x86_64) and its source (not Lustre patched kernel) as input to MOFED install command above, I hope that is correct. * kernel-mft-4.16.3-12.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.ol7u9.x86_64.rpm * knem-modules-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-nfsrdma-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-nfsrdma-debuginfo-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-ofa_kernel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-debuginfo-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-devel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-modules-5.3-OFED.5.3.1.0.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * ofed-scripts-5.3-OFED.5.3.1.0.0.x86_64.rpm Lustre Server packages ./configure --enable-server \ --with-linux=/usr/src/kernels/*_lustre.x86_64 \ --with-o2ib=/usr/src/ofa_kernel/default make rpms * kmod-lustre-2.14.51-1.el7.x86_64.rpm * kmod-lustre-osd-ldiskfs-2.14.51-1.el7.x86_64.rpm * kmod-lustre-tests-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.src.rpm * lustre-debuginfo-2.14.51-1.el7.x86_64.rpm * lustre-devel-2.14.51-1.el7.x86_64.rpm * lustre-iokit-2.14.51-1.el7.x86_64.rpm * lustre-osd-ldiskfs-mount-2.14.51-1.el7.x86_64.rpm * lustre-resource-agents-2.14.51-1.el7.x86_64.rpm * lustre-tests-2.14.51-1.el7.x86_64.rpm Lustre Client packages ./configure --disable-server --enable-client \ --with-linux=/usr/src/kernels/*_lustre.x86_64 \ --with-o2ib=/usr/src/ofa_kernel/default make rpms * kmod-lustre-client-2.14.51-1.el7.x86_64.rpm * kmod-lustre-client-tests-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.src.rpm * lustre-client-2.14.51-1.el7.x86_64.rpm * lustre-client-debuginfo-2.14.51-1.el7.x86_64.rpm * lustre-client-devel-2.14.51-1.el7.x86_64.rpm * lustre-client-tests-2.14.51-1.el7.x86_64.rpm * lustre-iokit-2.14.51-1.el7.x86_64.rpm Thanks, Pinkesh Valdria Principal Solutions Architect – HPC Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] [External] : lustre-discuss Digest, Vol 179, Issue 26
Nikitas, Your steps were perfect. It worked. I am able to compile the client. @Andreas Dilger - I am happy to add a wiki page with the steps I followed to get it compiled or fix existing page, if I can get login account to update. Similarly, I would like to add how to compile Lustre for Oracle Linux UEK kernels, if that's okay. Thanks, Pinkesh Valdria Principal Solutions Architect – HPC Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA On 2/26/21, 12:46 AM, "lustre-discuss on behalf of lustre-discuss-requ...@lists.lustre.org" wrote: Send lustre-discuss mailing list submissions to lustre-discuss@lists.lustre.org To subscribe or unsubscribe via the World Wide Web, visit https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!GqivPVa7Brio!JOuAP3APa1PYUYllxZG6H17Z9x96It0TNVvuN9BHUxVx5jj95BZllCbFIsz6eoZZiqC7$ or, via email, send a message with subject or body 'help' to lustre-discuss-requ...@lists.lustre.org You can reach the person managing the list at lustre-discuss-ow...@lists.lustre.org When replying, please edit your Subject line so it is more specific than "Re: Contents of lustre-discuss digest..." Today's Topics: 1. Re: Lustre Client compile on Ubuntu18.04 failing (Nikitas Angelinas) -- Message: 1 Date: Fri, 26 Feb 2021 08:46:19 +0000 From: Nikitas Angelinas To: Pinkesh Valdria Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lustre Client compile on Ubuntu18.04 failing Message-ID: <2bf290dd-a3c8-4727-8c87-012074c48...@cray.com> Content-Type: text/plain; charset="utf-8" Hi Pinkesh, Could you please try running ?make oldconfig? and then ?make modules_prepare? in the kernel source after copying the .config? The latter command should generate the missing files /root/linux-oracle/include/generated/autoconf.h and /root/linux-oracle/include/linux/autoconf.h. Cheers, Nikitas On 2/25/21, 10:14 PM, "lustre-discuss on behalf of Pinkesh Valdria via lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> wrote: Hello All, I am trying to compile lustre client (2.13.57) on Ubuntu18.04 and I am following the steps listed here: https://urldefense.com/v3/__https://wiki.whamcloud.com/pages/viewpage.action?pageId=63968116__;!!GqivPVa7Brio!JOuAP3APa1PYUYllxZG6H17Z9x96It0TNVvuN9BHUxVx5jj95BZllCbFIsz6ehaK0R8W$ , but its failing with below error. Any pointer/advice on what am I missing ? uname -r 5.4.0-1035-oracle # Using this one cd /root git clone https://urldefense.com/v3/__https://git.launchpad.net/*canonical-kernel/ubuntu/*source/linux-oracle__;fis!!GqivPVa7Brio!JOuAP3APa1PYUYllxZG6H17Z9x96It0TNVvuN9BHUxVx5jj95BZllCbFIsz6esq05dSq$ cd linux-oracle/ git checkout Ubuntu-oracle-5.4-5.4.0-1035.38_18.04.1 BUILDPATH=/root cd ${BUILDPATH} git clone git://git.whamcloud.com/fs/lustre-release.git cd lustre-release git checkout 2.13.57 git reset --hard && git clean -dfx sh autogen.sh ./configure --disable-server --with-linux=/root/linux-oracle # above command fails, saying .config file is missing CONFIG_RETPOLINE=y checking for Linux sources... /root/linux-oracle checking for /root/linux-oracle... yes checking for Linux objects... /root/linux-oracle checking for /root/linux-oracle/.config... no configure: error: Kernel config could not be found. cp /boot/config-5.4.0-1035-oracle /root/linux-oracle/.config # Re-ran it ./configure --disable-server --with-linux=/root/linux-oracle checking for swig2.0... no yes checking whether to build Lustre client support... yes dirname: missing operand Try 'dirname --help' for more information. checking whether mpitests can be built... no checking whether to build Linux kernel modules... yes (linux-gnu) find: '/usr/src/kernels/': No such file or directory checking for Linux sources... /root/linux-oracle checking for /root/linux-oracle... yes checking for Linux objects... /root/linux-oracle checking for /root/linux-oracle/.config... yes checking for /boot/kernel.h... no checking for /var/adm/running-kernel.h... no checking for /root/linux-oracle/include/generated/autoconf.h... no checking for /root/linux-oracle/include/linux/autoconf.h... no configure: error: Run make config in /root/linux-oracle. root@lustre-client-2-12-4-ubuntu1804:~/lustre-release# Thanks, Pinkesh Valdria Principal Solut
[lustre-discuss] Lustre Client compile on Ubuntu18.04 failing
Hello All, I am trying to compile lustre client (2.13.57) on Ubuntu18.04 and I am following the steps listed here: https://wiki.whamcloud.com/pages/viewpage.action?pageId=63968116, but its failing with below error. Any pointer/advice on what am I missing ? uname -r 5.4.0-1035-oracle # Using this one cd /root git clone https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-oracle cd linux-oracle/ git checkout Ubuntu-oracle-5.4-5.4.0-1035.38_18.04.1 BUILDPATH=/root cd ${BUILDPATH} git clone git://git.whamcloud.com/fs/lustre-release.git cd lustre-release git checkout 2.13.57 git reset --hard && git clean -dfx sh autogen.sh ./configure --disable-server --with-linux=/root/linux-oracle # above command fails, saying .config file is missing CONFIG_RETPOLINE=y checking for Linux sources... /root/linux-oracle checking for /root/linux-oracle... yes checking for Linux objects... /root/linux-oracle checking for /root/linux-oracle/.config... no configure: error: Kernel config could not be found. cp /boot/config-5.4.0-1035-oracle /root/linux-oracle/.config # Re-ran it ./configure --disable-server --with-linux=/root/linux-oracle checking for swig2.0... no yes checking whether to build Lustre client support... yes dirname: missing operand Try 'dirname --help' for more information. checking whether mpitests can be built... no checking whether to build Linux kernel modules... yes (linux-gnu) find: '/usr/src/kernels/': No such file or directory checking for Linux sources... /root/linux-oracle checking for /root/linux-oracle... yes checking for Linux objects... /root/linux-oracle checking for /root/linux-oracle/.config... yes checking for /boot/kernel.h... no checking for /var/adm/running-kernel.h... no checking for /root/linux-oracle/include/generated/autoconf.h... no checking for /root/linux-oracle/include/linux/autoconf.h... no configure: error: Run make config in /root/linux-oracle. root@lustre-client-2-12-4-ubuntu1804:~/lustre-release# Thanks, Pinkesh Valdria Principal Solutions Architect – HPC Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Complete list of rules for PCC
I am looking for the various policy rules which can be applied for Lustre Persistent Client Cache. In the docs, I see below example using projid, fname and uid. Where can I find a complete list of supported rules. Also is there a way for PCC to only cache content of few folders http://doc.lustre.org/lustre_manual.xhtml#pcc.design.rules The following command adds a PCC backend on a client: client# lctl pcc add /mnt/lustre /mnt/pcc --param "projid={500,1000}&fname={*.h5},uid=1001 rwid=2" The first substring of the config parameter is the auto-cache rule, where "&" represents the logical AND operator while "," represents the logical OR operator. The example rule means that new files are only auto cached if either of the following conditions are satisfied: The project ID is either 500 or 1000 and the suffix of the file name is "h5"; The user ID is 1001; ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Bulk Attach/Detach - Lustre PCC (Persistent Client Cache)
I am new to Lustre PCC (Persistent Client Cache) feature. I was looking at the Lustre documentation PCC section and found how I can attach or detach a file from PCC. http://doc.lustre.org/lustre_manual.xhtml#pcc.operations.detach Question: Is there a command to specify a folder in the command and all files under that folder & sub-folders will be attached or detached? How is everyone doing bulk attach or detach ? Are folks using some custom script to traverse a directory tree (recursively) and for each file found, call this command “lfs pcc detach ” or there is another command and I missed it in the docs. Thanks, Pinkesh Valdria ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre (latest) access via NFSv4
Can Lustre be access via NFSv4. I know we can use NFSv3, but wanted to ask about NFSv4 support ? ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] NFS Client Attributes caching - equivalent feature/config in Lustre
Does lustre have mount options to mimic NFS mount option behavior , listed below? I know in most cases, Lustre would perform much better than NFS and can scale and support a lot of clients in parallel. I have a use case, where there are only few clients accessing the filesystem and the files are really small, but in millions and files are very infrequently updated. The files are stored on an NFS server and its mounted on the clients with the below mount options, which results in caching of file attributes/metadata on the client and thus reduces # of calls to metadata and delivers better performance. NFS mount options type nfs (rw,nolock,nocto,actimeo=900,nfsvers=3,proto=tcp) A custom proprietary application which compile (make command) on some of these files takes 20-24 seconds to run. The same command when ran on the same files stored in BeeGFS parallel filesystem takes 80-90 seconds (4x times slow), mainly because there is no client caching in BeeGFS and client has to make a lot more metadata calls compared to NFS cache file attributes. Question I already tried BeeGFS and I am asking this question to determine, if Lustre performance would be better than NFS for very small file workloads (50 bytes, 200 bytes, 2KB files) with 5 millions files spread across nested directories. Does lustre have mount options to mimic NFS mount option behavior, listed below? Or is there some optional feature in Lustre to achieve this cache behavior? https://linux.die.net/man/5/nfs ac / noac Selects whether the client may cache file attributes. If neither option is specified (or if ac is specified), the client caches file attributes. For my custom applications, cache of file attributes is fine (no negative impact) and it helps to improve performance of NFS. actimeo=n Using actimeo sets all of acregmin, acregmax, acdirmin, and acdirmax to the same value. If this option is not specified, the NFS client uses the defaults for each of these options listed above. For my applications, it’s okay to use cache file attributes/metadata for few mins (eg: 5mins) by setting this value, it can reduce # of metadata calls been made to the server and especially with filesystems storing lot of small files, it’s a huge performance penalty, which can be avoided. nolock When mounting servers that do not support the NLM protocol, or when mounting an NFS server through a firewall that blocks the NLM service port, specify the nolock mount option. Specifying the nolock option may also be advised to improve the performance of a proprietary application which runs on a single client and uses file locks extensively. Appreciate any guidance. Thanks, pinkesh valdria ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre with 100 Gbps Mellanox CX5 card
Hello Lustre Community, I am trying to configure lustre for 100 Gbps Mellanox CX5 card. I tried using 2.12.3 version first, but it failed when I tried to run lnetctl net add --net o2ib0 --if enp94s0f0, so I started looking at the lustre binaries and found the below repos for ib. Is the below a special build for Mellanox cards? or should I still be using the common Lustre binaries which are also used for tcp/ksocklnd networks. [hpddLustreserver] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/lustre-2.13.0-ib/MOFED-4.7-1.0.0.1/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS- - Ldiskfs baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [hpddLustreclient] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/lustre-2.13.0-ib/MOFED-4.7-1.0.0.1/el7/client/ gpgcheck=0 When I use the above repos, the below command returns success, but the options I passed are not taking effect. NIC card: enp94s0f0 is my 100 Gbps card. lnetctl net add --net o2ib0 --if enp94s0f0 –peer-timeout 100 –peer-credits 16 –credits 2560 Similarly, when I try to configure some options via this file: /etc/modprobe.d/ko2iblnd.conf, they are not taking effect and are not applied when I run the command: cat /etc/modprobe.d/ko2iblnd.conf alias ko2iblnd ko2iblnd options ko2iblnd map_on_demand=256 concurrent_sends=63 peercredits_hiw=31 fmr_pool_size=1280 fmr_flush_trigger=1024 fmr_cache=1 lnetctl net show -v --net o2ib net: - net type: o2ib local NI(s): - nid: 192.168.1.2@o2ib status: up interfaces: 0: enp94s0f0 statistics: send_count: 0 recv_count: 0 drop_count: 0 tunables: peer_timeout: 100 peer_credits: 16 peer_buffer_credits: 0 credits: 2560 peercredits_hiw: 8 map_on_demand: 0 concurrent_sends: 16 fmr_pool_size: 512 fmr_flush_trigger: 384 fmr_cache: 1 ntx: 512 conns_per_peer: 1 lnd tunables: dev cpt: 0 tcp bonding: 0 CPT: "[0,1]" [root@inst-ran1f-lustre ~]# ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC)
To close the loop on this topic. The below parameters were not set by default and hence they were not showing up in lctl list_param commands. I have to set them first. lctl set_param llite.*.max_read_ahead_mb=256 lctl set_param llite.*.max_read_ahead_per_file_mb=256 Thanks to the Lustre Community for their help to tune Lustre, I was able to tune Lustre on Oracle Cloud Infrastructure to get good performance on Bare metal nodes with 2x25Gbps network. We have open sourced the deployment of Lustre on Oracle Cloud as well as all the performance tuning done at the Infrastructure level as well as Lustre FS level for everyone to benefit from it. https://github.com/oracle-quickstart/oci-lustre Terraform files are in : https://github.com/oracle-quickstart/oci-lustre/tree/master/terraform Tuning scripts are in this folder: https://github.com/oracle-quickstart/oci-lustre/tree/master/scripts As next step - I plan to test deployment of Lustre on 100 Gbps RoCEv2 RDMA network (Mellanox CX5). Thanks, Pinkesh Valdria Oracle Cloud – Principal Solutions Architect https://blogs.oracle.com/cloud-infrastructure/lustre-file-system-performance-on-oracle-cloud-infrastructure https://blogs.oracle.com/author/pinkesh-valdria From: lustre-discuss on behalf of Pinkesh Valdria Date: Friday, December 13, 2019 at 11:14 AM To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I ran the latest command you provided and it does not show the parameter, like you see.I can do screenshare. [opc@lustre-client-1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda339G 2.5G 36G 7% / devtmpfs158G 0 158G 0% /dev tmpfs 158G 0 158G 0% /dev/shm tmpfs 158G 17M 158G 1% /run tmpfs 158G 0 158G 0% /sys/fs/cgroup /dev/sda1 512M 12M 501M 3% /boot/efi 10.0.3.6@tcp1:/lfsbv 50T 89M 48T 1% /mnt/mdt_bv 10.0.3.6@tcp1:/lfsnvme 185T 8.7M 176T 1% /mnt/mdt_nvme tmpfs32G 0 32G 0% /run/user/1000 [opc@lustre-client-1 ~]$ lctl list_param -R llite | grep max_read_ahead [opc@lustre-client-1 ~]$ So I ran this: [opc@lustre-client-1 ~]$ lctl list_param -R llite > llite_parameters.txt There are other parameters under llite. I attached the complete list. From: "Moreno Diego (ID SIS)" Date: Friday, December 13, 2019 at 8:36 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) >From what I can see I think you just ran the wrong command (lctl list_param -R >* ) or it doesn’t work as you expected on 2.12.3. But llite params are sure there on a *mounted* Lustre client. This will give you the parameters you’re looking for and need to modify to have, likely, better read performance: lctl list_param -R llite | grep max_read_ahead From: Pinkesh Valdria Date: Friday, 13 December 2019 at 17:33 To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) This is how I installed lustre clients (only showing packages installed steps). cat > /etc/yum.repos.d/lustre.repo << EOF [hpddLustreserver] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS- - Ldiskfs baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [hpddLustreclient] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/client/ gpgcheck=0 EOF yum install lustre-client -y reboot From: "Moreno Diego (ID SIS)" Date: Friday, December 13, 2019 at 2:55 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) >From what I can see they exist on my 2.12.3 client node: [root@rufus4 ~]# lctl list_param -R llite | grep max_read_ahead llite.reprofs-9f7c3b4a8800.max_read_ahead_mb llite.reprofs-9f7c3b4a8800.max_read_ahead_per_file_mb llite.reprofs-9f7c3b4a8800.max_read_ahead_whole_mb Regards, Diego From: Pinkesh Valdria Date: Wednesday, 11 December 2019 at 17:46 To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I was not able to find those parameters on my client nodes, OSS or MGS nodes. Here is how I was extracting all parameters .
Re: [lustre-discuss] Lemur Lustre - make rpm fails
Hello Nathaniel, As a workaround, is there an older lemur rpm version or older Lustre version I should use to unblock myself? https://github.com/whamcloud/lemur/issues/7 https://github.com/whamcloud/lemur/issues/8 Thanks, Pinkesh Valdria On 12/11/19, 6:31 AM, "Pinkesh Valdria" wrote: Hi Nathaniel, I have an issue ticket opened: https://github.com/whamcloud/lemur/issues/7 I tried to do it locally, that also fails, given below is the error. [root@lustre-client-4 lemur]# lfs --version lfs 2.12.3 [root@lustre-client-4 lemur]# uname -a Linux lustre-client-4 3.10.0-1062.7.1.el7.x86_64 #1 SMP Mon Dec 2 17:33:29 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [root@lustre-client-4 lemur]# lsb_release -r Release:7.6.1810 [root@lustre-client-4 lemur]# [root@lustre-client-4 lemur]# make local-rpm make -C packaging/rpm NAME=lemur VERSION=0.6.0_4_g4655df8 RELEASE=1 URL="https://github.com/intel-hpdd/lemur"; make[1]: Entering directory `/root/lemur/packaging/rpm' cd ../../ && \ . github.com/intel-hpdd/lemur/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager github.com/intel-hpdd/lemur/cmd/lhsm-plugin-s3 install -d $(dirname /root/rpmbuild/BUILDROOT/lemur-hsm-agent-0.6.0_4_g4655df8-1.x86_64//usr/bin/lhsm-plugin-s3) install -m 755 lhsm-plugin-s3 /root/rpmbuild/BUILDROOT/lemur-hsm-agent-0.6.0_4_g4655df8-1.x86_64//usr/bin/lhsm-plugin-s3 go build -v -i -ldflags "-X 'main.version=0.6.0_4_g4655df8'" -o lhsm ./cmd/lhsm github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/pkg/pool github.com/intel-hpdd/lemur/cmd/lhsmd/agent/fileid github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi github.com/intel-hpdd/lemur/vendor/gopkg.in/yaml.v2 github.com/intel-hpdd/lemur/vendor/gopkg.in/urfave/cli.v1 # github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi cgo-gcc-prolog: In function '_cgo_c110903d49cd_C2func_llapi_get_version': cgo-gcc-prolog:58:2: warning: 'llapi_get_version' is deprecated (declared at /usr/include/lustre/lustreapi.h:398) [-Wdeprecated-declarations] cgo-gcc-prolog: In function '_cgo_c110903d49cd_Cfunc_llapi_get_version': cgo-gcc-prolog:107:2: warning: 'llapi_get_version' is deprecated (declared at /usr/include/lustre/lustreapi.h:398) [-Wdeprecated-declarations] # github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi vendor/github.com/intel-hpdd/go-lustre/llapi/changelog.go:273:39: cannot use _Ctype_int(r.flags) (type _Ctype_int) as type int32 in argument to _Cfunc_hsm_get_cl_flags make[2]: *** [lhsm] Error 2 make[2]: Leaving directory `/root/rpmbuild/BUILD/lemur-0.6.0_4_g4655df8/src/github.com/intel-hpdd/lemur' error: Bad exit status from /var/tmp/rpm-tmp.cPPeEL (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.cPPeEL (%install) make[1]: *** [rpm] Error 1 make[1]: Leaving directory `/root/lemur/packaging/rpm' make: *** [local-rpm] Error 2 [root@lustre-client-4 lemur]# Thanks, Pinkesh Valdria On 12/10/19, 4:55 AM, "lustre-discuss on behalf of Nathaniel Clark" wrote: Can you open at ticket for this on https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_whamcloud_lemur_issues&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=dvUy7ZhvTpzQ9yJzUhQmk0UHrXXOGiSc2X1_Sm5yOhY&s=aD2CBP6CmEF14pb7PM2A-H4aFyzbd09y5IRcQXqIHj8&e= And possibly https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.whamcloud.com_projects_LMR&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=dvUy7ZhvTpzQ9yJzUhQmk0UHrXXOGiSc2X1_Sm5yOhY&s=SoLFFKtz2XY9CNh4vyFssmTyhvmkIqyABrH_FzZzUQk&e= You could also try: $ make local-rpm Which will avoid the docker stack and just build on the local machine (beware it sudo's to install rpm build dependencies). -- Nathaniel Clark Senior Engineer Whamcloud / DDN On Mon, 2019-12-09 at 15:04 -0800, Pinkesh Valdria wrote: > I am trying to install Lemur on CentOS 7.6 (7.6.1810) to integrate > with Object storage but the install fails. I used the instructions > on below page to install. I already had Lustre client (2.12.3) > installed on the machine, so I started
Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC)
I ran the latest command you provided and it does not show the parameter, like you see. I can do screenshare. [opc@lustre-client-1 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 39G 2.5G 36G 7% / devtmpfs 158G 0 158G 0% /dev tmpfs 158G 0 158G 0% /dev/shm tmpfs 158G 17M 158G 1% /run tmpfs 158G 0 158G 0% /sys/fs/cgroup /dev/sda1 512M 12M 501M 3% /boot/efi 10.0.3.6@tcp1:/lfsbv 50T 89M 48T 1% /mnt/mdt_bv 10.0.3.6@tcp1:/lfsnvme 185T 8.7M 176T 1% /mnt/mdt_nvme tmpfs 32G 0 32G 0% /run/user/1000 [opc@lustre-client-1 ~]$ lctl list_param -R llite | grep max_read_ahead [opc@lustre-client-1 ~]$ So I ran this: [opc@lustre-client-1 ~]$ lctl list_param -R llite > llite_parameters.txt There are other parameters under llite. I attached the complete list. From: "Moreno Diego (ID SIS)" Date: Friday, December 13, 2019 at 8:36 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) >From what I can see I think you just ran the wrong command (lctl list_param -R >* ) or it doesn’t work as you expected on 2.12.3. But llite params are sure there on a *mounted* Lustre client. This will give you the parameters you’re looking for and need to modify to have, likely, better read performance: lctl list_param -R llite | grep max_read_ahead From: Pinkesh Valdria Date: Friday, 13 December 2019 at 17:33 To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) This is how I installed lustre clients (only showing packages installed steps). cat > /etc/yum.repos.d/lustre.repo << EOF [hpddLustreserver] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS- - Ldiskfs baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [hpddLustreclient] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/client/ gpgcheck=0 EOF yum install lustre-client -y reboot From: "Moreno Diego (ID SIS)" Date: Friday, December 13, 2019 at 2:55 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) >From what I can see they exist on my 2.12.3 client node: [root@rufus4 ~]# lctl list_param -R llite | grep max_read_ahead llite.reprofs-9f7c3b4a8800.max_read_ahead_mb llite.reprofs-9f7c3b4a8800.max_read_ahead_per_file_mb llite.reprofs-ffff9f7c3b4a8800.max_read_ahead_whole_mb Regards, Diego From: Pinkesh Valdria Date: Wednesday, 11 December 2019 at 17:46 To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I was not able to find those parameters on my client nodes, OSS or MGS nodes. Here is how I was extracting all parameters . mkdir -p lctl_list_param_R/ cd lctl_list_param_R/ lctl list_param -R * > lctl_list_param_R [opc@lustre-client-1 lctl_list_param_R]$ less lctl_list_param_R | grep ahead llite.lfsbv-98231c3bc000.statahead_agl llite.lfsbv-98231c3bc000.statahead_max llite.lfsbv-98231c3bc000.statahead_running_max llite.lfsnvme-98232c30e000.statahead_agl llite.lfsnvme-98232c30e000.statahead_max llite.lfsnvme-98232c30e000.statahead_running_max [opc@lustre-client-1 lctl_list_param_R]$ I also tried these commands: Not working: On client nodes lctl get_param llite.lfsbv-*.max_read_ahead_mb error: get_param: param_path 'llite/lfsbv-*/max_read_ahead_mb': No such file or directory [opc@lustre-client-1 lctl_list_param_R]$ Works On client nodes lctl get_param llite.*.statahead_agl llite.lfsbv-98231c3bc000.statahead_agl=1 llite.lfsnvme-98232c30e000.statahead_agl=1 [opc@lustre-client-1 lctl_list_param_R]$ From: "Moreno Diego (ID SIS)" Date: Tuesday, December 10, 2019 at 2:06 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) With that kind of degradation performance on read I would immediately think on llite’s max_read_ahead parameters on the client. Specifically these 2: max_read_ahead_mb: total amount of MB allocated for read ahead, usually quite low for bandwidth benchmarking purposes and when there’re several files per client max_read_ahead_per_file_mb
Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC)
This is how I installed lustre clients (only showing packages installed steps). cat > /etc/yum.repos.d/lustre.repo << EOF [hpddLustreserver] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS- - Ldiskfs baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [hpddLustreclient] name=CentOS- - Lustre baseurl=https://downloads.whamcloud.com/public/lustre/latest-release/el7/client/ gpgcheck=0 EOF yum install lustre-client -y reboot From: "Moreno Diego (ID SIS)" Date: Friday, December 13, 2019 at 2:55 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) >From what I can see they exist on my 2.12.3 client node: [root@rufus4 ~]# lctl list_param -R llite | grep max_read_ahead llite.reprofs-9f7c3b4a8800.max_read_ahead_mb llite.reprofs-9f7c3b4a8800.max_read_ahead_per_file_mb llite.reprofs-9f7c3b4a8800.max_read_ahead_whole_mb Regards, Diego From: Pinkesh Valdria Date: Wednesday, 11 December 2019 at 17:46 To: "Moreno Diego (ID SIS)" , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I was not able to find those parameters on my client nodes, OSS or MGS nodes. Here is how I was extracting all parameters . mkdir -p lctl_list_param_R/ cd lctl_list_param_R/ lctl list_param -R * > lctl_list_param_R [opc@lustre-client-1 lctl_list_param_R]$ less lctl_list_param_R | grep ahead llite.lfsbv-98231c3bc000.statahead_agl llite.lfsbv-98231c3bc000.statahead_max llite.lfsbv-98231c3bc000.statahead_running_max llite.lfsnvme-98232c30e000.statahead_agl llite.lfsnvme-98232c30e000.statahead_max llite.lfsnvme-98232c30e000.statahead_running_max [opc@lustre-client-1 lctl_list_param_R]$ I also tried these commands: Not working: On client nodes lctl get_param llite.lfsbv-*.max_read_ahead_mb error: get_param: param_path 'llite/lfsbv-*/max_read_ahead_mb': No such file or directory [opc@lustre-client-1 lctl_list_param_R]$ Works On client nodes lctl get_param llite.*.statahead_agl llite.lfsbv-98231c3bc000.statahead_agl=1 llite.lfsnvme-98232c30e000.statahead_agl=1 [opc@lustre-client-1 lctl_list_param_R]$ From: "Moreno Diego (ID SIS)" Date: Tuesday, December 10, 2019 at 2:06 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) With that kind of degradation performance on read I would immediately think on llite’s max_read_ahead parameters on the client. Specifically these 2: max_read_ahead_mb: total amount of MB allocated for read ahead, usually quite low for bandwidth benchmarking purposes and when there’re several files per client max_read_ahead_per_file_mb: the default is quite low for 16MB RPCs (only a few RPCs per file) You probably need to check the effect increasing both of them. Regards, Diego From: lustre-discuss on behalf of Pinkesh Valdria Date: Tuesday, 10 December 2019 at 09:40 To: "lustre-discuss@lists.lustre.org" Subject: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I was expecting better or same read performance with Large Bulk IO (16MB RPC), but I see degradation in performance. Do I need to tune any other parameter to benefit from Large Bulk IO? Appreciate if I can get any pointers to troubleshoot further. Throughput before - Read: 2563 MB/s - Write: 2585 MB/s Throughput after - Read: 1527 MB/s. (down by ~1025) - Write: 2859 MB/s Changes I did are: On oss - lctl set_param obdfilter.lfsbv-*.brw_size=16 On clients - unmounted and remounted - lctl set_param osc.lfsbv-OST*.max_pages_per_rpc=4096 (got auto-updated after re-mount) - lctl set_param osc.*.max_rpcs_in_flight=64 (Had to manually increase this to 64, since after re-mount, it was auto-set to 8, but read/write performance was poor) - lctl set_param osc.*.max_dirty_mb=2040. (setting the value to 2048 was failing with : Numerical result out of range error. Previously it was set to 2000 when I got good performance. My other settings: - lnetctl net add --net tcp1 --if $interface –peer-timeout 180 –peer-credits 128 –credits 1024 - echo "options ksocklnd nscheds=10 sock_timeout=100 credits=2560 peer_credits=63 enable_irq_affinity=0" > /etc/modprobe.d/ksocklnd.conf - lfs setstripe -c 1 -S 1M /mnt/mdt_bv/test1 ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC)
I was not able to find those parameters on my client nodes, OSS or MGS nodes. Here is how I was extracting all parameters . mkdir -p lctl_list_param_R/ cd lctl_list_param_R/ lctl list_param -R * > lctl_list_param_R [opc@lustre-client-1 lctl_list_param_R]$ less lctl_list_param_R | grep ahead llite.lfsbv-98231c3bc000.statahead_agl llite.lfsbv-98231c3bc000.statahead_max llite.lfsbv-98231c3bc000.statahead_running_max llite.lfsnvme-98232c30e000.statahead_agl llite.lfsnvme-98232c30e000.statahead_max llite.lfsnvme-98232c30e000.statahead_running_max [opc@lustre-client-1 lctl_list_param_R]$ I also tried these commands: Not working: On client nodes lctl get_param llite.lfsbv-*.max_read_ahead_mb error: get_param: param_path 'llite/lfsbv-*/max_read_ahead_mb': No such file or directory [opc@lustre-client-1 lctl_list_param_R]$ Works On client nodes lctl get_param llite.*.statahead_agl llite.lfsbv-98231c3bc000.statahead_agl=1 llite.lfsnvme-98232c30e000.statahead_agl=1 [opc@lustre-client-1 lctl_list_param_R]$ From: "Moreno Diego (ID SIS)" Date: Tuesday, December 10, 2019 at 2:06 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) With that kind of degradation performance on read I would immediately think on llite’s max_read_ahead parameters on the client. Specifically these 2: max_read_ahead_mb: total amount of MB allocated for read ahead, usually quite low for bandwidth benchmarking purposes and when there’re several files per client max_read_ahead_per_file_mb: the default is quite low for 16MB RPCs (only a few RPCs per file) You probably need to check the effect increasing both of them. Regards, Diego From: lustre-discuss on behalf of Pinkesh Valdria Date: Tuesday, 10 December 2019 at 09:40 To: "lustre-discuss@lists.lustre.org" Subject: [lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC) I was expecting better or same read performance with Large Bulk IO (16MB RPC), but I see degradation in performance. Do I need to tune any other parameter to benefit from Large Bulk IO? Appreciate if I can get any pointers to troubleshoot further. Throughput before - Read: 2563 MB/s - Write: 2585 MB/s Throughput after - Read: 1527 MB/s. (down by ~1025) - Write: 2859 MB/s Changes I did are: On oss - lctl set_param obdfilter.lfsbv-*.brw_size=16 On clients - unmounted and remounted - lctl set_param osc.lfsbv-OST*.max_pages_per_rpc=4096 (got auto-updated after re-mount) - lctl set_param osc.*.max_rpcs_in_flight=64 (Had to manually increase this to 64, since after re-mount, it was auto-set to 8, but read/write performance was poor) - lctl set_param osc.*.max_dirty_mb=2040. (setting the value to 2048 was failing with : Numerical result out of range error. Previously it was set to 2000 when I got good performance. My other settings: - lnetctl net add --net tcp1 --if $interface –peer-timeout 180 –peer-credits 128 –credits 1024 - echo "options ksocklnd nscheds=10 sock_timeout=100 credits=2560 peer_credits=63 enable_irq_affinity=0" > /etc/modprobe.d/ksocklnd.conf - lfs setstripe -c 1 -S 1M /mnt/mdt_bv/test1 ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lemur Lustre - make rpm fails
Hi Nathaniel, I have an issue ticket opened: https://github.com/whamcloud/lemur/issues/7 I tried to do it locally, that also fails, given below is the error. [root@lustre-client-4 lemur]# lfs --version lfs 2.12.3 [root@lustre-client-4 lemur]# uname -a Linux lustre-client-4 3.10.0-1062.7.1.el7.x86_64 #1 SMP Mon Dec 2 17:33:29 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [root@lustre-client-4 lemur]# lsb_release -r Release:7.6.1810 [root@lustre-client-4 lemur]# [root@lustre-client-4 lemur]# make local-rpm make -C packaging/rpm NAME=lemur VERSION=0.6.0_4_g4655df8 RELEASE=1 URL="https://github.com/intel-hpdd/lemur"; make[1]: Entering directory `/root/lemur/packaging/rpm' cd ../../ && \ . github.com/intel-hpdd/lemur/vendor/github.com/aws/aws-sdk-go/service/s3/s3manager github.com/intel-hpdd/lemur/cmd/lhsm-plugin-s3 install -d $(dirname /root/rpmbuild/BUILDROOT/lemur-hsm-agent-0.6.0_4_g4655df8-1.x86_64//usr/bin/lhsm-plugin-s3) install -m 755 lhsm-plugin-s3 /root/rpmbuild/BUILDROOT/lemur-hsm-agent-0.6.0_4_g4655df8-1.x86_64//usr/bin/lhsm-plugin-s3 go build -v -i -ldflags "-X 'main.version=0.6.0_4_g4655df8'" -o lhsm ./cmd/lhsm github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/pkg/pool github.com/intel-hpdd/lemur/cmd/lhsmd/agent/fileid github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi github.com/intel-hpdd/lemur/vendor/gopkg.in/yaml.v2 github.com/intel-hpdd/lemur/vendor/gopkg.in/urfave/cli.v1 # github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi cgo-gcc-prolog: In function '_cgo_c110903d49cd_C2func_llapi_get_version': cgo-gcc-prolog:58:2: warning: 'llapi_get_version' is deprecated (declared at /usr/include/lustre/lustreapi.h:398) [-Wdeprecated-declarations] cgo-gcc-prolog: In function '_cgo_c110903d49cd_Cfunc_llapi_get_version': cgo-gcc-prolog:107:2: warning: 'llapi_get_version' is deprecated (declared at /usr/include/lustre/lustreapi.h:398) [-Wdeprecated-declarations] # github.com/intel-hpdd/lemur/vendor/github.com/intel-hpdd/go-lustre/llapi vendor/github.com/intel-hpdd/go-lustre/llapi/changelog.go:273:39: cannot use _Ctype_int(r.flags) (type _Ctype_int) as type int32 in argument to _Cfunc_hsm_get_cl_flags make[2]: *** [lhsm] Error 2 make[2]: Leaving directory `/root/rpmbuild/BUILD/lemur-0.6.0_4_g4655df8/src/github.com/intel-hpdd/lemur' error: Bad exit status from /var/tmp/rpm-tmp.cPPeEL (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.cPPeEL (%install) make[1]: *** [rpm] Error 1 make[1]: Leaving directory `/root/lemur/packaging/rpm' make: *** [local-rpm] Error 2 [root@lustre-client-4 lemur]# Thanks, Pinkesh Valdria On 12/10/19, 4:55 AM, "lustre-discuss on behalf of Nathaniel Clark" wrote: Can you open at ticket for this on https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_whamcloud_lemur_issues&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=dvUy7ZhvTpzQ9yJzUhQmk0UHrXXOGiSc2X1_Sm5yOhY&s=aD2CBP6CmEF14pb7PM2A-H4aFyzbd09y5IRcQXqIHj8&e= And possibly https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.whamcloud.com_projects_LMR&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=dvUy7ZhvTpzQ9yJzUhQmk0UHrXXOGiSc2X1_Sm5yOhY&s=SoLFFKtz2XY9CNh4vyFssmTyhvmkIqyABrH_FzZzUQk&e= You could also try: $ make local-rpm Which will avoid the docker stack and just build on the local machine (beware it sudo's to install rpm build dependencies). -- Nathaniel Clark Senior Engineer Whamcloud / DDN On Mon, 2019-12-09 at 15:04 -0800, Pinkesh Valdria wrote: > I am trying to install Lemur on CentOS 7.6 (7.6.1810) to integrate > with Object storage but the install fails. I used the instructions > on below page to install. I already had Lustre client (2.12.3) > installed on the machine, so I started with steps for Lemur. > > https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.whamcloud.com_display_PUB_HPDD-2BHSM-2BAgent-2Band-2BData-2BMovers-2B-2528Lemur-2529-2BGetting-2BStarted-2BGuide&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=dvUy7ZhvTpzQ9yJzUhQmk0UHrXXOGiSc2X1_Sm5yOhY&s=IftJE1TWubVV7pX19fr31fzo-14G01LrgoBPRUHKO0g&e= > > > Steps followed: > > git clone https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_whamcloud_lemur.git&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7Hg
[lustre-discuss] Degraded read performance with Large Bulk IO (16MB RPC)
I was expecting better or same read performance with Large Bulk IO (16MB RPC), but I see degradation in performance. Do I need to tune any other parameter to benefit from Large Bulk IO? Appreciate if I can get any pointers to troubleshoot further. Throughput before Read: 2563 MB/s Write: 2585 MB/s Throughput after Read: 1527 MB/s. (down by ~1025) Write: 2859 MB/s Changes I did are: On oss lctl set_param obdfilter.lfsbv-*.brw_size=16 On clients unmounted and remounted lctl set_param osc.lfsbv-OST*.max_pages_per_rpc=4096 (got auto-updated after re-mount) lctl set_param osc.*.max_rpcs_in_flight=64 (Had to manually increase this to 64, since after re-mount, it was auto-set to 8, but read/write performance was poor) lctl set_param osc.*.max_dirty_mb=2040. (setting the value to 2048 was failing with : Numerical result out of range error. Previously it was set to 2000 when I got good performance. My other settings: lnetctl net add --net tcp1 --if $interface –peer-timeout 180 –peer-credits 128 –credits 1024 echo "options ksocklnd nscheds=10 sock_timeout=100 credits=2560 peer_credits=63 enable_irq_affinity=0" > /etc/modprobe.d/ksocklnd.conf lfs setstripe -c 1 -S 1M /mnt/mdt_bv/test1 ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lemur Lustre - make rpm fails
I am trying to install Lemur on CentOS 7.6 (7.6.1810) to integrate with Object storage but the install fails. I used the instructions on below page to install. I already had Lustre client (2.12.3) installed on the machine, so I started with steps for Lemur. https://wiki.whamcloud.com/display/PUB/HPDD+HSM+Agent+and+Data+Movers+%28Lemur%29+Getting+Started+Guide Steps followed: git clone https://github.com/whamcloud/lemur.git cd lemur git checkout master service docker start make rpm [root@lustre-client-4 lemur]# make rpm make -C packaging/docker make[1]: Entering directory `/root/lemur/packaging/docker' make[2]: Entering directory `/root/lemur/packaging/docker/go-el7' Building go-el7/1.13.5-1.fc32 for 1.13.5-1.fc32 docker build -t go-el7:1.13.5-1.fc32 -t go-el7:latest --build-arg=go_version=1.13.5-1.fc32 --build-arg=go_macros_version=3.0.8-4.fc31 . Sending build context to Docker daemon 4.608 kB Step 1/9 : FROM centos:7 ---> 5e35e350aded Step 2/9 : MAINTAINER Robert Read ---> Using cache ---> 4be0d7fa27a2 Step 3/9 : RUN yum install -y @development golang pcre-devel glibc-static which ---> Using cache ---> ac83254f37f7 Step 4/9 : RUN mkdir -p /go/src /go/bin && chmod -R 777 /go ---> Using cache ---> fdbb4d031716 Step 5/9 : ENV GOPATH /go PATH $GOPATH/bin:$PATH ---> Using cache ---> 216c5484727e Step 6/9 : RUN go get github.com/tools/godep && cp /go/bin/godep /usr/local/bin ---> Running in aed86ac3eb87 /bin/sh: go: command not found The command '/bin/sh -c go get github.com/tools/godep && cp /go/bin/godep /usr/local/bin' returned a non-zero code: 127 make[2]: *** [go-el7/1.13.5-1.fc32] Error 127 make[2]: Leaving directory `/root/lemur/packaging/docker/go-el7' make[1]: *** [go-el7] Error 2 make[1]: Leaving directory `/root/lemur/packaging/docker' make: *** [docker] Error 2 [root@lustre-client-4 lemur]# Is this repo for Lemur the most updated version? [root@lustre-client-4 lemur]# lfs --version lfs 2.12.3 [root@lustre-client-4 lemur]# ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lnet Self Test
0 0 S 3.6 0.0 81:30.26 socknal_sd01_03 551 root 20 0 0 0 0 S 2.6 0.0 39:24.00 kswapd0 60860 root 20 0 0 0 0 S 2.3 0.0 30:54.35 socknal_sd00_01 60864 root 20 0 0 0 0 S 2.3 0.0 30:58.20 socknal_sd00_05 64426 root 20 0 0 0 0 S 2.3 0.0 7:28.65 ll_ost_io01_102 60859 root 20 0 0 0 0 S 2.0 0.0 30:56.70 socknal_sd00_00 60861 root 20 0 0 0 0 S 2.0 0.0 30:54.97 socknal_sd00_02 60862 root 20 0 0 0 0 S 2.0 0.0 30:56.06 socknal_sd00_03 60863 root 20 0 0 0 0 S 2.0 0.0 30:56.32 socknal_sd00_04 64334 root 20 0 0 0 0 D 1.3 0.0 7:19.46 ll_ost_io01_010 64329 root 20 0 0 0 0 S 1.0 0.0 7:46.48 ll_ost_io01_005 From: "Moreno Diego (ID SIS)" Date: Wednesday, December 4, 2019 at 11:12 PM To: Pinkesh Valdria , Jongwoo Han Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test I recently did some work on 40Gb and 100Gb ethernet interfaces and these are a few of the things that helped me during lnet_selftest: On lnet: credits set to higher than the default (e.g: 1024 or more), peer_credits to 128 at least for network testing (it’s just 8 by default which is good for a big cluster maybe not for lnet_selftest with 2 clients), On ksocklnd module options: more schedulers (10, 6 by default which was not enough for my server), also changed some of the buffers (tx_buffer_size and rx_buffer_size set to 1073741824) but you need to be very careful on these Sysctl.conf: increase buffers (tcp_rmem, tcp_wmem, check window_scaling, net.core.max and default, check disabling timestamps if you can afford it) Other: cpupower governor (set to performance at least for testing), BIOS settings (e.g: on my AMD routers it was better to disable HT, disable a few virtualization oriented features and set the PCI config to performance). Basically, be aware that Lustre ethernet’s performance will take CPU resources so better optimize for it Last but not least be aware that Lustre’s ethernet driver (ksocklnd) does not load balance as well as Infiniband’s (ko2iblnd). I already saw sometimes several Lustre peers using the same socklnd thread on the destination but the other socklnd threads might not be active which means that your entire load is on just dependent on one core. For that the best is to try with more clients and check in your node what’s the cpu load per thread with top. 2 clients do not seem enough to me. With the proper configuration you should be perfectly able to saturate a 25Gb link in lnet_selftest. Regards, Diego From: lustre-discuss on behalf of Pinkesh Valdria Date: Thursday, 5 December 2019 at 06:14 To: Jongwoo Han Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test Thanks Jongwoo. I have the MTU set for 9000 and also ring buffer setting set to max. ip link set dev $primaryNICInterface mtu 9000 ethtool -G $primaryNICInterface rx 2047 tx 2047 rx-jumbo 8191 I read about changing Interrupt Coalesce, but unable to find what values should be changed and also if it really helps or not. # Several packets in a rapid sequence can be coalesced into one interrupt passed up to the CPU, providing more CPU time for application processing. Thanks, Pinkesh valdria Oracle Cloud From: Jongwoo Han Date: Wednesday, December 4, 2019 at 8:07 PM To: Pinkesh Valdria Cc: Andreas Dilger , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test Have you tried MTU >= 9000 bytes (AKA jumbo frame) on the 25G ethernet and the switch? If it is set to 1500 bytes, ethernet + IP + TCP frame headers take quite amount of packet, reducing available bandwidth for data. Jongwoo Han 2019년 11월 28일 (목) 오전 3:44, Pinkesh Valdria 님이 작성: Thanks Andreas for your response. I ran anotherLnet Self test with 48 concurrent processes, since the nodes have 52 physical cores and I was able to achieve same throughput (2052.71 MiB/s = 2152 MB/s). Is it expected to lose almost 600 MB/s (2750-2150= ) due to overheads on ethernet with Lnet? Thanks, Pinkesh Valdria Oracle Cloud Infrastructure From: Andreas Dilger Date: Wednesday, November 27, 2019 at 1:25 AM To: Pinkesh Valdria Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test The first thing to note is that lst reports results in binary units (MiB/s) while iperf reports results in decimal units (Gbps). If you do the conversion you get 2055.31 MiB/s = 2155 MB/s. The other thing to check is the CPU usage. For TCP the CPU usage can be high. You should try RoCE+o2iblnd instead. Cheers, Andreas On Nov 26, 20
Re: [lustre-discuss] Lnet Self Test
Thanks Jongwoo. I have the MTU set for 9000 and also ring buffer setting set to max. ip link set dev $primaryNICInterface mtu 9000 ethtool -G $primaryNICInterface rx 2047 tx 2047 rx-jumbo 8191 I read about changing Interrupt Coalesce, but unable to find what values should be changed and also if it really helps or not. # Several packets in a rapid sequence can be coalesced into one interrupt passed up to the CPU, providing more CPU time for application processing. Thanks, Pinkesh valdria Oracle Cloud From: Jongwoo Han Date: Wednesday, December 4, 2019 at 8:07 PM To: Pinkesh Valdria Cc: Andreas Dilger , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test Have you tried MTU >= 9000 bytes (AKA jumbo frame) on the 25G ethernet and the switch? If it is set to 1500 bytes, ethernet + IP + TCP frame headers take quite amount of packet, reducing available bandwidth for data. Jongwoo Han 2019년 11월 28일 (목) 오전 3:44, Pinkesh Valdria 님이 작성: Thanks Andreas for your response. I ran anotherLnet Self test with 48 concurrent processes, since the nodes have 52 physical cores and I was able to achieve same throughput (2052.71 MiB/s = 2152 MB/s). Is it expected to lose almost 600 MB/s (2750-2150= ) due to overheads on ethernet with Lnet? Thanks, Pinkesh Valdria Oracle Cloud Infrastructure From: Andreas Dilger Date: Wednesday, November 27, 2019 at 1:25 AM To: Pinkesh Valdria Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test The first thing to note is that lst reports results in binary units (MiB/s) while iperf reports results in decimal units (Gbps). If you do the conversion you get 2055.31 MiB/s = 2155 MB/s. The other thing to check is the CPU usage. For TCP the CPU usage can be high. You should try RoCE+o2iblnd instead. Cheers, Andreas On Nov 26, 2019, at 21:26, Pinkesh Valdria wrote: Hello All, I created a new Lustre cluster on CentOS7.6 and I am running lnet_selftest_wrapper.sh to measure throughput on the network. The nodes are connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 125 = 3125 MB/s.Using iperf3, I get 22Gbps (2750 MB/s) between the nodes. [root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ; do echo $c ; ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=$c SZ=1M TM=30 BRW=write CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" /root/lnet_selftest_wrapper.sh; done ; When I run lnet_selftest_wrapper.sh (from Lustre wiki) between 2 nodes, I get a max of 2055.31 MiB/s, Is that expected at the Lnet level? Or can I further tune the network and OS kernel (tuning I applied are below) to get better throughput? Result Snippet from lnet_selftest_wrapper.sh [LNet Rates of lfrom] [R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s [W] Avg: 2055.30 MiB/s Min: 2055.30 MiB/s Max: 2055.30 MiB/s [LNet Rates of lto] [R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [LNet Bandwidth of lto] [R] Avg: 2055.31 MiB/s Min: 2055.31 MiB/s Max: 2055.31 MiB/s [W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s Tuning applied: Ethernet NICs: ip link set dev ens3 mtu 9000 ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191 less /etc/sysctl.conf net.core.wmem_max=16777216 net.core.rmem_max=16777216 net.core.wmem_default=16777216 net.core.rmem_default=16777216 net.core.optmem_max=16777216 net.core.netdev_max_backlog=27000 kernel.sysrq=1 kernel.shmmax=18446744073692774399 net.core.somaxconn=8192 net.ipv4.tcp_adv_win_scale=2 net.ipv4.tcp_low_latency=1 net.ipv4.tcp_rmem = 212992 87380 16777216 net.ipv4.tcp_sack = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_wmem = 212992 65536 16777216 vm.min_free_kbytes = 65536 net.ipv4.tcp_congestion_control = cubic net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_congestion_control = htcp net.ipv4.tcp_no_metrics_save = 0 echo "# # tuned configuration # [main] summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads [disk] devices=!dm-*, !sda1, !sda2, !sda3 readahead=>4096 [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 1000 kernel.sched_wakeup_granularity_ns = 1500 vm.dirty_ratio = 30 vm.dirty_background_ratio = 10 vm.swappiness=30 " > lustre-performance/tuned.conf tuned-adm profile lustre-performance Thanks, Pinkesh Valdria ___
Re: [lustre-discuss] Lnet Self Test
Thanks Andreas for your response. I ran anotherLnet Self test with 48 concurrent processes, since the nodes have 52 physical cores and I was able to achieve same throughput (2052.71 MiB/s = 2152 MB/s). Is it expected to lose almost 600 MB/s (2750-2150= ) due to overheads on ethernet with Lnet? Thanks, Pinkesh Valdria Oracle Cloud Infrastructure From: Andreas Dilger Date: Wednesday, November 27, 2019 at 1:25 AM To: Pinkesh Valdria Cc: "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] Lnet Self Test The first thing to note is that lst reports results in binary units (MiB/s) while iperf reports results in decimal units (Gbps). If you do the conversion you get 2055.31 MiB/s = 2155 MB/s. The other thing to check is the CPU usage. For TCP the CPU usage can be high. You should try RoCE+o2iblnd instead. Cheers, Andreas On Nov 26, 2019, at 21:26, Pinkesh Valdria wrote: Hello All, I created a new Lustre cluster on CentOS7.6 and I am running lnet_selftest_wrapper.sh to measure throughput on the network. The nodes are connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 125 = 3125 MB/s.Using iperf3, I get 22Gbps (2750 MB/s) between the nodes. [root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ; do echo $c ; ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=$c SZ=1M TM=30 BRW=write CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" /root/lnet_selftest_wrapper.sh; done ; When I run lnet_selftest_wrapper.sh (from Lustre wiki) between 2 nodes, I get a max of 2055.31 MiB/s, Is that expected at the Lnet level? Or can I further tune the network and OS kernel (tuning I applied are below) to get better throughput? Result Snippet from lnet_selftest_wrapper.sh [LNet Rates of lfrom] [R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s [W] Avg: 2055.30 MiB/s Min: 2055.30 MiB/s Max: 2055.30 MiB/s [LNet Rates of lto] [R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [LNet Bandwidth of lto] [R] Avg: 2055.31 MiB/s Min: 2055.31 MiB/s Max: 2055.31 MiB/s [W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s Tuning applied: Ethernet NICs: ip link set dev ens3 mtu 9000 ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191 less /etc/sysctl.conf net.core.wmem_max=16777216 net.core.rmem_max=16777216 net.core.wmem_default=16777216 net.core.rmem_default=16777216 net.core.optmem_max=16777216 net.core.netdev_max_backlog=27000 kernel.sysrq=1 kernel.shmmax=18446744073692774399 net.core.somaxconn=8192 net.ipv4.tcp_adv_win_scale=2 net.ipv4.tcp_low_latency=1 net.ipv4.tcp_rmem = 212992 87380 16777216 net.ipv4.tcp_sack = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_wmem = 212992 65536 16777216 vm.min_free_kbytes = 65536 net.ipv4.tcp_congestion_control = cubic net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_congestion_control = htcp net.ipv4.tcp_no_metrics_save = 0 echo "# # tuned configuration # [main] summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads [disk] devices=!dm-*, !sda1, !sda2, !sda3 readahead=>4096 [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 1000 kernel.sched_wakeup_granularity_ns = 1500 vm.dirty_ratio = 30 vm.dirty_background_ratio = 10 vm.swappiness=30 " > lustre-performance/tuned.conf tuned-adm profile lustre-performance Thanks, Pinkesh Valdria ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lnet Self Test
Hello All, I created a new Lustre cluster on CentOS7.6 and I am running lnet_selftest_wrapper.sh to measure throughput on the network. The nodes are connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 125 = 3125 MB/s. Using iperf3, I get 22Gbps (2750 MB/s) between the nodes. [root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ; do echo $c ; ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=$c SZ=1M TM=30 BRW=write CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" /root/lnet_selftest_wrapper.sh; done ; When I run lnet_selftest_wrapper.sh (from Lustre wiki) between 2 nodes, I get a max of 2055.31 MiB/s, Is that expected at the Lnet level? Or can I further tune the network and OS kernel (tuning I applied are below) to get better throughput? Result Snippet from lnet_selftest_wrapper.sh [LNet Rates of lfrom] [R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s [W] Avg: 2055.30 MiB/s Min: 2055.30 MiB/s Max: 2055.30 MiB/s [LNet Rates of lto] [R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [LNet Bandwidth of lto] [R] Avg: 2055.31 MiB/s Min: 2055.31 MiB/s Max: 2055.31 MiB/s [W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s Tuning applied: Ethernet NICs: ip link set dev ens3 mtu 9000 ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191 less /etc/sysctl.conf net.core.wmem_max=16777216 net.core.rmem_max=16777216 net.core.wmem_default=16777216 net.core.rmem_default=16777216 net.core.optmem_max=16777216 net.core.netdev_max_backlog=27000 kernel.sysrq=1 kernel.shmmax=18446744073692774399 net.core.somaxconn=8192 net.ipv4.tcp_adv_win_scale=2 net.ipv4.tcp_low_latency=1 net.ipv4.tcp_rmem = 212992 87380 16777216 net.ipv4.tcp_sack = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_wmem = 212992 65536 16777216 vm.min_free_kbytes = 65536 net.ipv4.tcp_congestion_control = cubic net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_congestion_control = htcp net.ipv4.tcp_no_metrics_save = 0 echo "# # tuned configuration # [main] summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads [disk] devices=!dm-*, !sda1, !sda2, !sda3 readahead=>4096 [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 1000 kernel.sched_wakeup_granularity_ns = 1500 vm.dirty_ratio = 30 vm.dirty_background_ratio = 10 vm.swappiness=30 " > lustre-performance/tuned.conf tuned-adm profile lustre-performance Thanks, Pinkesh Valdria ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] max_pages_per_rpc=4096 fails on the client nodes
For others, incase they face this issue. Solution: I had to unmount and remount for the command to work. From: Pinkesh Valdria Date: Wednesday, August 14, 2019 at 9:25 AM To: "lustre-discuss@lists.lustre.org" Subject: max_pages_per_rpc=4096 fails on the client nodes I want to enable large RPC size. I followed the steps as per the Lustre manual section: 33.9.2 Usage (http://doc.lustre.org/lustre_manual.xhtml), but I get the below we error when I try to update the client. Updated the OSS server: [root@lustre-oss-server-nic0-1 test]# lctl set_param obdfilter.lfsbv-*.brw_size=16 obdfilter.lfsbv-OST.brw_size=16 obdfilter.lfsbv-OST0001.brw_size=16 obdfilter.lfsbv-OST0002.brw_size=16 obdfilter.lfsbv-OST0003.brw_size=16 obdfilter.lfsbv-OST0004.brw_size=16 obdfilter.lfsbv-OST0005.brw_size=16 obdfilter.lfsbv-OST0006.brw_size=16 obdfilter.lfsbv-OST0007.brw_size=16 obdfilter.lfsbv-OST0008.brw_size=16 obdfilter.lfsbv-OST0009.brw_size=16 [root@lustre-oss-server-nic0-1 test]# Add the above change permanently using MGS node: [root@lustre-mds-server-nic0-1 ~]# lctl set_param -P obdfilter.lfsbv-*.brw_size=16 [root@lustre-mds-server-nic0-1 ~]# Client side update – failed [root@lustre-client-1 ~]# lctl set_param osc.lfsbv-OST*.max_pages_per_rpc=4096 error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0001-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0002-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0003-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range ….. ….. 33.9.2. Usage In order to enable a larger RPC size, brw_size must be changed to an IO size value up to 16MB. To temporarily change brw_size, the following command should be run on the OSS: oss# lctl set_param obdfilter.fsname-OST*.brw_size=16 To persistently change brw_size, the following command should be run: oss# lctl set_param -P obdfilter.fsname-OST*.brw_size=16 When a client connects to an OST target, it will fetch brw_size from the target and pick the maximum value of brw_size and its local setting for max_pages_per_rpc as the actual RPC size. Therefore, the max_pages_per_rpc on the client side would have to be set to 16M, or 4096 if the PAGESIZE is 4KB, to enable a 16MB RPC. To temporarily make the change, the following command should be run on the client to setmax_pages_per_rpc: client$ lctl set_param osc.fsname-OST*.max_pages_per_rpc=16M To persistently make this change, the following command should be run: client$ lctl set_param -P obdfilter.fsname-OST*.osc.max_pages_per_rpc=16M Caution The brw_size of an OST can be changed on the fly. However, clients have to be remounted to renegotiate the new maximum RPC size. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] max_pages_per_rpc=4096 fails on the client nodes
I want to enable large RPC size. I followed the steps as per the Lustre manual section: 33.9.2 Usage (http://doc.lustre.org/lustre_manual.xhtml), but I get the below we error when I try to update the client. Updated the OSS server: [root@lustre-oss-server-nic0-1 test]# lctl set_param obdfilter.lfsbv-*.brw_size=16 obdfilter.lfsbv-OST.brw_size=16 obdfilter.lfsbv-OST0001.brw_size=16 obdfilter.lfsbv-OST0002.brw_size=16 obdfilter.lfsbv-OST0003.brw_size=16 obdfilter.lfsbv-OST0004.brw_size=16 obdfilter.lfsbv-OST0005.brw_size=16 obdfilter.lfsbv-OST0006.brw_size=16 obdfilter.lfsbv-OST0007.brw_size=16 obdfilter.lfsbv-OST0008.brw_size=16 obdfilter.lfsbv-OST0009.brw_size=16 [root@lustre-oss-server-nic0-1 test]# Add the above change permanently using MGS node: [root@lustre-mds-server-nic0-1 ~]# lctl set_param -P obdfilter.lfsbv-*.brw_size=16 [root@lustre-mds-server-nic0-1 ~]# Client side update – failed [root@lustre-client-1 ~]# lctl set_param osc.lfsbv-OST*.max_pages_per_rpc=4096 error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0001-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0002-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range error: set_param: setting /proc/fs/lustre/osc/lfsbv-OST0003-osc-8e66b4b08000/max_pages_per_rpc=4096: Numerical result out of range ….. ….. 33.9.2. Usage In order to enable a larger RPC size, brw_size must be changed to an IO size value up to 16MB. To temporarily change brw_size, the following command should be run on the OSS: oss# lctl set_param obdfilter.fsname-OST*.brw_size=16 To persistently change brw_size, the following command should be run: oss# lctl set_param -P obdfilter.fsname-OST*.brw_size=16 When a client connects to an OST target, it will fetch brw_size from the target and pick the maximum value of brw_size and its local setting for max_pages_per_rpc as the actual RPC size. Therefore, the max_pages_per_rpc on the client side would have to be set to 16M, or 4096 if the PAGESIZE is 4KB, to enable a 16MB RPC. To temporarily make the change, the following command should be run on the client to setmax_pages_per_rpc: client$ lctl set_param osc.fsname-OST*.max_pages_per_rpc=16M To persistently make this change, the following command should be run: client$ lctl set_param -P obdfilter.fsname-OST*.osc.max_pages_per_rpc=16M Caution The brw_size of an OST can be changed on the fly. However, clients have to be remounted to renegotiate the new maximum RPC size. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] lnet_selftest - fails for me
Figured out the issue. I forgot to load module on the server side. Solution: Load module on all nodes involved in testing. [root@lustre-oss-server-nic0-1 ~]# modprobe lnet_selftest [root@lustre-oss-server-nic0-1 ~]# From: lustre-discuss on behalf of Pinkesh Valdria Date: Monday, August 12, 2019 at 10:55 AM To: "lustre-discuss@lists.lustre.org" Subject: [lustre-discuss] lnet_selftest - fails for me Hello, Does anyone know why this simple lnet_selftest is failing.I am able to use the Lustre file system without any problem.I looked at /var/log/messages on the client and server node and there are no errors. Googling for the error, was not helpful. The script: lnet_selftest_wrapper.sh. has content which is mentioned on this page: http://wiki.lustre.org/LNET_Selftest (wrapper script at the end of the page). LFROM="10.0.3.4@tcp1" is the client node where I am running this script. LTO="10.0.3.6@tcp1" is one of the OSS server.I have total 3 of them. [root@lustre-client-1 ~]# ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=1 SZ=1M TM=60 BRW=read CKSUM=simple LFROM="10.0.3.4@tcp1" LTO="10.0.3.6@tcp1" ./lnet_selftest_wrapper.sh lst-output-2019-08-12-17:47:09 1 1M 60 read simple 10.0.3.4@tcp1 10.0.3.6@tcp1 LST_SESSION = 6787 SESSION: lstread FEATURES: 1 TIMEOUT: 300 FORCE: No 10.0.3.4@tcp1 are added to session create session RPC failed on 12345-10.0.3.6@tcp1: Unknown error -110 No nodes added successfully, deleting group lto Group is deleted Can't get count of nodes from lto: No such file or directory bulk_read is running now Capturing statistics for 60 secs Invalid nid: lto Failed to get count of nodes from lto: Success ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=HpfvG0tozSl7HgJJuyxxo2149EjwqpQDE7ytv-4sZuI&m=kCjhxVXU3X4xpBaM7SJPPhcFh8shTcCUiBfvAcZsICs&s=XGV5I0U6R0olO7ocBaiHEeBgcrCMIJFe2xUQMh_ymjI&e= ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] lnet_selftest - fails for me
Hello, Does anyone know why this simple lnet_selftest is failing. I am able to use the Lustre file system without any problem. I looked at /var/log/messages on the client and server node and there are no errors. Googling for the error, was not helpful. The script: lnet_selftest_wrapper.sh. has content which is mentioned on this page: http://wiki.lustre.org/LNET_Selftest (wrapper script at the end of the page). LFROM="10.0.3.4@tcp1" is the client node where I am running this script. LTO="10.0.3.6@tcp1" is one of the OSS server. I have total 3 of them. [root@lustre-client-1 ~]# ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=1 SZ=1M TM=60 BRW=read CKSUM=simple LFROM="10.0.3.4@tcp1" LTO="10.0.3.6@tcp1" ./lnet_selftest_wrapper.sh lst-output-2019-08-12-17:47:09 1 1M 60 read simple 10.0.3.4@tcp1 10.0.3.6@tcp1 LST_SESSION = 6787 SESSION: lstread FEATURES: 1 TIMEOUT: 300 FORCE: No 10.0.3.4@tcp1 are added to session create session RPC failed on 12345-10.0.3.6@tcp1: Unknown error -110 No nodes added successfully, deleting group lto Group is deleted Can't get count of nodes from lto: No such file or directory bulk_read is running now Capturing statistics for 60 secs Invalid nid: lto Failed to get count of nodes from lto: Success ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] LND tunables - how to set on Ethernet network
I have been trying to find out, how can I set these below values, if I use Luster & LNET using 25 gbps Ethernet network. Seems like for Infiniband or Intel OPA, you can set them in ko2iblnd.conf. tunables: peer_timeout: 180 peer_credits: 8 peer_buffer_credits: 0 credits: 256 lnd tunables: peercredits_hiw: 64 map_on_demand: 32 concurrent_sends: 256 fmr_pool_size: 2048 fmr_flush_trigger: 512 fmr_cache: 1 Also, I have been doing Dynamic Network Configuration using lnetctl command, in that case, is there a way to set the above values or it has to be via some config file only? ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] LNET tunables and LND tunables
Hello, I have a lustre cluster using 25gbps ethernet network. (no infinitiband). I see lot of examples online for infiniband and what tunables to use for it, but I am struggling to find recommendations when using ethernet networks. Appreciate if someone can share their experience and settings when using ethernet or if there is any details online for ethernet recommended values. >From my lustre client node. I don’t see any LND tunables below like >(peercredits_hiw: 64, map_on_demand: 32, concurrent_sends: 256, fmr_pool_size: >2048, fmr_flush_trigger: 512, fmr_cache: 1). [root@lustre-client-1 ~]# lnetctl net show --verbose net: - net type: lo local NI(s): - nid: 0@lo status: up statistics: send_count: 0 recv_count: 0 drop_count: 0 tunables: peer_timeout: 0 peer_credits: 0 peer_buffer_credits: 0 credits: 0 dev cpt: 0 tcp bonding: 0 CPT: "[0,1,2,3,4,5,6,7,8,9,10,11]" - net type: tcp1 local NI(s): - nid: 10.0.3.4@tcp1 status: up interfaces: 0: ens3 statistics: send_count: 657209676 recv_count: 657208330 drop_count: 0 tunables: peer_timeout: 180 peer_credits: 8 peer_buffer_credits: 0 credits: 256 dev cpt: -1 tcp bonding: 0 CPT: "[0,1,2,3,4,5,6,7,8,9,10,11]" [root@lustre-client-1 ~]# OSS Server [root@lustre-oss-server-nic0-1 ~]# lnetctl net show --verbose net: - net type: lo local NI(s): - nid: 0@lo status: up statistics: send_count: 0 recv_count: 0 drop_count: 0 tunables: peer_timeout: 0 peer_credits: 0 peer_buffer_credits: 0 credits: 0 dev cpt: 0 tcp bonding: 0 CPT: "[0,1]" - net type: tcp1 local NI(s): - nid: 10.0.3.6@tcp1 status: up interfaces: 0: eno3d1 statistics: send_count: 232650108 recv_count: 232650019 drop_count: 0 tunables: peer_timeout: 180 peer_credits: 8 peer_buffer_credits: 0 credits: 256 dev cpt: 0 tcp bonding: 0 CPT: "[0,1]" [root@lustre-oss-server-nic0-1 ~]# ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre tuning - help
Lustre experts, I recently installed Lustre for the first time. Its working (so I am happy), but now I am trying to do some performance testing/tuning. My goal is to run SAS workload and use Lustre as the shared file system for SAS Grid. Later, do tuning of Lustre for generic HPC workload. Through Google search, I read articles on Lustre and recommendation for tuning from LUG conference slides, etc https://cpb-us-e1.wpmucdn.com/blogs.rice.edu/dist/0/2327/files/2014/03/Fragalla.pdf http://cdn.opensfs.org/wp-content/uploads/2019/07/LUG2019-Sysadmin-tutorial.pdf http://support.sas.com/rnd/scalability/grid/SGMonAWS.pdf I have results of IBM Spectrum Scale (GPFS) running on same hardware/software stack and based on Lustre tuning I have done, I am not getting optimal performance. My understanding was that Lustre can deliver better performance compare to GPFS, if tuned correctly. I have tried, changing the following: Use Stripe count =1, 4, 8, 16, 24 , -1 (to stripe across all OSTs). And progressive file layout: lfs setstripe -E 256M -c 1 -E 4G -c 4 -E -1 -c -1 -S 4M /mnt/mdt_bv Use Stripe Size: default (1M), 4M, 64K (since SAS apps uses this). SAS Grid uses large-block, sequential IO patterns. (Block size: 64K, 128K, 256K, - 64K is their preferred value). Question 1: How should I tune the Stripe Count and Stripe Size for the above. Also should I use Progressive Stripe Layout? So appreciate, if I can get some feedback on tuning I have done and if its correct and if I am missing anything. Details: It’s a cloud based solution – Oracle Cloud Infrastructure. Installed Lustre using instructions on WhamCloud. All running CentOS 7. MGS -1 node (shared with MDS), MDS -1 node, OSS -3 nodes. All nodes are Baremetal machines (no VM) with 52 physical cores, 768GB RAM and have 2 NICs (2x25gbps ethernet, no dual bonding). 1 NIC is configured to connect to Block Storage disks. 2nd NiC is configured to talk to clients. So LNET is configured with 2nd NIC. Each OSS is connected to 10 Block Volume disk, 800GB each. So 10 OSTs per OSS. Total of 30 OSTs (21TB storage) . Have 1 MDT (800GB) attached to MDS. Clients are 24 physical cores VMs, 320GB RAM, 1 NIC (24.6gbps). Using 3 clients in the above setup. On all nodes (MDS/OSS/Clients): ### ### OS Performance tuning ### setenforce 0 echo " * hard memlock unlimited * soft memlock unlimited " >> /etc/security/limits.conf # The below applies for both compute and server nodes (storage) cd /usr/lib/tuned/ cp -r throughput-performance/ sas-performance echo "# # tuned configuration # [main] include=throughput-performance summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads [disk] devices=!dm-*, !sda1, !sda2, !sda3 readahead=>4096 [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 1000 kernel.sched_wakeup_granularity_ns = 1500 vm.dirty_ratio = 30 vm.dirty_background_ratio = 10 vm.swappiness=30 " > sas-performance/tuned.conf tuned-adm profile sas-performance # Display active profile tuned-adm active Networking: All NICs are configured to use MTU – 9000 Block Volumes/Disks For all OSTs/MDT: cat /sys/block/$disk/queue/max_hw_sectors_kb 32767 echo “32767” > /sys/block/$disk/queue/max_sectors_kb ; echo "192" > /sys/block/$disk/queue/nr_requests ; echo "deadline" > /sys/block/$disk/queue/scheduler ; echo "0" > /sys/block/$disk/queue/read_ahead_kb ; echo "68" > /sys/block/$disk/device/timeout ; Only OSTs: lctl set_param osd-ldiskfs.*.readcache_max_filesize=2M Lustre clients: lctl set_param osc.*.checksums=0 lctl set_param timeout=600 #lctl set_param ldlm_timeout=200 - This fails with below error #error: set_param: param_path 'ldlm_timeout': No such file or directory lctl set_param ldlm_timeout=200 lctl set_param at_min=250 lctl set_param at_max=600 lctl set_param ldlm.namespaces.*.lru_size=128 lctl set_param osc.*.max_rpcs_in_flight=32 lctl set_param osc.*.max_dirty_mb=256 lctl set_param debug="+neterror" # https://cpb-us-e1.wpmucdn.com/blogs.rice.edu/dist/0/2327/files/2014/03/Fragalla.pdf - says turn off checksum at network level ethtool -K ens3 rx off tx off Lustre mounted with -o flock option mount -t lustre -o flock ${mgs_ip}@tcp1:/$fsname $mount_point Once again, appreciate any guidance or help you can provide or you can point me to docs, articles, which will be helpful for me. Thanks, Pinkesh Valdr
Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails
Thanks to Shaun and Chris. Sorry I forgot to paste, I tried osd first, it didn’t work, so I tried ost, incase ost was the new name. [root@lustre-oss-server-nic0-1 ~]# lctl set_param osd-*.readcache_max_filesize=2M error: set_param: param_path 'osd-*/readcache_max_filesize': No such file or directory [root@lustre-oss-server-nic0-1 ~]# lctl set_param ost-*.readcache_max_filesize=2M error: set_param: param_path 'ost-*/readcache_max_filesize': No such file or directory [root@lustre-oss-server-nic0-1 ~]# So I have this, based on this, [root@lustre-oss-server-nic0-1 ~]# lctl list_param -R * | grep readcache_max_filesize osd-ldiskfs.lfsbv-OST.readcache_max_filesize osd-ldiskfs.lfsbv-OST0001.readcache_max_filesize osd-ldiskfs.lfsbv-OST0002.readcache_max_filesize osd-ldiskfs.lfsbv-OST0003.readcache_max_filesize osd-ldiskfs.lfsbv-OST0004.readcache_max_filesize osd-ldiskfs.lfsbv-OST0005.readcache_max_filesize osd-ldiskfs.lfsbv-OST0006.readcache_max_filesize osd-ldiskfs.lfsbv-OST0007.readcache_max_filesize osd-ldiskfs.lfsbv-OST0008.readcache_max_filesize osd-ldiskfs.lfsbv-OST0009.readcache_max_filesize [root@lustre-oss-server-nic0-1 ~]# So I did trial and error and found I need to do this: osd-ldiskfs.* instead of osd.* [root@lustre-oss-server-nic0-1 ~] lctl set_param osd-ldiskfs.*.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0001.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0002.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0003.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0004.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0005.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0006.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0007.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0008.readcache_max_filesize=2M osd-ldiskfs.lfsbv-OST0009.readcache_max_filesize=2M [root@lustre-oss-server-nic0-1 ~]# From: Chris Horn Date: Thursday, August 8, 2019 at 11:11 AM To: Pinkesh Valdria , Shaun Tancheff , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails You have a typo in the command you ran. Shaun wrote > lctl set_param osd-*.readcache_max_filesize=2M but you have: > lctl set_param ost-*.readcache_max_filesize=2M ost->osd > Is there commands to see all the currently parameters using some get command lctl list_param -R * # lctl list_param -R * | grep readcache_max_filesize osd-ldiskfs.snx11922-OST0002.readcache_max_filesize osd-ldiskfs.snx11922-OST0003.readcache_max_filesize # Chris Horn From: lustre-discuss on behalf of Pinkesh Valdria Date: Thursday, August 8, 2019 at 11:57 AM To: Shaun Tancheff , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails That also fails [root@lustre-oss-server-nic0-1 ~]# lctl set_param ost-*.readcache_max_filesize=2M error: set_param: param_path 'ost-*/readcache_max_filesize': No such file or directory [root@lustre-oss-server-nic0-1 ~]# Is there commands to see all the currently parameters using some get command From: Shaun Tancheff Date: Thursday, August 8, 2019 at 9:50 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails I think the parameter has changed: 'obdfilter.*.readcache_max_filesize’ => 'osd-*.readcache_max_filesize' So try: lctl set_param osd-*.readcache_max_filesize=2M From: lustre-discuss on behalf of Pinkesh Valdria Date: Thursday, August 8, 2019 at 11:43 AM To: "lustre-discuss@lists.lustre.org" Subject: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails
That also fails [root@lustre-oss-server-nic0-1 ~]# lctl set_param ost-*.readcache_max_filesize=2M error: set_param: param_path 'ost-*/readcache_max_filesize': No such file or directory [root@lustre-oss-server-nic0-1 ~]# Is there commands to see all the currently parameters using some get command From: Shaun Tancheff Date: Thursday, August 8, 2019 at 9:50 AM To: Pinkesh Valdria , "lustre-discuss@lists.lustre.org" Subject: Re: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails I think the parameter has changed: 'obdfilter.*.readcache_max_filesize’ => 'osd-*.readcache_max_filesize' So try: lctl set_param osd-*.readcache_max_filesize=2M From: lustre-discuss on behalf of Pinkesh Valdria Date: Thursday, August 8, 2019 at 11:43 AM To: "lustre-discuss@lists.lustre.org" Subject: [lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] lctl set_param obdfilter.*.readcache_max_filesize=2M fails
Hello Lustre experts, I am fairly new to lustre and I did a deployment of it on Oracle Public Cloud using instructions on whamcloud wiki pages. I am now trying to set some parameters for better performance and need help to understand why I am getting this error: On OSS Servers: [root@lustre-oss-server-nic0-1 ~]# lctl set_param obdfilter.*.readcache_max_filesize=2M error: set_param: param_path 'obdfilter/*/readcache_max_filesize': No such file or directory [root@lustre-oss-server-nic0-1 ~]# Also I have seen this happen on Client nodes: lctl set_param ldlm_timeout=200 - This fails with below error error: set_param: param_path 'ldlm_timeout': No such file or directory Appreciate your help. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] MDS/MGS has a block storage device mounted and it does not have any permissions (no read , no write, no execute)
Thanks Andreas. Given below are the output of the commands you asked to run:. > [root@lustre-mds-server-1 opc]# > • Assuming if the above is not an issue, after setting up OSS/OST and > Client node, When my client tries to mount, I get the below error: > [root@lustre-client-1 opc]# mount -t lustre 10.0.2.4@tcp:/lustrewt > /mnt > mount.lustre: mount 10.0.2.4@tcp:/lustrewt at /mnt failed: > Input/output error Is the MGS running? > [root@lustre-client-1 opc]# Andreas: Can you do "lctl ping" from the client to the MGS node? Most commonly this happens because the client still has a firewall configured, or it is defined to have "127.0.0.1" as the local node address. Pinkesh response: [root@lustre-client-1 opc]# lctl ping 10.0.2.6@tcp 12345-0@lo 12345-10.0.2.6@tcp So there is a "lo" mentioned here, could that be a problem? I also ran the mount command on client node to capture logs on both the client node and MDS node. (ran command at 18.11 time) [root@lustre-client-1 opc]# mount -t lustre 10.0.2.6@tcp:/lustrewt /mnt mount.lustre: mount 10.0.2.6@tcp:/lustrewt at /mnt failed: Input/output error Is the MGS running? [root@lustre-client-1 opc]# [root@lustre-mds-server-1 opc]# tail -f /var/log/messages Feb 6 18:11:38 lustre-mds-server-1 kernel: Lustre: MGS: Connection restored to 88e1c321-1eaa-6914-5a37-4fff2063b526 (at 10.0.0.2@tcp) Feb 6 18:11:38 lustre-mds-server-1 kernel: Lustre: Skipped 1 previous similar message Feb 6 18:11:45 lustre-mds-server-1 kernel: Lustre: MGS: Received new LWP connection from 10.0.0.2@tcp, removing former export from same NID Feb 6 18:11:45 lustre-mds-server-1 kernel: Lustre: MGS: Connection restored to 88e1c321-1eaa-6914-5a37-4fff2063b526 (at 10.0.0.2@tcp) [root@lustre-client-1 opc]# less /var/log/messages Feb 6 18:10:01 lustre-client-1 systemd: Removed slice User Slice of root. Feb 6 18:10:01 lustre-client-1 systemd: Stopping User Slice of root. Feb 6 18:11:45 lustre-client-1 kernel: Lustre: 10376:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1549476698/real 1549476698] req@9259bb42a100 x1624614953288736/t0(0) o503->MGC10.0.2.6@tcp@10.0.2.6@tcp:26/25 lens 272/8416 e 0 to 1 dl 1549476705 ref 2 fl Rpc:X/0/ rc 0/-1 Feb 6 18:11:45 lustre-client-1 kernel: Lustre: 10376:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message Feb 6 18:11:45 lustre-client-1 kernel: LustreError: 166-1: MGC10.0.2.6@tcp: Connection to MGS (at 10.0.2.6@tcp) was lost; in progress operations using this service will fail Feb 6 18:11:45 lustre-client-1 kernel: LustreError: 15c-8: MGC10.0.2.6@tcp: The configuration from log 'lustrewt-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Feb 6 18:11:45 lustre-client-1 kernel: Lustre: MGC10.0.2.6@tcp: Connection restored to MGC10.0.2.6@tcp_0 (at 10.0.2.6@tcp) Feb 6 18:11:45 lustre-client-1 kernel: Lustre: Unmounted lustrewt-client Feb 6 18:11:45 lustre-client-1 kernel: LustreError: 10376:0:(obd_mount.c:1582:lustre_fill_super()) Unable to mount (-5) Thanks, Pinkesh Valdria OCI – Big Data Principal Solutions Architect m: +1-206-234-4314 pinkesh.vald...@oracle.com -Original Message- From: Andreas Dilger Sent: Wednesday, February 6, 2019 2:28 AM To: Pinkesh Valdria Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] MDS/MGS has a block storage device mounted and it does not have any permissions (no read , no write, no execute) On Feb 5, 2019, at 15:39, Pinkesh Valdria wrote: > > Hello All, > > I am new to Lustre. I started by using the docs on this page to deploy > Lustre on Virtual machines running CentOS 7.x (CentOS-7-2018.08.15-0). > Included below are the content of the scripts I used and the error I get. > I have not done any setup for “o2ib0(ib0)” and lnet is using tcp. All the > nodes are on the same network & subnet and cannot communicate on my protocol > and port #. > > Thanks for your help. I am completely blocked and looking for ideas. > (already did google search ☹). > > I have 2 questions: > • The MDT mounted on MDS has no permissions (no read , no write, no > execute), even for root user on MDS/MGS node. Is that expected? . See > “MGS/MDS node setup” section for more details on what I did. > [root@lustre-mds-server-1 opc]# mount -t lustre /dev/sdb /mnt/mdt > > [root@lustre-mds-server-1 opc]# ll /mnt total 0 d-. 1 root > root 0 Jan 1 1970 mdt The mountpoint on the MDS is just there for "df" to work and to manage the block device. It does not provide access to filesystem. You need to do a client mount for that (typically on another node, but
[lustre-discuss] MDS/MGS has a block storage device mounted and it does not have any permissions (no read , no write, no execute)
work up LNET configured [root@lustre-mds-server-1 opc]# lctl list_nids 10.0.2.4@tcp [root@lustre-mds-server-1 opc]# ll /mnt total 0 d-. 1 root root 0 Jan 1 1970 mdt [root@lustre-mds-server-1 opc]# OSS/OST node 1 OSS node with 1 block device for OST (/dev/sdb). The setup to update kernel was the same as MGS/MDS node (described above), then I ran the below commands: mkfs.lustre --ost --fsname=lustrewt --index=0 --mgsnode=10.0.2.4@tcp /dev/sdb mkdir -p /ostoss_mount mount -t lustre /dev/sdb /ostoss_mount Client node 1 client node. The setup to update kernel was the same as MGS/MDS node (described above), then I ran the below commands: [root@lustre-client-1 opc]# modprobe lustre [root@lustre-client-1 opc]# mount -t lustre 10.0.2.3@tcp:/lustrewt /mnt (This fails with below error): mount.lustre: mount 10.0.2.4@tcp:/lustrewt at /mnt failed: Input/output error Is the MGS running? [root@lustre-client-1 opc]# Thanks, Pinkesh Valdria OCI – Big Data Principal Solutions Architect m: +1-206-234-4314 HYPERLINK "mailto:pinkesh.vald...@oracle.com"pinkesh.vald...@oracle.com ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org