[lustre-discuss] Recover from broken lustre updates (Haoyang Liu)

2021-07-26 Thread via lustre-discuss
Hi, Haoyang Maybe you should rebuild the MOFED with new kernel first, then rebuild lustre server package. 1) about restore I think you can try switch to the old kernel first, but as you said, you have rebuild the MOFED under the new kernel, so once you go back to the old kernel you need to

Re: [lustre-discuss] Quota related (Anilkumar Naik)

2020-11-30 Thread
Mon, 30 Nov, 2020, 6:59 am 肖正刚, wrote: > >> Hi, >> you can enable user quota on mgs by >> " >> lctl conf_param your_fsname.qouta.mdt=u >> lctl conf_param your_fsname.qouta.ost=u >> " >> details about quota in lustre manua

Re: [lustre-discuss] Quota related (Anilkumar Naik)

2020-11-29 Thread
Hi, you can enable user quota on mgs by " lctl conf_param your_fsname.qouta.mdt=u lctl conf_param your_fsname.qouta.ost=u " details about quota in lustre manual chapter 25 https://doc.lustre.org/lustre_manual.xhtml#configuringquotas ___ lustre-discuss

Re: [lustre-discuss] Lustre QoS using TBF (Strikwerda, Ger)

2020-10-26 Thread
Hi, You can find details in lustre manual chapter 34.6.5 When enabling TBF policy, you can specify one of the type(NID, JOBID, OPCode and UID/GID )s, or just use "tbf" to enable all of them to do a fine-grained RPC requests classification and this feature also supports logical conditional

Re: [lustre-discuss] ls command blocked in some dir

2020-09-23 Thread
> Regards, > Knut > > Am Mittwoch, den 23.09.2020, 22:56 +0800 schrieb 肖正刚: > > Caution! External email. Do not open attachments or click links, unless > this email comes from a known sender and you know the content is safe. > > Hi, all > > In one of our lustre

[lustre-discuss] ls command blocked in some dir

2020-09-23 Thread
Hi, all In one of our lustre filesystems,we found that 1) ls command blocked in some dir but ls --color=never worked. 2) some files can not be accessed, like cat/head/vim/file(i use strace to trace command "strace file .xxx" , stucked in lstat). " fstat(1,

[lustre-discuss] client syslog flood with "client_bulk_callback()) event type 2, status -103"

2020-09-09 Thread
Hi, all After upgrade lustre client from 2.12.2 to 2.12.5,we found some clients flood with messages like " [Wed Sep 9 15:49:05 2020] LustreError: 3476:0:(events.c:200:client_bulk_callback()) event type 2, status -103, desc 9a1f3cafd000 [Wed Sep 9 15:49:06 2020] LustreError:

Re: [lustre-discuss] some clients dmesg filled up with "dirty page discard"

2020-08-29 Thread
Hi, Andreas, Thanks for your reply. Maybe this is a bug? We never hit this before update client to 2.12.5 Andreas Dilger 于2020年8月29日周六 下午6:37写道: > On Aug 25, 2020, at 17:42, 肖正刚 wrote: > > > no, on oss we found only the client who reported " dirty page discard " >

Re: [lustre-discuss] some clients dmesg filled up with "dirty page discard"

2020-08-25 Thread
: > The I/O was not fully committed after close() from the client. Are you > experiencing high numbers of evictions? > > On Tue, Aug 25, 2020 at 9:12 AM 肖正刚 wrote: > >> Hi, all >> >> We found that some clients' dmesg filled up with messages like >> " &g

[lustre-discuss] some clients dmesg filled up with "dirty page discard"

2020-08-25 Thread
Hi, all We found that some clients' dmesg filled up with messages like " Aug 24 19:54:34 ln5 kernel: Lustre: 13565:0:(llite_lib.c:2759:ll_dirty_page_discard_warn()) public1: dirty page discard: 10.10.2.11@o2ib:10.10.2.12@o2ib:/public1/fid: [0x27a82:0x1680f:0x0]/ may get corrupted (rc -108)

[lustre-discuss] depmod error when upgrade 2.12.2 to 2.12.5

2020-08-10 Thread
Hi,all when upgrade 2.12.2 to 2.12.5,we hit depmod error , can this be ignored or how to resolve it? Error info: depmod: ERROR: fstatat(4, ptlrpc.ko.xz): No such file or directory depmod: ERROR: fstatat(4, fld.ko.xz): No such file or directory depmod: ERROR: fstatat(4, mgs.ko.xz): No such file or

Re: [lustre-discuss] infiniband mlx5_0: dump_cqe:286:(pid 25761): dump error cqe

2020-07-30 Thread
Hi, Thanks for your suggestion. But , to reboot the OSSs in production under massive IO pressure will make another long long story . Regards. Weiss, Karsten 于2020年7月30日周四 下午11:31写道: > Hi! > > > > (Caveat: I ran into this issue not on Lustre but on HPC MPI jobs on CentOS > 7.7. They only run

[lustre-discuss] infiniband mlx5_0: dump_cqe:286:(pid 25761): dump error cqe

2020-07-30 Thread
Hi, all we installed lustre-2.12.2 both server and clients ,recently,our oss's syslog flooding with messages like below: “ infiniband mlx5_0: dump_cqe:286:(pid 25761): dump error cqe : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Re: [lustre-discuss] Is there aceiling of lustre filesystem a client can mount

2020-07-20 Thread
Hi, Alastair & Mark Hahn Can mounted lustre filesystem(same version ) impact each other ? Can network become bottleneck? regards. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org

Re: [lustre-discuss] client /server version compatibility (Peeples, Heath)

2020-07-18 Thread
Hi, You can get something from changlog,like http://wiki.lustre.org/Lustre_2.12.2_Changelog. BTW,we have tried 2.12.2 server with 2.7.x client & 2.5.x client,not work. Regards. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org

Re: [lustre-discuss] Is there aceiling of lustre filesystem a client can mount

2020-07-16 Thread
Hi, Mark Hahn Very appreciate for your detailed reply. And sorry for the ambiguous description. For some reasons, we decided not to expand on the lustre filesystem already exists; so what I want to know is the number of lustre filesystems that a client can mount on the same time . Best regards.

Re: [lustre-discuss] Is there aceiling of lustre filesystem a client can mount

2020-07-15 Thread
Hi, Jongwoo & Andreas Sorry for the ambiguous description. What I want to know is the number of lustre filesystems that a client can mount on the same time. Thanks > > > Message: 1 > Date: Wed, 15 Jul 2020 14:29:10 +0800 > From: ??? > To: lustre-discuss@lists.lustre.org > Subject:

[lustre-discuss] Is there aceiling of lustre filesystem a client can mount

2020-07-15 Thread
Hi, all Is there a ceiling for a Lustre filesystem that can be mounted in a cluster? If so, what's the number? If not, how much is proper? Does mount multiple filesystems can affect the stability of each file system or cause other problems? Thanks! ___

Re: [lustre-discuss] getcwd() fails (Leonardo Saavedra)

2020-07-13 Thread
Hi, Leonardo Thanks for your reply, but I found that use vasp-5.4.4 can walk arround this issue,so we do not intend to upgrade the kernel recently. 于2020年7月14日周二 上午9:14写道: > Send lustre-discuss mailing list submissions to > lustre-discuss@lists.lustre.org > > To subscribe or

Re: [lustre-discuss] getcwd() fails

2020-07-12 Thread
2020年7月10日周五 下午5:01写道: > Hello, > > > On Fri, Jul 10, 2020 at 11:28 AM 肖正刚 wrote: > >> Hi all, >> >> We run lustre 2.12.2(both server) on CentOS 7.6, we hits getcwd >> error when ran vasp. >> Error message: >> forrtl: severe (121): Cannot a

[lustre-discuss] getcwd() fails

2020-07-10 Thread
Hi all, We run lustre 2.12.2(both server) on CentOS 7.6, we hits getcwd error when ran vasp. Error message: forrtl: severe (121): Cannot access current working directory for unit 7, file "Unknown" Image PCRoutineLineSource vasp_std

[lustre-discuss] mlx4 and mxl5 mix environment

2020-06-22 Thread
Hi, all We setup up a cluster use mlx4 and mlx5 driver mixed,all things goes well. Later I find something in wiki http://wiki.lustre.org/Infiniband_Configuration_Howto and http://lists.onebuilding.org/pipermail/lustre-devel-lustre.org/2016-May/003842.html which was last edited on 2016. So do i

Re: [lustre-discuss] how to mapping of RPC rate to bandwidth/IOPS?

2020-06-13 Thread
Hi, Andreas Thanks for your reply. I am geting more clearer,now. Thanks Andreas Dilger 于2020年6月10日周三 上午11:00写道: > On Jun 2, 2020, at 02:30, 肖正刚 wrote: > > > Hi all, > we use TBF policy(details: > https://jira.whamcloud.com/secure/attachment/14201/Lustre%20NRS%20TBF%20docum

[lustre-discuss] how to mapping of RPC rate to bandwidth/IOPS?

2020-06-02 Thread
Hi all, we use TBF policy(details: https://jira.whamcloud.com/secure/attachment/14201/Lustre%20NRS%20TBF%20documentation%200.1.pdf) to limit rpcrate coming from clients; but I do not know how to mapping of rpcrate to bandwidth or iops. For example: if I set a client's rpcrate=10,how much bandwith

[lustre-discuss] file restored after modify

2020-05-08 Thread
Hi all, we used lustre 2.12.2, hit a strange problem this morning, some file restored after modify i modified job.sh, about 1 min later,file restored. when i copy job.sh to test.sh, then modify test.sh , test.sh not restored. when i use root to modify job.sh, file not restored; then i use user

Re: [lustre-discuss] can not rebuild lustre 2.12.4 use src rpm package

2020-04-03 Thread
I got around this by disable building of Lustre tests cp rpmbuild/SOURCES/lustre-2.12.4.tar.gz . tar -xvf lustre-2.12.4.tar.gz cd lustre-2.12.4/ ./configure --enable-server --disable-tests make rpms 肖正刚 于2020年4月3日周五 下午3:33写道: > Hi, > > i rebuild 2.12.4 clients in cento

[lustre-discuss] can not rebuild lustre 2.12.4 use src rpm package

2020-04-03 Thread
Hi, i rebuild 2.12.4 clients in centos 7.6 use command line : rpmbuild --rebuild lustre-2.12.4-1.src.rpm no error found while configure, but when compiling , i see error Finding Provides: /usr/lib/rpm/redhat/find-provides Finding Requires(interp): Finding Requires(rpmlib): Finding

[lustre-discuss] MDS HIT LBUG

2020-04-01 Thread
Hi, Our MDS hit a bug(Red Hat issue) as described in : https://jira.whamcloud.com/browse/LU-10678 https://jira.whamcloud.com/browse/LU-11786 Our kernel version : 3.10.0-957.10.1.el7_lustre.x86_64 lustre version: 2.12.2 OS version: CentOS 7.6 RHEL said the kernel bug was resolved in

Re: [lustre-discuss] confused about mdt space

2020-04-01 Thread
Now, I am clear. Thanks Richard ! Mohr Jr, Richard Frank 于2020年4月1日周三 下午10:59写道: > > > > On Apr 1, 2020, at 10:07 AM, Mohr Jr, Richard Frank > wrote: > > > > > > > >> On Apr 1, 2020, at 3:55 AM, 肖正刚 wrote: > >> > >> For

Re: [lustre-discuss] confused about mdt space

2020-04-01 Thread
41% of mdt disk space consumed by inodes. but from the manual I found the default "inode ratio" is 2K, so where the additional 0.5KB comes from ? Thanks. 肖正刚 于2020年4月1日周三 下午1:00写道: > Thanks a lot. > I have two more questions: > 1) Assume I consider the mdt space use the method

Re: [lustre-discuss] confused about mdt space

2020-03-31 Thread
, right ? 2) mds need additional space for other use, like log,acls,xattrs;how to estimate these space ? Thanks! Mohr Jr, Richard Frank 于2020年3月31日周二 下午9:57写道: > > > > On Mar 30, 2020, at 10:56 PM, 肖正刚 wrote: > > > > Hello, I have some question about metadata space. >

[lustre-discuss] confused about mdt space

2020-03-30 Thread
Hello, I have some question about metadata space. 1) I have ten 960GB SAS SSDs for mdt,after done raid10,we have 4.7TB space free. after formated as mdt,we only have 2.6TB space free; so where the 2.1TB space go ? 2) for the 2.6TB space, what's it used for?