Re: [lustre-discuss] Slow mount on clients

2020-02-04 Thread Åke Sandgren
Which then of course means that if the MDS HA has done a failover it
will be slow to mount until it's back in its usual place...

Used to happen to us very often when the server side was somewhat unstable.

On 2/4/20 8:55 AM, Moreno Diego (ID SIS) wrote:
> Not sure if it's your case but the order of MGS' NIDs when mounting matters:
> 
> [root@my-ms-01xx-yy ~]# time mount -t lustre 
> 10.210.1.101@tcp:10.210.1.102@tcp:/fs2 /scratch
> 
> real0m0.215s
> user0m0.007s
> sys 0m0.059s
> 
> [root@my-ms-01xx-yy ~]# time mount -t lustre 
> 10.210.1.102@tcp:10.210.1.101@tcp:/fs2 /scratch
> 
> real0m25.196s
> user0m0.009s
> sys 0m0.033s
> 
> Since the MGS is running on the node having the IP "10.210.1.101", if we 
> first try with the other one there seems to be a timeout of 25s.
> 
> Diego
>  
> 
> On 03.02.20, 23:17, "lustre-discuss on behalf of Andrew Elwell" 
>  andrew.elw...@gmail.com> wrote:
> 
> Hi Folks,
> 
> One of our (recently built) 2.10.x filesystems is slow to mount on
> clients (~20 seconds) whereas the others are nigh on instantaneous.
> 
> We saw this before with a 2.7 filesystem that went away after doing
>  but we've no idea what.
> 
> Nothing obvious in the logs.
> 
> Does anyone have suggestions for what causes this, and how to make it
> faster? It's annoying me as "something" isn't right but I can't
> identify what.
> 
> 
> Many thanks
> 
> Andrew
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Slow mount on clients

2020-02-04 Thread Andrew Elwell
> HA / MGS running on second node in fstab
:-) that was one of the first things we checked, and I've tried
manually mounting it but no change

10.10.36.224@o2ib4:10.10.36.225@o2ib4:/askapfs1  3.7P  3.0P  507T  86%
/askapbuffer

hpc-admin2:~ # lctl ping 10.10.36.224@o2ib4
12345-0@lo
12345-10.10.36.224@o2ib4
hpc-admin2:~ # lctl ping 10.10.36.225@o2ib4
12345-0@lo
12345-10.10.36.225@o2ib4
hpc-admin2:~ # umount /askapbuffer
hpc-admin2:~ # time mount /askapbuffer/

real 1m15.099s
user 0m0.012s
sys 0m0.021s
hpc-admin2:~ #

and on the server:

[root@askap-fs1-mds01 ~]# mount -t lustre
/dev/mapper/array00_2 on /lustre/MGS type lustre (ro)
/dev/mapper/array00_1 on /lustre/askapfs1-MDT0001 type lustre (ro)
[root@askap-fs1-mds01 ~]# lctl list_nids
10.10.36.224@o2ib4
[root@askap-fs1-mds01 ~]# tunefs.lustre --dryrun /dev/mapper/array00_2
checking for existing Lustre data: found
Reading CONFIGS/mountdata

   Read previous values:
Target: MGS
Index:  unassigned
Lustre FS:  askapfs1
Mount type: ldiskfs
Flags:  0x1004
  (MGS no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.10.36.224@o2ib4:10.10.36.225@o2ib4


   Permanent disk data:
Target: MGS
Index:  unassigned
Lustre FS:  askapfs1
Mount type: ldiskfs
Flags:  0x1004
  (MGS no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.10.36.224@o2ib4:10.10.36.225@o2ib4

exiting before disk write.
[root@askap-fs1-mds01 ~]#


(MDT is mounted on the other node at this time).
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Slow release of inodes on OST

2020-02-04 Thread Åke Sandgren
Hi!

When I create a large number of files on an OST and then remove them,
the used inode count on the OST decreases very slowly, it takes several
hours for it to go from 3M to the correct ~10k.

(I'm running the io500 test suite)

Is there something I can do to make it release them faster?
Right now it has gone from 3M to 1.5M in 6 hours, (lfs df -i).

These are SSD based OST's in case it matters.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Slow release of inodes on OST

2020-02-04 Thread Åke Sandgren
Forgot to mention that I'm running 2.13.0 from git on the servers.

On 2/4/20 3:23 PM, Åke Sandgren wrote:
> Hi!
> 
> When I create a large number of files on an OST and then remove them,
> the used inode count on the OST decreases very slowly, it takes several
> hours for it to go from 3M to the correct ~10k.
> 
> (I'm running the io500 test suite)
> 
> Is there something I can do to make it release them faster?
> Right now it has gone from 3M to 1.5M in 6 hours, (lfs df -i).
> 
> These are SSD based OST's in case it matters.
> 

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org