Actually i think it is just a bug with the way slab caches are created. Some of them should be passed a flag that they are reclaimable. i.e. something like: https://patchwork.kernel.org/patch/9360819/
Regards. Jacek Tomaka On Sun, Apr 14, 2019 at 3:27 PM Jacek Tomaka <jac...@dug.com> wrote: > Hello, > > TL;DR; > Is there a way to figure out how much memory Lustre will make available > under memory pressure? > > Details: > We are running lustre client on a machine with 128GB of memory (Centos 7) > Intel Phi KNL machines and at certain situations we see that there can be > ~10GB+ of memory allocated on the kernel side i.e. : > > vvp_object_kmem 3535336 3536986 176 46 2 : tunables 0 0 > 0 : slabdata 76891 76891 0 > ll_thread_kmem 33511 33511 344 47 4 : tunables 0 0 0 > : slabdata 713 713 0 > lov_session_kmem 34760 34760 592 55 8 : tunables 0 0 0 > : slabdata 632 632 0 > osc_extent_kmem 3549831 3551232 168 48 2 : tunables 0 0 > 0 : slabdata 73984 73984 0 > osc_thread_kmem 14012 14116 2832 11 8 : tunables 0 0 0 > : slabdata 1286 1286 0 > osc_object_kmem 3546640 3548350 304 53 4 : tunables 0 0 > 0 : slabdata 66950 66950 0 > signal_cache 3702537 3707144 1152 28 8 : tunables 0 0 > 0 : slabdata 132398 132398 0 > > /proc/meminfo: > MemAvailable: 114196044 kB > Slab: 11641808 kB > SReclaimable: 1410732 kB > SUnreclaim: 10231076 kB > > After executing > > echo 1 >/proc/sys/vm/drop_caches > > the slabinfo values don't change but when i actually generate memory > pressure by: > > java -Xmx117G -Xms117G -XX:+AlwaysPreTouch -version > > > lots of memory gets freed: > vvp_object_kmem 127650 127880 176 46 2 : tunables 0 0 0 > : slabdata 2780 2780 0 > ll_thread_kmem 33558 33558 344 47 4 : tunables 0 0 0 > : slabdata 714 714 0 > lov_session_kmem 34815 34815 592 55 8 : tunables 0 0 0 > : slabdata 633 633 0 > osc_extent_kmem 128640 128880 168 48 2 : tunables 0 0 0 > : slabdata 2685 2685 0 > osc_thread_kmem 14038 14116 2832 11 8 : tunables 0 0 0 > : slabdata 1286 1286 0 > osc_object_kmem 82998 83263 304 53 4 : tunables 0 0 0 > : slabdata 1571 1571 0 > signal_cache 38734 44268 1152 28 8 : tunables 0 0 0 > : slabdata 1581 1581 0 > > /proc/meminfo: > MemAvailable: 123146076 kB > Slab: 1959160 kB > SReclaimable: 334276 kB > SUnreclaim: 1624884 kB > > The similar effect to generating memory pressure we see when executing: > > echo 3 >/proc/sys/vm/drop_caches > > But this can take very long time (10 minutes). > > So essentially on a machine using Lustre client, MemAvailable is no > longer a good predictor of the amount of memory that can be allocated. > Is there a way to query Lustre and compensate for lustre cache memory that > will be made available on memory pressure? > > Regards. > -- > *Jacek Tomaka* > Geophysical Software Developer > > > > > > > *DownUnder GeoSolutions* > 76 Kings Park Road > West Perth 6005 WA, Australia > *tel *+61 8 9287 4143 <+61%208%209287%204143> > jac...@dug.com > *www.dug.com <http://www.dug.com>* > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> jac...@dug.com *www.dug.com <http://www.dug.com>*
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org