Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-15 Thread Jacek Tomaka
I did : for i in {1..100}; do cat /proc/36960/stack >$i; sleep 1; done in one bash and on the other one(36960): time -p echo 3 >/proc/sys/vm/drop_caches It took about two minutes, unfortunately most of the time it claims that it was not doing anything kernel side: [] 0x with the ex

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-15 Thread Jacek Tomaka
>That would be interesting. About a dozen copies of > cat /proc/$PID/stack >taken in quick succession would be best, where $PID is the pid of >the shell process which wrote to drop_caches. Will do later today. I have found a candidate node with the problem, just need to wait for the current task

Re: [lustre-discuss] Lustre client memory and MemoryAvailable

2019-04-15 Thread NeilBrown
On Mon, Apr 15 2019, Jacek Tomaka wrote: > Thanks Patrick for getting the ball rolling! > >>1/ w.r.t drop_caches, "2" is *not* "inode and dentry". The '2' bit >> causes all registered shrinkers to be run, until they report there is >> nothing left that can be discarded. If this is taking 10 mi

Re: [lustre-discuss] inodes not adding up

2019-04-15 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Apr 13, 2019, at 4:57 AM, Youssef Eldakar wrote: > > For one Lustre filesystem, inode count in the summary is notably less than > what the individual OST inode counts would add up to: The first thing to understand is that every Lustre file will consume one inode on the MDT, and this ino

Re: [lustre-discuss] lfsck repair quota

2019-04-15 Thread Fernando Perez
Dear lustre users, Could anyone confirm me that the correct way to repair wrong quotes in a ldiskfs mdt is lctl lfsck_start -t layout -A? Thanks in advance. Regards. = Fernando Pérez Institut de Ciències del Mar (CSIC) Departament Oceanografía Físi

[lustre-discuss] Problem with OPA fabric leading to unexpected lustre behaviour

2019-04-15 Thread Kurt Strosahl
Good Morning, I'm presently working on an issue with my OPA network that seems to be having an unusual impact on lustre. What happens is that when one of the nodes on the OPA fabric reboots it sometimes has trouble reaching one of the four lnet routers that we have set up. This isn't, i

[lustre-discuss] File missing with "Invalid argument" error

2019-04-15 Thread Tung-Han Hsieh
Dear All, We are facing a serious problem after a mistake of doing Lustre (1.8.8) maintenance. We had a bad OST and want to remove it. So we went to MDS and run lctl conf_param foo-OST.osc.active=0 After doing this, in MDS there are still logs reside in /proc/fs/lustre/osc/ director