Caching of file descriptors can be disabled with "Cache_FDs = FALSE;" in
cacheinode{} block.
Regards, The Spectrum Scale (GPFS) team
--
If you feel that your question can benefit
Forwarding Malahal's reply.
Max open files of the ganesha process is set internally (and written to
/etc/sysconfig/ganesha file as NOFILE parameter) based on MFTC (max files
to cache) gpfs parameter.
Regards, The Spectrum Scale (GPFS) team
On 3/24/17 12:17 PM, IBM Spectrum Scale wrote:
Hi Bryan,
Making sure Malahal's reply was received by the user group.
>> Then we noticed that the CES host had 5.4 million files open
This is technically not possible with ganesha alone. A process can only open 1
million files on RHEL distro.
also running version 4.2.2.2.
On 3/24/17 2:57 PM, Matt Weil wrote:
On 3/24/17 1:13 PM, Bryan Banister wrote:
Hi Vipul,
Hmm… interesting. We have dedicated systems running CES and nothing else, so
the only thing opening files on GPFS is ganesha. IBM Support recommended we
massively
On 3/24/17 1:13 PM, Bryan Banister wrote:
Hi Vipul,
Hmm… interesting. We have dedicated systems running CES and nothing else, so
the only thing opening files on GPFS is ganesha. IBM Support recommended we
massively increase the maxFilesToCache to fix the performance issues we were
having.
Thanks Sven!
We recently upgrade to 4.2.2 and will see about lowering the maxFilesToCache to
something more appropriate. We’re not offering NFS access as a performance
solution… but it can’t come to a crawl either!
Your help is greatly appreciated as always,
-Bryan
From:
changes in ganesha management code were made in April 2016 to reduce the
need for high maxfilestocache value, the ganesha daemon adjusts it allowed
file cache by reading the maxfilestocache value and then reducing its
allowed NOFILE value . the code shipped with 4.2.2 release.
you want a high
Hi Vipul,
Hmm... interesting. We have dedicated systems running CES and nothing else, so
the only thing opening files on GPFS is ganesha. IBM Support recommended we
massively increase the maxFilesToCache to fix the performance issues we were
having. I could try to reproduce the problem to
I believe it was created with -n 5000. Here's the exact command that was
used:
/usr/lpp/mmfs/bin/mmcrfs dnb03 -F ./disc_mmcrnsd_dnb03.lst -T
/gpfsm/dnb03 -j cluster -B 1M -n 5000 -N 20M -r1 -R2 -m2 -M2 -A no -Q
yes -v yes -i 512 --metadata-block-size=256K -L 8388608
-Aaron
On 3/24/17 2:05
was this filesystem creates with -n 5000 ? or was that changed later with
mmchfs ?
please send the mmlsconfig/mmlscluster output to me at oeh...@us.ibm.com
On Fri, Mar 24, 2017 at 10:58 AM Aaron Knister
wrote:
> I feel a little awkward about posting wlists of IP's
It's large, I do know that much. I'll defer to one of our other storage
admins. Jordan, do you have that number handy?
-Aaron
On 3/24/17 2:03 PM, Fosburgh,Jonathan wrote:
7PB filesystem and only 28 million inodes in use? What is your average
file size? Our large filesystem is 7.5P (currently
7PB filesystem and only 28 million inodes in use? What is your average file
size? Our large filesystem is 7.5P (currently 71% used) with over 1 billion
inodes in use.
--
Jonathan Fosburgh
Principal Application Systems Analyst
Storage Team
IT Operations
I feel a little awkward about posting wlists of IP's and hostnames on
the mailing list (even though they're all internal) but I'm happy to
send to you directly. I've attached both an lsfs and an mmdf output of
the fs in question here since that may be useful for others to see. Just
a note
Thanks Bob, Jonathan.
We're running GPFS 4.1.1.10 and no HSM/LTFSEE.
I'm currently gathering, as requested, a snap from all nodes (with
traces). With 3500 nodes this ought to be entertaining.
-Aaron
On 3/24/17 12:50 PM, Oesterlin, Robert wrote:
Hi Aaron
Yes, I have seen this several times
ok, that seems a different problem then i was thinking.
can you send output of mmlscluster, mmlsconfig, mmlsfs all ?
also are you getting close to fill grade on inodes or capacity on any of
the filesystems ?
sven
On Fri, Mar 24, 2017 at 10:34 AM Aaron Knister
wrote:
you must be on sles as this segfaults only on sles to my knowledge :-)
i am looking for a NSD or manager node in your cluster that runs at 100%
cpu usage.
do you have zimon deployed to look at cpu utilization across your nodes ?
sven
On Fri, Mar 24, 2017 at 10:08 AM Aaron Knister
Hi Bryan,
Making sure Malahal's reply was received by the user group.
>> Then we noticed that the CES host had 5.4 million files open
This is technically not possible with ganesha alone. A process can only
open 1 million files on RHEL distro. Either we have leaks in kernel or
some other
Hi Sven,
Which NSD server should I run top on, the fs manager? If so the CPU load
is about 155%. I'm working on perf top but not off to a great start...
# perf top
PerfTop:1095 irqs/sec kernel:61.9% exact: 0.0% [1000Hz
cycles], (all, 28 CPUs)
while this is happening run top and see if there is very high cpu
utilization at this time on the NSD Server.
if there is , run perf top (you might need to install perf command) and see
if the top cpu contender is a spinlock . if so send a screenshot of perf
top as i may know what that is and
Hi Aaron
Yes, I have seen this several times over the last 6 months. I opened at least
one PMR on it and they never could track it down. I did some snap dumps but
without some traces, they did not have enough. I ended up getting out of it by
selectively rebooting some of my NSD servers. My
This is one of the more annoying long waiter problems. We've seen it several
times and I'm not sure they all had the same cause.
What version of GPFS?
Do you have anything like Tivoli HSM or LTFSEE?
--
Jonathan Fosburgh
Principal Application Systems Analyst
Storage Team
IT Operations
Since yesterday morning we've noticed some deadlocks on one of our
filesystems that seem to be triggered by writing to it. The waiters on
the clients look like this:
0x19450B0 ( 6730) waiting 2063.294589599 seconds, SyncHandlerThread:
on ThCond 0x1802585CB10 (0xC9002585CB10)
22 matches
Mail list logo