Also, if snapshotting multiple filesets, it's important to group these into
a single mmcrsnapshot command. Then you get a single quiesce, instead of
one per fileset.
i.e. do:
snapname=$(date --utc +@GMT-%Y.%m.%d-%H.%M.%S)
mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:sna
Just ran an upgrade on an EMS, and the only changes I see are these updated
packages on the ems:
+gpfs.docs-5.1.2-0.9.noarchMon 20 Dec 2021 11:56:43 AM
CET
+gpfs.ess.firmware-6.0.0-15.ppc64leMon 20 Dec 2021 11:56:42 AM
CET
+gpfs.msg.en_US-5.1.2-0.9.noarch
I believe this should be a fully working solution. I see no problem
enabling RDMA between a subset of nodes -- just disable verbsRdma on the
nodes you want to use plain IP.
-jf
On Thu, Dec 9, 2021 at 11:04 AM Walter Sklenka
wrote:
> Dear spectrum scale users!
>
> May I ask you a design quest
> I have also played with the -A and -a parameters with no combination that I
> can find making it any better.
>
>
>
> Thanks for the feedback.
>
>
>
> *From:* gpfsug-discuss-boun...@spectrumscale.org <
> gpfsug-discuss-boun...@spectrumscale.org> *On Behalf Of
So…. the advertisement ssys we should be able to do 1M files/s…
http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf
First I would try is if maybe limiting which nodes are used for the
processing helps. Maybe limit to the NSD-servers (-N nodenames) ?
Also, --choice-algorithm
eo
>
> Paul Scherrer Institut
> Dr. Leonardo Sala
> Group Leader High Performance Computing
> Deputy Section Head Science IT
> Science IT
> WHGA/036
> Forschungstrasse 111
> 5232 Villigen PSI
> Switzerland
>
> Phone: +41 56 310 3369leonardo.s...@psi.chwww.psi.ch
I’ve done this a few times. Once with IPoIB as daemon network, and then
created a separate routed network on the hypervisor to bridge (?) between
VM and IPoIB network.
Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor:
To give the VMs access to the daemon network, w
One thing to check: Storwize/SVC code will *always* guess wrong on
prefetching for GPFS. You can see this with having a lot higher read data
throughput on mdisk vs. on on vdisks in the webui. To fix it, disable
cache_prefetch with "chsystem -cache_prefetch off".
This being a global setting, you pr
It has features for both being an Object store for other applications
(running openstack swift/S3), and for migrating/tiering filesystem data to
an object store like Amazon S3, IBM COS, etc...
-jf
fre. 21. mai 2021 kl. 10:42 skrev David Reynolds :
> When we talk about supported protocols on S
A couple of ideas.
The KC recommends adding WEIGHT(DIRECTORY_HASH) to group deletions within a
directory. Then maybe also do it as a 2-step process, in the same policy
run. Where you delete all non-directories first, and then deletes the
directories in a depth-first order using WEIGTH(Length(PATH_
Have you installed the gpfs.pm-ganesha package, and do you have any active
NFS exports/clients ?
-jf
On Tue, Apr 20, 2021 at 12:19 PM Dorigo Alvise (PSI)
wrote:
> Dear Community,
>
>
>
> I’ve activated CES-related metrics by simply doing:
>
> [root@xbl-ces-91 ~]# mmperfmon config show |egrep
No — all copying between filesets require full data copy. No simple rename.
This might be worthy of an RFE, as it’s a bit unexpected, and could
potentially work more efficiently..
-jf
man. 22. mar. 2021 kl. 10:39 skrev Ulrich Sibiller <
u.sibil...@science-computing.de>:
> Hello,
>
> we usua
I’ve tried benchmarking many vs. few vdisks per RG, and never could see any
performance difference.
Usually we create 1 vdisk per enclosure per RG, thinking this will allow
us to grow with same size vdisks when adding additional enclosures in the
future.
Don’t think mmvdisk can be told to creat
=1048576 modsnap=1 extperms=0x2,xa
replmeta *illReplicated
unbalanced* dev=3824,150 archive compressed crtime 1610965529.23447
That program can probably easily be modified to only list these files..
-jf
On Fri, Feb 19, 2021 at 1:50 PM Jan-Frode Myklebust
wrote:
> We just discussed thi
We just discussed this a bit internally, and I found "something* that might
help... There's a mmrestripefs --inode-criteria command that can be used to
identify files with these unknown-to-ILM flags set. Something like:
# echo illreplicated > criteria
# mmrestripefs gpfs01 -p --inode-criteria crit
Agree.. Write a policy that takes a "mmapplypolicy -M var=val" argument,
and figure out the workdays outside of the policy. Something like:
# cat test.poilcy
define( access_age, (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)))
/* list migrated files */
RULE EXTERNAL LIST 'oldFiles' EXEC ''
RULE
This sounds like a bug to me... (I wouldn't expect mmchattr works on
different node than other file access). I would check "mmdiag --iohist
verbose" during these slow reads, to see if it gives a hint at what it's
doing, versus what it shows during "mmchattr". Maybe one is
triggering prefetch, while
My understanding of these limits are that they are to limit the
configuration files from becoming too large, which makes
changing/processing them somewhat slow.
For SMB shares, you might be able to limit the number of configured shares
by using wildcards in the config (%U). These wildcarded entrie
I would not mount a GPFS filesystem within a GPFS filesystem. Technically
it should work, but I’d expect it to cause surprises if ever the lower
filesystem experienced problems. Alone, a filesystem might recover
automatically by remounting. But if there’s another filesystem mounted
within, I expect
Nice to see it working well!
But, what about ACLs? Does you rsync pull in all needed metadata, or do you
also need to sync ACLs ? Any plans for how to solve that ?
On Tue, Nov 17, 2020 at 12:52 PM Andi Christiansen
wrote:
> Hi all,
>
> thanks for all the information, there was some interesting
I would expect you should be able to get it back up using the routine at
https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1adv_failsynch.htm
Maybe you just need to force remove quorum-role from the dead node ?
-jf
On Tue, Aug 18, 2020 at 2:16 PM H
States or your local IBM Service Center in
> other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> [image: Inactive hide details for Jan-Frode Myklebust ---15-07-2020
> 08.44.
It looks like the old NFS4 ACL patch for rsync is no longer needed.
Starting with rsync-3.2.0 (and backported to rsync-3.1.2-9 in RHEL7), it
will now copy NFS4 ACLs if we tell it to ignore the posix ACLs:
rsync -X --filter '-x system.posix_acl' file-with-acl copy-with-acl
tir. 16. jun. 2020 kl. 15:32 skrev Giovanni Bracco :
>
> > I would correct MaxMBpS -- put it at something reasonable, enable
> > verbsRdmaSend=yes and
> > ignorePrefetchLUNCount=yes.
>
> Now we have set:
> verbsRdmaSend yes
> ignorePrefetchLUNCount yes
> maxMBpS 8000
>
> but the only parameter whi
On Thu, Jun 11, 2020 at 9:53 AM Giovanni Bracco
wrote:
>
> >
> > You could potentially still do SRP from QDR nodes, and via NSD for your
> > omnipath nodes. Going via NSD seems like a bit pointless indirection.
>
> not really: both clusters, the 400 OPA nodes and the 300 QDR nodes share
> the sam
fre. 5. jun. 2020 kl. 15:53 skrev Giovanni Bracco :
> answer in the text
>
> On 05/06/20 14:58, Jan-Frode Myklebust wrote:
> >
> > Could maybe be interesting to drop the NSD servers, and let all nodes
> > access the storage via srp ?
>
> no we can not: the product
Could maybe be interesting to drop the NSD servers, and let all nodes
access the storage via srp ?
Maybe turn off readahead, since it can cause performance degradation when
GPFS reads 1 MB blocks scattered on the NSDs, so that read-ahead always
reads too much. This might be the cause of the slow r
No, this is a common misconception. You don’t need any NSD servers. NSD
servers are only needed if you have nodes without direct block access.
Remote cluster or not, disk access will be over local block device (without
involving NSD servers in any way), or NSD server if local access isn’t
availab
Seeing that as a %changelog in the RPMs would be fantastic.. :-)
-jf
tir. 26. mai 2020 kl. 15:44 skrev Carl Zetie - ca...@us.ibm.com <
ca...@us.ibm.com>:
> Achim, I think the request here (lost in translation?) is for a list of
> the bugs that 5.0.5.0 addresses. And we're currently looking to
(almost verbatim copy of my previous email — in case anybody else needs it,
or has ideas for improvements :-)
The way I would do this is to install "haproxy" on all these nodes, and
have haproxy terminate SSL and balance incoming requests over the 3
CES-addresses. For S3 -- we only need to provide
I don’t know the answer — but as an alternative solution, have you
considered splitting the read only clients out into a separate cluster.
Then you could enforce the read-only setting using «mmauth grant ... -a ro».
That should be supported.
-jf
ons. 4. mar. 2020 kl. 12:05 skrev Agostino Fu
ppc64le. The Readme for 5.3.5 lists FW860.60 again, same as 5.3.4?
>
>
>
> Cheers,
>
>
>
> Heiner
>
> *From: * on behalf of Jan-Frode
> Myklebust
> *Reply to: *gpfsug main discussion list
> *Date: *Thursday, 30 January 2020 at 18:00
> *To: *gpfsug
I *think* this was a known bug in the Power firmware included with 5.3.4,
and that it was fixed in the FW860.70. Something hanging/crashing in IPMI.
-jf
tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward :
> Interesting. We just deployed an ESS here and are running into a very
> similar proble
There’s still being maintained the ESS v5.2 release stream with gpfs
v4.2.3.x for customer that are stuck on v4. You should probably install
that on your ESS if you want to add it to your existing cluster.
BTW: I think Tomer misunderstood the task a bit. It sounded like you needed
to keep the exis
Adding the GL2 into your existing cluster shouldn’t be any problem. You
would just delete the existing cluster on the GL2, then on the EMS run
something like:
gssaddnode -N gssio1-hs --cluster-node netapp-node --nodetype gss
--accept-license
gssaddnode -N gssio2-hs --cluster-node netapp-node
|
> DISPLAY_NULL(xattr_integer('gpfs.FileHeat',7,2,'B')) || ' ' ||
> DISPLAY_NULL(FILE_HEAT) || ' ' ||
> DISPLAY_NULL(hex(xattr('gpfs.FileHeat'))) || ' ' ||
> getmmconfig('fileHeatPeriodMinutes') || ' ' ||
&
What about filesystem atime updates. We recently changed the default to
«relatime». Could that maybe influence heat tracking?
-jf
tir. 13. aug. 2019 kl. 11:29 skrev Ulrich Sibiller <
u.sibil...@science-computing.de>:
> On 12.08.19 15:38, Marc A Kaplan wrote:
> > My Admin guide says:
> >
> >
I would mainly consider future upgrades. F.ex. do one vdisk per disk shelf
per rg. F.ex. for a GL6S you would have 12 vdisks, and if you add a GL4S
you would add 8 more vdisks, then each spindle of both systems should get
approximately the same number of IOs.
Another thing to consider is re-alloca
I’ve had a situation recently where mmnsddiscover didn’t help, but
mmshutdown/mmstartup on that node did fix it.
This was with v5.0.2-3 on ppc64le.
-jf
tir. 25. jun. 2019 kl. 17:02 skrev Son Truong :
>
> Hello Renar,
>
> Thanks for that command, very useful and I can now see the problematic
It’s a multiple of full blocks.
-jf
tir. 21. mai 2019 kl. 20:06 skrev Todd Ruston :
> Hi Indulis,
>
> Yes, thanks for the reminder. I'd come across that, and our system is
> currently set to a stub size of zero (the default, I presume). I'd intended
> to ask in my original query whether anyon
scale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>1. Re: Adding ESS to existing Scale Cluster (Sanchez, Paul)
>2. New ESS install - Ne
Have you tried:
updatenode nodename -P gss_ofed
But, is this the known issue listed in the qdg?
https://www.ibm.com/support/knowledgecenter/SSYSP8_5.3.2/ess_qdg.pdf
-jf
ons. 3. apr. 2019 kl. 19:26 skrev Oesterlin, Robert <
robert.oester...@nuance.com>:
> Any insight on what command I nee
There seems to be some changes or bug here.. But try usage=dataOnly
pool=neverused failureGroup=xx.. and it should have the same function as
long as you never place anything in this pool.
-jf
tor. 28. mar. 2019 kl. 18:43 skrev Luke Sudbery :
> We have a 2 site Lenovo DSS-G based filesystem, whic
I've been looking for a good way of listing this as well. Could you please
share your policy ?
-jf
On Thu, Mar 28, 2019 at 1:52 PM Dorigo Alvise (PSI)
wrote:
> Hello,
> to get the list (and size) of files that fit into inodes what I do, using
> a policy, is listing "online" (not evicted) fil
You could put snapshot data in a separate storage pool. Then it should be
visible how much space it occupies, but it’s a bit hard to see how this
will be usable/manageable..
-jf
tir. 29. jan. 2019 kl. 20:08 skrev Christopher Black :
> Thanks for the quick and detailed reply! I had read the manua
I don't think it's going to figure this out automatically between the two
rules.. I believe you will need to do something like the below (untested,
and definitely not perfect!!) rebalancing:
define(
weight_junkedness,
(CASE
/* Create 3 classes of files */
I’d be curious to hear if all these arguments against iSCSI shouldn’t also
apply to NSD protocol over TCP/IP?
-jf
man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard <
jonathan.buzz...@strath.ac.uk>:
> On 13/12/2018 20:54, Buterbaugh, Kevin L wrote:
>
> [SNIP]
>
> >
> > Two things that I am alre
I have been running GPFS over iSCSI, and know of customers who are also.
Probably not in the most demanding environments, but from my experience
iSCSI works perfectly fine as long as you have a stable network. Having a
dedicated (simple) storage network for iSCSI is probably a good idea (just
like
/mibgroup/hardware/fsys/mnttypes.h
@@ -121,6 +121,9 @@
#ifndef MNTTYPE_GFS2
#define MNTTYPE_GFS2 "gfs2"
#endif
+#ifndef MNTTYPE_GPFS
+#define MNTTYPE_GPFS "gpfs"
+#endif
#ifndef MNTTYPE_XFS
#define MNTTYPE_XFS "xfs"
#endif
On Wed, Nov 7, 2018 at 12:20
Looking at the CHANGELOG for net-snmp, it seems it needs to know about each
filesystem it's going to support, and I see no GPFS/mmfs. It has entries
like:
- Added simfs (OpenVZ filesystem) to hrStorageTable and hrFSTable.
- Added CVFS (CentraVision File System) to hrStorageTable and
Also beware there are 2 different linux NFS "async" settings. A client side
setting (mount -o async), which still cases sync on file close() -- and a
server (knfs) side setting (/etc/exports) that violates NFS protocol and
returns requests before data has hit stable storage.
-jf
On Wed, Oct 17
nc on each file..
>
> you'll never outperform e.g. 128 (maybe slower), but, parallel threads
> (running write-behind) <---> with one single but fast threads,
>
> so as Alex suggest.. if possible.. take gpfs client of kNFS for those
> types of workloads..
>
>
Do you know if the slow throughput is caused by the network/nfs-protocol
layer, or does it help to use faster storage (ssd)? If on storage, have you
considered if HAWC can help?
I’m thinking about adding an SSD pool as a first tier to hold the active
dataset for a similar setup, but that’s mainly
Not sure if better or worse idea, but I believe robocopy support syncing
just the ACLs, so if you do SMB mounts from both sides, that might be an
option.
-jf
tir. 25. sep. 2018 kl. 20:05 skrev Bryan Banister :
> I have to correct myself, looks like using nfs4_getacl, nfs4_setfacl,
> nfs4_editfac
That reminds me of a point Sven made when I was trying to optimize mdtest
results with metadata on FlashSystem... He sent me the following:
-- started at 11/15/2015 15:20:39 --
mdtest-1.9.3 was launched with 138 total task(s) on 23 node(s)
Command line used: /ghome/oehmes/mpi/bin/mdtest-pcmpi9131-
Just found the Spectrum Scale policy "best practices" presentation from the
latest UG:
http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf
which mentions:
"mmapplypolicy … --choice-algorithm fast && ... WEIGHT(0) … (avoids final
sort of all selected files by weight)"
and lo
Since I'm pretty proud of my awk one-liner, and maybe it's useful for this
kind of charging, here's how to sum up how much data each user has in the
filesystem (without regards to if the data blocks are offline, online,
replicated or compressed):
# cat full-file-list.policy
RULE EXTERNAL LIST 'fil
Stock | IBM Pittsburgh Lab | 720-430-8821
> sto...@us.ibm.com
>
>
>
> From:Jan-Frode Myklebust
> To:gpfsug main discussion list
> Date:03/16/2018 04:30 AM
>
> Subject:Re: [gpfsug-discuss] GPFS autoload - wait for IB ports
>
gt;
> Ray Coetzee
> Mob: +44 759 704 7060
>
> Skype: ray.coetzee
>
> Email: coetzee@gmail.com
>
>
> On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust
> wrote:
>
>>
>> Yes, I've been struggelig with something similiar this week. Ganesha
>>
Yes, I've been struggelig with something similiar this week. Ganesha dying
with SIGABRT -- nothing else logged. After catching a few coredumps, it has
been identified as a problem with some udp-communication during mounts from
solaris clients. Disabling udp as transport on the shares serverside did
target.wants/NetworkManager-wait-online.
> service'
>
> in many cases .. it helps ..
>
>
>
>
>
> From:Jan-Frode Myklebust
> To:gpfsug main discussion list
> Date:03/15/2018 06:18 PM
> Subject:Re: [gpfsug-discuss] GPFS
I found some discussion on this at
https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=----14471957&ps=25
and there it's claimed that none of the callback events are early enough to
resolve this. That we need a pre-preStartup trigger. Any idea if this has
The FAQ has a note stating:
1. AFM, Asynch Disaster Recovery with AFM, Integrated Protocols, and
Installation Toolkit are not supported on RHEL 7.4.
Could someone please clarify this sentence ? It can't be right that none of
these features are supported with RHEL 7.4, or .. ?
-jf
Have you seen
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_dr.htm
?
Seems to cover what you’re looking for..
-jf
ons. 24. jan. 2018 kl. 07:33 skrev Harold Morales :
> Thanks for answering.
>
> Essentially, the idea being explored is to
Don’t have documentation/whitepaper, but as I recall, it will first
allocate round-robin over failureGroup, then round-robin over nsdServers,
and then round-robin over volumes. So if these new NSDs are defined as
different failureGroup from the old disks, that might explain it..
-jf
lør. 13. jan.
Can’t you just reverse the mmchrecoverygroup --servers order, before
starting the io-server?
-jf
fre. 22. des. 2017 kl. 18:45 skrev Damir Krstic :
> It's been a very frustrating couple of months with our 2 ESS systems. IBM
> tells us we had blueflame bug and they came on site and updated our ESS
Bill, could you say something about what the metadata-storage here was?
ESS/NL-SAS/3way replication?
I just asked about this in the internal slack channel #scale-help today..
-jf
fre. 1. des. 2017 kl. 13:44 skrev Bill Hartner :
> > "It has a significant performance penalty for small files in
Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same
number of spindles for any size filesystem, so I would also expect them to
perform the same.
-jf
ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser :
> to add a comment ... .. very simply... depending on how you allocate th
Plese see
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration
But also check «mmcesdr primary backup». I don't rememner if it included
all of mmbackupconfig/mmccr, but I think it did, and it also i
You can lower LEASE_LIFETIME and GRACE_PERIOD to shorten the time it's in
grace, to make it more bearable. Making export changes dynamic is something
that's fixed in newer versions of nfs-ganesha than what's shipped with
Scale:
https://github.com/nfs-ganesha/nfs-ganesha/releases/tag/V2.4.0:
"dy
You don't have room to write 180GB of file data, only ~100GB. When you
write f.ex. 90 GB of file data, each filesystem block will get one copy
written to each of your disks, occuppying 180 GB on total disk space. So
you can always read if from the other disks if one should fail.
This is controlled
If you do a "mmchconfig restripeOnDiskFailure=yes", such a callback will be
added for node-join events. That can be quite useful for stretched
clusters, where you want to replicate all blocks to both locations, and
this way recover automatically.
-jf
On Wed, Aug 9, 2017 at 2:14 PM, Oesterlin,
Nils Haustein did such a migration from v7000 Unified to ESS last year.
Used SOBAR to avoid recalls from HSM. I believe he wrote a whitepaper on
the process..
-jf
tir. 18. jul. 2017 kl. 21.21 skrev Simon Thompson (IT Research Support) <
s.j.thomp...@bham.ac.uk>:
> So just following up on my ques
You had it here:
[root@server ~]# mmlsrecoverygroup BB1RGL -L
declustered
recovery group arrays vdisks pdisks format version
- --- -- -- --
BB1RGL 3 18 119 4.2.0.1
declustered needs replace scrub background activity
array service vdisks pdisks spares
Switch to node affinity policy, and it will stick to where you move it.
"mmces address policy node-affinity".
-jf
tir. 13. jun. 2017 kl. 06.21 skrev :
> On Mon, 12 Jun 2017 20:06:09 -, "Simon Thompson (IT Research Support)"
> said:
>
> > mmces node suspend -N
> >
> > Is what you want. This
I also don't know much about this, but the ESS quick deployment guide is
quite clear on the we should use connected mode for IPoIB:
--
Note: If using bonded IP over IB, do the following: Ensure that the
CONNECTED_MODE=yes statement exists in the corresponding slave-bond
interface scrip
This doesn't sound like normal behaviour. It shouldn't matter which
filesystem your tiebreaker disks belong to. I think the failure was caused
by something else, but am not able to guess from the little information you
posted.. The mmfs.log will probably tell you the reason.
-jf
ons. 3. mai 2017
e need either long IO blocked reads
> >or writes (from the GPFS end).
> >
> >We also originally had soft as the default option, but saw issues then
> >and the docs suggested hard, so we switched and also enabled sync (we
> >figured maybe it was NFS client with uncommited w
I *think* I've seen this, and that we then had open TCP connection from
client to NFS server according to netstat, but these connections were not
visible from netstat on NFS-server side.
Unfortunately I don't remember what the fix was..
-jf
tir. 25. apr. 2017 kl. 16.06 skrev Simon Thompson (
I agree with Luis -- why so many nodes?
"""
So if i would go for 4 NSD server, 6 protocol nodes and 2 tsm backup nodes
and at least 3 test server a total of 11 server is needed.
"""
If this is your whole cluster, why not just 3x P822L/P812L running single
partition per node, hosting a cluster of
e
> mix both protocols for such case.
>
>
> Is the spreadsheet publicly available or do we need to ask IBM ?
>
>
> Thank for your help,
>
> Frank.
>
>
> --
> *From:* Jan-Frode Myklebust
> *Sent:* Saturday, April 22, 2017 10:50
That's a tiny maxFilesToCache...
I would start by implementing the settings from
/usr/lpp/mmfs/*/gpfsprotocolldefaul* plus a 64GB pagepool for your
protocoll nodes, and leave further tuning to when you see you have issues.
Regarding sizing, we have a spreadsheet somewhere where you can input some
Maybe try mmumount -f on the remaining 4 nodes?
-jf
ons. 5. apr. 2017 kl. 18.54 skrev Buterbaugh, Kevin L <
kevin.buterba...@vanderbilt.edu>:
> Hi Simon,
>
> No, I do not.
>
> Let me also add that this is a filesystem that I migrated users off of and
> to another GPFS filesystem. I moved the l
Why would you need a NSD protocol router when the NSD servers can have a
mix of infiniband and ethernet adapters? F.ex. 4x EDR + 2x 100GbE per
io-node in an ESS should give you lots of bandwidth for your common
ethernet medium.
-jf
On Thu, Mar 16, 2017 at 1:52 AM, Aaron Knister
wrote:
> *dra
Another thing to consider is how many disk block pointers you have room for
in the inode, and when you'll need to add additional indirect blocks. Ref:
http://files.gpfsug.org/presentations/2016/south-bank/
D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf
If I understand that presentation correctly.. a
There's a manual for it now.. and it points out "The tool impacts
performance.." Also it has caused mmfsd crashes for me earlier, so I've
learned to be weary of running it..
The manual also says it's collecting using mmfsadm, and the mmfsadm manual
warns that it might cause GPFS to fail ("in certa
Unmount filesystems cleanly "mmumount all -a", stop gpfs "mmshutdown -N
gss_ppc64" and poweroff "xdsh gss_ppc64 poweroff".
-jf
fre. 3. mar. 2017 kl. 20.55 skrev Joseph Grace :
> Please excuse the newb question but we have a planned power outage coming
> and I can't locate a procedure to shutdown
ust
> using that for testing, the NFS clients all access it though nfsv4 which
> looks to be using kerberos from this.
>
> On 01/03/17 20:21, Jan-Frode Myklebust wrote:
> > This looks to me like a quite plain SYS authorized NFS, maybe also verify
> > that the individua
> LOG_LEVEL: EVENT
> ==
>
> Idmapd Configuration
> ==
> DOMAIN: DS.LEEDS.AC.UK
> ==
>
> On 01/03/17 14:12, Jan-Frode Myklebust wrote:
> > Lets figure out how your NFS is authenticating
Lets figure out how your NFS is authenticating then. The userdefined
authentication you have, could mean that your linux host is configured to
authenticated towards AD --- or it could be that you're using simple
sys-authentication for NFS.
Could you please post the output of:
mmnfs export list
mm
AFM apparently keeps track og this, so maybe it would be possible to run
AFM-SW with disconnected home and query the queue of changes? But would
require some way of clearing the queue as well..
-jf
On Monday, February 27, 2017, Marc A Kaplan wrote:
> Diffing file lists can be fast - IF yo
I just had a similar experience from a sandisk infiniflash system
SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for writes.
and 250-300 Mbyte/s on sequential reads!! Random reads were on the order of
2 Gbyte/s.
After a bit head scratching snd fumbling around I found out that reducin
The 4.2.2.2 readme says:
* Fix a multipath device failure that reads "blk_cloned_rq_check_limits:
over max size limit" which can occur when kernel function bio_get_nr_vecs()
returns a value which is larger than the value of max sectors of the block
device.
-jf
On Sat, Feb 11, 2017 at 7:32 PM
Just some datapoints, in hope that it helps..
I've seen metadata performance improvements by turning down hyperthreading
from 8/core to 4/core on Power8. Also it helped distributing the token
managers over multiple nodes (6+) instead of fewer.
I would expect this to flow over IP, not IB.
-jf
> Damir
>
> On Wed, Jan 11, 2017 at 12:38 PM Jan-Frode Myklebust
> wrote:
>
> And there you have:
>
> [ems1-fdr,compute,gss_ppc64]
> verbsRdmaSend yes
>
> Try turning this off.
>
>
> -jf
> ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic :
>
>
And there you have:
[ems1-fdr,compute,gss_ppc64]
verbsRdmaSend yes
Try turning this off.
-jf
ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic :
> Thanks for all the suggestions. Here is our mmlsconfig file. We just
> purchased another GL6. During the installation of the new GL6 IBM will
> upgra
My first guess would also be rdmaSend, which the gssClientConfig.sh enables
by default, but isn't scalable to large clusters. It fits with your error
message:
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20%28GPFS%29/page/Best%20Practice
I also struggle with where to look for CES log files.. but maybe the new
"mmprotocoltrace" command can be useful?
# mmprotocoltrace start smb
### reproduce problem
# mmprotocoltrace stop smb
Check log files it has collected.
-jf
On Wed, Jan 11, 2017 at 10:27 AM, Sobey, Richard A
wrote:
>
Yaron, doesn't "-1" make each of these disk an independent failure group?
>From 'man mmcrnsd':
"The default is -1, which indicates this disk has no point of failure in
common with any other disk."
-jf
man. 9. jan. 2017 kl. 21.54 skrev Yaron Daniel :
> Hi
>
> So - do u able to have GPFS rep
Untested, and I have no idea if it will work on the number of files and
directories you have, but maybe you can fix it by rsyncing just the
directories?
rsync -av --dry-run --include='*/' --exclude='*' source/ destination/
-jf
man. 9. jan. 2017 kl. 16.09 skrev :
> Hi All,
>
> We have just com
1 - 100 of 137 matches
Mail list logo