Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Jan-Frode Myklebust
Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc +@GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0 fileset1:$snapname,filset2:snapname,fileset3:sna

Re: [gpfsug-discuss] ESS 6.1.2.1 changes

2021-12-20 Thread Jan-Frode Myklebust
Just ran an upgrade on an EMS, and the only changes I see are these updated packages on the ems: +gpfs.docs-5.1.2-0.9.noarchMon 20 Dec 2021 11:56:43 AM CET +gpfs.ess.firmware-6.0.0-15.ppc64leMon 20 Dec 2021 11:56:42 AM CET +gpfs.msg.en_US-5.1.2-0.9.noarch

Re: [gpfsug-discuss] alternate path between ESS Servers for Datamigration

2021-12-09 Thread Jan-Frode Myklebust
I believe this should be a fully working solution. I see no problem enabling RDMA between a subset of nodes -- just disable verbsRdma on the nodes you want to use plain IP. -jf On Thu, Dec 9, 2021 at 11:04 AM Walter Sklenka wrote: > Dear spectrum scale users! > > May I ask you a design quest

Re: [gpfsug-discuss] mmapplypolicy slow

2021-08-03 Thread Jan-Frode Myklebust
> I have also played with the -A and -a parameters with no combination that I > can find making it any better. > > > > Thanks for the feedback. > > > > *From:* gpfsug-discuss-boun...@spectrumscale.org < > gpfsug-discuss-boun...@spectrumscale.org> *On Behalf Of

Re: [gpfsug-discuss] mmapplypolicy slow

2021-08-03 Thread Jan-Frode Myklebust
So…. the advertisement ssys we should be able to do 1M files/s… http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf First I would try is if maybe limiting which nodes are used for the processing helps. Maybe limit to the NSD-servers (-N nodenames) ? Also, --choice-algorithm

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-17 Thread Jan-Frode Myklebust
eo > > Paul Scherrer Institut > Dr. Leonardo Sala > Group Leader High Performance Computing > Deputy Section Head Science IT > Science IT > WHGA/036 > Forschungstrasse 111 > 5232 Villigen PSI > Switzerland > > Phone: +41 56 310 3369leonardo.s...@psi.chwww.psi.ch

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-07 Thread Jan-Frode Myklebust
I’ve done this a few times. Once with IPoIB as daemon network, and then created a separate routed network on the hypervisor to bridge (?) between VM and IPoIB network. Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor: To give the VMs access to the daemon network, w

Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Jan-Frode Myklebust
One thing to check: Storwize/SVC code will *always* guess wrong on prefetching for GPFS. You can see this with having a lot higher read data throughput on mdisk vs. on on vdisks in the webui. To fix it, disable cache_prefetch with "chsystem -cache_prefetch off". This being a global setting, you pr

Re: [gpfsug-discuss] Spectrum Scale & S3

2021-05-21 Thread Jan-Frode Myklebust
It has features for both being an Object store for other applications (running openstack swift/S3), and for migrating/tiering filesystem data to an object store like Amazon S3, IBM COS, etc... -jf fre. 21. mai 2021 kl. 10:42 skrev David Reynolds : > When we talk about supported protocols on S

Re: [gpfsug-discuss] Quick delete of huge tree

2021-04-20 Thread Jan-Frode Myklebust
A couple of ideas. The KC recommends adding WEIGHT(DIRECTORY_HASH) to group deletions within a directory. Then maybe also do it as a 2-step process, in the same policy run. Where you delete all non-directories first, and then deletes the directories in a depth-first order using WEIGTH(Length(PATH_

Re: [gpfsug-discuss] NFSIO metrics absent in pmcollector

2021-04-20 Thread Jan-Frode Myklebust
Have you installed the gpfs.pm-ganesha package, and do you have any active NFS exports/clients ? -jf On Tue, Apr 20, 2021 at 12:19 PM Dorigo Alvise (PSI) wrote: > Dear Community, > > > > I’ve activated CES-related metrics by simply doing: > > [root@xbl-ces-91 ~]# mmperfmon config show |egrep

Re: [gpfsug-discuss] Move data to fileset seamlessly

2021-03-22 Thread Jan-Frode Myklebust
No — all copying between filesets require full data copy. No simple rename. This might be worthy of an RFE, as it’s a bit unexpected, and could potentially work more efficiently.. -jf man. 22. mar. 2021 kl. 10:39 skrev Ulrich Sibiller < u.sibil...@science-computing.de>: > Hello, > > we usua

Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-02-28 Thread Jan-Frode Myklebust
I’ve tried benchmarking many vs. few vdisks per RG, and never could see any performance difference. Usually we create 1 vdisk per enclosure per RG, thinking this will allow us to grow with same size vdisks when adding additional enclosures in the future. Don’t think mmvdisk can be told to creat

Re: [gpfsug-discuss] policy ilm features?

2021-02-19 Thread Jan-Frode Myklebust
=1048576 modsnap=1 extperms=0x2,xa replmeta *illReplicated unbalanced* dev=3824,150 archive compressed crtime 1610965529.23447 That program can probably easily be modified to only list these files.. -jf On Fri, Feb 19, 2021 at 1:50 PM Jan-Frode Myklebust wrote: > We just discussed thi

Re: [gpfsug-discuss] policy ilm features?

2021-02-19 Thread Jan-Frode Myklebust
We just discussed this a bit internally, and I found "something* that might help... There's a mmrestripefs --inode-criteria command that can be used to identify files with these unknown-to-ILM flags set. Something like: # echo illreplicated > criteria # mmrestripefs gpfs01 -p --inode-criteria crit

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 108, Issue 18

2021-02-01 Thread Jan-Frode Myklebust
Agree.. Write a policy that takes a "mmapplypolicy -M var=val" argument, and figure out the workdays outside of the policy. Something like: # cat test.poilcy define( access_age, (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) /* list migrated files */ RULE EXTERNAL LIST 'oldFiles' EXEC '' RULE

Re: [gpfsug-discuss] Spectrum Scale 5 and Reading Compressed Data

2021-01-20 Thread Jan-Frode Myklebust
This sounds like a bug to me... (I wouldn't expect mmchattr works on different node than other file access). I would check "mmdiag --iohist verbose" during these slow reads, to see if it gives a hint at what it's doing, versus what it shows during "mmchattr". Maybe one is triggering prefetch, while

Re: [gpfsug-discuss] Protocol limits

2020-12-09 Thread Jan-Frode Myklebust
My understanding of these limits are that they are to limit the configuration files from becoming too large, which makes changing/processing them somewhat slow. For SMB shares, you might be able to limit the number of configured shares by using wildcards in the config (%U). These wildcarded entrie

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jan-Frode Myklebust
I would not mount a GPFS filesystem within a GPFS filesystem. Technically it should work, but I’d expect it to cause surprises if ever the lower filesystem experienced problems. Alone, a filesystem might recover automatically by remounting. But if there’s another filesystem mounted within, I expect

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jan-Frode Myklebust
Nice to see it working well! But, what about ACLs? Does you rsync pull in all needed metadata, or do you also need to sync ACLs ? Any plans for how to solve that ? On Tue, Nov 17, 2020 at 12:52 PM Andi Christiansen wrote: > Hi all, > > thanks for all the information, there was some interesting

Re: [gpfsug-discuss] Tiny cluster quorum problem

2020-08-18 Thread Jan-Frode Myklebust
I would expect you should be able to get it back up using the routine at https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1adv_failsynch.htm Maybe you just need to force remove quorum-role from the dead node ? -jf On Tue, Aug 18, 2020 at 2:16 PM H

Re: [gpfsug-discuss] rsync NFS4 ACLs

2020-07-17 Thread Jan-Frode Myklebust
States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > [image: Inactive hide details for Jan-Frode Myklebust ---15-07-2020 > 08.44.

[gpfsug-discuss] rsync NFS4 ACLs

2020-07-15 Thread Jan-Frode Myklebust
It looks like the old NFS4 ACL patch for rsync is no longer needed. Starting with rsync-3.2.0 (and backported to rsync-3.1.2-9 in RHEL7), it will now copy NFS4 ACLs if we tell it to ignore the posix ACLs: rsync -X --filter '-x system.posix_acl' file-with-acl copy-with-acl

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN: effect of ignorePrefetchLUNCount

2020-06-16 Thread Jan-Frode Myklebust
tir. 16. jun. 2020 kl. 15:32 skrev Giovanni Bracco : > > > I would correct MaxMBpS -- put it at something reasonable, enable > > verbsRdmaSend=yes and > > ignorePrefetchLUNCount=yes. > > Now we have set: > verbsRdmaSend yes > ignorePrefetchLUNCount yes > maxMBpS 8000 > > but the only parameter whi

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-11 Thread Jan-Frode Myklebust
On Thu, Jun 11, 2020 at 9:53 AM Giovanni Bracco wrote: > > > > > You could potentially still do SRP from QDR nodes, and via NSD for your > > omnipath nodes. Going via NSD seems like a bit pointless indirection. > > not really: both clusters, the 400 OPA nodes and the 300 QDR nodes share > the sam

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-05 Thread Jan-Frode Myklebust
fre. 5. jun. 2020 kl. 15:53 skrev Giovanni Bracco : > answer in the text > > On 05/06/20 14:58, Jan-Frode Myklebust wrote: > > > > Could maybe be interesting to drop the NSD servers, and let all nodes > > access the storage via srp ? > > no we can not: the product

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-05 Thread Jan-Frode Myklebust
Could maybe be interesting to drop the NSD servers, and let all nodes access the storage via srp ? Maybe turn off readahead, since it can cause performance degradation when GPFS reads 1 MB blocks scattered on the NSDs, so that read-ahead always reads too much. This might be the cause of the slow r

Re: [gpfsug-discuss] Multi-cluster question (was Re: gpfsug-discuss Digest, Vol 100, Issue 32)

2020-05-31 Thread Jan-Frode Myklebust
No, this is a common misconception. You don’t need any NSD servers. NSD servers are only needed if you have nodes without direct block access. Remote cluster or not, disk access will be over local block device (without involving NSD servers in any way), or NSD server if local access isn’t availab

Re: [gpfsug-discuss] Spectrum Scale 5.0.5.0 is available on FixCentral

2020-05-26 Thread Jan-Frode Myklebust
Seeing that as a %changelog in the RPMs would be fantastic.. :-) -jf tir. 26. mai 2020 kl. 15:44 skrev Carl Zetie - ca...@us.ibm.com < ca...@us.ibm.com>: > Achim, I think the request here (lost in translation?) is for a list of > the bugs that 5.0.5.0 addresses. And we're currently looking to

Re: [gpfsug-discuss] Enabling SSL/HTTPS/ on Object S3.

2020-05-07 Thread Jan-Frode Myklebust
(almost verbatim copy of my previous email — in case anybody else needs it, or has ideas for improvements :-) The way I would do this is to install "haproxy" on all these nodes, and have haproxy terminate SSL and balance incoming requests over the 3 CES-addresses. For S3 -- we only need to provide

Re: [gpfsug-discuss] Read-only mount option for GPFS version 4.2.3.19

2020-03-04 Thread Jan-Frode Myklebust
I don’t know the answer — but as an alternative solution, have you considered splitting the read only clients out into a separate cluster. Then you could enforce the read-only setting using «mmauth grant ... -a ro». That should be supported. -jf ons. 4. mar. 2020 kl. 12:05 skrev Agostino Fu

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-02-03 Thread Jan-Frode Myklebust
ppc64le. The Readme for 5.3.5 lists FW860.60 again, same as 5.3.4? > > > > Cheers, > > > > Heiner > > *From: * on behalf of Jan-Frode > Myklebust > *Reply to: *gpfsug main discussion list > *Date: *Thursday, 30 January 2020 at 18:00 > *To: *gpfsug

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-01-30 Thread Jan-Frode Myklebust
I *think* this was a known bug in the Power firmware included with 5.3.4, and that it was fixed in the FW860.70. Something hanging/crashing in IPMI. -jf tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward : > Interesting. We just deployed an ESS here and are running into a very > similar proble

Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-05 Thread Jan-Frode Myklebust
There’s still being maintained the ESS v5.2 release stream with gpfs v4.2.3.x for customer that are stuck on v4. You should probably install that on your ESS if you want to add it to your existing cluster. BTW: I think Tomer misunderstood the task a bit. It sounded like you needed to keep the exis

Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-04 Thread Jan-Frode Myklebust
Adding the GL2 into your existing cluster shouldn’t be any problem. You would just delete the existing cluster on the GL2, then on the EMS run something like: gssaddnode -N gssio1-hs --cluster-node netapp-node --nodetype gss --accept-license gssaddnode -N gssio2-hs --cluster-node netapp-node

Re: [gpfsug-discuss] Fileheat - does work! Complete test/example provided here.

2019-09-03 Thread Jan-Frode Myklebust
| > DISPLAY_NULL(xattr_integer('gpfs.FileHeat',7,2,'B')) || ' ' || > DISPLAY_NULL(FILE_HEAT) || ' ' || > DISPLAY_NULL(hex(xattr('gpfs.FileHeat'))) || ' ' || > getmmconfig('fileHeatPeriodMinutes') || ' ' || &

Re: [gpfsug-discuss] Fileheat

2019-08-13 Thread Jan-Frode Myklebust
What about filesystem atime updates. We recently changed the default to «relatime». Could that maybe influence heat tracking? -jf tir. 13. aug. 2019 kl. 11:29 skrev Ulrich Sibiller < u.sibil...@science-computing.de>: > On 12.08.19 15:38, Marc A Kaplan wrote: > > My Admin guide says: > > > >

Re: [gpfsug-discuss] Any guidelines for choosing vdisk size?

2019-07-01 Thread Jan-Frode Myklebust
I would mainly consider future upgrades. F.ex. do one vdisk per disk shelf per rg. F.ex. for a GL6S you would have 12 vdisks, and if you add a GL4S you would add 8 more vdisks, then each spindle of both systems should get approximately the same number of IOs. Another thing to consider is re-alloca

Re: [gpfsug-discuss] rescan-scsi-bus.sh and "Local access to NSD failed with EIO, switching to access the disk remotely."

2019-06-25 Thread Jan-Frode Myklebust
I’ve had a situation recently where mmnsddiscover didn’t help, but mmshutdown/mmstartup on that node did fix it. This was with v5.0.2-3 on ppc64le. -jf tir. 25. jun. 2019 kl. 17:02 skrev Son Truong : > > Hello Renar, > > Thanks for that command, very useful and I can now see the problematic

Re: [gpfsug-discuss] [EXTERNAL] Intro, and Spectrum Archive self-service recall interface question

2019-05-21 Thread Jan-Frode Myklebust
It’s a multiple of full blocks. -jf tir. 21. mai 2019 kl. 20:06 skrev Todd Ruston : > Hi Indulis, > > Yes, thanks for the reminder. I'd come across that, and our system is > currently set to a stub size of zero (the default, I presume). I'd intended > to ask in my original query whether anyon

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 87, Issue 4

2019-04-03 Thread Jan-Frode Myklebust
scale.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > >1. Re: Adding ESS to existing Scale Cluster (Sanchez, Paul) >2. New ESS install - Ne

Re: [gpfsug-discuss] New ESS install - Network adapter down level

2019-04-03 Thread Jan-Frode Myklebust
Have you tried: updatenode nodename -P gss_ofed But, is this the known issue listed in the qdg? https://www.ibm.com/support/knowledgecenter/SSYSP8_5.3.2/ess_qdg.pdf -jf ons. 3. apr. 2019 kl. 19:26 skrev Oesterlin, Robert < robert.oester...@nuance.com>: > Any insight on what command I nee

Re: [gpfsug-discuss] Filesystem descriptor discs for GNR

2019-03-28 Thread Jan-Frode Myklebust
There seems to be some changes or bug here.. But try usage=dataOnly pool=neverused failureGroup=xx.. and it should have the same function as long as you never place anything in this pool. -jf tor. 28. mar. 2019 kl. 18:43 skrev Luke Sudbery : > We have a 2 site Lenovo DSS-G based filesystem, whic

Re: [gpfsug-discuss] Getting which files are store fully in inodes

2019-03-28 Thread Jan-Frode Myklebust
I've been looking for a good way of listing this as well. Could you please share your policy ? -jf On Thu, Mar 28, 2019 at 1:52 PM Dorigo Alvise (PSI) wrote: > Hello, > to get the list (and size) of files that fit into inodes what I do, using > a policy, is listing "online" (not evicted) fil

Re: [gpfsug-discuss] Querying size of snapshots

2019-01-29 Thread Jan-Frode Myklebust
You could put snapshot data in a separate storage pool. Then it should be visible how much space it occupies, but it’s a bit hard to see how this will be usable/manageable.. -jf tir. 29. jan. 2019 kl. 20:08 skrev Christopher Black : > Thanks for the quick and detailed reply! I had read the manua

Re: [gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy

2018-12-18 Thread Jan-Frode Myklebust
I don't think it's going to figure this out automatically between the two rules.. I believe you will need to do something like the below (untested, and definitely not perfect!!) rebalancing: define( weight_junkedness, (CASE /* Create 3 classes of files */

Re: [gpfsug-discuss] Anybody running GPFS over iSCSI?

2018-12-16 Thread Jan-Frode Myklebust
I’d be curious to hear if all these arguments against iSCSI shouldn’t also apply to NSD protocol over TCP/IP? -jf man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard < jonathan.buzz...@strath.ac.uk>: > On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > Two things that I am alre

Re: [gpfsug-discuss] Anybody running GPFS over iSCSI? -

2018-12-16 Thread Jan-Frode Myklebust
I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like

Re: [gpfsug-discuss] gpfs mount point not visible in snmp hrStorageTable

2018-11-07 Thread Jan-Frode Myklebust
/mibgroup/hardware/fsys/mnttypes.h @@ -121,6 +121,9 @@ #ifndef MNTTYPE_GFS2 #define MNTTYPE_GFS2 "gfs2" #endif +#ifndef MNTTYPE_GPFS +#define MNTTYPE_GPFS "gpfs" +#endif #ifndef MNTTYPE_XFS #define MNTTYPE_XFS "xfs" #endif On Wed, Nov 7, 2018 at 12:20

Re: [gpfsug-discuss] gpfs mount point not visible in snmp hrStorageTable

2018-11-07 Thread Jan-Frode Myklebust
Looking at the CHANGELOG for net-snmp, it seems it needs to know about each filesystem it's going to support, and I see no GPFS/mmfs. It has entries like: - Added simfs (OpenVZ filesystem) to hrStorageTable and hrFSTable. - Added CVFS (CentraVision File System) to hrStorageTable and

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
Also beware there are 2 different linux NFS "async" settings. A client side setting (mount -o async), which still cases sync on file close() -- and a server (knfs) side setting (/etc/exports) that violates NFS protocol and returns requests before data has hit stable storage. -jf On Wed, Oct 17

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
nc on each file.. > > you'll never outperform e.g. 128 (maybe slower), but, parallel threads > (running write-behind) <---> with one single but fast threads, > > so as Alex suggest.. if possible.. take gpfs client of kNFS for those > types of workloads.. > >

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
Do you know if the slow throughput is caused by the network/nfs-protocol layer, or does it help to use faster storage (ssd)? If on storage, have you considered if HAWC can help? I’m thinking about adding an SSD pool as a first tier to hold the active dataset for a similar setup, but that’s mainly

Re: [gpfsug-discuss] replicating ACLs across GPFS's?

2018-09-25 Thread Jan-Frode Myklebust
Not sure if better or worse idea, but I believe robocopy support syncing just the ACLs, so if you do SMB mounts from both sides, that might be an option. -jf tir. 25. sep. 2018 kl. 20:05 skrev Bryan Banister : > I have to correct myself, looks like using nfs4_getacl, nfs4_setfacl, > nfs4_editfac

Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Jan-Frode Myklebust
That reminds me of a point Sven made when I was trying to optimize mdtest results with metadata on FlashSystem... He sent me the following: -- started at 11/15/2015 15:20:39 -- mdtest-1.9.3 was launched with 138 total task(s) on 23 node(s) Command line used: /ghome/oehmes/mpi/bin/mdtest-pcmpi9131-

[gpfsug-discuss] mmapplypolicy --choice-algorithm fast

2018-05-28 Thread Jan-Frode Myklebust
Just found the Spectrum Scale policy "best practices" presentation from the latest UG: http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf which mentions: "mmapplypolicy … --choice-algorithm fast && ... WEIGHT(0) … (avoids final sort of all selected files by weight)" and lo

Re: [gpfsug-discuss] Recharging where HSM is used

2018-05-03 Thread Jan-Frode Myklebust
Since I'm pretty proud of my awk one-liner, and maybe it's useful for this kind of charging, here's how to sum up how much data each user has in the filesystem (without regards to if the data blocks are offline, online, replicated or compressed): # cat full-file-list.policy RULE EXTERNAL LIST 'fil

Re: [gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive

2018-04-27 Thread Jan-Frode Myklebust
Stock | IBM Pittsburgh Lab | 720-430-8821 > sto...@us.ibm.com > > > > From:Jan-Frode Myklebust > To:gpfsug main discussion list > Date:03/16/2018 04:30 AM > > Subject:Re: [gpfsug-discuss] GPFS autoload - wait for IB ports >

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

2018-04-22 Thread Jan-Frode Myklebust
gt; > Ray Coetzee > Mob: +44 759 704 7060 > > Skype: ray.coetzee > > Email: coetzee@gmail.com > > > On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust > wrote: > >> >> Yes, I've been struggelig with something similiar this week. Ganesha >>

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

2018-04-22 Thread Jan-Frode Myklebust
Yes, I've been struggelig with something similiar this week. Ganesha dying with SIGABRT -- nothing else logged. After catching a few coredumps, it has been identified as a problem with some udp-communication during mounts from solaris clients. Disabling udp as transport on the shares serverside did

Re: [gpfsug-discuss] GPFS autoload - wait for IB ports tobecomeactive

2018-03-16 Thread Jan-Frode Myklebust
target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From:Jan-Frode Myklebust > To:gpfsug main discussion list > Date:03/15/2018 06:18 PM > Subject:Re: [gpfsug-discuss] GPFS

Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to becomeactive

2018-03-15 Thread Jan-Frode Myklebust
I found some discussion on this at https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=----14471957&ps=25 and there it's claimed that none of the callback events are early enough to resolve this. That we need a pre-preStartup trigger. Any idea if this has

[gpfsug-discuss] AFM and RHEL 7.4

2018-01-25 Thread Jan-Frode Myklebust
The FAQ has a note stating: 1. AFM, Asynch Disaster Recovery with AFM, Integrated Protocols, and Installation Toolkit are not supported on RHEL 7.4. Could someone please clarify this sentence ? It can't be right that none of these features are supported with RHEL 7.4, or .. ? -jf

Re: [gpfsug-discuss] storage-based replication for Spectrum Scale

2018-01-23 Thread Jan-Frode Myklebust
Have you seen https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_dr.htm ? Seems to cover what you’re looking for.. -jf ons. 24. jan. 2018 kl. 07:33 skrev Harold Morales : > Thanks for answering. > > Essentially, the idea being explored is to

Re: [gpfsug-discuss] pool block allocation algorithm

2018-01-13 Thread Jan-Frode Myklebust
Don’t have documentation/whitepaper, but as I recall, it will first allocate round-robin over failureGroup, then round-robin over nsdServers, and then round-robin over volumes. So if these new NSDs are defined as different failureGroup from the old disks, that might explain it.. -jf lør. 13. jan.

Re: [gpfsug-discuss] ESS bring up the GPFS in recovery group without takeover

2017-12-22 Thread Jan-Frode Myklebust
Can’t you just reverse the mmchrecoverygroup --servers order, before starting the io-server? -jf fre. 22. des. 2017 kl. 18:45 skrev Damir Krstic : > It's been a very frustrating couple of months with our 2 ESS systems. IBM > tells us we had blueflame bug and they came on site and updated our ESS

Re: [gpfsug-discuss] Online data migration tool

2017-12-01 Thread Jan-Frode Myklebust
Bill, could you say something about what the metadata-storage here was? ESS/NL-SAS/3way replication? I just asked about this in the internal slack channel #scale-help today.. -jf fre. 1. des. 2017 kl. 13:44 skrev Bill Hartner : > > "It has a significant performance penalty for small files in

Re: [gpfsug-discuss] Write performances and filesystem size

2017-11-15 Thread Jan-Frode Myklebust
Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same. -jf ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser : > to add a comment ... .. very simply... depending on how you allocate th

Re: [gpfsug-discuss] Backing up GPFS config

2017-11-14 Thread Jan-Frode Myklebust
Plese see https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration But also check «mmcesdr primary backup». I don't rememner if it included all of mmbackupconfig/mmccr, but I think it did, and it also i

Re: [gpfsug-discuss] Experience with CES NFS export management

2017-10-23 Thread Jan-Frode Myklebust
You can lower LEASE_LIFETIME and GRACE_PERIOD to shorten the time it's in grace, to make it more bearable. Making export changes dynamic is something that's fixed in newer versions of nfs-ganesha than what's shipped with Scale: https://github.com/nfs-ganesha/nfs-ganesha/releases/tag/V2.4.0: "dy

Re: [gpfsug-discuss] how gpfs work when disk fail

2017-10-09 Thread Jan-Frode Myklebust
You don't have room to write 180GB of file data, only ~100GB. When you write f.ex. 90 GB of file data, each filesystem block will get one copy written to each of your disks, occuppying 180 GB on total disk space. So you can always read if from the other disks if one should fail. This is controlled

Re: [gpfsug-discuss] Is GPFS starting NSDs automatically?

2017-08-09 Thread Jan-Frode Myklebust
If you do a "mmchconfig restripeOnDiskFailure=yes", such a callback will be added for node-join events. That can be quite useful for stretched clusters, where you want to replicate all blocks to both locations, and this way recover automatically. -jf On Wed, Aug 9, 2017 at 2:14 PM, Oesterlin,

Re: [gpfsug-discuss] SOBAR questions

2017-07-19 Thread Jan-Frode Myklebust
Nils Haustein did such a migration from v7000 Unified to ESS last year. Used SOBAR to avoid recalls from HSM. I believe he wrote a whitepaper on the process.. -jf tir. 18. jul. 2017 kl. 21.21 skrev Simon Thompson (IT Research Support) < s.j.thomp...@bham.ac.uk>: > So just following up on my ques

Re: [gpfsug-discuss] get free space in GSS

2017-07-09 Thread Jan-Frode Myklebust
You had it here: [root@server ~]# mmlsrecoverygroup BB1RGL -L declustered recovery group arrays vdisks pdisks format version - --- -- -- -- BB1RGL 3 18 119 4.2.0.1 declustered needs replace scrub background activity array service vdisks pdisks spares

Re: [gpfsug-discuss] 'mmces address move' weirdness?

2017-06-12 Thread Jan-Frode Myklebust
Switch to node affinity policy, and it will stick to where you move it. "mmces address policy node-affinity". -jf tir. 13. jun. 2017 kl. 06.21 skrev : > On Mon, 12 Jun 2017 20:06:09 -, "Simon Thompson (IT Research Support)" > said: > > > mmces node suspend -N > > > > Is what you want. This

Re: [gpfsug-discuss] connected v. datagram mode

2017-05-12 Thread Jan-Frode Myklebust
I also don't know much about this, but the ESS quick deployment guide is quite clear on the we should use connected mode for IPoIB: -- Note: If using bonded IP over IB, do the following: Ensure that the CONNECTED_MODE=yes statement exists in the corresponding slave-bond interface scrip

Re: [gpfsug-discuss] Tiebreaker disk question

2017-05-03 Thread Jan-Frode Myklebust
This doesn't sound like normal behaviour. It shouldn't matter which filesystem your tiebreaker disks belong to. I think the failure was caused by something else, but am not able to guess from the little information you posted.. The mmfs.log will probably tell you the reason. -jf ons. 3. mai 2017

Re: [gpfsug-discuss] NFS issues

2017-04-26 Thread Jan-Frode Myklebust
e need either long IO blocked reads > >or writes (from the GPFS end). > > > >We also originally had soft as the default option, but saw issues then > >and the docs suggested hard, so we switched and also enabled sync (we > >figured maybe it was NFS client with uncommited w

Re: [gpfsug-discuss] NFS issues

2017-04-25 Thread Jan-Frode Myklebust
I *think* I've seen this, and that we then had open TCP connection from client to NFS server according to netstat, but these connections were not visible from netstat on NFS-server side. Unfortunately I don't remember what the fix was.. -jf tir. 25. apr. 2017 kl. 16.06 skrev Simon Thompson (

Re: [gpfsug-discuss] Used virtualization technologies for GPFS/Spectrum Scale

2017-04-24 Thread Jan-Frode Myklebust
I agree with Luis -- why so many nodes? """ So if i would go for 4 NSD server, 6 protocol nodes and 2 tsm backup nodes and at least 3 test server a total of 11 server is needed. """ If this is your whole cluster, why not just 3x P822L/P812L running single partition per node, hosting a cluster of

Re: [gpfsug-discuss] Protocol node recommendations

2017-04-23 Thread Jan-Frode Myklebust
e > mix both protocols for such case. > > > Is the spreadsheet publicly available or do we need to ask IBM ? > > > Thank for your help, > > Frank. > > > -- > *From:* Jan-Frode Myklebust > *Sent:* Saturday, April 22, 2017 10:50

Re: [gpfsug-discuss] Protocol node recommendations

2017-04-22 Thread Jan-Frode Myklebust
That's a tiny maxFilesToCache... I would start by implementing the settings from /usr/lpp/mmfs/*/gpfsprotocolldefaul* plus a 64GB pagepool for your protocoll nodes, and leave further tuning to when you see you have issues. Regarding sizing, we have a spreadsheet somewhere where you can input some

Re: [gpfsug-discuss] Can't delete filesystem

2017-04-05 Thread Jan-Frode Myklebust
Maybe try mmumount -f on the remaining 4 nodes? -jf ons. 5. apr. 2017 kl. 18.54 skrev Buterbaugh, Kevin L < kevin.buterba...@vanderbilt.edu>: > Hi Simon, > > No, I do not. > > Let me also add that this is a filesystem that I migrated users off of and > to another GPFS filesystem. I moved the l

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 62, Issue 33

2017-03-16 Thread Jan-Frode Myklebust
Why would you need a NSD protocol router when the NSD servers can have a mix of infiniband and ethernet adapters? F.ex. 4x EDR + 2x 100GbE per io-node in an ESS should give you lots of bandwidth for your common ethernet medium. -jf On Thu, Mar 16, 2017 at 1:52 AM, Aaron Knister wrote: > *dra

Re: [gpfsug-discuss] default inode size

2017-03-15 Thread Jan-Frode Myklebust
Another thing to consider is how many disk block pointers you have room for in the inode, and when you'll need to add additional indirect blocks. Ref: http://files.gpfsug.org/presentations/2016/south-bank/ D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf If I understand that presentation correctly.. a

Re: [gpfsug-discuss] Running gpfs.snap outside of problems

2017-03-09 Thread Jan-Frode Myklebust
There's a manual for it now.. and it points out "The tool impacts performance.." Also it has caused mmfsd crashes for me earlier, so I've learned to be weary of running it.. The manual also says it's collecting using mmfsadm, and the mmfsadm manual warns that it might cause GPFS to fail ("in certa

Re: [gpfsug-discuss] shutdown

2017-03-03 Thread Jan-Frode Myklebust
Unmount filesystems cleanly "mmumount all -a", stop gpfs "mmshutdown -N gss_ppc64" and poweroff "xdsh gss_ppc64 poweroff". -jf fre. 3. mar. 2017 kl. 20.55 skrev Joseph Grace : > Please excuse the newb question but we have a planned power outage coming > and I can't locate a procedure to shutdown

Re: [gpfsug-discuss] Issues getting SMB shares working.

2017-03-02 Thread Jan-Frode Myklebust
ust > using that for testing, the NFS clients all access it though nfsv4 which > looks to be using kerberos from this. > > On 01/03/17 20:21, Jan-Frode Myklebust wrote: > > This looks to me like a quite plain SYS authorized NFS, maybe also verify > > that the individua

Re: [gpfsug-discuss] Issues getting SMB shares working.

2017-03-01 Thread Jan-Frode Myklebust
> LOG_LEVEL: EVENT > == > > Idmapd Configuration > == > DOMAIN: DS.LEEDS.AC.UK > == > > On 01/03/17 14:12, Jan-Frode Myklebust wrote: > > Lets figure out how your NFS is authenticating

Re: [gpfsug-discuss] Issues getting SMB shares working.

2017-03-01 Thread Jan-Frode Myklebust
Lets figure out how your NFS is authenticating then. The userdefined authentication you have, could mean that your linux host is configured to authenticated towards AD --- or it could be that you're using simple sys-authentication for NFS. Could you please post the output of: mmnfs export list mm

Re: [gpfsug-discuss] Tracking deleted files

2017-02-27 Thread Jan-Frode Myklebust
AFM apparently keeps track og this, so maybe it would be possible to run AFM-SW with disconnected home and query the queue of changes? But would require some way of clearing the queue as well.. -jf On Monday, February 27, 2017, Marc A Kaplan wrote: > Diffing file lists can be fast - IF yo

Re: [gpfsug-discuss] bizarre performance behavior

2017-02-17 Thread Jan-Frode Myklebust
I just had a similar experience from a sandisk infiniflash system SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for writes. and 250-300 Mbyte/s on sequential reads!! Random reads were on the order of 2 Gbyte/s. After a bit head scratching snd fumbling around I found out that reducin

Re: [gpfsug-discuss] Getting 'blk_cloned_rq_check_limits: over max size limit' errors after updating the systems to kernel 2.6.32-642.el6 or later

2017-02-12 Thread Jan-Frode Myklebust
The 4.2.2.2 readme says: * Fix a multipath device failure that reads "blk_cloned_rq_check_limits: over max size limit" which can occur when kernel function bio_get_nr_vecs() returns a value which is larger than the value of max sectors of the block device. -jf On Sat, Feb 11, 2017 at 7:32 PM

Re: [gpfsug-discuss] Manager nodes

2017-01-24 Thread Jan-Frode Myklebust
Just some datapoints, in hope that it helps.. I've seen metadata performance improvements by turning down hyperthreading from 8/core to 4/core on Power8. Also it helped distributing the token managers over multiple nodes (6+) instead of fewer. I would expect this to flow over IP, not IB. -jf

Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Jan-Frode Myklebust
> Damir > > On Wed, Jan 11, 2017 at 12:38 PM Jan-Frode Myklebust > wrote: > > And there you have: > > [ems1-fdr,compute,gss_ppc64] > verbsRdmaSend yes > > Try turning this off. > > > -jf > ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic : > >

Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Jan-Frode Myklebust
And there you have: [ems1-fdr,compute,gss_ppc64] verbsRdmaSend yes Try turning this off. -jf ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic : > Thanks for all the suggestions. Here is our mmlsconfig file. We just > purchased another GL6. During the installation of the new GL6 IBM will > upgra

Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Jan-Frode Myklebust
My first guess would also be rdmaSend, which the gssClientConfig.sh enables by default, but isn't scalable to large clusters. It fits with your error message: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20%28GPFS%29/page/Best%20Practice

Re: [gpfsug-discuss] CES log files

2017-01-11 Thread Jan-Frode Myklebust
I also struggle with where to look for CES log files.. but maybe the new "mmprotocoltrace" command can be useful? # mmprotocoltrace start smb ### reproduce problem # mmprotocoltrace stop smb Check log files it has collected. -jf On Wed, Jan 11, 2017 at 10:27 AM, Sobey, Richard A wrote: >

Re: [gpfsug-discuss] replication and no failure groups

2017-01-09 Thread Jan-Frode Myklebust
Yaron, doesn't "-1" make each of these disk an independent failure group? >From 'man mmcrnsd': "The default is -1, which indicates this disk has no point of failure in common with any other disk." -jf man. 9. jan. 2017 kl. 21.54 skrev Yaron Daniel : > Hi > > So - do u able to have GPFS rep

Re: [gpfsug-discuss] AFM Migration Issue

2017-01-09 Thread Jan-Frode Myklebust
Untested, and I have no idea if it will work on the number of files and directories you have, but maybe you can fix it by rsyncing just the directories? rsync -av --dry-run --include='*/' --exclude='*' source/ destination/ -jf man. 9. jan. 2017 kl. 16.09 skrev : > Hi All, > > We have just com

  1   2   >