Re: [gpfsug-discuss] kernel 3.10.0-1160.36.2.el7.x86_64 (CVE-2021-33909) not compatible with DB2 (for TSM, HPSS, possibly other IBM apps)

2021-07-30 Thread Jaime Pinto

Hey Jonathan

3.10.0-1160.31.1 seems to be one of the last kernel releases prior to the 
CVE-2021-33909 exploit.
3.10.0-1160.36.2.el7.x86_64 seems to be the first on the Redhat repo that fixes 
the exploit, but it's not working for our combination of TSM/DB2 versions:
* TSM 8.1.8
* DB2 v11.1.4.4

I'll just keep one eye on the repo for the next kernel available and try it 
again. Until then I'll stick with 3.10.0-1062.18.1

On the HPSS side 3.10.0-1160.36.2.el7.x86_64 worked fine with DB2 11.5, but not 
with 10.5

Thanks
Jaime


On 7/30/2021 07:27:49, Jonathan Buzzard wrote:

On 30/07/2021 05:16, Jaime Pinto wrote:


Alert related to sysadmins managing TSM/DB2 servers and those responsible for 
applying security patches, in particular kernel 3.10.0-1160.36.2.el7.x86_64, 
despite security concerns raised by CVE-2021-33909:

Please hold off on upgrading your RedHat systems (possibly centos too). I just 
found out the hard way that kernel 3.10.0-1160.36.2.el7.x86_64 is not 
compatible with DB2, and after the node reboot DB2 would not work anymore, not 
only on TSM, but neither on HPSS. I had to revert the kernel to 
3.10.0-1062.18.1.el7.x86_64 to get DB2 working properly again.



For the record I have been running Spectrum Protect Extended Edition 8.1.12 on 
3.10.0-1160.31.1 (genuine RHEL 7.9) since the 11th of June this year.

I would say therefore there is no need to roll back quite so far as 
3.10.0-1062.18.1 which is quite ancient now.

Can't test anything newer as I am literally in the middle of migrating our TSM 
server to new hardware and a RHEL 8.4 install. Spent yesterday in the data 
centre re-cabling the disk arrays to the new server; neat, tidy and labelled 
this time :-)


JAB.



---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] kernel 3.10.0-1160.36.2.el7.x86_64 (CVE-2021-33909) not compatible with DB2 (for TSM, HPSS, possibly other IBM apps)

2021-07-30 Thread Jaime Pinto

Alert related to sysadmins managing TSM/DB2 servers and those responsible for 
applying security patches, in particular kernel 3.10.0-1160.36.2.el7.x86_64, 
despite security concerns raised by CVE-2021-33909:

Please hold off on upgrading your RedHat systems (possibly centos too). I just 
found out the hard way that kernel 3.10.0-1160.36.2.el7.x86_64 is not 
compatible with DB2, and after the node reboot DB2 would not work anymore, not 
only on TSM, but neither on HPSS. I had to revert the kernel to 
3.10.0-1062.18.1.el7.x86_64 to get DB2 working properly again.

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] SS 5.0.x and quota issues

2020-05-18 Thread Jaime Pinto

So Bob,
Yes, we too have observed an uncharacteristic lag on the correction of the 
internal quota accounting on GPFS since we updated from version 3.3 to version 
4.x some 7-8 years ago. That lag remains through version 5.0.x as well. And it 
persisted through several appliances (DDN, G200, GSS, ESS and now DSS-G). In 
our university environment there is also a lot of data churning, in particular 
small files.

The workaround has always been to periodically run mmcheckquota on the top 
independent fileset to expedite that correction (I have a crontab script that 
measures the relative size of the in-dought columns on the mmrepquota report, 
size and inodes, and if either exceeds 2% for any USR/GRP/FILESET I run 
mmrepquota)

We have opened supports calls with IBM about this issue in the past, and we 
never got a solution, to possibly adjust some GPFS configuration parameter, and 
have this correction done automatically. We gave up.

And that begs the question: what do you mean by "... 5.0.4-4 ... that has a fix for 
mmcheckquota"? Isn't mmcheckquota zeroing the in-doubt columns when you run it?

The fix should be for gpfs (something buried in the code over many versions). 
As far as I can tell there has never been anything wrong with mmcheckquota.

Thanks
Jaime


On 5/18/2020 08:59:09, Cregan, Bob wrote:

Hi,
       At Imperial we  have been experiencing an issue with SS 5.0.x and quotas. The main 
symptom is a slow decay in the accuracy of reported quota usage when compared to the 
actual usage as reported by "du". This discrepancy can be as little as a few 
percent and as much as many  X100% . We also sometimes see bizarre effects such negative 
file number counts being reported.

We have been working with IBM  and have put in the latest 5.0.4-4 (that has a 
fix for mmcheckquota) that we have been pinning our hopes on, but this has not 
worked.

Is anyone else experiencing similar issues? We need to try and get an idea if 
this is an issue peculiar to our set up or a more general SS problem.

We are using user and group quotas in a fileset context.

Thanks


*Bob Cregan*
HPC Systems Analyst

Information & Communication Technologies

Imperial College London,
South Kensington Campus London, SW7 2AZ
T: 07712388129
E: b.cre...@imperial.ac.uk

W: www.imperial.ac.uk/ict/rcs <http://www.imperial.ac.uk/ict/rcs>

_1505984389175_twitter.png @imperialRCS @imperialRSE

1505983882959_Imperial-RCS.png



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



.
.
.
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Odd networking/name resolution issue

2020-05-10 Thread Jaime Pinto

The rationale for my suggestion doesn't have much to do with the central DNS 
server, but everything to do with the DNS client side of the service.
If you have a very busy cluster at times, and a number of nodes really busy 
with 1000+ IOPs for instance, so much that the OS on the client can't barely 
spare a cycle to query the DSN server on what the IP associated with the name 
of interface leading to the GPFS infrastructure is, or even process that 
response when it returns, on the same interface where it's having contentions 
and trying to process all the gpfs data transactions, you can have temporary 
catch 22 situations. This can generate a backlog of waiters, and eventual 
expelling of some nodes when the cluster managers don't hear from them in 
reasonable time.

It's doesn't really matter if you have a central DNS server in steroids.

Jaime

On 5/10/2020 03:35:29, TURNER Aaron wrote:

Following on from Jonathan Buzzards comments, I'd also like to point out that 
I've never known a central DNS failure in a UK HEI for as long as I can 
remember, and it was certainly not my intention to suggest that as I think a 
central DNS issue is highly unlikely. And indeed, as I originally noted, the 
standard command-line tools on the nodes resolve the names as expected, so 
whatever is going on looks like it affects GPFS only. It may even be that the 
repetition of the domain names in the logs is just a function of something it 
is doing when logging when a node is failing to connect for some other reason 
entirely. It's just not something I recall having seen before and wanted to see 
if anyone else had seen it.
--
*From:* gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Jonathan Buzzard 

*Sent:* 09 May 2020 23:22
*To:* gpfsug-discuss@spectrumscale.org 
*Subject:* Re: [gpfsug-discuss] Odd networking/name resolution issue
On 09/05/2020 12:06, Jaime Pinto wrote:
DNS shouldn't be relied upon on a GPFS cluster for internal 
communication/management or data.




The 1980's have called and want their lack of IP resolution protocols
back :-)

I would kindly disagree. If your DNS is not working then your cluster is
fubar anyway and a zillion other things will also break very rapidly.
For us at least half of the running jobs would be dead in a few minutes
as failure to contact license servers would cause the software to stop.
All authentication and account lookup is also going to fail as well.

You could distribute a hosts file but frankly outside of a storage only
cluster (as opposed to one with hundreds if not thousands of compute
nodes) that is frankly madness and will inevitably come to bite you in
the ass because they *will* get out of sync. The only hosts entry we
have is for the Salt Stack host because it tries to do things before the
DNS resolvers have been setup and consequently breaks otherwise. Which
IMHO is duff on it's behalf.

I would add I can't think of a time in the last 16 years where internal
DNS at any University I have worked at has stopped working for even one
millisecond. If DNS is that flaky at your institution then I suggest
sacking the people responsible for it's maintenance as being incompetent
twits. It is just such a vanishingly remote possibility that it's not
worth bothering about. Frankly a aircraft falling out the sky and
squishing your data centre seems more likely to me.

Finally in a world of IPv6 then anything other than DNS is a utter
madness IMHO.


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

Re: [gpfsug-discuss] Odd networking/name resolution issue

2020-05-09 Thread Jaime Pinto

DNS shouldn't be relied upon on a GPFS cluster for internal 
communication/management or data.

As a starting point, make sure the IP's and names of all managers/quorum nodes 
and clients have *unique* entries in the hosts files of all other nodes in the 
clusters, being the same as how they where joined and licensed in the first 
place. If you issue a 'mmlscluster' on the cluster manager for the servers and 
clients, those results should be used to build the common hosts file for all 
nodes involved.

Also, all nodes should have a common ntp configuration, pointing to the same 
*internal* ntp server, easily accessible via name/IP also on the hosts file.

And obviously, you need a stable network, eth or IB. Have a good monitoring 
tool in place, to rule out network as a possible culprit. In the particular 
case of IB, check that the fabric managers are doing their jobs properly.

And keep one eye on the 'tail -f /var/mmfs/gen/mmfslog' output of the managers 
and the nodes being expelled for other clues.

Jaime



On 5/9/2020 06:25:28, TURNER Aaron wrote:

Dear All,

We are getting, on an intermittent basis with currently no obvious pattern, an 
issue with GPFS nodes reporting rejecting nodes of the form:

nodename.domain.domain.domain

DNS resolution using the standard command-line tools of the IP address present 
in the logs does not repeat the domain, and so far it seems isolated to GPFS.

Ultimately the nodes are rejected as not responding on the network.

Has anyone seen this sort of behaviour before?

Regards

Aaron Turner
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



.
.
.
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] GPFS vulnerability with possible root exploit on versions prior to 5.0.4.3 (and 4.2.3.21)

2020-04-22 Thread Jaime Pinto

In case you missed (the forum has been pretty quiet about this one), 
CVE-2020-4273 had an update yesterday:

https://www.ibm.com/support/pages/node/6151701?myns=s033=OCSTXKQY=E_sp=s033-_-OCSTXKQY-_-E

If you can't do the upgrade now, at least apply the mitigation to the client 
nodes generally exposed to unprivileged users:

Check the setuid bit:
ls -l /usr/lpp/mmfs/bin | grep r-s | awk '{system("ls -l 
/usr/lpp/mmfs/bin/"$9)}')

Apply the mitigation:
ls -l /usr/lpp/mmfs/bin | grep r-s | awk '{system("chmod u-s 
/usr/lpp/mmfs/bin/"$9)}'

Verification:
ls -l /usr/lpp/mmfs/bin | grep r-s | awk '{system("ls -l 
/usr/lpp/mmfs/bin/"$9)}')

All the best
Jaime

.
.
.
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] fast search for archivable data sets

2020-04-03 Thread Jaime Pinto

Hi Jim,

If you never worked with policy rules before, you may want to start by building 
your nerves to it.

In the /usr/lpp/mmfs/samples/ilm path you will find several examples of 
templates that you can use to play around. I would start with the 'list' rules 
first.
Some of those templates are a bit complex, so here is one script that I use on 
a regular basis to detect files larger than 1MB (you can even exclude specific 
filesets):

~~~
dss-mgt1:/scratch/r/root/mmpolicyRules # cat mmpolicyRules-list-large
/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])

/* Define three external lists */
RULE EXTERNAL LIST 'largefiles' EXEC 
'/gpfs/fs0/scratch/r/root/mmpolicyRules/mmpolicyExec-list'

/* Generate a list of all files that have more than 1MB of space allocated. */
RULE 'r2' LIST 'largefiles'
SHOW('-u' vc(USER_ID) || ' -s' || vc(FILE_SIZE))
/*FROM POOL 'system'*/
FROM POOL 'data'
/*FOR FILESET('root')*/
WEIGHT(FILE_SIZE)
WHERE KB_ALLOCATED > 1024

/* Files in special filesets, such as mmpolicyRules, are never moved or deleted 
*/
RULE 'ExcSpecialFile' EXCLUDE
FOR FILESET('mmpolicyRules','todelete','tapenode-stuff','toarchive')
~~~



And here is another to detect files not looked at for more than 6 months. I 
found more effective to use atime and ctime. You could combine this with the 
one above to detect file size as well.

~~~
dss-mgt1:/scratch/r/root/mmpolicyRules # cat 
mmpolicyRules-list-atime-ctime-gt-6months
/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])

/* Define three external lists */
RULE EXTERNAL LIST 'accessedfiles' EXEC 
'/gpfs/fs0/scratch/r/root/mmpolicyRules/mmpolicyExec-list'

/* Generate a list of all files, directories, plus all other file system 
objects,
   like symlinks, named pipes, etc, accessed prior to a certain date AND that 
are
   not owned by root. Include the owner's id with each object and sort them by
   the owner's id */

/* Files in special filesets, such as mmpolicyRules, are never moved or deleted 
*/
RULE 'ExcSpecialFile' EXCLUDE
FOR FILESET ('scratch-root','todelete','root')

RULE 'r5' LIST 'accessedfiles'
DIRECTORIES_PLUS
FROM POOL 'data'
SHOW('-u' vc(USER_ID) || ' -a' || vc(ACCESS_TIME) || ' -c' || 
vc(CREATION_TIME) || ' -s ' || vc(FILE_SIZE))
WHERE (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME) > 183) AND 
(DAYS(CURRENT_TIMESTAMP) - DAYS(CREATION_TIME) > 183) AND NOT USER_ID = 0
AND NOT (PATH_NAME LIKE '/gpfs/fs0/scratch/r/root/%')
~~~


Note that both these scripts work on a system wide (or root fileset) basis, and 
will not give you specific directories, unless you run them several times on 
specific directories (not very efficient). To produce general lists per 
directory you would need to do some post processing on the lists, with 'awk' or 
some other scripting language. If you need some samples I can send you.


And finally, you need to be more specific by what you mean by 'archivable'. 
Once you produce the list you can do several things with them or leverage the 
rules to actually execute things, such as move, delete, or hsm stuff. The 
/usr/lpp/mmfs/samples/ilm path has some samples as well.



On 4/3/2020 18:25:33, Jim Kavitsky wrote:

Hello everyone,
I'm managing a low-multi-petabyte Scale filesystem with hundreds of millions of 
inodes, and I'm looking for the best way to locate archivable directories. For 
example, these might be directories where whose contents were greater than 5 or 
10TB, and whose contents had atimes greater than two years.

Has anyone found a great way to do this with a policy engine run? If not, is 
there another good way that anyone would recommend? Thanks in advance,


yes, there is another way, the 'mmfind' utility, also in the same sample path. You have 
to compile it for you OS (mmfind.README). This is a very powerful canned procedure that 
lets you run the "-exec" option just as in the normal linux version of 'find'. 
I use it very often, and it's just as efficient as the other policy rules based 
alternative.

Good luck.

Keep safe and confined.

Jaime




Jim Kavitsky

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



.
.
.
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___

Re: [gpfsug-discuss] mmbackup monitoring

2020-03-25 Thread Jaime Pinto

Additionally, mmbackup creates by default a .mmbackupCfg directory on the root 
of the fileset where it dumps several files and directories with the progress 
of the backup. For instance: expiredFiles/, prepFiles/, updatedFiles/, 
dsminstr.log, ...

You may then create a script to search these directories for logs/lists of what 
has happened, and generate a more detailed report of what happened during the 
backup. In our case I generate a daily report of how many files and how much 
data have been sent to the TSM server and deleted for each user, including 
their paths. You can do more tricks if you want.

Jaime


On 3/25/2020 10:15:59, Skylar Thompson wrote:

We execute mmbackup via a regular TSM client schedule with an incremental
action, with a virtualmountpoint set to an empty, local "canary" directory.
mmbackup runs as a preschedule command, and the client -domain parameter is
set only to backup the canary directory. dsmc will backup the canary
directory as a filespace only if mmbackup succeeds (exits with 0). We can
then monitor the canary and infer the status of the associated GPFS
filespace or fileset.

On Wed, Mar 25, 2020 at 10:01:04AM +, Jonathan Buzzard wrote:


What is the best way of monitoring whether or not mmbackup has managed to
complete a backup successfully?

Traditionally one use a TSM monitoring solution of your choice to make sure
nodes where backing up (I am assuming mmbackup is being used in conjunction
with TSM here).

However mmbackup does not update the backup_end column in the filespaceview
table (at least in 4.2) which makes things rather more complicated.

The best I can come up with is querying the events table to see if the
client schedule completed, but that gives a false sense of security as the
schedule completing does not mean the backup completed as far as I know.

What solutions are you all using, or does mmbackup in 5.x update the
filespaceview table?




.
.
.
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup [--tsm-servers TSMServer[, TSMServer...]]

2020-02-11 Thread Jaime Pinto

Hi Mark,
Just a follow up to your suggestion few months ago.

I finally got to a point where I do 2 independent backups of the same path to 2 servers, and they are pretty even, finishing within 4 hours each, when 
serialized.


I now just would like to use one mmbackup instance to 2 servers at the same time, with the --tsm-servers option, however it's not being 
accepted/recognized (see below).


So, what is the proper syntax for this option?

Thanks
Jaime

# /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib ‐‐tsm‐servers TAPENODE3,TAPENODE4 -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog 
--scope inodespace -v -a 8 -L 2

mmbackup: Incorrect extra argument: ‐‐tsm‐servers
Usage:
  mmbackup {Device | Directory} [-t {full | incremental}]
   [-N {Node[,Node...] | NodeFile | NodeClass}]
   [-g GlobalWorkDirectory] [-s LocalWorkDirectory]
   [-S SnapshotName] [-f] [-q] [-v] [-d]
   [-a IscanThreads] [-n DirThreadLevel]
   [-m ExecThreads | [[--expire-threads ExpireThreads] 
[--backup-threads BackupThreads]]]
   [-B MaxFiles | [[--max-backup-count MaxBackupCount] 
[--max-expire-count MaxExpireCount]]]
   [--max-backup-size MaxBackupSize] [--qos QosClass] [--quote | 
--noquote]
   [--rebuild] [--scope {filesystem | inodespace}]
   [--backup-migrated | --skip-migrated] [--tsm-servers 
TSMServer[,TSMServer...]]
   [--tsm-errorlog TSMErrorLogFile] [-L n] [-P PolicyFile]

Changing the order of the options/arguments makes no difference.

Even when I explicitly specify only one server, mmbackup still doesn't seem to 
recognize the ‐‐tsm‐servers option (it thinks it's some kind of argument):

# /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib ‐‐tsm‐servers TAPENODE3 -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog --scope 
inodespace -v -a 8 -L 2

mmbackup: Incorrect extra argument: ‐‐tsm‐servers
Usage:
  mmbackup {Device | Directory} [-t {full | incremental}]
   [-N {Node[,Node...] | NodeFile | NodeClass}]
   [-g GlobalWorkDirectory] [-s LocalWorkDirectory]
   [-S SnapshotName] [-f] [-q] [-v] [-d]
   [-a IscanThreads] [-n DirThreadLevel]
   [-m ExecThreads | [[--expire-threads ExpireThreads] 
[--backup-threads BackupThreads]]]
   [-B MaxFiles | [[--max-backup-count MaxBackupCount] 
[--max-expire-count MaxExpireCount]]]
   [--max-backup-size MaxBackupSize] [--qos QosClass] [--quote | 
--noquote]
   [--rebuild] [--scope {filesystem | inodespace}]
   [--backup-migrated | --skip-migrated] [--tsm-servers 
TSMServer[,TSMServer...]]
   [--tsm-errorlog TSMErrorLogFile] [-L n] [-P PolicyFile]



I defined the 2 servers stanzas as follows:

# cat dsm.sys
SERVERNAME TAPENODE3
SCHEDMODE   PROMPTED
ERRORLOGRETENTION   0 D
TCPSERVERADDRESS10.20.205.51
NODENAMEhome
COMMMETHOD  TCPIP
TCPPort 1500
PASSWORDACCESS  GENERATE
TXNBYTELIMIT1048576 

SERVERNAME TAPENODE4
SCHEDMODE   PROMPTED
ERRORLOGRETENTION   0 D
TCPSERVERADDRESS192.168.94.128
NODENAMEhome
COMMMETHOD  TCPIP
TCPPort 1500
PASSWORDACCESS  GENERATE
TXNBYTELIMIT1048576
TCPBuffsize 512







On 2019-11-03 8:56 p.m., Jaime Pinto wrote:



On 11/3/2019 20:24:35, Marc A Kaplan wrote:

Please show us the 2 or 3 mmbackup commands that you would like to run 
concurrently.


Hey Marc,
They would be pretty similar, with the only different being the target TSM server, determined by sourcing a different dsmenv1(2 or 3) prior to the 
start of each instance, each with its own dsm.sys (3 wrappers).
(source dsmenv1; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg1  --scope inodespace -v -a 8 -L 2)
(source dsmenv3; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg2  --scope inodespace -v -a 8 -L 2)
(source dsmenv3; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg3  --scope inodespace -v -a 8 -L 2)


I was playing with the -L (to control the policy), but you bring up a very good point I had not experimented with, such as a single traverse for 
multiple target servers. It may be just what I need. I'll try this next.


Thank you very much,
Jaime



Peeking into the script, I find:

if [[ $scope == "inode-space" ]]
then
deviceSuffix="${deviceName}.${filesetName}"
else
deviceSuffix="${deviceName}"


I believe mmbackup is designed to allow concurrent backup of different 
indepe

Re: [gpfsug-discuss] mmbackup ‐g GlobalWorkDirectory not being followed

2019-11-03 Thread Jaime Pinto



On 11/3/2019 20:24:35, Marc A Kaplan wrote:

Please show us the 2 or 3 mmbackup commands that you would like to run 
concurrently.


Hey Marc,
They would be pretty similar, with the only different being the target TSM 
server, determined by sourcing a different dsmenv1(2 or 3) prior to the start 
of each instance, each with its own dsm.sys (3 wrappers).
(source dsmenv1; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s 
/dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg1  --scope inodespace -v -a 8 -L 2)
(source dsmenv3; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s 
/dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg2  --scope inodespace -v -a 8 -L 2)
(source dsmenv3; /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib -s 
/dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog  -g 
/gpfs/fs1/home/.mmbackupCfg3  --scope inodespace -v -a 8 -L 2)

I was playing with the -L (to control the policy), but you bring up a very good 
point I had not experimented with, such as a single traverse for multiple 
target servers. It may be just what I need. I'll try this next.

Thank you very much,
Jaime



Peeking into the script, I find:

if [[ $scope == "inode-space" ]]
then
deviceSuffix="${deviceName}.${filesetName}"
else
deviceSuffix="${deviceName}"


I believe mmbackup is designed to allow concurrent backup of different 
independent filesets within the same filesystem, Or different filesystems...

And a single mmbackup instance can drive several TSM servers, which can be 
named with an option or in the dsm.sys file:

# --tsm-servers TSMserver[,TSMserver...]
# List of TSM servers to use instead of the servers in the dsm.sys file.



Inactive hide details for Jaime Pinto ---11/01/2019 07:40:47 PM---How can I 
force secondary processes to use the folder instrucJaime Pinto ---11/01/2019 
07:40:47 PM---How can I force secondary processes to use the folder instructed 
by the -g option? I started a mmbac

From: Jaime Pinto 
To: gpfsug main discussion list 
Date: 11/01/2019 07:40 PM
Subject: [EXTERNAL] [gpfsug-discuss] mmbackup ‐g GlobalWorkDirectory not being 
followed
Sent by: gpfsug-discuss-boun...@spectrumscale.org

--



How can I force secondary processes to use the folder instructed by the -g 
option?

I started a mmbackup with ‐g /gpfs/fs1/home/.mmbackupCfg1 and another with ‐g 
/gpfs/fs1/home/.mmbackupCfg2 (and another with ‐g /gpfs/fs1/home/.mmbackupCfg3 
...)

However I'm still seeing transient files being worked into a 
"/gpfs/fs1/home/.mmbackupCfg" folder (created by magic !!!). This absolutely 
can not happen, since it's mixing up workfiles from multiple mmbackup instances for 
different target TSM servers.

See below the "-f /gpfs/fs1/home/.mmbackupCfg/prepFiles" created by 
mmapplypolicy (forked by mmbackup):

DEBUGtsbackup33: /usr/lpp/mmfs/bin/mmapplypolicy "/gpfs/fs1/home" -g 
/gpfs/fs1/home/.mmbackupCfg2 -N tapenode3-ib -s /dev/shm -L 2 --qos maintenance -a 8  -P 
/var/mmfs/mmbackup/.mmbackupRules.fs1.home -I prepare -f 
/gpfs/fs1/home/.mmbackupCfg/prepFiles --irule0 --sort-buffer-size=5% --scope inodespace


Basically, I don't want a "/gpfs/fs1/home/.mmbackupCfg" folder to ever exist. 
Otherwise I'll be forced to serialize these backups, to avoid the different mmbackup 
instances tripping over each other. The serializing is very undesirable.

Thanks
Jaime






 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] mmbackup ‐g GlobalWorkDirectory not being followed

2019-11-01 Thread Jaime Pinto

How can I force secondary processes to use the folder instructed by the -g 
option?

I started a mmbackup with ‐g /gpfs/fs1/home/.mmbackupCfg1 and another with ‐g 
/gpfs/fs1/home/.mmbackupCfg2 (and another with ‐g /gpfs/fs1/home/.mmbackupCfg3 
...)

However I'm still seeing transient files being worked into a 
"/gpfs/fs1/home/.mmbackupCfg" folder (created by magic !!!). This absolutely 
can not happen, since it's mixing up workfiles from multiple mmbackup instances for 
different target TSM servers.

See below the "-f /gpfs/fs1/home/.mmbackupCfg/prepFiles" created by 
mmapplypolicy (forked by mmbackup):

DEBUGtsbackup33: /usr/lpp/mmfs/bin/mmapplypolicy "/gpfs/fs1/home" -g 
/gpfs/fs1/home/.mmbackupCfg2 -N tapenode3-ib -s /dev/shm -L 2 --qos maintenance -a 8  -P 
/var/mmfs/mmbackup/.mmbackupRules.fs1.home -I prepare -f 
/gpfs/fs1/home/.mmbackupCfg/prepFiles --irule0 --sort-buffer-size=5% --scope inodespace


Basically, I don't want a "/gpfs/fs1/home/.mmbackupCfg" folder to ever exist. 
Otherwise I'll be forced to serialize these backups, to avoid the different mmbackup 
instances tripping over each other. The serializing is very undesirable.

Thanks
Jaime



 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) files

2019-03-12 Thread Jaime Pinto
How can I instruct mmbackup to *NOT* delete the temporary directories  
and files created inside the FILESET/.mmbackupCfg folder?


I can see that during the process the folders expiredFiles &  
updatedFiles are there, and contain the lists I'm interested in for  
post-analysis.


Thanks
Jaime



---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmapplypolicy on nested filesets ...

2018-06-25 Thread Jaime Pinto
It took a while before I could get back to this issue, but I want to  
confirm that Marc's  suggestions worked line a charm, and did exactly  
what I hoped for:


* remove any FOR FILESET(...) specifications
* mmapplypolicy  
/path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan ...  
--scope inodespace  -P your-policy-rules-file ...


I didn't have to do anything else, but exclude a few filesets from the scan.

Thanks
Jaime


Quoting "Marc A Kaplan" :


I suggest you remove any FOR FILESET(...) specifications from your rules
and then run

mmapplypolicy
/path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan
... --scope inodespace  -P your-policy-rules-file ...

See also the (RTFineM) for the --scope option and the Directory argument
of the mmapplypolicy command.

That is the best, most efficient way to scan all the files that are in a
particular inode-space.  Also, you must have all filesets of interest
"linked" and the file system must be mounted.

Notice that "independent" means that the fileset name is used to denote
both a fileset and an inode-space, where said inode-space contains the
fileset of that name and possibly other "dependent" filesets...

IF one wished to search the entire file system for files within several
different filesets, one could use rules with

FOR FILESET('fileset1','fileset2','and-so-on')

Or even more flexibly

WHERE   FILESET_NAME LIKE  'sql-like-pattern-with-%s-and-maybe-_s'

Or even more powerfully

WHERE  regex(FILESET_NAME, 'extended-regular-.*-expression')





From:   "Jaime Pinto" 
To: "gpfsug main discussion list" 
Date:   04/18/2018 01:00 PM
Subject:[gpfsug-discuss] mmapplypolicy on nested filesets ...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



A few months ago I asked about limits and dynamics of traversing
depended .vs independent filesets on this forum. I used the
information provided to make decisions and setup our new DSS based
gpfs storage system. Now I have a problem I couldn't' yet figure out
how to make it work:

'project' and 'scratch' are top *independent* filesets of the same
file system.

'proj1', 'proj2' are dependent filesets nested under 'project'
'scra1', 'scra2' are dependent filesets nested under 'scratch'

I would like to run a purging policy on all contents under 'scratch'
(which includes 'scra1', 'scra2'), and TSM backup policies on all
contents under 'project' (which includes 'proj1', 'proj2').

HOWEVER:
When I run the purging policy on the whole gpfs device (with both
'project' and 'scratch' filesets)

* if I use FOR FILESET('scratch') on the list rules, the 'scra1' and
'scra2' filesets under scratch are excluded (totally unexpected)

* if I use FOR FILESET('scra1') I get error that scra1 is dependent
fileset (Ok, that is expected)

* if I use /*FOR FILESET('scratch')*/, all contents under 'project',
'proj1', 'proj2' are traversed as well, and I don't want that (it
takes too much time)

* if I use /*FOR FILESET('scratch')*/, and instead of the whole device
I apply the policy to the /scratch mount point only, the policy still
traverses all the content of 'project', 'proj1', 'proj2', which I
don't want. (again, totally unexpected)

QUESTION:

How can I craft the syntax of the mmapplypolicy in combination with
the RULE filters, so that I can traverse all the contents under the
'scratch' independent fileset, including the nested dependent filesets
'scra1','scra2', and NOT traverse the other independent filesets at
all (since this takes too much time)?

Thanks
Jaime


PS: FOR FILESET('scra*') does not work.




  
   TELL US ABOUT YOUR SUCCESS STORIES

https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials=DwICAg=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk=

  
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of
Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk=














 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 

Re: [gpfsug-discuss] Capacity pool filling

2018-06-07 Thread Jaime Pinto
I think the restore is is bringing back a lot of material with atime >  
90, so it is passing-trough gpfs23data and going directly to  
gpfs23capacity.


I also think you may not have stopped the crontab script as you  
believe you did.


Jaime





Quoting "Buterbaugh, Kevin L" :


Hi All,

First off, I?m on day 8 of dealing with two different   
mini-catastrophes at work and am therefore very sleep deprived and   
possibly missing something obvious ? with that disclaimer out of the  
 way?


We have a filesystem with 3 pools:  1) system (metadata only), 2)   
gpfs23data (the default pool if I run mmlspolicy), and 3)   
gpfs23capacity (where files with an atime - yes atime - of more than  
 90 days get migrated to by a script that runs out of cron each   
weekend.


However ? this morning the free space in the gpfs23capacity pool is   
dropping ? I?m down to 0.5 TB free in a 582 TB pool ? and I cannot   
figure out why.  The migration script is NOT running ? in fact, it?s  
 currently disabled.  So I can only think of two possible   
explanations for this:


1.  There are one or more files already in the gpfs23capacity pool   
that someone has started updating.  Is there a way to check for that  
 ? i.e. a way to run something like ?find /gpfs23 -mtime -7 -ls? but  
 restricted to only files in the gpfs23capacity pool.  Marc Kaplan -  
 can mmfind do that??  ;-)


2.  We are doing a large volume of restores right now because one of  
 the mini-catastrophes I?m dealing with is one NSD (gpfs23data pool)  
 down due to a issue with the storage array.  We?re working with the  
 vendor to try to resolve that but are not optimistic so we have   
started doing restores in case they come back and tell us it?s not   
recoverable.  We did run ?mmfileid? to identify the files that have   
one or more blocks on the down NSD, but there are so many that what   
we?re doing is actually restoring all the files to an alternate path  
 (easier for out tape system), then replacing the corrupted files,   
then deleting any restores we don?t need.  But shouldn?t all of that  
 be going to the gpfs23data pool?  I.e. even if we?re restoring  
files  that are in the gpfs23capacity pool shouldn?t the fact that  
we?re  restoring to an alternate path (i.e. not overwriting files  
with the  tape restores) and the default pool is the gpfs23data pool  
mean that  nothing is being restored to the gpfs23capacity pool???


Is there a third explanation I?m not thinking of?

Thanks...

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> -   
(615)875-9633












 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmapplypolicy on nested filesets ...

2018-04-18 Thread Jaime Pinto

Ok Marc and Frederick, there is hope.
I'll conduct more experiments and report back
Thanks for the suggestions.
Jaime

Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


I suggest you remove any FOR FILESET(...) specifications from your rules
and then run

mmapplypolicy
/path/to/the/root/directory/of/the/independent-fileset-you-wish-to-scan
... --scope inodespace  -P your-policy-rules-file ...

See also the (RTFineM) for the --scope option and the Directory argument
of the mmapplypolicy command.

That is the best, most efficient way to scan all the files that are in a
particular inode-space.  Also, you must have all filesets of interest
"linked" and the file system must be mounted.

Notice that "independent" means that the fileset name is used to denote
both a fileset and an inode-space, where said inode-space contains the
fileset of that name and possibly other "dependent" filesets...

IF one wished to search the entire file system for files within several
different filesets, one could use rules with

FOR FILESET('fileset1','fileset2','and-so-on')

Or even more flexibly

WHERE   FILESET_NAME LIKE  'sql-like-pattern-with-%s-and-maybe-_s'

Or even more powerfully

WHERE  regex(FILESET_NAME, 'extended-regular-.*-expression')





From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   04/18/2018 01:00 PM
Subject:[gpfsug-discuss] mmapplypolicy on nested filesets ...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



A few months ago I asked about limits and dynamics of traversing
depended .vs independent filesets on this forum. I used the
information provided to make decisions and setup our new DSS based
gpfs storage system. Now I have a problem I couldn't' yet figure out
how to make it work:

'project' and 'scratch' are top *independent* filesets of the same
file system.

'proj1', 'proj2' are dependent filesets nested under 'project'
'scra1', 'scra2' are dependent filesets nested under 'scratch'

I would like to run a purging policy on all contents under 'scratch'
(which includes 'scra1', 'scra2'), and TSM backup policies on all
contents under 'project' (which includes 'proj1', 'proj2').

HOWEVER:
When I run the purging policy on the whole gpfs device (with both
'project' and 'scratch' filesets)

* if I use FOR FILESET('scratch') on the list rules, the 'scra1' and
'scra2' filesets under scratch are excluded (totally unexpected)

* if I use FOR FILESET('scra1') I get error that scra1 is dependent
fileset (Ok, that is expected)

* if I use /*FOR FILESET('scratch')*/, all contents under 'project',
'proj1', 'proj2' are traversed as well, and I don't want that (it
takes too much time)

* if I use /*FOR FILESET('scratch')*/, and instead of the whole device
I apply the policy to the /scratch mount point only, the policy still
traverses all the content of 'project', 'proj1', 'proj2', which I
don't want. (again, totally unexpected)

QUESTION:

How can I craft the syntax of the mmapplypolicy in combination with
the RULE filters, so that I can traverse all the contents under the
'scratch' independent fileset, including the nested dependent filesets
'scra1','scra2', and NOT traverse the other independent filesets at
all (since this takes too much time)?

Thanks
Jaime


PS: FOR FILESET('scra*') does not work.




  
   TELL US ABOUT YOUR SUCCESS STORIES

https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials=DwICAg=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE=IpwHlr0YNr7rgV7gI8Y2sxIELLIwA15KK4nBnv9BYWk=

  
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of
Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=y0aRzkzp0QA9QR8eh3XtN6PETqWYDCNvItdihzdueTE=aff0vMJkKd-Z3pw3-jckmI3ejqXh8aSr8rxkKf3OGdk=














 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 Universi

Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5?

2018-02-05 Thread Jaime Pinto
We are considering moving from user/group based quotas to path based  
quotas with nested filesets. We also facing challenges to traverse  
'Dependent Filesets' for daily TSM backups of projects and for purging  
scratch area.


We're about to deploy a new GPFS storage cluster, some 12-15PB, 13K+  
users and 5K+ groups as the baseline, with expected substantial  
scaling up within the next 3-5 years in all dimmensions. Therefore,  
decisions we make now under GPFS v4.x trough v5.x will have  
consequences in the very near future, if they are not the proper ones.


Thanks
Jaime


Quoting "Daniel Kidger" <daniel.kid...@uk.ibm.com>:


Jamie, I believe at least one of those limits is 'maximum supported'
rather than an architectural limit.   Is your use case one which
would push these boundaries?  If so care to describe what you would
wish to do? Daniel

  [1]

 DR DANIEL KIDGER
IBM Technical Sales Specialist
Software Defined Solution Sales

+44-(0)7818 522 266
daniel.kid...@uk.ibm.com


  - Original message -----
From: "Jaime Pinto" <pi...@scinet.utoronto.ca>
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>,
"Truong Vu" <truo...@us.ibm.com>
Cc: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5?
Date: Mon, Feb 5, 2018 2:56 PM
  Thanks Truong
Jaime

Quoting "Truong Vu" <truo...@us.ibm.com>:



Hi Jamie,

The limits are the same in 5.0.0.  We'll look into the FAQ.

Thanks,
Tru.




From: gpfsug-discuss-requ...@spectrumscale.org
To: gpfsug-discuss@spectrumscale.org
Date: 02/05/2018 07:00 AM
Subject: gpfsug-discuss Digest, Vol 73, Issue 9
Sent by: gpfsug-discuss-boun...@spectrumscale.org



Send gpfsug-discuss mailing list submissions to
gpfsug-discuss@spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit



https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM=doLWvSNAkaAwsGv0OWEMdk4umwTUPj5qHjnchKlkNE4=ptDCYhJK4ltkJaYKCaTThZHUXCFrHGIIPVCgBD-VH8s=[2]


or, via email, send a message with subject or body 'help' to
gpfsug-discuss-requ...@spectrumscale.org

You can reach the person managing the list at
gpfsug-discuss-ow...@spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Maximum Number of filesets on GPFS v5? (Jaime Pinto)




------


Message: 1
Date: Sun, 04 Feb 2018 14:58:39 -0500
From: "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list"

<gpfsug-discuss@spectrumscale.org>

Subject: [gpfsug-discuss] Maximum Number of filesets on GPFS v5?
Message-ID:
<20180204145839.77101pngtlr3q...@support.scinet.utoronto.ca>
Content-Type: text/plain; charset=ISO-8859-1;
DelSp="Yes";
format="flowed"

Here is what I found for versions 4 & 3.5:
* Maximum Number of Dependent Filesets: 10,000
* Maximum Number of Independent Filesets: 1,000



https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#filesets[3]




I'm having some difficulty finding published documentation on
limitations for version 5:



https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/6027-2699.htm[4]





https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1pdg_increasefilesetspace.htm[5]



Any hints?

Thanks
Jaime


---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto



This message was sent using IMP at SciNet Consortium, University of
Toronto.




--

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org


https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM=doLWvSNAkaAwsGv0OWEMdk4umwTUPj5qHjnchKlkNE4=ptDCYhJK4ltkJaYKCaTThZHUXCFrHGIIPVCgBD-VH8s=[6]




End of gpfsug-discuss Digest, Vol 73, Issue 9
*






  
   TELL US ABOUT YOUR SUCCESS STORIES

https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials=DwICAg=jf_iaSHvJObTbx-siA1ZOg=HlQDuUjgJx4p54QzcXd0_zTwf4Cr2t3NINalNhLTA2E=xnPNZO_v81jNbr_IcbbyLPUpPdAFjKIzptnqTnmqaFQ=Dln7axLq9ej2KttpKZJwLKuvxfS

Re: [gpfsug-discuss] Maximum Number of filesets on GPFS v5?

2018-02-05 Thread Jaime Pinto

Thanks Truong
Jaime

Quoting "Truong Vu" <truo...@us.ibm.com>:



Hi Jamie,

The limits are the same in 5.0.0.  We'll look into the FAQ.

Thanks,
Tru.




From:   gpfsug-discuss-requ...@spectrumscale.org
To: gpfsug-discuss@spectrumscale.org
Date:   02/05/2018 07:00 AM
Subject:gpfsug-discuss Digest, Vol 73, Issue 9
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Send gpfsug-discuss mailing list submissions to
 gpfsug-discuss@spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit

https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM=doLWvSNAkaAwsGv0OWEMdk4umwTUPj5qHjnchKlkNE4=ptDCYhJK4ltkJaYKCaTThZHUXCFrHGIIPVCgBD-VH8s=

or, via email, send a message with subject or body 'help' to
 gpfsug-discuss-requ...@spectrumscale.org

You can reach the person managing the list at
 gpfsug-discuss-ow...@spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Maximum Number of filesets on GPFS v5? (Jaime Pinto)


--

Message: 1
Date: Sun, 04 Feb 2018 14:58:39 -0500
From: "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Subject: [gpfsug-discuss] Maximum Number of filesets on GPFS v5?
Message-ID:
 <20180204145839.77101pngtlr3q...@support.scinet.utoronto.ca>
Content-Type: text/plain;charset=ISO-8859-1;
DelSp="Yes";
 format="flowed"

Here is what I found for versions 4 & 3.5:
* Maximum Number of Dependent Filesets: 10,000
* Maximum Number of Independent Filesets: 1,000

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#filesets



I'm having some difficulty finding published documentation on
limitations for version 5:

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/6027-2699.htm


https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1pdg_increasefilesetspace.htm


Any hints?

Thanks
Jaime


---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto



This message was sent using IMP at SciNet Consortium, University of
Toronto.




--

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=HQmkdQWQHoc1Nu6Mg_g8NVugim3OiUUy5n0QgLQcbkM=doLWvSNAkaAwsGv0OWEMdk4umwTUPj5qHjnchKlkNE4=ptDCYhJK4ltkJaYKCaTThZHUXCFrHGIIPVCgBD-VH8s=



End of gpfsug-discuss Digest, Vol 73, Issue 9
*











 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Maximum Number of filesets on GPFS v5?

2018-02-04 Thread Jaime Pinto

Here is what I found for versions 4 & 3.5:
* Maximum Number of Dependent Filesets: 10,000
* Maximum Number of Independent Filesets: 1,000

https://www.ibm.com/support/knowledgecenter/en/STXKQY/gpfsclustersfaq.html#filesets


I'm having some difficulty finding published documentation on  
limitations for version 5:


https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/6027-2699.htm

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1pdg_increasefilesetspace.htm

Any hints?

Thanks
Jaime


---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto



This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Fileset quotas enforcement

2017-10-25 Thread Jaime Pinto

Did you try to run mmcheckquota on the device

I observed that in the most recent versions (for the last 3 years)  
there is a real long lag for GPFS to process the internal accounting.  
So there is a slippage effects that skews quota operations.  
mmcheckquota is supposed to reset and zero all those cumulative deltas  
effective immediately.


Jaime

Quoting "Emmanuel Barajas Gonzalez" <vanfa...@mx1.ibm.com>:


Hello spectrum scale team!   I'm working on the implementation of
quotas per fileset and I followed the basic instructions described in
the documentation. Currently the gpfs device has per-fileset quotas
and there is one fileset with a block soft and a hard limit set.  My
problem is that I'm being able to write more and more files beyond
the quota (the grace period has expired as well).   How can I make
sure quotas will be enforced and that no user will be able to consume
more space than specified?   mmrepquota smfslv0
 Block Limits
   |
Name   filesettype KB  quota  limit
in_doubtgrace |
root   root   USR 512  0  0
   0 none |
root   cp1USR   64128  0  0
   0 none |
system root   GRP 512  0  0
   0 none |
system cp1GRP   64128  0  0
   0 none |
valid  root   GRP   0  0  0
   0 none |
root   root   FILESET 512  0  0
   0 none |
cp1root   FILESET   64128   2048   2048
   0  expired |
Thanks in advance !   Best regards,
__
Emmanuel Barajas Gonzalez TRANSPARENT CLOUD TIERING FOR DS8000

Phone:
52-33-3669-7000 x5547 E-mail: vanfa...@mx1.ibm.com[1] Follow me:
@van_falen
  2200 Camino A El Castillo
El Salto, JAL 45680
Mexico




Links:
--
[1] mailto:vanfa...@mx1.ibm.com








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Quota and hardlimit enforcement

2017-07-31 Thread Jaime Pinto
www.huk.de<http://www.huk.de/>


HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter  
Deutschlands a. G. in Coburg

Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr.  
Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel  
Thomas (stv.).


Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte  
Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht  
irrtümlich erhalten haben,

informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser   
Nachricht ist nicht gestattet.


This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this   
information in error) please notify the

sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material  
 in this information is strictly forbidden.



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss








___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss









 
  TELL US ABOUT YOUR SUCCESS STORIES
     http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Quota and hardlimit enforcement

2017-07-31 Thread Jaime Pinto

Renar

For as long as the usage is below the hard limit (space or inodes) and  
below the grace period you'll be able to write.


I don't think you can set the grace period to an specific value as a  
quota parameter, such as none. That is set at the filesystem creation  
time. BTW, grace period limit has been a mystery to me for many years.  
My impression is that GPFS keeps changing it internally depending on  
the position of the moon. I think ours is 2 hours, but at times I can  
see users writing for longer.


Jaime



Quoting "Grunenberg, Renar" <renar.grunenb...@huk-coburg.de>:


Hallo All,
we are on Version 4.2.3.2 and see some missunderstandig in the   
enforcement of hardlimit definitions on a flieset quota. What we see  
 is we put some 200 GB files on following quota definitions:  quota   
150 GB Limit 250 GB Grace none.
After the creating of one 200 GB we hit the softquota limit, thats   
ok. But After the the second file was created!! we expect an io   
error but it don?t happen. We define all well know Parameters   
(-Q,..) on the filesystem . Is this a bug or a Feature? mmcheckquota  
 are already running at first.

Regards Renar.


Renar Grunenberg
Abteilung Informatik ? Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:09561 96-44110
Telefax:09561 96-44104
E-Mail: renar.grunenb...@huk-coburg.de
Internet:   www.huk.de

HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter  
Deutschlands a. G. in Coburg

Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr.  
Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel  
Thomas (stv.).


Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte  
Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht  
irrtümlich erhalten haben,

informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser   
Nachricht ist nicht gestattet.


This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this   
information in error) please notify the

sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material  
 in this information is strictly forbidden.










 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Spectrum Scale - Spectrum Protect - SpaceManagement (GPFS HSM)

2017-06-02 Thread Jaime Pinto
It has been a while since I used HSM with GPFS via TSM, but as far as  
I can remember, unprivileged users can run dsmmigrate and dsmrecall.


Based on the instructions on the link, dsmrecall may now leverage the  
Recommended Access Order (RAO) available on enterprise drives, however  
root would have to be the one to invoke that feature.


In that case we may have to develop a middleware/wrapper for dsmrecall  
that will run as root and act on behalf of the user when optimization  
is requested.


Someone here more familiar with the latest version of TSM-HSM may be  
able to give us some hints on how people are doing this in practice.


Jaime




Quoting "Andrew Beattie" <abeat...@au1.ibm.com>:


Thanks Jaime, How do you get around Optimised recalls? from what I
can see the optimised recall process needs a root level account to
retrieve a list of files
https://www.ibm.com/support/knowledgecenter/SSSR2R_7.1.1/com.ibm.itsm.hsmul.doc/c_recall_optimized_tape.html[1]
Regards, Andrew Beattie Software Defined Storage  - IT Specialist
Phone: 614-2133-7927 E-mail: abeat...@au1.ibm.com[2] -
Original message -----
From: "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>,
"Andrew Beattie" <abeat...@au1.ibm.com>
Cc: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] Spectrum Scale - Spectrum Protect -
Space Management (GPFS HSM)
Date: Fri, Jun 2, 2017 7:28 PM
  We have that situation.
Users don't need to login to NSD's

What you need is to add at least one gpfs client to the cluster (or
multi-cluster), mount the DMAPI enabled file system, and use that
node
as a gateway for end-users. They can access the contents on the mount

point with their own underprivileged accounts.

Whether or not on a schedule, the moment an application or linux
command (such as cp, cat, vi, etc) accesses a stub, the file will be

staged.

Jaime

Quoting "Andrew Beattie" <abeat...@au1.ibm.com>:


Quick question,   Does anyone have a Scale / GPFS environment (HPC)
where users need the ability to recall data sets after they have

been

stubbed, but only System Administrators are permitted to log onto

the

NSD servers for security purposes.   And if so how do you provide

the

ability for the users to schedule their data set recalls?

Regards,

Andrew Beattie Software Defined Storage  - IT Specialist Phone:
614-2133-7927 E-mail: abeat...@au1.ibm.com[1]


Links:
--
[1] mailto:abeat...@au1.ibm.com[3]



  
   TELL US ABOUT YOUR SUCCESS STORIES
  http://www.scinethpc.ca/testimonials[4]
  
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477





This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Spectrum Scale - Spectrum Protect - Space Management (GPFS HSM)

2017-06-02 Thread Jaime Pinto

We have that situation.
Users don't need to login to NSD's

What you need is to add at least one gpfs client to the cluster (or  
multi-cluster), mount the DMAPI enabled file system, and use that node  
as a gateway for end-users. They can access the contents on the mount  
point with their own underprivileged accounts.


Whether or not on a schedule, the moment an application or linux  
command (such as cp, cat, vi, etc) accesses a stub, the file will be  
staged.


Jaime




Quoting "Andrew Beattie" <abeat...@au1.ibm.com>:


Quick question,   Does anyone have a Scale / GPFS environment (HPC)
where users need the ability to recall data sets after they have been
stubbed, but only System Administrators are permitted to log onto the
NSD servers for security purposes.   And if so how do you provide the
ability for the users to schedule their data set recalls?   Regards,
Andrew Beattie Software Defined Storage  - IT Specialist Phone:
614-2133-7927 E-mail: abeat...@au1.ibm.com[1]


Links:
--
[1] mailto:abeat...@au1.ibm.com








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup with TSM INCLUDE/EXCLUDE was Re: What is an independent fileset? was: mmbackup with fileset : scope errors

2017-05-29 Thread Jaime Pinto
s
Mon May 29 15:54:52 2017 mmbackup:Determining file system changes for  
wosgpfs [TAPENODE3].
Mon May 29 15:54:52 2017 mmbackup:changed=3, expired=0, unsupported=0  
for server [TAPENODE3]
Mon May 29 15:54:52 2017 mmbackup:Sending files to the TSM server [3  
changed, 0 expired].

mmbackup: TSM Summary Information:
Total number of objects inspected:  3
Total number of objects backed up:  3
Total number of objects updated:0
Total number of objects rebound:0
Total number of objects deleted:0
Total number of objects expired:0
Total number of objects failed: 0
Total number of objects encrypted:  0
Total number of bytes inspected:4096
Total number of bytes transferred:  512
--
mmbackup: Backup of /wosgpfs completed successfully at Mon May 29  
15:54:56 EDT 2017.

--

real0m9.276s
user0m2.906s
sys 0m3.212s

_


Thanks for all the help
Jaime








From:   Jez Tucker <jtuc...@pixitmedia.com>
To: gpfsug-discuss@spectrumscale.org
Date:   05/18/2017 03:33 PM
Subject:Re: [gpfsug-discuss] What is an independent fileset? was:
mmbackup with fileset : scope errors
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi

  When mmbackup has passed the preflight stage (pretty quickly) you'll
find the autogenerated ruleset as /var/mmfs/mmbackup/.mmbackupRules*

Best,

Jez


On 18/05/17 20:02, Jaime Pinto wrote:
Ok Mark

I'll follow your option 2) suggestion, and capture what mmbackup is using
as a rule first, then modify it.

I imagine by 'capture' you are referring to the -L n level I use?

-L n
 Controls the level of information displayed by the
 mmbackup command. Larger values indicate the
 display of more detailed information. n should be one of
 the following values:

 3
  Displays the same information as 2, plus each
  candidate file and the applicable rule.

 4
  Displays the same information as 3, plus each
  explicitly EXCLUDEed or LISTed
  file, and the applicable rule.

 5
  Displays the same information as 4, plus the
  attributes of candidate and EXCLUDEed or
  LISTed files.

 6
  Displays the same information as 5, plus
  non-candidate files and their attributes.

Thanks
Jaime




Quoting "Marc A Kaplan" <makap...@us.ibm.com>:

1. As I surmised, and I now have verification from Mr. mmbackup, mmbackup
wants to support incremental backups (using what it calls its shadow
database) and keep both your sanity and its sanity -- so mmbackup limits
you to either full filesystem or full inode-space (independent fileset.)
If you want to do something else, okay, but you have to be careful and be
sure of yourself. IBM will not be able to jump in and help you if and when

it comes time to restore and you discover that your backup(s) were not
complete.

2. If you decide you're a big boy (or woman or XXX) and want to do some
hacking ...  Fine... But even then, I suggest you do the smallest hack
that will mostly achieve your goal...
DO NOT think you can create a custom policy rules list for mmbackup out of

thin air  Capture the rules mmbackup creates and make small changes to

that --
And as with any disaster recovery plan.   Plan your Test and Test your

Plan  Then do some dry run recoveries before you really "need" to do a

real recovery.

I only even sugest this because Jaime says he has a huge filesystem with
several dependent filesets and he really, really wants to do a partial
backup, without first copying or re-organizing the filesets.

HMMM otoh... if you have one or more dependent filesets that are
smallish, and/or you don't need the backups -- create independent
filesets, copy/move/delete the data, rename, voila.



From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "Marc A Kaplan" <makap...@us.ibm.com>
Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   05/18/2017 12:36 PM
Subject:Re: [gpfsug-discuss] What is an independent fileset? was:
mmbackupwith fileset : scope errors



Marc

The -P option may be a very good workaround, but I still have to test it.

I'm currently trying to craft the mm rule, as minimalist as possible,
however I'm not sure about what attributes mmbackup expects to see.

Below is my first attempt. It would be nice to get comments from
somebody familiar with the inner works of mmbackup.

Thanks
Jaime


/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])

/* Define three exter

Re: [gpfsug-discuss] What is an independent fileset? was: mmbackup with fileset : scope errors

2017-05-18 Thread Jaime Pinto

Ok Mark

I'll follow your option 2) suggestion, and capture what mmbackup is  
using as a rule first, then modify it.


I imagine by 'capture' you are referring to the -L n level I use?

-L n
 Controls the level of information displayed by the
 mmbackup command. Larger values indicate the
 display of more detailed information. n should be one of
 the following values:

 3
  Displays the same information as 2, plus each
  candidate file and the applicable rule.

 4
  Displays the same information as 3, plus each
  explicitly EXCLUDEed or LISTed
  file, and the applicable rule.

 5
  Displays the same information as 4, plus the
  attributes of candidate and EXCLUDEed or
  LISTed files.

 6
  Displays the same information as 5, plus
  non-candidate files and their attributes.

Thanks
Jaime




Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


1. As I surmised, and I now have verification from Mr. mmbackup, mmbackup
wants to support incremental backups (using what it calls its shadow
database) and keep both your sanity and its sanity -- so mmbackup limits
you to either full filesystem or full inode-space (independent fileset.)
If you want to do something else, okay, but you have to be careful and be
sure of yourself. IBM will not be able to jump in and help you if and when
it comes time to restore and you discover that your backup(s) were not
complete.

2. If you decide you're a big boy (or woman or XXX) and want to do some
hacking ...  Fine... But even then, I suggest you do the smallest hack
that will mostly achieve your goal...
DO NOT think you can create a custom policy rules list for mmbackup out of
thin air  Capture the rules mmbackup creates and make small changes to
that --
And as with any disaster recovery plan.   Plan your Test and Test your
Plan  Then do some dry run recoveries before you really "need" to do a
real recovery.

I only even sugest this because Jaime says he has a huge filesystem with
several dependent filesets and he really, really wants to do a partial
backup, without first copying or re-organizing the filesets.

HMMM otoh... if you have one or more dependent filesets that are
smallish, and/or you don't need the backups -- create independent
filesets, copy/move/delete the data, rename, voila.



From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "Marc A Kaplan" <makap...@us.ibm.com>
Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   05/18/2017 12:36 PM
Subject:Re: [gpfsug-discuss] What is an independent fileset? was:
mmbackupwith fileset : scope errors



Marc

The -P option may be a very good workaround, but I still have to test it.

I'm currently trying to craft the mm rule, as minimalist as possible,
however I'm not sure about what attributes mmbackup expects to see.

Below is my first attempt. It would be nice to get comments from
somebody familiar with the inner works of mmbackup.

Thanks
Jaime


/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])

/* Define three external lists */
RULE EXTERNAL LIST 'allfiles' EXEC
'/scratch/r/root/mmpolicyRules/mmpolicyExec-list'

/* Generate a list of all files, directories, plus all other file
system objects,
like symlinks, named pipes, etc. Include the owner's id with each
object and
sort them by the owner's id */

RULE 'r1' LIST 'allfiles'
 DIRECTORIES_PLUS
 SHOW('-u' vc(USER_ID) || ' -a' || vc(ACCESS_TIME) || ' -m' ||
vc(MODIFICATION_TIME) || ' -s ' || vc(FILE_SIZE))
 FROM POOL 'system'
 FOR FILESET('sysadmin3')

/* Files in special filesets, such as those excluded, are never traversed
*/
RULE 'ExcSpecialFile' EXCLUDE
 FOR FILESET('scratch3','project3')





Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


Jaime,

  While we're waiting for the mmbackup expert to weigh in, notice that

the

mmbackup command does have a -P option that allows you to provide a
customized policy rules file.

So... a fairly safe hack is to do a trial mmbackup run, capture the
automatically generated policy file, and then augment it with FOR
FILESET('fileset-I-want-to-backup') clauses Then run the mmbackup

for

real with your customized policy file.

mmbackup uses mmapplypolicy which by itself is happy to limit its
directory scan to a particular fileset by using

mmapplypolicy /path-to-any-directory-within-a-gpfs-filesystem --scope
fileset 

However, mmbackup probably has other worries and for simpliciity and
helping make sure you get complete, sensible backups, apparently has
imposed some restrictions to preserve sanity (yours and our support

team!

;-) )  ...   (For example, suppose you were doing incremental ba

Re: [gpfsug-discuss] What is an independent fileset? was: mmbackup with fileset : scope errors

2017-05-18 Thread Jaime Pinto

Marc

The -P option may be a very good workaround, but I still have to test it.

I'm currently trying to craft the mm rule, as minimalist as possible,  
however I'm not sure about what attributes mmbackup expects to see.


Below is my first attempt. It would be nice to get comments from  
somebody familiar with the inner works of mmbackup.


Thanks
Jaime


/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])

/* Define three external lists */
RULE EXTERNAL LIST 'allfiles' EXEC  
'/scratch/r/root/mmpolicyRules/mmpolicyExec-list'


/* Generate a list of all files, directories, plus all other file  
system objects,
   like symlinks, named pipes, etc. Include the owner's id with each  
object and

   sort them by the owner's id */

RULE 'r1' LIST 'allfiles'
DIRECTORIES_PLUS
SHOW('-u' vc(USER_ID) || ' -a' || vc(ACCESS_TIME) || ' -m' ||  
vc(MODIFICATION_TIME) || ' -s ' || vc(FILE_SIZE))

FROM POOL 'system'
FOR FILESET('sysadmin3')

/* Files in special filesets, such as those excluded, are never traversed */
RULE 'ExcSpecialFile' EXCLUDE
FOR FILESET('scratch3','project3')





Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


Jaime,

  While we're waiting for the mmbackup expert to weigh in, notice that the
mmbackup command does have a -P option that allows you to provide a
customized policy rules file.

So... a fairly safe hack is to do a trial mmbackup run, capture the
automatically generated policy file, and then augment it with FOR
FILESET('fileset-I-want-to-backup') clauses Then run the mmbackup for
real with your customized policy file.

mmbackup uses mmapplypolicy which by itself is happy to limit its
directory scan to a particular fileset by using

mmapplypolicy /path-to-any-directory-within-a-gpfs-filesystem --scope
fileset 

However, mmbackup probably has other worries and for simpliciity and
helping make sure you get complete, sensible backups, apparently has
imposed some restrictions to preserve sanity (yours and our support team!
;-) )  ...   (For example, suppose you were doing incremental backups,
starting at different paths each time? -- happy to do so, but when
disaster strikes and you want to restore -- you'll end up confused and/or
unhappy!)

"converting from one fileset to another" --- sorry there is no such thing.
 Filesets are kinda like little filesystems within filesystems.  Moving a
file from one fileset to another requires a copy operation.   There is no
fast move nor hardlinking.

--marc



From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>,
"Marc   A Kaplan" <makap...@us.ibm.com>
Date:   05/18/2017 09:58 AM
Subject:Re: [gpfsug-discuss] What is an independent fileset? was:
mmbackupwith fileset : scope errors



Thanks for the explanation Mark and Luis,

It begs the question: why filesets are created as dependent by
default, if the adverse repercussions can be so great afterward? Even
in my case, where I manage GPFS and TSM deployments (and I have been
around for a while), didn't realize at all that not adding and extra
option at fileset creation time would cause me huge trouble with
scaling later on as I try to use mmbackup.

When you have different groups to manage file systems and backups that
don't read each-other's manuals ahead of time then we have a really
bad recipe.

I'm looking forward to your explanation as to why mmbackup cares one
way or another.

I'm also hoping for a hint as to how to configure backup exclusion
rules on the TSM side to exclude fileset traversing on the GPFS side.
Is mmbackup smart enough (actually smarter than TSM client itself) to
read the exclusion rules on the TSM configuration and apply them
before traversing?

Thanks
Jaime

Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


When I see "independent fileset" (in Spectrum/GPFS/Scale)  I always

think

and try to read that as "inode space".

An "independent fileset" has all the attributes of an (older-fashioned)
dependent fileset PLUS all of its files are represented by inodes that

are

in a separable range of inode numbers - this allows GPFS to efficiently

do

snapshots of just that inode-space (uh... independent fileset)...

And... of course the files of dependent filesets must also be

represented

by inodes -- those inode numbers are within the inode-space of whatever
the containing independent fileset is... as was chosen when you created
the fileset   If you didn't say otherwise, inodes come from the
default "root" fileset

Clear as your bath-water, no?

So why does mmbackup care one way or another ???   Stay tuned

BTW - if you look at the bits of the inode numbers carefully --- you may
not immediately discern what I mean by a "separable range of inode
numbers" -- (very technical hint) you may need to

Re: [gpfsug-discuss] What is an independent fileset? was: mmbackup with fileset : scope errors

2017-05-18 Thread Jaime Pinto

Thanks for the explanation Mark and Luis,

It begs the question: why filesets are created as dependent by  
default, if the adverse repercussions can be so great afterward? Even  
in my case, where I manage GPFS and TSM deployments (and I have been  
around for a while), didn't realize at all that not adding and extra  
option at fileset creation time would cause me huge trouble with  
scaling later on as I try to use mmbackup.


When you have different groups to manage file systems and backups that  
don't read each-other's manuals ahead of time then we have a really  
bad recipe.


I'm looking forward to your explanation as to why mmbackup cares one  
way or another.


I'm also hoping for a hint as to how to configure backup exclusion  
rules on the TSM side to exclude fileset traversing on the GPFS side.  
Is mmbackup smart enough (actually smarter than TSM client itself) to  
read the exclusion rules on the TSM configuration and apply them  
before traversing?


Thanks
Jaime

Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


When I see "independent fileset" (in Spectrum/GPFS/Scale)  I always think
and try to read that as "inode space".

An "independent fileset" has all the attributes of an (older-fashioned)
dependent fileset PLUS all of its files are represented by inodes that are
in a separable range of inode numbers - this allows GPFS to efficiently do
snapshots of just that inode-space (uh... independent fileset)...

And... of course the files of dependent filesets must also be represented
by inodes -- those inode numbers are within the inode-space of whatever
the containing independent fileset is... as was chosen when you created
the fileset   If you didn't say otherwise, inodes come from the
default "root" fileset

Clear as your bath-water, no?

So why does mmbackup care one way or another ???   Stay tuned

BTW - if you look at the bits of the inode numbers carefully --- you may
not immediately discern what I mean by a "separable range of inode
numbers" -- (very technical hint) you may need to permute the bit order
before you discern a simple pattern...



From:   "Luis Bolinches" <luis.bolinc...@fi.ibm.com>
To: gpfsug-discuss@spectrumscale.org
Cc: gpfsug-discuss@spectrumscale.org
Date:   05/18/2017 02:10 AM
Subject:Re: [gpfsug-discuss] mmbackup with fileset : scope errors
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi

There is no direct way to convert the one fileset that is dependent to
independent or viceversa.

I would suggest to take a look to chapter 5 of the 2014 redbook, lots of
definitions about GPFS ILM including filesets
http://www.redbooks.ibm.com/abstracts/sg248254.html?Open Is not the only
place that is explained but I honestly believe is a good single start
point. It also needs an update as does nto have anything on CES nor ESS,
so anyone in this list feel free to give feedback on that page people with
funding decisions listen there.

So you are limited to either migrate the data from that fileset to a new
independent fileset (multiple ways to do that) or use the TSM client
config.

- Original message -
From: "Jaime Pinto" <pi...@scinet.utoronto.ca>
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>,
"Jaime Pinto" <pi...@scinet.utoronto.ca>
Cc:
Subject: Re: [gpfsug-discuss] mmbackup with fileset : scope errors
Date: Thu, May 18, 2017 4:43 AM

There is hope. See reference link below:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.ins.doc/bl1ins_tsm_fsvsfset.htm


The issue has to do with dependent vs. independent filesets, something
I didn't even realize existed until now. Our filesets are dependent
(for no particular reason), so I have to find a way to turn them into
independent.

The proper option syntax is "--scope inodespace", and the error
message actually flagged that out, however I didn't know how to
interpret what I saw:


# mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--scope inodespace --tsm-errorlog $logfile -L 2

mmbackup: Backup of /gpfs/sgfs1/sysadmin3 begins at Wed May 17
21:27:43 EDT 2017.

Wed May 17 21:27:45 2017 mmbackup:mmbackup: Backing up *dependent*
fileset sysadmin3 is not supported
Wed May 17 21:27:45 2017 mmbackup:This fileset is not suitable for
fileset level backup.  exit 1


Will post the outcome.
Jaime



Quoting "Jaime Pinto" <pi...@scinet.utoronto.ca>:


Quoting "Luis Bolinches" <luis.bolinc...@fi.ibm.com>:


Hi

have you tried to add exceptions on the TSM client config file?


Hey Luis,

That would work as well (mechanically), however 

Re: [gpfsug-discuss] mmbackup with fileset : scope errors

2017-05-17 Thread Jaime Pinto

There is hope. See reference link below:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.1.1/com.ibm.spectrum.scale.v4r11.ins.doc/bl1ins_tsm_fsvsfset.htm

The issue has to do with dependent vs. independent filesets, something  
I didn't even realize existed until now. Our filesets are dependent  
(for no particular reason), so I have to find a way to turn them into  
independent.


The proper option syntax is "--scope inodespace", and the error  
message actually flagged that out, however I didn't know how to  
interpret what I saw:



# mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm  
--scope inodespace --tsm-errorlog $logfile -L 2


mmbackup: Backup of /gpfs/sgfs1/sysadmin3 begins at Wed May 17  
21:27:43 EDT 2017.


Wed May 17 21:27:45 2017 mmbackup:mmbackup: Backing up *dependent*  
fileset sysadmin3 is not supported
Wed May 17 21:27:45 2017 mmbackup:This fileset is not suitable for  
fileset level backup.  exit 1



Will post the outcome.
Jaime



Quoting "Jaime Pinto" <pi...@scinet.utoronto.ca>:


Quoting "Luis Bolinches" <luis.bolinc...@fi.ibm.com>:


Hi

have you tried to add exceptions on the TSM client config file?


Hey Luis,

That would work as well (mechanically), however it's not elegant or
efficient. When you have over 1PB and 200M files on scratch it will
take many hours and several helper nodes to traverse that fileset just
to be negated by TSM. In fact exclusion on TSM are just as inefficient.
Considering that I want to keep project and sysadmin on different
domains then it's much worst, since we have to traverse and exclude
scratch & (project|sysadmin) twice, once to capture sysadmin and again
to capture project.

If I have to use exclusion rules it has to rely sole on gpfs rules, and
somehow not traverse scratch at all.

I suspect there is a way to do this properly, however the examples on
the gpfs guide and other references are not exhaustive. They only show
a couple of trivial cases.

However my situation is not unique. I suspect there are may facilities
having to deal with backup of HUGE filesets.

So the search is on.

Thanks
Jaime






Assuming your GPFS dir is /IBM/GPFS and your fileset to exclude is linked
on /IBM/GPFS/FSET1

dsm.sys
...

DOMAIN /IBM/GPFS
EXCLUDE.DIR /IBM/GPFS/FSET1


From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   17-05-17 23:44
Subject:[gpfsug-discuss] mmbackup with fileset : scope errors
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I have a g200 /gpfs/sgfs1 filesystem with 3 filesets:
* project3
* scratch3
* sysadmin3

I have no problems mmbacking up /gpfs/sgfs1 (or sgfs1), however we
have no need or space to include *scratch3* on TSM.

Question: how to craft the mmbackup command to backup
/gpfs/sgfs1/project3 and/or /gpfs/sgfs1/sysadmin3 only?

Below are 3 types of errors:

1) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--tsm-errorlog $logfile -L 2

ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
cannot be specified at the same time.

2) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--scope inodespace --tsm-errorlog $logfile -L 2

ERROR: Wed May 17 16:27:11 2017 mmbackup:mmbackup: Backing up
dependent fileset sysadmin3 is not supported
Wed May 17 16:27:11 2017 mmbackup:This fileset is not suitable for
fileset level backup.  exit 1

3) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--scope filesystem --tsm-errorlog $logfile -L 2

ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
cannot be specified at the same time.

These examples don't really cover my case:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmbackup.htm#mmbackup__mmbackup_examples


Thanks
Jaime


 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of
Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3
Registered in Finland









Re: [gpfsug-discuss] mmbackup with fileset : scope errors

2017-05-17 Thread Jaime Pinto

Quoting "Luis Bolinches" <luis.bolinc...@fi.ibm.com>:


Hi

have you tried to add exceptions on the TSM client config file?


Hey Luis,

That would work as well (mechanically), however it's not elegant or  
efficient. When you have over 1PB and 200M files on scratch it will  
take many hours and several helper nodes to traverse that fileset just  
to be negated by TSM. In fact exclusion on TSM are just as  
inefficient. Considering that I want to keep project and sysadmin on  
different domains then it's much worst, since we have to traverse and  
exclude scratch & (project|sysadmin) twice, once to capture sysadmin  
and again to capture project.


If I have to use exclusion rules it has to rely sole on gpfs rules,  
and somehow not traverse scratch at all.


I suspect there is a way to do this properly, however the examples on  
the gpfs guide and other references are not exhaustive. They only show  
a couple of trivial cases.


However my situation is not unique. I suspect there are may facilities  
having to deal with backup of HUGE filesets.


So the search is on.

Thanks
Jaime






Assuming your GPFS dir is /IBM/GPFS and your fileset to exclude is linked
on /IBM/GPFS/FSET1

dsm.sys
...

DOMAIN /IBM/GPFS
EXCLUDE.DIR /IBM/GPFS/FSET1


From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   17-05-17 23:44
Subject:[gpfsug-discuss] mmbackup with fileset : scope errors
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I have a g200 /gpfs/sgfs1 filesystem with 3 filesets:
* project3
* scratch3
* sysadmin3

I have no problems mmbacking up /gpfs/sgfs1 (or sgfs1), however we
have no need or space to include *scratch3* on TSM.

Question: how to craft the mmbackup command to backup
/gpfs/sgfs1/project3 and/or /gpfs/sgfs1/sysadmin3 only?

Below are 3 types of errors:

1) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--tsm-errorlog $logfile -L 2

ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
cannot be specified at the same time.

2) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--scope inodespace --tsm-errorlog $logfile -L 2

ERROR: Wed May 17 16:27:11 2017 mmbackup:mmbackup: Backing up
dependent fileset sysadmin3 is not supported
Wed May 17 16:27:11 2017 mmbackup:This fileset is not suitable for
fileset level backup.  exit 1

3) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm
--scope filesystem --tsm-errorlog $logfile -L 2

ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem
cannot be specified at the same time.

These examples don't really cover my case:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmbackup.htm#mmbackup__mmbackup_examples


Thanks
Jaime


  
   TELL US ABOUT YOUR SUCCESS STORIES
  http://www.scinethpc.ca/testimonials
  
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of
Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





Ellei edellä ole toisin mainittu: / Unless stated otherwise above:
Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3
Registered in Finland








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
     
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] mmbackup with fileset : scope errors

2017-05-17 Thread Jaime Pinto

I have a g200 /gpfs/sgfs1 filesystem with 3 filesets:
* project3
* scratch3
* sysadmin3

I have no problems mmbacking up /gpfs/sgfs1 (or sgfs1), however we  
have no need or space to include *scratch3* on TSM.


Question: how to craft the mmbackup command to backup  
/gpfs/sgfs1/project3 and/or /gpfs/sgfs1/sysadmin3 only?


Below are 3 types of errors:

1) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm  
--tsm-errorlog $logfile -L 2


ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem  
cannot be specified at the same time.


2) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm  
--scope inodespace --tsm-errorlog $logfile -L 2


ERROR: Wed May 17 16:27:11 2017 mmbackup:mmbackup: Backing up  
dependent fileset sysadmin3 is not supported
Wed May 17 16:27:11 2017 mmbackup:This fileset is not suitable for  
fileset level backup.  exit 1


3) mmbackup /gpfs/sgfs1/sysadmin3 -N tsm-helper1-ib0 -s /dev/shm  
--scope filesystem --tsm-errorlog $logfile -L 2


ERROR: mmbackup: Options /gpfs/sgfs1/sysadmin3 and --scope filesystem  
cannot be specified at the same time.


These examples don't really cover my case:
https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_mmbackup.htm#mmbackup__mmbackup_examples

Thanks
Jaime


 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] BIG LAG since 3.5 on quota accounting reconciliation

2017-05-11 Thread Jaime Pinto

Just bumping up.
When I first posted this subject at the end of March there was a UG  
meeting that drove people's attention.


I hope to get some comments now.

Thanks
Jaime

Quoting "Jaime Pinto" <pi...@scinet.utoronto.ca>:


In the old days of DDN 9900 and gpfs 3.4 I only had to run mmcheckquota
once a month, usually after the massive monthly purge.

I noticed that starting with the GSS and ESS appliances under 3.5 that
I needed to run mmcheckquota more often, at least once a week, or as
often as daily, to clear the slippage errors in the accounting
information, otherwise users complained that they were hitting their
quotas, even throughout they deleted a lot of stuff.

More recently we adopted a G200 appliance (1.8PB), with v4.1, and now
things have gotten worst, and I have to run it twice daily, just in
case.

So, what I am missing? Is their a parameter since 3.5 and through 4.1
that we can set, so that GPFS will reconcile the quota accounting
internally more often and on its own?

Thanks
Jaime







 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] help with multi-cluster setup: Network isunreachable

2017-05-09 Thread Jaime Pinto
As it turned out, the 'authorized_keys' file placed in the  
/var/mmfs/ssl directory of the NDS for the new storage cluster 4  
(4.1.1-14) needed an explicit entry of the following format for the  
bracket associated with clients on cluster 0:

nistCompliance=off

Apparently the default for 4.1.x is:
nistCompliance=SP800-131A

I just noticed that on cluster 3 (4.1.1-7) that entry is also present  
for the bracket associated with clients cluster 0. I guess the Seagate  
fellows that helped us install the G200 in our facility had that  
figured out.


The original "TLS handshake" error message kind of gave me a hint of  
the problem, however the 4.1 installation manual specifically  
mentioned that this could be an issue only on 4.2 onward. The  
troubleshoot guide for 4.2 has this excerpt:


"Ensure that the configurations of GPFS and the remote key management  
(RKM) server are
compatible when it comes to the version of the TLS protocol used upon  
key retrieval (GPFS uses the nistCompliance configuration variable to  
control that). In particular, if nistCompliance=SP800-131A is set in  
GPFS, ensure that the TLS v1.2 protocol is enabled in
the RKM server. If this does not resolve the issue, contact the IBM  
Support Center.". So, how am I to know that nistCompliance=off is even  
an option?



For backward compatibility with the older storage clusters on 3.5 the  
clients cluster need to have nistCompliance=off


I hope this helps the fellows in mixed versions environments, since  
it's not obvious from the 3.5/4.1 installation manuals or the  
troubleshoots guide what we should do.


Thanks everyone for the help.
Jaime





Quoting "Uwe Falke" <uwefa...@de.ibm.com>:


Hi, Jaime,
I'd suggest you trace a client while trying to connect and check what
addresses it is going to talk to actually. It is a bit tedious, but you
will be able to find this in the trace report file. You might also get an
idea what's going wrong...



Mit freundlichen Grüßen / Kind regards


Dr. Uwe Falke

IT Specialist
High Performance Computing Services / Integrated Technology Services /
Data Center Services
---
IBM Deutschland
Rathausstr. 7
09111 Chemnitz
Phone: +49 371 6978 2165
Mobile: +49 175 575 2877
E-Mail: uwefa...@de.ibm.com
---
IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
Andreas Hasse, Thomas Wolter
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 17122




From:   "Jaime Pinto" <pi...@scinet.utoronto.ca>
To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Date:   05/08/2017 06:06 PM
Subject:[gpfsug-discuss] help with multi-cluster setup: Network is
unreachable
Sent by:gpfsug-discuss-boun...@spectrumscale.org



We have a setup in which "cluster 0" is made up of clients only on
gpfs v4.1, ie, no NDS's or formal storage on this primary membership.

All storage for those clients come in a multi-cluster fashion, from
clusters 1 (3.5.0-23), 2 (3.5.0-11) and 3 (4.1.1-7).

We recently added a new storage cluster 4 (4.1.1-14), and for some
obscure reason we keep getting "Network is unreachable" during mount
by clients, even though there were no issues or errors with the
multi-cluster setup, ie, 'mmremotecluster add' and 'mmremotefs add'
worked fine, and all clients have an entry in /etc/fstab for the file
system associated with the new cluster 4. The weird thing is that we
can mount cluster 3 fine (also 4.1).

Another piece og information is that as far as GPFS goes all clusters
are configured to communicate exclusively over Infiniband, each on a
different 10.20.x.x network, but broadcast 10.20.255.255. As far as
the IB network goes there are no problems routing/pinging around all
the clusters. So this must be internal to GPFS.

None of the clusters have the subnet parameter set explicitly at
configuration, and on reading the 3.5 and 4.1 manuals it doesn't seem
we need to. All have cipherList AUTHONLY. One difference is that
cluster 4 has DMAPI enabled (don't think it matters).

Below is an excerpt of the /var/mmfs/gen/mmfslog in one of the clients
during mount (10.20.179.1 is one of the NDS on cluster 4):
Mon May  8 11:35:27.773 2017: [I] Waiting to join remote cluster
wosgpfs.wos-gateway01-ib0
Mon May  8 11:35:28.777 2017: [W] The TLS handshake with node
10.20.179.1 failed with error 447 (client side).
Mon May  8 11:35:28.781 2017: [E] Failed to join remote cluster
wosgpfs.wos-gateway01-ib0
Mon May  8 11:35:28.782 2017: [W] Command: err 719: mount
wosgpfs.wos-gateway01-ib0:wosgpfs
Mon May  8 11:35:28.783 2017: Network is unreachable


I see this reference to "TLS handshake&

Re: [gpfsug-discuss] help with multi-cluster setup: Network is unreachable

2017-05-08 Thread Jaime Pinto

Quoting valdis.kletni...@vt.edu:


On Mon, 08 May 2017 12:06:22 -0400, "Jaime Pinto" said:


Another piece og information is that as far as GPFS goes all clusters
are configured to communicate exclusively over Infiniband, each on a
different 10.20.x.x network, but broadcast 10.20.255.255. As far as


Have you verified that broadcast setting actually works, and packets
aren't being discarded as martians?



Yes, we have. They are fine.

I'm seeing "failure to join the cluster" messages prior to the  
"network unreachable" in the mmfslog files, so I'm starting to suspect  
minor disparities between older releases of 3.5.x.x at one end and  
newer 4.1.x.x at the other. I'll dig a little more and report the  
findings.


Thanks
Jaime







 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] help with multi-cluster setup: Network is unreachable

2017-05-08 Thread Jaime Pinto
I only ask that we look beyond the trivial. The existing multi-cluster  
setup with mixed versions of servers already work fine with 4000+  
clients on 4.1. We still have 3 legacy servers on 3.5, we already have  
a server on 4.1 also serving fine. The brand new 4.1 server we added  
last week seems to be at odds for some reason, not that obvious.


Thanks
Jaime

Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:


Hi Eric, Jamie,

Interesting comment as we do exactly the opposite!

I always make sure that my servers are running a particular version   
before I upgrade any clients.  Now we never mix and match major   
versions (i.e. 4.x and 3.x) for long ? those kinds of upgrades we do  
 rapidly.  But right now I?ve got clients running 4.2.0-3 talking   
just fine to 4.2.2.3 servers.


To be clear, I?m not saying I?m right and Eric?s wrong at all - just  
 an observation / data point.  YMMV?


Kevin

On May 8, 2017, at 11:34 AM, J. Eric Wonderley   
<eric.wonder...@vt.edu<mailto:eric.wonder...@vt.edu>> wrote:


Hi Jamie:

I think typically you want to keep the clients ahead of the server   
in version.  I would advance the version of you client nodes.


New clients can communicate with older versions of server nsds.
Vice versa...no so much.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

?
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> -   
(615)875-9633












 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] help with multi-cluster setup: Network is unreachable

2017-05-08 Thread Jaime Pinto
Sorry, I made a mistake on the original description: all our clients  
are already on 4.1.1-7.

Jaime


Quoting "J. Eric Wonderley" <eric.wonder...@vt.edu>:


Hi Jamie:

I think typically you want to keep the clients ahead of the server in
version.  I would advance the version of you client nodes.

New clients can communicate with older versions of server nsds.  Vice
versa...no so much.








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] BIG LAG since 3.5 on quota accounting reconciliation

2017-03-31 Thread Jaime Pinto
In the old days of DDN 9900 and gpfs 3.4 I only had to run  
mmcheckquota once a month, usually after the massive monthly purge.


I noticed that starting with the GSS and ESS appliances under 3.5 that  
I needed to run mmcheckquota more often, at least once a week, or as  
often as daily, to clear the slippage errors in the accounting  
information, otherwise users complained that they were hitting their  
quotas, even throughout they deleted a lot of stuff.


More recently we adopted a G200 appliance (1.8PB), with v4.1, and now  
things have gotten worst, and I have to run it twice daily, just in  
case.


So, what I am missing? Is their a parameter since 3.5 and through 4.1  
that we can set, so that GPFS will reconcile the quota accounting  
internally more often and on its own?


Thanks
Jaime


 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] fix mmrepquota report format during grace periods

2017-03-28 Thread Jaime Pinto

Aah! Another one of those options not so well documented or exposed:
Usage:
  mmrepquota [-u] [-g] [-e] [-q] [-n] [-v] [-t]
 [--block-size {BlockSize | auto}] {-a | Device[:Fileset] ...}
  or
  mmrepquota -j [-e] [-q] [-n] [-v] [-t]
 [--block-size {BlockSize | auto}] {-a | Device ...}


I agree, in this way it would be easier for a script to deal with  
fields that have spaces, using ':' as a field separator. However it  
mangles all the information together, making it very difficult for  
human eyes of a sysadmin to deal with in its original format.


I'll take it under consideration for the scripts version (many of them  
to be revised), however the best is for the original and plain reports  
to have consistence.


Thanks
Jaime

Quoting "Oesterlin, Robert" <robert.oester...@nuance.com>:


Try running it with the ?-Y? option, it returns an easily to read output:
mmrepquota -Y dns
mmrepquota::HEADER:version:reserved:reserved:filesystemName:quotaType:id:name:blockUsage:blockQuota:blockLimit:blockInDoubt:blockGrace:filesUsage:filesQuota:filesLimit:filesInDoubt:filesGrace:remarks:quota:defQuota:fid:filesetname:
mmrepquota::0:1:::dns:USR:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:0:root:
mmrepquota::0:1:::dns:USR:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:1:users:
mmrepquota::0:1:::dns:GRP:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:0:root:
mmrepquota::0:1:::dns:GRP:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:1:users:
mmrepquota::0:1:::dns:FILESET:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:::
mmrepquota::0:1:::dns:FILESET:1:users:0:4294967296:4294967296:0:none:1:0:0:0:none:e:on:off:::

Bob Oesterlin
Sr Principal Storage Engineer, Nuance



On 3/28/17, 9:47 AM, "gpfsug-discuss-boun...@spectrumscale.org on   
behalf of Jaime Pinto" <gpfsug-discuss-boun...@spectrumscale.org on   
behalf of pi...@scinet.utoronto.ca> wrote:


Any chance you guys in the GPFS devel team could patch the mmrepquota
code so that during grace periods the report column for "none" would
still be replaced with >>>*ONE*<<< word? By that I mean, instead of "2
days" for example, just print "2-days" or "2days" or "2_days", and so
on.

I have a number of scripts that fail for users when they are over
their quotas under grace periods, because the report shifts the
remaining information for that user 1 column to the right.

Obviously it would cost me absolutely nothing to patch my scripts to
deal with this, however the principle here is that the reports
generated by GPFS should be the ones keeping consistence.

Thanks
Jaime




  
   TELL US ABOUT YOUR SUCCESS STORIES

https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials=DwICAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU=PnZlzkqTEICwnHCIZvUgTr2CN-RqtzNsKbADKWCeLhA=TVGnqMwSWqNI1Vu1BlCcwXiVGsLUO9ZnbqlasVmT2HU=

  
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University  
 of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
  
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=LPDewt1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU=PnZlzkqTEICwnHCIZvUgTr2CN-RqtzNsKbADKWCeLhA=AXRwDMAVkYdwEaSFzejLQzNnS-KXoKj9GauzeEuu2H8=



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss








 
  TELL US ABOUT YOUR SUCCESS STORIES
     http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] fix mmrepquota report format during grace periods

2017-03-28 Thread Jaime Pinto
Any chance you guys in the GPFS devel team could patch the mmrepquota  
code so that during grace periods the report column for "none" would  
still be replaced with >>>*ONE*<<< word? By that I mean, instead of "2  
days" for example, just print "2-days" or "2days" or "2_days", and so  
on.


I have a number of scripts that fail for users when they are over  
their quotas under grace periods, because the report shifts the  
remaining information for that user 1 column to the right.


Obviously it would cost me absolutely nothing to patch my scripts to  
deal with this, however the principle here is that the reports  
generated by GPFS should be the ones keeping consistence.


Thanks
Jaime




 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] replicating ACLs across GPFS's?

2017-01-05 Thread Jaime Pinto

Great guys!!!
Just what I was looking for.
Everyone is always so helpful on this forum.
Thanks a lot.
Jaime

Quoting "Laurence Horrocks-Barlow" <laure...@qsplace.co.uk>:


Are you talking about the GPFSUG github?

https://github.com/gpfsug/gpfsug-tools

The patched rsync there I believe was done by Orlando.

-- Lauz


On 05/01/2017 22:01, Buterbaugh, Kevin L wrote:

Hi Jaime,

IBM developed a patch for rsync that can replicate ACL?s ? we?ve   
used it and it works great ? can?t remember where we downloaded it   
from, though.  Maybe someone else on the list who *isn?t* having a   
senior moment can point you to it?


Kevin


On Jan 5, 2017, at 3:53 PM, Jaime Pinto <pi...@scinet.utoronto.ca> wrote:

Does anyone know of a functional standard alone tool to   
systematically and recursively find and replicate ACLs that works   
well with GPFS?


* We're currently using rsync, which will replicate permissions   
fine, however it leaves the ACL's behind. The --perms option for   
rsync is blind to ACLs.


* The native linux trick below works well with ext4 after an   
rsync, but makes a mess on GPFS.

% getfacl -R /path/to/source > /root/perms.ac
% setfacl --restore=/root/perms.acl

* The native GPFS mmgetacl/mmputacl pair does not have a built-in   
recursive option.


Any ideas?

Thanks
Jaime

---
Jaime Pinto
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University   
of Toronto.



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] quota on secondary groups for a user?

2016-08-04 Thread Jaime Pinto

OK

More info:

Users can apply the 'sg group1' or 'sq group2' command from a shell or  
script to switch the group mask from that point on, and dodge the  
quota that may have been exceeded on a group.


However, as the group owner or other member of the group on the limit,  
I could not find a tool they can use on their own to find out who  
is(are) the largest user(s); 'du' takes too long, and some users don't  
give read permissions on their directories.


As part of the puzzle solution I have to come up with a root wrapper  
that can make the contents of the mmrepquota report available to them.


Jaime



Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:


Hi Jaime,

Thank you so much for doing this and reporting back the results!  
  They?re in line with what I would expect to happen.  I was going  
to  test this as well, but we have had to extend our downtime until   
noontime tomorrow, so I haven?t had a chance to do so yet.  Now I   
don?t have to? ;-)


Kevin

On Aug 4, 2016, at 10:59 AM, Jaime Pinto   
<pi...@scinet.utoronto.ca<mailto:pi...@scinet.utoronto.ca>> wrote:


Since there were inconsistencies in the responses, I decided to rig   
a couple of accounts/groups on our LDAP to test "My interpretation",  
 and determined that I was wrong. When Kevin mentioned it would mean  
 a bug I had to double-check:


If a user hits the hard quota or exceeds the grace period on the   
soft quota on any of the secondary groups that user will be stopped   
from further writing to those groups as well, just as in the primary  
 group.


I hope this clears the waters a bit. I still have to solve my puzzle.

Thanks everyone for the feedback.
Jaime



Quoting "Jaime Pinto"   
<pi...@scinet.utoronto.ca<mailto:pi...@scinet.utoronto.ca>>:


Quoting "Buterbaugh, Kevin L"   
<kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu>>:


Hi Sven,

Wait - am I misunderstanding something here?  Let?s say that I have   
  ?user1? who has primary group ?group1? and secondary group
?group2?.   And let?s say that they write to a directory where the
bit on the  directory forces all files created in that directory to   
 have group2  associated with them.  Are you saying that those files  
  still count  against group1?s group quota???


Thanks for clarifying?

Kevin

Not really,

My interpretation is that all files written with group2 will count
towards the quota on that group. However any users with group2 as the
primary group will be prevented from writing any further when the
group2 quota is reached. However the culprit user1 with primary group
as group1 won't be detected by gpfs, and can just keep going on writing
group2 files.

As far as the individual user quota, it doesn't matter: group1 or
group2 it will be counted towards the usage of that user.

It would be interesting if the behavior was more as expected. I just
checked with my Lustre counter-parts and they tell me whichever
secondary group is hit first, however many there may be, the user will
be stopped. The problem then becomes identifying which of the secondary
groups hit the limit for that user.

Jaime



On Aug 3, 2016, at 11:35 AM, Sven Oehme 
<oeh...@gmail.com<mailto:oeh...@gmail.com><mailto:oeh...@gmail.com>>  
 wrote:


Hi,

quotas are only counted against primary group

sven


On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto 
<pi...@scinet.utoronto.ca<mailto:pi...@scinet.utoronto.ca><mailto:pi...@scinet.utoronto.ca>>   
wrote:
Suppose I want to set both USR and GRP quotas for a user, however 
GRP is not the primary group. Will gpfs enforce the secondary group   
  quota for that user?


What I mean is, if the user keeps writing files with secondary
group  as the attribute, and that overall group quota is reached,
will that  user be stopped by gpfs?


Thanks
Jaime




   
TELL US ABOUT YOUR SUCCESS STORIES
   http://www.scinethpc.ca/testimonials
   
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca<http://www.scinet.utoronto.ca><http://www.scinet.utoronto.ca/> - 
www.computecanada.org<http://www.computecanada.org><http://www.computecanada.org/>

University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477




This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss







********
 TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/

Re: [gpfsug-discuss] quota on secondary groups for a user?

2016-08-04 Thread Jaime Pinto
Since there were inconsistencies in the responses, I decided to rig a  
couple of accounts/groups on our LDAP to test "My interpretation", and  
determined that I was wrong. When Kevin mentioned it would mean a bug  
I had to double-check:


If a user hits the hard quota or exceeds the grace period on the soft  
quota on any of the secondary groups that user will be stopped from  
further writing to those groups as well, just as in the primary group.


I hope this clears the waters a bit. I still have to solve my puzzle.

Thanks everyone for the feedback.
Jaime



Quoting "Jaime Pinto" <pi...@scinet.utoronto.ca>:


Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:


Hi Sven,

Wait - am I misunderstanding something here?  Let?s say that I have  
  ?user1? who has primary group ?group1? and secondary group   
?group2?.   And let?s say that they write to a directory where the   
bit on the  directory forces all files created in that directory to  
 have group2  associated with them.  Are you saying that those  
files  still count  against group1?s group quota???


Thanks for clarifying?

Kevin


Not really,

My interpretation is that all files written with group2 will count
towards the quota on that group. However any users with group2 as the
primary group will be prevented from writing any further when the
group2 quota is reached. However the culprit user1 with primary group
as group1 won't be detected by gpfs, and can just keep going on writing
group2 files.

As far as the individual user quota, it doesn't matter: group1 or
group2 it will be counted towards the usage of that user.

It would be interesting if the behavior was more as expected. I just
checked with my Lustre counter-parts and they tell me whichever
secondary group is hit first, however many there may be, the user will
be stopped. The problem then becomes identifying which of the secondary
groups hit the limit for that user.

Jaime




On Aug 3, 2016, at 11:35 AM, Sven Oehme
<oeh...@gmail.com<mailto:oeh...@gmail.com>> wrote:


Hi,

quotas are only counted against primary group

sven


On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto
<pi...@scinet.utoronto.ca<mailto:pi...@scinet.utoronto.ca>> wrote:
Suppose I want to set both USR and GRP quotas for a user, however
GRP is not the primary group. Will gpfs enforce the secondary group  
  quota for that user?


What I mean is, if the user keeps writing files with secondary   
group  as the attribute, and that overall group quota is reached,   
will that  user be stopped by gpfs?


Thanks
Jaime





 TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/testimonials

---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca<http://www.scinet.utoronto.ca/> -
www.computecanada.org<http://www.computecanada.org/>

University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477





This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] quota on secondary groups for a user?

2016-08-03 Thread Jaime Pinto

Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:


Hi Sven,

Wait - am I misunderstanding something here?  Let?s say that I have   
?user1? who has primary group ?group1? and secondary group ?group2?.  
  And let?s say that they write to a directory where the bit on the   
directory forces all files created in that directory to have group2   
associated with them.  Are you saying that those files still count   
against group1?s group quota???


Thanks for clarifying?

Kevin


Not really,

My interpretation is that all files written with group2 will count  
towards the quota on that group. However any users with group2 as the  
primary group will be prevented from writing any further when the  
group2 quota is reached. However the culprit user1 with primary group  
as group1 won't be detected by gpfs, and can just keep going on  
writing group2 files.


As far as the individual user quota, it doesn't matter: group1 or  
group2 it will be counted towards the usage of that user.


It would be interesting if the behavior was more as expected. I just  
checked with my Lustre counter-parts and they tell me whichever  
secondary group is hit first, however many there may be, the user will  
be stopped. The problem then becomes identifying which of the  
secondary groups hit the limit for that user.


Jaime




On Aug 3, 2016, at 11:35 AM, Sven Oehme   
<oeh...@gmail.com<mailto:oeh...@gmail.com>> wrote:


Hi,

quotas are only counted against primary group

sven


On Wed, Aug 3, 2016 at 9:22 AM, Jaime Pinto   
<pi...@scinet.utoronto.ca<mailto:pi...@scinet.utoronto.ca>> wrote:
Suppose I want to set both USR and GRP quotas for a user, however   
GRP is not the primary group. Will gpfs enforce the secondary group   
quota for that user?


What I mean is, if the user keeps writing files with secondary group  
 as the attribute, and that overall group quota is reached, will  
that  user be stopped by gpfs?


Thanks
Jaime




 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca<http://www.scinet.utoronto.ca/> -   
www.computecanada.org<http://www.computecanada.org/>

University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477





This message was sent using IMP at SciNet Consortium, University of Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS on ZFS! ... ?

2016-06-13 Thread Jaime Pinto

As Marc, I also have questions related to performance.

Assuming we let ZFS take care of the underlying software raid, what  
would be the difference between GPFS and Lustre for instance, for the  
"parallel serving" at scale part of the file system. What would keep  
GPFS from performing or functioning just as well?


Thanks
Jaime

Quoting "Marc A Kaplan" <makap...@us.ibm.com>:


How do you set the size of a ZFS file that is simulating a GPFS disk?  How
do "tell" GPFS about that?

How efficient is this layering, compared to just giving GPFS direct access
to the same kind of LUNs that ZFS is using?

Hmmm... to partially answer my question, I do something similar, but
strictly for testing non-performance critical GPFS functions.
On any file system one can:

  dd if=/dev/zero of=/fakedisks/d3 count=1 bs=1M seek=3000  # create a
fake 3GB disk for GPFS

Then use a GPFS nsd configuration record like this:

%nsd: nsd=d3  device=/fakedisks/d3  usage=dataOnly pool=xtra
servers=bog-xxx

Which starts out as sparse and the filesystem will dynamically "grow" as
GPFS writes to it...

But I have no idea how well this will work for a critical "production"
system...

tx, marc kaplan.



From:   "Allen, Benjamin S." <bsal...@alcf.anl.gov>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:   06/13/2016 12:34 PM
Subject:Re: [gpfsug-discuss] GPFS on ZFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Jaime,

See
https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_nsddevices.htm
. An example I have for add /dev/nvme* devices:

* GPFS doesn't know how that /dev/nvme* are valid block devices, use a
user exit script to let it know about it

cp /usr/lpp/mmfs/samples/nsddevices.sample /var/mmfs/etc/nsddevices

* Edit /var/mmfs/etc/nsddevices, and add to linux section:

if [[ $osName = Linux ]]
then
  : # Add function to discover disks in the Linux environment.
for dev in $( cat /proc/partitions | grep nvme | awk '{print $4}' )
  do
echo $dev generic
done
fi

* Copy edited nsddevices to the rest of the nodes at the same path
for host in n01 n02 n03 n04; do
  scp /var/mmfs/etc/nsddevices ${host}:/var/mmfs/etc/nsddevices
done


Ben


On Jun 13, 2016, at 11:26 AM, Jaime Pinto <pi...@scinet.utoronto.ca>

wrote:


Hi Chris

As I understand, GPFS likes to 'see' the block devices, even on a

hardware raid solution such as DDN's.


How is that accomplished when you use ZFS for software raid?
On page 4, I see this info, and I'm trying to interpret it:

General Configuration
...
* zvols
* nsddevices
 - echo "zdX generic"


Thanks
Jaime

Quoting "Hoffman, Christopher P" <cphof...@lanl.gov>:


Hi Jaime,

What in particular would you like explained more? I'd be more than

happy to discuss things further.


Chris
____
From: gpfsug-discuss-boun...@spectrumscale.org

[gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jaime Pinto
[pi...@scinet.utoronto.ca]

Sent: Monday, June 13, 2016 10:11
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] GPFS on ZFS?

I just came across this presentation on "GPFS with underlying ZFS
block devices", by Christopher Hoffman, Los Alamos National Lab,
although some of the
implementation remains obscure.

http://files.gpfsug.org/presentations/2016/anl-june/LANL_GPFS_ZFS.pdf

It would be great to have more details, in particular the possibility
of straight use of GPFS on ZFS, instead of the 'archive' use case as
described on the presentation.

Thanks
Jaime




Quoting "Jaime Pinto" <pi...@scinet.utoronto.ca>:


Since we can not get GNR outside ESS/GSS appliances, is anybody using
ZFS for software raid on commodity storage?

Thanks
Jaime







 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of

Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss









     TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/testimonials

---
Jaime Pinto
SciNet HPC Consortium  - Comp

[gpfsug-discuss] GPFS on ZFS?

2016-04-18 Thread Jaime Pinto
Since we can not get GNR outside ESS/GSS appliances, is anybody using  
ZFS for software raid on commodity storage?


Thanks
Jaime


---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] backup and disaster recovery solutions

2016-04-11 Thread Jaime Pinto

Hi Mark

Personally I'm aware of the HSM features.

However I was specifically referring to TSM Backup restore. I was told  
the new GUI for unprivileged users looks identical to what root would  
see, but unprivileged users would only be able to see material for  
which they have read permissions, and restore only to paths they have  
write permissions. The GUI is supposed to be a difference platform  
then the java/WebSphere like we have seen in the past to manage TSM.  
I'm looking forward to it as well.


Jaime



Quoting Marc A Kaplan <makap...@us.ibm.com>:


IBM HSM products have always supported unprivileged, user triggered recall
of any file.  I am not familiar with any particular GUI, but from the CLI,
it's easy enough:

dd if=/pathtothefileyouwantrecalled  of=/dev/null bs=1M count=2  &  #
pulling the first few blocks will trigger a complete recall if the file
happens to be on HSM

We also had IBM HSM for mainframe MVS, years and years ago, which is now
called DFHSM for  z/OS.   (I remember using this from TSO...)

If the file has been migrated to a tape archive, accessing the file will
trigger a tape mount which can take a while, depending on how fast your
tape mounting (robot?), operates and what other requests may be queued
ahead of yours!











 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] backup and disaster recovery solutions

2016-04-11 Thread Jaime Pinto
I heard as recently as last Friday from IBM support/vendors/developers  
of GPFS/TSM/HSM that the newest release of Spectrum Protect (7.11)  
offers a GUI interface that is user centric, and will allow for  
unprivileged users to restore their own material via a newer WebGUI  
(one that also works with Firefox, Chrome and on linux, not only IE on  
Windows). Users may authenticate via AD or LDAP, and traverse only  
what they would be allowed to via linux permissions and ACLs.


Jaime

Quoting Jonathan Buzzard <jonat...@buzzard.me.uk>:


On Mon, 2016-04-11 at 10:34 -0400, Jaime Pinto wrote:

Do you want backups or periodic frozen snapshots of the file system?

Backups can entail some level of version control, so that you or
end-users can get files back on certain points in time, in case of
accidental deletions. Besides 1.5PB is a lot of material, so you may
not want to take full snapshots that often. In that case, a
combination of daily incremental backups using TSM with GPFS's
mmbackup can be a good option. TSM also does a very good job at
controlling how material is distributed across multiple tapes, and
that is something that requires a lot of micro-management if you want
a home grown solution of rsync+LTFS.


Is there any other viable option other than TSM for backing up 1.5PB of
data? All other backup software does not handle this at all well.


On the other hand, you could use gpfs built-in tools such a
mmapplypolicy to identify candidates for incremental backup, and send
them to LTFS. Just more micro management, and you may have to come up
with your own tool to let end-users restore their stuff, or you'll
have to act on their behalf.



I was not aware of a way of letting end users restore their stuff from
*backup* for any of the major backup software while respecting the file
system level security of the original file system. If you let the end
user have access to the backup they can restore any file to any location
which is generally not a good idea.

I do have a concept of creating a read only Fuse mounted file system
from a TSM point in time synthetic backup, and then using the shadow
copy feature of Samba to enable restores using the "Previous Versions"
feature of windows file manager.

I got as far as getting a directory tree you could browse through but
then had an enforced change of jobs and don't have access to a TSM
server any more to continue development.

Note if anyone from IBM is listening that would be a super cool feature.


JAB.

--
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup scripts) vs. TSM(backup)

2016-03-19 Thread Jaime Pinto

OK, that is good to know.
I'll give it a try with snapshot then. We already have 3.5 almost  
everywhere, and planing for 4.2 upgrade (reading the posts with  
interest)

Thanks
Jaime

Quoting Yuri L Volobuev <volob...@us.ibm.com>:




Under both 3.2 and 3.3 mmbackup would always lock up our cluster when
using snapshot. I never understood the behavior without snapshot, and
the lock up was intermittent in the carved-out small test cluster, so
I never felt confident enough to deploy over the larger 4000+ clients
cluster.


Back then, GPFS code had a deficiency: migrating very large files didn't
work well with snapshots (and some operation mm commands).  In order to
create a snapshot, we have to have the file system in a consistent state
for a moment, and we get there by performing a "quiesce" operation.  This
is done by flushing all dirty buffers to disk, stopping any new incoming
file system operations at the gates, and waiting for all in-flight
operations to finish.  This works well when all in-flight operations
actually finish reasonably quickly.  That assumption was broken if an
external utility, e.g. mmapplypolicy, used gpfs_restripe_file API on a very
large file, e.g. to migrate the file's blocks to a different storage pool.
The quiesce operation would need to wait for that API call to finish, as
it's an in-flight operation, but migrating a multi-TB file could take a
while, and during this time all new file system ops would be blocked.  This
was solved several years ago by changing the API and its callers to do the
migration one block range at a time, thus making each individual syscall
short and allowing quiesce to barge in and do its thing.  All currently
supported levels of GPFS have this fix.  I believe mmbackup was affected by
the same GPFS deficiency and benefited from the same fix.

yuri








 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Use of commodity HDs on large GPFS client base clusters?

2016-03-15 Thread Jaime Pinto
I'd like to hear about performance consideration from sites that may  
be using "non-IBM sanctioned" storage hardware or appliance, such as  
DDN, GSS, ESS (we have all of these).


For instance, how could that compare with ESS, which I understand has  
some sort of "dispersed parity" feature, that substantially diminishes  
rebuilt time in case of HD failures.


I'm particularly interested on HPC sites with 5000+ clients mounting  
such commodity NSD's+HD's setup.


Thanks
Jaime


---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS+TSM+HSM: staging vs. migration priority

2016-03-10 Thread Jaime Pinto

Hey Dominic

Just submitted a  new request:

Headline: GPFS+TSM+HSM: staging vs. migration priority

ID: 85292

Thank you
Jaime



Quoting Dominic Mueller-Wicke01 <dominic.muel...@de.ibm.com>:



Hi Jaime,

I received the same request from other customers as well.
could you please open a RFE for the theme and send me the RFE ID? I will
discuss it with the product management then. RFE Link:
https://www.ibm.com/developerworks/rfe/execute?use_case=changeRequestLanding_ID=0_ID=360=11=12

Greetings, Dominic.

__

Dominic Mueller-Wicke | IBM Spectrum Protect Development | Technical Lead |
+49 7034 64 32794 | dominic.muel...@de.ibm.com

Vorsitzende des Aufsichtsrats: Martina Koederitz; Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart,
HRB 243294



From:   Jaime Pinto <pi...@scinet.utoronto.ca>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>,
Marc A Kaplan <makap...@us.ibm.com>
Cc: Dominic Mueller-Wicke01/Germany/IBM@IBMDE
Date:   09.03.2016 16:22
Subject:Re: [gpfsug-discuss] GPFS+TSM+HSM: staging vs. migration
priority



Interesting perspective Mark.

I'm inclined to think EBUSY would be more appropriate.

Jaime

Quoting Marc A Kaplan <makap...@us.ibm.com>:


For a write or create operation ENOSPC  would make some sense.
But if the file already exists and I'm just opening for read access I
would be very confused by ENOSPC.
How should the system respond:  "Sorry, I know about that file, I have it
safely stored away in HSM, but it is not available right now. Try again
later!"

EAGAIN or EBUSY might be the closest in ordinary language...
But EAGAIN is used when a system call is interrupted and can be retried
right away...
So EBUSY?

The standard return codes in Linux are:

#define EPERM1  /* Operation not permitted */
#define ENOENT   2  /* No such file or directory */
#define ESRCH3  /* No such process */
#define EINTR4  /* Interrupted system call */
#define EIO  5  /* I/O error */
#define ENXIO6  /* No such device or address */
#define E2BIG7  /* Argument list too long */
#define ENOEXEC  8  /* Exec format error */
#define EBADF9  /* Bad file number */
#define ECHILD  10  /* No child processes */
#define EAGAIN  11  /* Try again */
#define ENOMEM  12  /* Out of memory */
#define EACCES  13  /* Permission denied */
#define EFAULT  14  /* Bad address */
#define ENOTBLK 15  /* Block device required */
#define EBUSY   16  /* Device or resource busy */
#define EEXIST  17  /* File exists */
#define EXDEV   18  /* Cross-device link */
#define ENODEV  19  /* No such device */
#define ENOTDIR 20  /* Not a directory */
#define EISDIR  21  /* Is a directory */
#define EINVAL  22  /* Invalid argument */
#define ENFILE  23  /* File table overflow */
#define EMFILE  24  /* Too many open files */
#define ENOTTY  25  /* Not a typewriter */
#define ETXTBSY 26  /* Text file busy */
#define EFBIG   27  /* File too large */
#define ENOSPC  28  /* No space left on device */
#define ESPIPE  29  /* Illegal seek */
#define EROFS   30  /* Read-only file system */
#define EMLINK  31  /* Too many links */
#define EPIPE   32  /* Broken pipe */
#define EDOM33  /* Math argument out of domain of func */
#define ERANGE  34  /* Math result not representable */














  
   TELL US ABOUT YOUR SUCCESS STORIES
  http://www.scinethpc.ca/testimonials
  ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of
Toronto.












 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at 

Re: [gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup scripts) vs. TSM(backup)

2016-03-09 Thread Jaime Pinto

Quoting Yaron Daniel <y...@il.ibm.com>:


Hi

Did u use mmbackup with TSM ?

https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_mmbackup.htm


I have used mmbackup on test mode a few times before, while under gpfs  
3.2 and 3.3, but not under 3.5 yet or 4.x series (not installed in our  
facility yet).


Under both 3.2 and 3.3 mmbackup would always lock up our cluster when  
using snapshot. I never understood the behavior without snapshot, and  
the lock up was intermittent in the carved-out small test cluster, so  
I never felt confident enough to deploy over the larger 4000+ clients  
cluster.


Another issue was that the version of mmbackup then would not let me  
choose the client environment associated with a particular gpfs file  
system, fileset or path, and the equivalent storage pool and /or  
policy on the TSM side.


With the native TSM client we can do this by configuring the dsmenv  
file, and even the NODEMANE/ASNODE, etc, with which to access TSM, so  
we can keep the backups segregated on different pools/tapes if  
necessary (by user, by group, by project, etc)


The problem we all agree on is that TSM client traversing is VERY  
SLOW, and can not be parallelized. I always knew that the mmbackup  
client was supposed to replace the TSM client for the traversing, and  
then parse the "necessary parameters" and files to the native TSM  
client, so it could then take over for the remainder of the workflow.


Therefore, the remaining problems are as follows:
* I never understood the snapshot induced lookup, and how to fix it.  
Was it due to the size of our cluster or the version of GPFS? Has it  
been addressed under 3.5 or 4.x series? Without the snapshot how would  
mmbackup know what was already gone to backup since the previous  
incremental backup? Does it check each file against what is already on  
TSM to build the list of candidates? What is the experience out there?


* In the v4r2 version of the manual for the mmbackup utility we still  
don't seem to be able to determine which TSM BA Client dsmenv to use  
as a parameter. All we can do is choose the --tsm-servers  
TSMServer[,TSMServer...]] . I can only conclude that all the contents  
of any backup on the GPFS side will always end-up on a default storage  
pool and use the standard TSM policy if nothing else is done. I'm now  
wondering if it would be ok to simply 'source dsmenv' from a shell for  
each instance of the mmbackup we fire up, in addition to setting up  
the other MMBACKUP_DSMC_MISC, MMBACKUP_DSMC_BACKUP, ..., etc as  
described on man page.


* what about the restore side of things? Most mm* commands can only be  
executed by root. Should we still have to rely on the TSM BA Client  
(dsmc|dsmj) if unprivileged users want to restore their own stuff?


I guess I'll have to conduct more experiments.





Please also review this :

http://files.gpfsug.org/presentations/2015/SBENDER-GPFS_UG_UK_2015-05-20.pdf



This is pretty good, as a high level overview. Much better than a few  
others I've seen with the release of the Spectrum Suite, since it  
focus entirely on GPFS/TSM/backup|(HSM). It would be nice to have some  
typical implementation examples.




Thanks a lot for the references Yaron, and again thanks for any  
further comments.

Jaime





Regards





Yaron Daniel
 94 Em Ha'Moshavot Rd

Server, Storage and Data Services - Team Leader
 Petach Tiqva, 49527
Global Technology Services
 Israel
Phone:
+972-3-916-5672


Fax:
+972-3-916-5672


Mobile:
+972-52-8395593


e-mail:
y...@il.ibm.com


IBM Israel







gpfsug-discuss-boun...@spectrumscale.org wrote on 03/09/2016 09:56:13 PM:


From: Jaime Pinto <pi...@scinet.utoronto.ca>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date: 03/09/2016 09:56 PM
Subject: [gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup
scripts) vs. TSM(backup)
Sent by: gpfsug-discuss-boun...@spectrumscale.org

Here is another area where I've been reading material from several
sources for years, and in fact trying one solution over the other from
time-to-time in a test environment. However, to date I have not been
able to find a one-piece-document where all these different IBM
alternatives for backup are discussed at length, with the pos and cons
well explained, along with the how-to's.

I'm currently using TSM(built-in backup client), and over the years I
developed a set of tricks to rely on disk based volumes as
intermediate cache, and multiple backup client nodes, to split the
load and substantially improve the performance of the backup compared
to when I first deployed this solution. However I suspect it could
still be improved further if I was to apply tools from the GPFS side
of the equation.

I would appreciate any comments/pointers.

Thanks
Jaime





---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto

[gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup scripts) vs. TSM(backup)

2016-03-09 Thread Jaime Pinto
Here is another area where I've been reading material from several  
sources for years, and in fact trying one solution over the other from  
time-to-time in a test environment. However, to date I have not been  
able to find a one-piece-document where all these different IBM  
alternatives for backup are discussed at length, with the pos and cons  
well explained, along with the how-to's.


I'm currently using TSM(built-in backup client), and over the years I  
developed a set of tricks to rely on disk based volumes as  
intermediate cache, and multiple backup client nodes, to split the  
load and substantially improve the performance of the backup compared  
to when I first deployed this solution. However I suspect it could  
still be improved further if I was to apply tools from the GPFS side  
of the equation.


I would appreciate any comments/pointers.

Thanks
Jaime





---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS+TSM+HSM: staging vs. migration priority

2016-03-09 Thread Jaime Pinto

Interesting perspective Mark.

I'm inclined to think EBUSY would be more appropriate.

Jaime

Quoting Marc A Kaplan <makap...@us.ibm.com>:


For a write or create operation ENOSPC  would make some sense.
But if the file already exists and I'm just opening for read access I
would be very confused by ENOSPC.
How should the system respond:  "Sorry, I know about that file, I have it
safely stored away in HSM, but it is not available right now. Try again
later!"

EAGAIN or EBUSY might be the closest in ordinary language...
But EAGAIN is used when a system call is interrupted and can be retried
right away...
So EBUSY?

The standard return codes in Linux are:

#define EPERM1  /* Operation not permitted */
#define ENOENT   2  /* No such file or directory */
#define ESRCH3  /* No such process */
#define EINTR4  /* Interrupted system call */
#define EIO  5  /* I/O error */
#define ENXIO6  /* No such device or address */
#define E2BIG7  /* Argument list too long */
#define ENOEXEC  8  /* Exec format error */
#define EBADF9  /* Bad file number */
#define ECHILD  10  /* No child processes */
#define EAGAIN  11  /* Try again */
#define ENOMEM  12  /* Out of memory */
#define EACCES  13  /* Permission denied */
#define EFAULT  14  /* Bad address */
#define ENOTBLK 15  /* Block device required */
#define EBUSY   16  /* Device or resource busy */
#define EEXIST  17  /* File exists */
#define EXDEV   18  /* Cross-device link */
#define ENODEV  19  /* No such device */
#define ENOTDIR 20  /* Not a directory */
#define EISDIR  21  /* Is a directory */
#define EINVAL  22  /* Invalid argument */
#define ENFILE  23  /* File table overflow */
#define EMFILE  24  /* Too many open files */
#define ENOTTY  25  /* Not a typewriter */
#define ETXTBSY 26  /* Text file busy */
#define EFBIG   27  /* File too large */
#define ENOSPC  28  /* No space left on device */
#define ESPIPE  29  /* Illegal seek */
#define EROFS   30  /* Read-only file system */
#define EMLINK  31  /* Too many links */
#define EPIPE   32  /* Broken pipe */
#define EDOM33  /* Math argument out of domain of func */
#define ERANGE  34  /* Math result not representable */














 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ****
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] GPFS+TSM+HSM: staging vs. migration priority

2016-03-08 Thread Jaime Pinto
I'm wondering whether the new version of the "Spectrum Suite" will  
allow us set the priority of the HSM migration to be higher than  
staging.



I ask this because back in 2011 when we were still using Tivoli HSM  
with GPFS, during mixed requests for migration and staging operations,  
we had a very annoying behavior in which the staging would always take  
precedence over migration. The end-result was that the GPFS would fill  
up to 100% and induce a deadlock on the cluster, unless we identified  
all the user driven stage requests in time, and killed them all. We  
contacted IBM support a few times asking for a way fix this, and were  
told it was built into TSM. Back then we gave up IBM's HSM primarily  
for this reason, although performance was also a consideration (more  
to this on another post).


We are now reconsidering HSM for a new deployment, however only if  
this issue has been resolved (among a few others).


What has been some of the experience out there?

Thanks
Jaime




---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmlsnode: Unable to determine the local node identity.

2016-02-10 Thread Jaime Pinto

Quoting "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>:


Hi Jaime,

Have you tried wiping out /var/mmfs/gen/* and /var/mmfs/etc/* on the  
 old nodeA?


Kevin


That did the trick.
Thanks Kevin and all that responded privately.

Jaime






On Feb 10, 2016, at 1:26 PM, Jaime Pinto <pi...@scinet.utoronto.ca> wrote:

Dear group

I'm trying to deal with this in the most elegant way possible:

Once upon the time there were nodeA and nodeB in the cluster, on a   
'onDemand manual HA' fashion.


* nodeA died, so I migrated the whole OS/software/application stack  
 from backup over to 'nodeB', IP/hostname, etc, hence 'old nodeB'   
effectively became the new nodeA.


* Getting the new nodeA to rejoin the cluster was already a pain,   
but through a mmdelnode and mmaddnode operation we eventually got   
it to mount gpfs.


Well ...

* Old nodeA is now fixed and back on the network, and I'd like to   
re-purpose it as the new standby nodeB (IP and hostname already   
applied). As the subject say, I'm now facing node identity issues.   
From the FSmgr I already tried to del/add nodeB, even nodeA, etc,   
however GPFS seems to keep some information cached somewhere in the  
 cluster.


* At this point I even turned old nodeA into a nodeC with a   
different IP, etc, but that doesn't help either. I can't even start  
 gpfs on nodeC.


Question: what is the appropriate process to clean this mess from   
the GPFS perspective?


I can't touch the new nodeA. It's highly committed in production already.

Thanks
Jaime






********
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss










 
  TELL US ABOUT YOUR SUCCESS STORIES
 http://www.scinethpc.ca/testimonials
 ********
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of Toronto.


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss