Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-12-04 Thread Dugan, Michael J
I have a cluster with two filesystems and I need to migrate a fileset from one 
to the other.
I would normally do this with tar and rsync but I decided to experiment with 
AFM following
the document below. In my test setup I'm finding that hardlinks are not 
preserved by the migration.
Is that expected or am I doing something wrong?

I'm using gpfs-5.0.5.4.

Thanks.

--Mike


--
Michael J. Dugan
Manager of Systems Programming and Administration
Research Computing Services | IS&T | Boston University
617-358-0030
du...@bu.edu
http://www.bu.edu/tech



From: gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Venkateswara R Puvvada 

Sent: Monday, November 23, 2020 9:41 PM
To: gpfsug main discussion list 
Cc: gpfsug-discuss-boun...@spectrumscale.org 

Subject: Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over 
NFS?

AFM provides near zero downtime for migration.  As of today,  AFM migration 
does not support ACLs or other EAs migration from non scale (GPFS) source.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/com.ibm.spectrum.scale.v5r10.doc/bl1ins_uc_migrationusingafmmigrationenhancements.htm

~Venkat (vpuvv...@in.ibm.com)



From:"Frederick Stock" 
To:gpfsug-discuss@spectrumscale.org
Cc:gpfsug-discuss@spectrumscale.org
Date:11/17/2020 03:14 AM
Subject:    [EXTERNAL] Re: [gpfsug-discuss] Migrate/syncronize data from 
Isilon to Scale overNFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org




Have you considered using the AFM feature of Spectrum Scale?  I doubt it will 
provide any speed improvement but it would allow for data to be accessed as it 
was being migrated.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com


- Original message -
From: Andi Christiansen 
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug-discuss@spectrumscale.org" 
Cc:
Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon to 
Scale over NFS?
Date: Mon, Nov 16, 2020 2:44 PM

Hi all,

i have got a case where a customer wants 700TB migrated from isilon to Scale 
and the only way for him is exporting the same directory on NFS from two 
different nodes...

as of now we are using multiple rsync processes on different parts of folders 
within the main directory. this is really slow and will take forever.. right 
now 14 rsync processes spread across 3 nodes fetching from 2..

does anyone know of a way to speed it up? right now we see from 1Gbit to 3Gbit 
if we are lucky(total bandwidth) and there is a total of 30Gbit from scale 
nodes and 20Gbits from isilon so we should be able to reach just under 20Gbit...


if anyone have any ideas they are welcome!


Thanks in advance
Andi Christiansen
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-27 Thread Venkateswara R Puvvada
Hi Yeep,

>If ACLs and other EAs migration from non scale is not supported by AFM, 
is there any 3rd party tool that could complement that when paired with 
AFM?

rsync can be used to just fix metadata like ACLs and EAs.  AFM does not 
revalidate the files with source system if rsync changes the ACLs on them. 
So ACLs can only be fixed after or during the cutover.  ACL inheritance 
may be used by setting on ACLs on required parent dirs upfront if this 
option is sufficient, there was an user who migrated to scale using this 
method.

~Venkat (vpuvv...@in.ibm.com)



From:   "T.A. Yeep" 
To: gpfsug main discussion list 
Cc: gpfsug-discuss-boun...@spectrumscale.org
Date:   11/24/2020 07:40 PM
Subject:    [EXTERNAL] Re: [gpfsug-discuss] Migrate/syncronize data 
from Isilon to Scale over NFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Venkat,

If ACLs and other EAs migration from non scale is not supported by AFM, is 
there any 3rd party tool that could complement that when paired with AFM?

On Tue, Nov 24, 2020 at 10:41 AM Venkateswara R Puvvada <
vpuvv...@in.ibm.com> wrote:
AFM provides near zero downtime for migration.  As of today,  AFM 
migration does not support ACLs or other EAs migration from non scale 
(GPFS) source.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/com.ibm.spectrum.scale.v5r10.doc/bl1ins_uc_migrationusingafmmigrationenhancements.htm


~Venkat (vpuvv...@in.ibm.com)



From:"Frederick Stock" 
To:gpfsug-discuss@spectrumscale.org
Cc:gpfsug-discuss@spectrumscale.org
Date:11/17/2020 03:14 AM
Subject:    [EXTERNAL] Re: [gpfsug-discuss] Migrate/syncronize data 
from Isilon to Scale overNFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Have you considered using the AFM feature of Spectrum Scale?  I doubt it 
will provide any speed improvement but it would allow for data to be 
accessed as it was being migrated.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com
 
 
- Original message -
From: Andi Christiansen 
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug-discuss@spectrumscale.org" 
Cc:
Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon 
to Scale over NFS?
Date: Mon, Nov 16, 2020 2:44 PM
 
Hi all,
 
i have got a case where a customer wants 700TB migrated from isilon to 
Scale and the only way for him is exporting the same directory on NFS from 
two different nodes...
 
as of now we are using multiple rsync processes on different parts of 
folders within the main directory. this is really slow and will take 
forever.. right now 14 rsync processes spread across 3 nodes fetching from 
2.. 
 
does anyone know of a way to speed it up? right now we see from 1Gbit to 
3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
scale nodes and 20Gbits from isilon so we should be able to reach just 
under 20Gbit...
 
 
if anyone have any ideas they are welcome! 


Thanks in advance 
Andi Christiansen
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 
Best regards 
T.A. Yeep
Mobile: +6-016-719 8506 | Tel: +6-03-7628 0526 | www.robusthpc.com


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-24 Thread T.A. Yeep
Hi Venkat,

If ACLs and other EAs migration from non scale is not supported by AFM, is
there any 3rd party tool that could complement that when paired with AFM?

On Tue, Nov 24, 2020 at 10:41 AM Venkateswara R Puvvada 
wrote:

> AFM provides near zero downtime for migration.  As of today,  AFM
> migration does not support ACLs or other EAs migration from non scale
> (GPFS) source.
>
>
> https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/com.ibm.spectrum.scale.v5r10.doc/bl1ins_uc_migrationusingafmmigrationenhancements.htm
>
> ~Venkat (vpuvv...@in.ibm.com)
>
>
>
> From:"Frederick Stock" 
> To:gpfsug-discuss@spectrumscale.org
> Cc:gpfsug-discuss@spectrumscale.org
> Date:        11/17/2020 03:14 AM
> Subject:    [EXTERNAL] Re: [gpfsug-discuss] Migrate/syncronize data
> from Isilon to Scale overNFS?
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> --
>
>
>
> Have you considered using the AFM feature of Spectrum Scale?  I doubt it
> will provide any speed improvement but it would allow for data to be
> accessed as it was being migrated.
>
> Fred
> __
> Fred Stock | IBM Pittsburgh Lab | 720-430-8821
> sto...@us.ibm.com
>
>
> - Original message -
> From: Andi Christiansen 
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: "gpfsug-discuss@spectrumscale.org" 
> Cc:
> Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon
> to Scale over NFS?
> Date: Mon, Nov 16, 2020 2:44 PM
>
> Hi all,
>
> i have got a case where a customer wants 700TB migrated from isilon to
> Scale and the only way for him is exporting the same directory on NFS from
> two different nodes...
>
> as of now we are using multiple rsync processes on different parts of
> folders within the main directory. this is really slow and will take
> forever.. right now 14 rsync processes spread across 3 nodes fetching from
> 2..
>
> does anyone know of a way to speed it up? right now we see from 1Gbit to
> 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from
> scale nodes and 20Gbits from isilon so we should be able to reach just
> under 20Gbit...
>
>
> if anyone have any ideas they are welcome!
>
>
> Thanks in advance
> Andi Christiansen
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>


-- 
Best regards

*T.A. Yeep*Mobile: +6-016-719 8506 | Tel: +6-03-7628 0526 |
www.robusthpc.com
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-23 Thread Venkateswara R Puvvada
AFM provides near zero downtime for migration.  As of today,  AFM 
migration does not support ACLs or other EAs migration from non scale 
(GPFS) source.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.1.0/com.ibm.spectrum.scale.v5r10.doc/bl1ins_uc_migrationusingafmmigrationenhancements.htm

~Venkat (vpuvv...@in.ibm.com)



From:   "Frederick Stock" 
To: gpfsug-discuss@spectrumscale.org
Cc: gpfsug-discuss@spectrumscale.org
Date:   11/17/2020 03:14 AM
Subject:[EXTERNAL] Re: [gpfsug-discuss] Migrate/syncronize data 
from Isilon to Scale over   NFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Have you considered using the AFM feature of Spectrum Scale?  I doubt it 
will provide any speed improvement but it would allow for data to be 
accessed as it was being migrated.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com
 
 
- Original message -
From: Andi Christiansen 
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug-discuss@spectrumscale.org" 
Cc:
Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon 
to Scale over NFS?
Date: Mon, Nov 16, 2020 2:44 PM
 
Hi all,
 
i have got a case where a customer wants 700TB migrated from isilon to 
Scale and the only way for him is exporting the same directory on NFS from 
two different nodes...
 
as of now we are using multiple rsync processes on different parts of 
folders within the main directory. this is really slow and will take 
forever.. right now 14 rsync processes spread across 3 nodes fetching from 
2.. 
 
does anyone know of a way to speed it up? right now we see from 1Gbit to 
3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
scale nodes and 20Gbits from isilon so we should be able to reach just 
under 20Gbit...
 
 
if anyone have any ideas they are welcome! 


Thanks in advance 
Andi Christiansen
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Chris Schlipalius
Ø  I would counsel in the strongest possible terms against that approach.

Ø  Basically you have to be assured that none of your file names have "wacky" 
characters in them, because handling "wacky" characters in file

Ø  names is exceedingly difficult. I cannot stress how hard it is and the above 
example does not handle all "wacky" characters in file names.

 

Well that’s indeed another kettle of fish if you have irregular/special naming 
of files, no I didn’t cover that and if you have millions of files, yes a list 
would be unwieldy, then I would be tarring up dirs. before moving… and then 
untarring on GPFS  or breaking up the list into sets or sub lists. 

If you have these wacky types of file names well there are fixes as in the 
rsync manpages… yes not easy but possible..

 

Ie

 

1.   -s, --protect-args

 

2.   As per usual you can escape the spaces, or substitute for spaces. 
rsync -avuz u...@server1.com:"${remote_path// /\\ }" .

 

3.   Single quote the file name and path inside double quotes.

 

 

 

Ø  One thing I didn't mention is that I would run anything with in a screen (or 
tmux if that is your poison) and turn on logging.

 

Absolutely agree…

 

Ø  For those interested I am in the process of cleaning up the script a bit and 
will post it somewhere in due course.

Ø  JAB.

 

Would be interesting to see….

 

I’ve also had success on GPFS with DCP and possibly this would be another 
option 

 

Regards,

Chris Schlipalius

 

Team Lead, Data Storage Infrastructure, Supercomputing Platforms, Pawsey 
Supercomputing Centre (CSIRO)

1 Bryce Avenue

Kensington  WA  6151

Australia

 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Valdis Klētnieks
On Wed, 18 Nov 2020 11:48:52 +, Jonathan Buzzard said:

> So what do I mean by "wacky" characters. Well remember a file name can
> have just about anything in it on Linux with the exception of '/', and

You want to see some fireworks?  At least at one time, it was possible to use
a file system debugger that's all too trusting of hexadecimal input and create
a directory entry of '../'. Let's just say that fs/namei.c was also far too 
trusting,
and fsck was more than happy to make *different* errors than the kernel was

> The obvious ones are spaces, but it's not just ASCII 0x20, but tabs too.
> Then there is the use of the wildcard characters, especially '?' but
> also '*'.

Don't forget ESC, CR, LF, backticks, forward ticks, semicolons, and pretty much
anything else that will give a shell indigestion. SQL isn't the only thing 
prone to
injection attacks.. :)



pgps69JeqhsqZ.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Andi Christiansen
Hi Jonathan,

i would be very interested in seeing your scripts when they are posted. Let me 
know where to get them!

Thanks a bunch!
Andi Christiansen

> On 11/18/2020 12:48 PM Jonathan Buzzard  wrote:
> 
>  
> On 17/11/2020 23:17, Chris Schlipalius wrote:
> > So at my last job we used to rsync data between isilons across campus, 
> > and isilon to Windows File Cluster (and back).
> > 
> > I recommend using dry run to generate a list of files and then use this 
> > to run with rysnc.
> > 
> > This allows you also to be able to break up the transfer into batches, 
> > and check if files have changed before sync (say if your isilon files 
> > are not RO.
> > 
> > Also ensure you have a recent version of rsync that preserves extended 
> > attributes and check your ACLS.
> > 
> > A dry run example:
> > 
> > https://unix.stackexchange.com/a/261372 
> > 
> > I always felt more comfortable having a list of files before a sync….
> > 
> 
> I would counsel in the strongest possible terms against that approach.
> 
> Basically you have to be assured that none of your file names have 
> "wacky" characters in them, because handling "wacky" characters in file 
> names is exceedingly difficult. I cannot stress how hard it is and the 
> above example does not handle all "wacky" characters in file names.
> 
> So what do I mean by "wacky" characters. Well remember a file name can 
> have just about anything in it on Linux with the exception of '/', and 
> users especially when using a GUI, and even more so if they are Mac 
> users can and do use what I will call "wacky" characters in their file 
> names.
> 
> The obvious ones are spaces, but it's not just ASCII 0x20, but tabs too. 
> Then there is the use of the wildcard characters, especially '?' but 
> also '*'.
> 
> Not too difficult to handle you might say. Right now deal with a file 
> name with a newline character in it :-) Don't ask me how or why you even 
> do that but let me assure you that I have seen them on more than one 
> occasion. And now your dry run list is broken...
> 
> Not only that if you have a few hundred million files to move a list 
> just becomes unwieldy anyway.
> 
> One thing I didn't mention is that I would run anything with in a screen 
> (or tmux if that is your poison) and turn on logging.
> 
> For those interested I am in the process of cleaning up the script a bit 
> and will post it somewhere in due course.
> 
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Jonathan Buzzard

On 17/11/2020 23:17, Chris Schlipalius wrote:
So at my last job we used to rsync data between isilons across campus, 
and isilon to Windows File Cluster (and back).


I recommend using dry run to generate a list of files and then use this 
to run with rysnc.


This allows you also to be able to break up the transfer into batches, 
and check if files have changed before sync (say if your isilon files 
are not RO.


Also ensure you have a recent version of rsync that preserves extended 
attributes and check your ACLS.


A dry run example:

https://unix.stackexchange.com/a/261372 


I always felt more comfortable having a list of files before a sync….



I would counsel in the strongest possible terms against that approach.

Basically you have to be assured that none of your file names have 
"wacky" characters in them, because handling "wacky" characters in file 
names is exceedingly difficult. I cannot stress how hard it is and the 
above example does not handle all "wacky" characters in file names.


So what do I mean by "wacky" characters. Well remember a file name can 
have just about anything in it on Linux with the exception of '/', and 
users especially when using a GUI, and even more so if they are Mac 
users can and do use what I will call "wacky" characters in their file 
names.


The obvious ones are spaces, but it's not just ASCII 0x20, but tabs too. 
Then there is the use of the wildcard characters, especially '?' but 
also '*'.


Not too difficult to handle you might say. Right now deal with a file 
name with a newline character in it :-) Don't ask me how or why you even 
do that but let me assure you that I have seen them on more than one 
occasion. And now your dry run list is broken...


Not only that if you have a few hundred million files to move a list 
just becomes unwieldy anyway.


One thing I didn't mention is that I would run anything with in a screen 
(or tmux if that is your poison) and turn on logging.


For those interested I am in the process of cleaning up the script a bit 
and will post it somewhere in due course.



JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Chris Schlipalius
So at my last job we used to rsync data between isilons across campus, and 
isilon to Windows File Cluster (and back).

I recommend using dry run to generate a list of files and then use this to run 
with rysnc.

This allows you also to be able to break up the transfer into batches, and 
check if files have changed before sync (say if your isilon files are not RO.

Also ensure you have a recent version of rsync that preserves extended 
attributes and check your ACLS.

 

A dry run example:

https://unix.stackexchange.com/a/261372

 

I always felt more comfortable having a list of files before a sync….

 

 

 

Regards,

Chris Schlipalius

 

Team Lead, Data Storage Infrastructure, Supercomputing Platforms, Pawsey 
Supercomputing Centre (CSIRO)

1 Bryce Avenue

Kensington  WA  6151

Australia

 

Tel  +61 8 6436 8815 

Email  chris.schlipal...@pawsey.org.au

Web  www.pawsey.org.au

 

 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Andi Christiansen
Hi Jonathan,

yes you are correct! but we plan to resync this once or twice every week for 
the next 3-4months to be sure everything is as it should be.

Right now we are focused on getting them synced up and then we will run 
scheduled resyncs/checks once or twice a week depending on the data growth :)

Thanks
Andi Christiansen

> On 11/17/2020 2:53 PM Jonathan Buzzard  wrote:
> 
>  
> On 17/11/2020 11:51, Andi Christiansen wrote:
> > Hi all,
> > 
> > thanks for all the information, there was some interesting things
> > amount it..
> > 
> > I kept on going with rsync and ended up making a file with all top
> > level user directories and splitting them into chunks of 347 per
> > rsync session(total 42000 ish folders). yesterday we had only 14
> > sessions with 3000 folders in each and that was too much work for one
> > rsync session..
> 
> Unless you use something similar to my DB suggestion it is almost 
> inevitable that some of those rsync sessions are going to have issues 
> and you will have no way to track it or even know it has happened unless 
> you do a single final giant catchup/check rsync.
> 
> I should add that a copy of the sqlite DB is cover your backside 
> protection when a user pops up claiming that you failed to transfer one 
> of their vitally important files six months down the line and the old 
> system is turned off and scrapped.
> 
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jonathan Buzzard

On 17/11/2020 15:55, Simon Thompson wrote:



Fortunately, we seem committed to GPFS so it might be we never have to do
another bulk transfer outside of the filesystem...


Until you want to move a v3 or v4 created file-system to v5 block sizes __


You forget the v2 to v3 for more than two billion files switch. Either 
that or you where not using it back then. Then there was the v3.2 if you 
ever want to mount it on Windows.




I hopes we won't be doing that sort of thing again...



Yep, going to be recycling my scripts in the coming week for a v4 to v5 
with capacity upgrade on our DSS-G. That basically involves a trashing 
of the file system and a restore from backup.


Going to be doing the your data will be restored based on a metric of 
how many files and how much data you have ploy again :-)


I too hope that will be the last time I have to do anything similar but 
my experience of the last couple of decades says that is likely to be a 
forlorn hope :-(


I speculate that one day the 10,000 file set limit will be lifted, but 
only if you reformat your file system...


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Simon Thompson


>Fortunately, we seem committed to GPFS so it might be we never have to do
>another bulk transfer outside of the filesystem...

Until you want to move a v3 or v4 created file-system to v5 block sizes __

I hopes we won't be doing that sort of thing again...

Simon

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Skylar Thompson
On Tue, Nov 17, 2020 at 01:53:43PM +, Jonathan Buzzard wrote:
> On 17/11/2020 11:51, Andi Christiansen wrote:
> > Hi all,
> > 
> > thanks for all the information, there was some interesting things
> > amount it..
> > 
> > I kept on going with rsync and ended up making a file with all top
> > level user directories and splitting them into chunks of 347 per
> > rsync session(total 42000 ish folders). yesterday we had only 14
> > sessions with 3000 folders in each and that was too much work for one
> > rsync session..
> 
> Unless you use something similar to my DB suggestion it is almost inevitable
> that some of those rsync sessions are going to have issues and you will have
> no way to track it or even know it has happened unless you do a single final
> giant catchup/check rsync.
> 
> I should add that a copy of the sqlite DB is cover your backside protection
> when a user pops up claiming that you failed to transfer one of their
> vitally important files six months down the line and the old system is
> turned off and scrapped.

That's not a bad idea, and I like it more than the method I setup where we
captured the output of find from both sides of the transfer and preserved
it for posterity, but obviously did require a hard-stop date on the source.

Fortunately, we seem committed to GPFS so it might be we never have to do
another bulk transfer outside of the filesystem...

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jonathan Buzzard

On 17/11/2020 11:51, Andi Christiansen wrote:

Hi all,

thanks for all the information, there was some interesting things
amount it..

I kept on going with rsync and ended up making a file with all top
level user directories and splitting them into chunks of 347 per
rsync session(total 42000 ish folders). yesterday we had only 14
sessions with 3000 folders in each and that was too much work for one
rsync session..


Unless you use something similar to my DB suggestion it is almost 
inevitable that some of those rsync sessions are going to have issues 
and you will have no way to track it or even know it has happened unless 
you do a single final giant catchup/check rsync.


I should add that a copy of the sqlite DB is cover your backside 
protection when a user pops up claiming that you failed to transfer one 
of their vitally important files six months down the line and the old 
system is turned off and scrapped.



JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Andi Christiansen
Hi Jan,

We are syncing ACLs, groups, owners and timestamps aswell :)

/Andi Christiansen

> On 11/17/2020 1:07 PM Jan-Frode Myklebust  wrote:
> 
> 
> Nice to see it working well!
> 
> But, what about ACLs? Does you rsync pull in all needed metadata, or do 
> you also need to sync ACLs ? Any plans for how to solve that ?
> 
> On Tue, Nov 17, 2020 at 12:52 PM Andi Christiansen 
>  wrote:
> 
> > > Hi all,
> > 
> > thanks for all the information, there was some interesting things 
> > amount it..
> > 
> > I kept on going with rsync and ended up making a file with all top 
> > level user directories and splitting them into chunks of 347 per rsync 
> > session(total 42000 ish folders). yesterday we had only 14 sessions with 
> > 3000 folders in each and that was too much work for one rsync session..
> > 
> > i divided them out among all GPFS nodes to have them fetch an area 
> > each and actually doing that 3 times on each node and that has now boosted 
> > the bandwidth usage from 3Gbit to around 16Gbit in total..
> > 
> > all nodes have been seing doing work above 7Gbit individual which 
> > is actually near to what i was expecting without any modifications to the 
> > NFS server or TCP tuning..
> > 
> > CPU is around 30-50% on each server and mostly below or around 30% 
> > so it seems like it could have handled abit more sessions..
> > 
> > Small files are really a killer but with all 96+ sessions we have 
> > now its not often all sessions are handling small files at the same time so 
> > we have an average of about 10-12Gbit bandwidth usage.
> > 
> > Thanks all! ill keep you in mind if for some reason we see it 
> > slowing down again but for now i think we will try to see if it will go the 
> > last mile with a bit more sessions on each :)
> > 
> > Best Regards
> > Andi Christiansen
> > 
> > > On 11/17/2020 9:57 AM Uwe Falke  > mailto:uwefa...@de.ibm.com > wrote:
> > >
> > > 
> > > Hi, Andi, sorry I just took your 20Gbit for the sign of 2x10Gbps 
> > bons, but
> > > it is over two nodes, so no bonding. But still, I'd expect to 
> > open several
> > > TCP connections in parallel per source-target pair  (like with 
> > several
> > > rsyncs per source node) would bear an advantage (and still I 
> > thing NFS
> > > doesn't do that, but I can be wrong).
> > > If more nodes have access to the Isilon data they could also 
> > participate
> > > (and don't need NFS exports for that).
> > >
> > > Mit freundlichen Grüßen / Kind regards
> > >
> > > Dr. Uwe Falke
> > > IT Specialist
> > > Hybrid Cloud Infrastructure / Technology Consulting & 
> > Implementation
> > > Services
> > > +49 175 575 2877 Mobile
> > > Rathausstr. 7, 09111 Chemnitz, Germany
> > > uwefa...@de.ibm.com mailto:uwefa...@de.ibm.com
> > >
> > > IBM Services
> > >
> > > IBM Data Privacy Statement
> > >
> > > IBM Deutschland Business & Technology Services GmbH
> > > Geschäftsführung: Sven Schooss, Stefan Hierl
> > > Sitz der Gesellschaft: Ehningen
> > > Registergericht: Amtsgericht Stuttgart, HRB 17122
> > >
> > >
> > >
> > > From:   Uwe Falke/Germany/IBM
> > > To: gpfsug main discussion list 
> > mailto:gpfsug-discuss@spectrumscale.org >
> > > Date:   17/11/2020 09:50
> > > Subject:Re: [EXTERNAL] [gpfsug-discuss] 
> > Migrate/syncronize data
> > > from Isilon to Scale over   NFS?
> > >
> > >
> > > Hi Andi,
> > >
> > > what about leaving NFS completeley out and using rsync  (multiple 
> > rsyncs
> > > in parallel, of course) directly between your source and target 
> > servers?
> > > I am not sure how many TCP connections (suppose it is NFS4) in 
> > parallel
> > > are opened between client and server, using a 2x bonded interface 
> > well
> > > requires at least two.  That combined with the DB approach 
> > suggested by
> > > Jonathan to control the activity of the rsync streams would be my 
> > best
> > > guess.
> > > If you have many small files, the overhead might still kill you. 
> > Tarring
> > > them up into larger aggregates for transfer would help a lot, but 
> > then you
> > > must be sure they won't change or you need to implement your own 
> > version
> > > control for that class of files.
> > >
> > > Mit freundlichen Grüßen / Kind regards
> > >
> > > Dr. Uwe Falke
> > > IT Specialist
> > > Hybrid Cloud Infrastructure / Technology Consulting & 
> > Implementation
> > > Services
> > > +49 175 575 2877 Mobile
> > > Rathausstr. 7, 09111 Chemnitz, German

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jan-Frode Myklebust
Nice to see it working well!

But, what about ACLs? Does you rsync pull in all needed metadata, or do you
also need to sync ACLs ? Any plans for how to solve that ?

On Tue, Nov 17, 2020 at 12:52 PM Andi Christiansen 
wrote:

> Hi all,
>
> thanks for all the information, there was some interesting things amount
> it..
>
> I kept on going with rsync and ended up making a file with all top level
> user directories and splitting them into chunks of 347 per rsync
> session(total 42000 ish folders). yesterday we had only 14 sessions with
> 3000 folders in each and that was too much work for one rsync session..
>
> i divided them out among all GPFS nodes to have them fetch an area each
> and actually doing that 3 times on each node and that has now boosted the
> bandwidth usage from 3Gbit to around 16Gbit in total..
>
> all nodes have been seing doing work above 7Gbit individual which is
> actually near to what i was expecting without any modifications to the NFS
> server or TCP tuning..
>
> CPU is around 30-50% on each server and mostly below or around 30% so it
> seems like it could have handled abit more sessions..
>
> Small files are really a killer but with all 96+ sessions we have now its
> not often all sessions are handling small files at the same time so we have
> an average of about 10-12Gbit bandwidth usage.
>
> Thanks all! ill keep you in mind if for some reason we see it slowing down
> again but for now i think we will try to see if it will go the last mile
> with a bit more sessions on each :)
>
> Best Regards
> Andi Christiansen
>
> > On 11/17/2020 9:57 AM Uwe Falke  wrote:
> >
> >
> > Hi, Andi, sorry I just took your 20Gbit for the sign of 2x10Gbps bons,
> but
> > it is over two nodes, so no bonding. But still, I'd expect to open
> several
> > TCP connections in parallel per source-target pair  (like with several
> > rsyncs per source node) would bear an advantage (and still I thing NFS
> > doesn't do that, but I can be wrong).
> > If more nodes have access to the Isilon data they could also participate
> > (and don't need NFS exports for that).
> >
> > Mit freundlichen Grüßen / Kind regards
> >
> > Dr. Uwe Falke
> > IT Specialist
> > Hybrid Cloud Infrastructure / Technology Consulting & Implementation
> > Services
> > +49 175 575 2877 Mobile
> > Rathausstr. 7, 09111 Chemnitz, Germany
> > uwefa...@de.ibm.com
> >
> > IBM Services
> >
> > IBM Data Privacy Statement
> >
> > IBM Deutschland Business & Technology Services GmbH
> > Geschäftsführung: Sven Schooss, Stefan Hierl
> > Sitz der Gesellschaft: Ehningen
> > Registergericht: Amtsgericht Stuttgart, HRB 17122
> >
> >
> >
> > From:   Uwe Falke/Germany/IBM
> > To: gpfsug main discussion list 
> > Date:   17/11/2020 09:50
> > Subject:Re: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data
> > from Isilon to Scale over   NFS?
> >
> >
> > Hi Andi,
> >
> > what about leaving NFS completeley out and using rsync  (multiple rsyncs
> > in parallel, of course) directly between your source and target servers?
> > I am not sure how many TCP connections (suppose it is NFS4) in parallel
> > are opened between client and server, using a 2x bonded interface well
> > requires at least two.  That combined with the DB approach suggested by
> > Jonathan to control the activity of the rsync streams would be my best
> > guess.
> > If you have many small files, the overhead might still kill you. Tarring
> > them up into larger aggregates for transfer would help a lot, but then
> you
> > must be sure they won't change or you need to implement your own version
> > control for that class of files.
> >
> > Mit freundlichen Grüßen / Kind regards
> >
> > Dr. Uwe Falke
> > IT Specialist
> > Hybrid Cloud Infrastructure / Technology Consulting & Implementation
> > Services
> > +49 175 575 2877 Mobile
> > Rathausstr. 7, 09111 Chemnitz, Germany
> > uwefa...@de.ibm.com
> >
> > IBM Services
> >
> > IBM Data Privacy Statement
> >
> > IBM Deutschland Business & Technology Services GmbH
> > Geschäftsführung: Sven Schooss, Stefan Hierl
> > Sitz der Gesellschaft: Ehningen
> > Registergericht: Amtsgericht Stuttgart, HRB 17122
> >
> >
> >
> >
> > From:   Andi Christiansen 
> > To: "gpfsug-discuss@spectrumscale.org"
> > 
> > Date:   16/11/2020 20:44
> > Subject:[EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from
> > Isilon to Scale overNFS?
> > Sent by:gpfsug-discuss-boun...@spectrumscale.org
> >
> >
> >
> > Hi all,
> >
> > i have got a case where a customer wants 700TB migrated from isilon to
> > Scale and the only way for him is exporting the same directory on NFS
> from
> > two different nodes...
> >
> > as of now we are using multiple rsync processes on different parts of
> > folders within the main directory. this is really slow and will take
> > forever.. right now 14 rsync processes spread across 3 nodes fetching
> from
> > 2..
> >
> > does anyone know of a way to speed it up? right now we see from 1Gbit to
> > 3Gbit if we are lucky(total b

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Andi Christiansen
Hi all,

thanks for all the information, there was some interesting things amount it..

I kept on going with rsync and ended up making a file with all top level user 
directories and splitting them into chunks of 347 per rsync session(total 42000 
ish folders). yesterday we had only 14 sessions with 3000 folders in each and 
that was too much work for one rsync session..

i divided them out among all GPFS nodes to have them fetch an area each and 
actually doing that 3 times on each node and that has now boosted the bandwidth 
usage from 3Gbit to around 16Gbit in total..

all nodes have been seing doing work above 7Gbit individual which is actually 
near to what i was expecting without any modifications to the NFS server or TCP 
tuning..

CPU is around 30-50% on each server and mostly below or around 30% so it seems 
like it could have handled abit more sessions..

Small files are really a killer but with all 96+ sessions we have now its not 
often all sessions are handling small files at the same time so we have an 
average of about 10-12Gbit bandwidth usage.

Thanks all! ill keep you in mind if for some reason we see it slowing down 
again but for now i think we will try to see if it will go the last mile with a 
bit more sessions on each :)

Best Regards
Andi Christiansen

> On 11/17/2020 9:57 AM Uwe Falke  wrote:
> 
>  
> Hi, Andi, sorry I just took your 20Gbit for the sign of 2x10Gbps bons, but 
> it is over two nodes, so no bonding. But still, I'd expect to open several 
> TCP connections in parallel per source-target pair  (like with several 
> rsyncs per source node) would bear an advantage (and still I thing NFS 
> doesn't do that, but I can be wrong). 
> If more nodes have access to the Isilon data they could also participate 
> (and don't need NFS exports for that).
> 
> Mit freundlichen Grüßen / Kind regards
> 
> Dr. Uwe Falke
> IT Specialist
> Hybrid Cloud Infrastructure / Technology Consulting & Implementation 
> Services
> +49 175 575 2877 Mobile
> Rathausstr. 7, 09111 Chemnitz, Germany
> uwefa...@de.ibm.com
> 
> IBM Services
> 
> IBM Data Privacy Statement
> 
> IBM Deutschland Business & Technology Services GmbH
> Geschäftsführung: Sven Schooss, Stefan Hierl
> Sitz der Gesellschaft: Ehningen
> Registergericht: Amtsgericht Stuttgart, HRB 17122
> 
> 
> 
> From:   Uwe Falke/Germany/IBM
> To: gpfsug main discussion list 
> Date:   17/11/2020 09:50
> Subject:Re: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data 
> from Isilon to Scale over   NFS?
> 
> 
> Hi Andi, 
> 
> what about leaving NFS completeley out and using rsync  (multiple rsyncs 
> in parallel, of course) directly between your source and target servers? 
> I am not sure how many TCP connections (suppose it is NFS4) in parallel 
> are opened between client and server, using a 2x bonded interface well 
> requires at least two.  That combined with the DB approach suggested by 
> Jonathan to control the activity of the rsync streams would be my best 
> guess.
> If you have many small files, the overhead might still kill you. Tarring 
> them up into larger aggregates for transfer would help a lot, but then you 
> must be sure they won't change or you need to implement your own version 
> control for that class of files.
> 
> Mit freundlichen Grüßen / Kind regards
> 
> Dr. Uwe Falke
> IT Specialist
> Hybrid Cloud Infrastructure / Technology Consulting & Implementation 
> Services
> +49 175 575 2877 Mobile
> Rathausstr. 7, 09111 Chemnitz, Germany
> uwefa...@de.ibm.com
> 
> IBM Services
> 
> IBM Data Privacy Statement
> 
> IBM Deutschland Business & Technology Services GmbH
> Geschäftsführung: Sven Schooss, Stefan Hierl
> Sitz der Gesellschaft: Ehningen
> Registergericht: Amtsgericht Stuttgart, HRB 17122
> 
> 
> 
> 
> From:   Andi Christiansen 
> To: "gpfsug-discuss@spectrumscale.org" 
> 
> Date:   16/11/2020 20:44
> Subject:[EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from 
> Isilon to Scale overNFS?
> Sent by:gpfsug-discuss-boun...@spectrumscale.org
> 
> 
> 
> Hi all, 
> 
> i have got a case where a customer wants 700TB migrated from isilon to 
> Scale and the only way for him is exporting the same directory on NFS from 
> two different nodes... 
> 
> as of now we are using multiple rsync processes on different parts of 
> folders within the main directory. this is really slow and will take 
> forever.. right now 14 rsync processes spread across 3 nodes fetching from 
> 2.. 
> 
> does anyone know of a way to speed it up? right now we see from 1Gbit to 
> 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
> scale nodes and 20Gbits from isilon so we should be able to reach just 
> under 20Gbit... 
> 
> 
> if anyone have any ideas they are welcome! 
> 
> 
> Thanks in advance 
> Andi Christiansen ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
> 
> 
> 

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Uwe Falke
Hi, Andi, sorry I just took your 20Gbit for the sign of 2x10Gbps bons, but 
it is over two nodes, so no bonding. But still, I'd expect to open several 
TCP connections in parallel per source-target pair  (like with several 
rsyncs per source node) would bear an advantage (and still I thing NFS 
doesn't do that, but I can be wrong). 
If more nodes have access to the Isilon data they could also participate 
(and don't need NFS exports for that).

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Hybrid Cloud Infrastructure / Technology Consulting & Implementation 
Services
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefa...@de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Sven Schooss, Stefan Hierl
Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122



From:   Uwe Falke/Germany/IBM
To: gpfsug main discussion list 
Date:   17/11/2020 09:50
Subject:Re: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data 
from Isilon to Scale over   NFS?


Hi Andi, 

what about leaving NFS completeley out and using rsync  (multiple rsyncs 
in parallel, of course) directly between your source and target servers? 
I am not sure how many TCP connections (suppose it is NFS4) in parallel 
are opened between client and server, using a 2x bonded interface well 
requires at least two.  That combined with the DB approach suggested by 
Jonathan to control the activity of the rsync streams would be my best 
guess.
If you have many small files, the overhead might still kill you. Tarring 
them up into larger aggregates for transfer would help a lot, but then you 
must be sure they won't change or you need to implement your own version 
control for that class of files.

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Hybrid Cloud Infrastructure / Technology Consulting & Implementation 
Services
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefa...@de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Sven Schooss, Stefan Hierl
Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122




From:   Andi Christiansen 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   16/11/2020 20:44
Subject:[EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from 
Isilon to Scale overNFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi all, 

i have got a case where a customer wants 700TB migrated from isilon to 
Scale and the only way for him is exporting the same directory on NFS from 
two different nodes... 

as of now we are using multiple rsync processes on different parts of 
folders within the main directory. this is really slow and will take 
forever.. right now 14 rsync processes spread across 3 nodes fetching from 
2.. 

does anyone know of a way to speed it up? right now we see from 1Gbit to 
3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
scale nodes and 20Gbits from isilon so we should be able to reach just 
under 20Gbit... 


if anyone have any ideas they are welcome! 


Thanks in advance 
Andi Christiansen ___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 






___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Uwe Falke
Hi Andi, 

what about leaving NFS completeley out and using rsync  (multiple rsyncs 
in parallel, of course) directly between your source and target servers? 
I am not sure how many TCP connections (suppose it is NFS4) in parallel 
are opened between client and server, using a 2x bonded interface well 
requires at least two.  That combined with the DB approach suggested by 
Jonathan to control the activity of the rsync streams would be my best 
guess.
If you have many small files, the overhead might still kill you. Tarring 
them up into larger aggregates for transfer would help a lot, but then you 
must be sure they won't change or you need to implement your own version 
control for that class of files.

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Hybrid Cloud Infrastructure / Technology Consulting & Implementation 
Services
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefa...@de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Sven Schooss, Stefan Hierl
Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122



From:   Andi Christiansen 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   16/11/2020 20:44
Subject:[EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from 
Isilon to Scale overNFS?
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi all, 

i have got a case where a customer wants 700TB migrated from isilon to 
Scale and the only way for him is exporting the same directory on NFS from 
two different nodes... 

as of now we are using multiple rsync processes on different parts of 
folders within the main directory. this is really slow and will take 
forever.. right now 14 rsync processes spread across 3 nodes fetching from 
2.. 

does anyone know of a way to speed it up? right now we see from 1Gbit to 
3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
scale nodes and 20Gbits from isilon so we should be able to reach just 
under 20Gbit... 


if anyone have any ideas they are welcome! 


Thanks in advance 
Andi Christiansen ___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss 





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Jonathan Buzzard

On 16/11/2020 21:58, Skylar Thompson wrote:

When we did a similar (though larger, at ~2.5PB) migration, we used rsync
as well, but ran one rsync process per Isilon node, and made sure the NFS
clients were hitting separate Isilon nodes for their reads. We also didn't
have more than one rsync process running per client, as the Linux NFS
client (at least in CentOS 6) was terrible when it came to concurrent access.



The million dollar question IMHO is the number of files and their sizes.

Basically if you have a million 1KB files to move it is going to take 
much longer than a 100 1GB files. That is the overhead of dealing with 
each file is a real bitch and kills your attainable transfer speed stone 
dead.


One option I have used in the past is to use your last backup and 
restore to the new system, then rsync in the changes. That way you don't 
impact the source file system which is live.


Another option I have used is to inform users in advance that data will 
be transferred based on a metric of how many files and how much data 
they have. So the less data and fewer files the quicker you will get 
access to the new system once access to the old system is turned off.


It is amazing how much users clear up junk under this scenario. Last 
time I did this a single user went from over 17 million files to 11 
thousand! In total many many TB of data just vanished from the system 
(around half of the data when puff) as users actually got around to some 
house keeping LOL. Moving less data and files is always less painful.



Whatever method you end up using, I can guarantee you will be much happier
once you are on GPFS. :)


Goes without saying :-)


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Jonathan Buzzard

On 16/11/2020 19:44, Andi Christiansen wrote:

Hi all,

i have got a case where a customer wants 700TB migrated from isilon to 
Scale and the only way for him is exporting the same directory on NFS 
from two different nodes...


as of now we are using multiple rsync processes on different parts of 
folders within the main directory. this is really slow and will take 
forever.. right now 14 rsync processes spread across 3 nodes fetching 
from 2..


does anyone know of a way to speed it up? right now we see from 1Gbit to 
3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit 
from scale nodes and 20Gbits from isilon so we should be able to reach 
just under 20Gbit...



if anyone have any ideas they are welcome!



My biggest recommendation when doing this is to use a sqlite database to 
keep track of what is going on.


The main issue is that you are almost certainly going to need to do more 
than one rsync pass unless your source Isilon system has no user 
activity, and with 700TB to move that seems unlikely. Typically you do 
an initial rsync to move the bulk of the data while the users are still 
live, then shutdown user access to the source system and do the final 
rsync which hopefully has a significantly smaller amount of data to 
actually move.


So this is what I have done on a number of occasions now. I create a 
very simple sqlite DB with a list of source and destination folders and 
a status code. Initially the status code is set to -1.


Then I have a perl script which looks at the sqlite DB, picks a row with 
a status code of -1, and sets the status code to -2, aka that directory 
is in progress. It then proceeds to run the rsync and when it finishes 
it updates the status code to the exit code of the rsync process.


As long as all the rsync processes have access to the same copy of the 
sqlite DB (simplest to put it on either the source or destination file 
system) then all is good. You can fire off multiple rsync's on multiple 
nodes and they will all keep churning away till there is no more work to 
be done.


The advantage is you can easily interrogate the DB to find out the state 
of play. That is how many of your transfers have completed, how many are 
yet to be done, which ones are currently being transferred etc. without 
logging onto multiple nodes.


*MOST* importantly you can see if any of the rsync's had an error, by 
simply looking for status codes greater than zero. I cannot stress how 
important this is. Noting that if the source is still active you will 
see errors down to files being deleted on the source file system before 
rsync has a chance to copy them. However this has a specific exit code 
(24) so is easy to spot and not worry about.


Finally it is also very simple to set the status codes to -1 again and 
set the process away again. So the final run is easier to do.


If you want to mail me off list I can dig out a copy of the perl code I 
used if your interested. There are several version as I have tended to 
tailor to each transfer.



JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Skylar Thompson
When we did a similar (though larger, at ~2.5PB) migration, we used rsync
as well, but ran one rsync process per Isilon node, and made sure the NFS
clients were hitting separate Isilon nodes for their reads. We also didn't
have more than one rsync process running per client, as the Linux NFS
client (at least in CentOS 6) was terrible when it came to concurrent access.

Whatever method you end up using, I can guarantee you will be much happier
once you are on GPFS. :)

On Mon, Nov 16, 2020 at 08:44:14PM +0100, Andi Christiansen wrote:
> Hi all,
> 
> i have got a case where a customer wants 700TB migrated from isilon to Scale 
> and the only way for him is exporting the same directory on NFS from two 
> different nodes...
> 
> as of now we are using multiple rsync processes on different parts of folders 
> within the main directory. this is really slow and will take forever.. right 
> now 14 rsync processes spread across 3 nodes fetching from 2.. 
> 
> does anyone know of a way to speed it up? right now we see from 1Gbit to 
> 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from 
> scale nodes and 20Gbits from isilon so we should be able to reach just under 
> 20Gbit...
> 
> 
> if anyone have any ideas they are welcome! 
> 
> 
> Thanks in advance 
> Andi Christiansen

> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Frederick Stock
Have you considered using the AFM feature of Spectrum Scale?  I doubt it will provide any speed improvement but it would allow for data to be accessed as it was being migrated.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Andi Christiansen Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?Date: Mon, Nov 16, 2020 2:44 PM 
Hi all,
 
i have got a case where a customer wants 700TB migrated from isilon to Scale and the only way for him is exporting the same directory on NFS from two different nodes...
 
as of now we are using multiple rsync processes on different parts of folders within the main directory. this is really slow and will take forever.. right now 14 rsync processes spread across 3 nodes fetching from 2.. 
 
does anyone know of a way to speed it up? right now we see from 1Gbit to 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from scale nodes and 20Gbits from isilon so we should be able to reach just under 20Gbit...
 
 
if anyone have any ideas they are welcome! Thanks in advance 
Andi Christiansen
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss