Hello to all!

I'm pleased to have joined the GPFS UG mailing list, I'm experimenting with 
GPFS on zLinux running in z/VM on a z13 mainframe. I work for the UK Met Office 
in the GPCS team (general purpose compute service/mainframe team) and I'm based 
in Exeter, Devon.

I've joined with a specific question to ask, in short: how can I automate 
sending files to a cloud object store as they arrive in GPFS and keep a copy of 
the file in GPFS?

The longer spiel is this: We have a HPC that throws out a lot of NetCDF files 
via FTP for use in forecasts. We're currently undergoing a change in working 
practice, so that data processing is beginning to be done in the cloud.  At the 
same time we're also attempting to de-duplicate the data being sent from the 
HPC by creating one space to receive it and then have consumers use it or send 
it on as necessary from there. The data is in terabytes a day sizes, and the 
timeliness of it's arrival to systems is fairly important (forecasts cease to 
be forecasts if they're too late).

We're using zLinux because the mainframe already receives much of the data from 
the HPC and has access to a SAN with SSD storage, has the right network 
connections it needs and generally seems the least amount of work to put 
something in place.

Getting a supported clustered filesystem on zLinux is tricky, but GPFS fits the 
bill and having hardware, storage, OS and filesystem from one provider (IBM) 
should hopefully save some headaches.

We're using Amazon as our cloud provider, and have 2x10GB direct links to their 
London data centre with a ping of about 15ms, so fairly low latency. The 
developers using the data want it in s3 so they can access it from server-less 
environments and won't need to have ec2 instances loitering to look after the 
data.

We were initially interested in using mmcloudgateway/cloud data sharing to send 
the data, but it's not available for s390x (only x86_64), so I'm now looking at 
setting up a external storage pool for talking to s3 and then having some kind 
of ilm soft quota trigger to send the data once enough of it has arrived, but 
I'm still exploring options. Options such as asking the user group of 
experienced folks what they think is best!

So, any help or advice would be greatly appreciated!

Regards,

Peter Chase
GPCS Team
Met Office  FitzRoy Road  Exeter  Devon  EX1 3PB  United Kingdom
Email: peter.ch...@metoffice.gov.uk<mailto:peter.ch...@metoffice.gov.uk> 
Website: www.metoffice.gov.uk<http://www.metoffice.gov.uk/>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to