Hi, I've tried increasing all the refresh intervals, but even at 300 seconds, there is very little performance increase.
The job runs in several steps, and gets held up at two places, as far as I can see. First at a kind of parallelisation step where about 1000-3000 files are created in the current working folder on a single compute node, and then at a step where lots of small output files are written on each of the compute nodes involved in the job. Comparing with running the same data set on a non-AFM cache fileset in the same storage system, it runs at least a factor 5 slower, even with really high refresh intervals. In the Scale documentation, it states that the afmRefreshAsync is only configurable cluster wide. Is it also configurable on a per-fileset level? https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.3/com.ibm.spectrum.scale.v5r03.doc/bl1adm_configurationparametersAFM.htm The software is XDS, http://xds.mpimf-heidelberg.mpg.de/ Unfortunately it is a closed source software, so it is not possible to adapt the software. Regards, Andreas Mattsson ____________________________________________ [X] Andreas Mattsson Systems Engineer MAX IV Laboratory Lund University P.O. Box 118, SE-221 00 Lund, Sweden Visiting address: Fotongatan 2, 224 84 Lund Mobile: +46 706 64 95 44 <mailto:andreas.matts...@maxiv.se>andreas.matts...@maxiv.lu.se<mailto:andreas.matts...@maxiv.lu.se> www.maxiv.se<http://www.maxiv.se/> ________________________________ Från: gpfsug-discuss-boun...@spectrumscale.org <gpfsug-discuss-boun...@spectrumscale.org> för Venkateswara R Puvvada <vpuvv...@in.ibm.com> Skickat: den 27 september 2019 10:23:13 Till: gpfsug main discussion list Ämne: Re: [gpfsug-discuss] afmRefreshAsync questions Hi, Both storage and client clusters have to be on 5.0.3.x to get the AFM revalidation performance with afmRefreshAsync. What are the refresh intervals ?, you could also try increasing them. Is this config option set at fileset level or cluster level ? ~Venkat (vpuvv...@in.ibm.com) From: Andreas Mattsson <andreas.matts...@maxiv.lu.se> To: GPFS User Group <gpfsug-discuss@spectrumscale.org> Date: 09/26/2019 03:26 PM Subject: [EXTERNAL] [gpfsug-discuss] afmRefreshAsync questions Sent by: gpfsug-discuss-boun...@spectrumscale.org ________________________________ Hi, Due to having a data analysis software that isn't running well at all in our AFM caches, it runs 4-6 times slower on an AFM cache than on a non-AFM fileset on the same storage system, I wanted to try out the afmRefreshAsync feature that came with 5.0.3 to see if it is the cache data refresh that is holding things up. Enabling this feature has had zero impact on performance of the software though. The storage cluster is running 5.0.3.x, and afmRefreshAsync has been set there, but at the moment the remote-mounting client cluster is still running 5.0.2.x. Would this feature still have any effect in this setup? Regards, Andreas Mattsson ____________________________________________ [cid:_4_DB7D1BA8DB7D1920002E115D65258482] Andreas Mattsson Systems Engineer MAX IV Laboratory Lund University P.O. Box 118, SE-221 00 Lund, Sweden Visiting address: Fotongatan 2, 224 84 Lund Mobile: +46 706 64 95 44 <mailto:andreas.matts...@maxiv.se>andreas.matts...@maxiv.lu.se<mailto:andreas.matts...@maxiv.lu.se> www.maxiv.se<http://www.maxiv.se/> _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss