Not that it answers Stuart's questions in any way, but we gave up on the same problem on a similar setup, rescued an old fileserver off the scrapheap (RAID6 of 12 x 7.2k rpm SAS on a PERC H710P) and just served the reference data by nfs - good enough to keep the compute busy rather than in cxiWaitEventWait. If there's significant demand for Alphafold then somebody's arm will be twisted for a new server with some NVMe. If I remember right, the reference data is ~2.3TB, ruling out our usual approach of just reading the problematic files into a ramdisk first. We are also interested in hearing how it might be usably served from GPFS. Thanks, Jon
-- Dr. Jonathan Diprose <j...@well.ox.ac.uk> Tel: 01865 287873 Research Computing Manager Henry Wellcome Building for Genomic Medicine Roosevelt Drive, Headington, Oxford OX3 7BN ________________________________________ From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Stuart Barkley [stua...@4gh.net] Sent: 19 October 2021 18:16 To: gpfsug-discuss@spectrumscale.org Subject: [gpfsug-discuss] alphafold and mmap performance Over the years there have been several discussions about performance problems with mmap() on GPFS/Spectrum Scale. We are currently having problems with mmap() performance on our systems with new alphafold <https://github.com/deepmind/alphafold> protein folding software. Things look similar to previous times we have had mmap() problems. The software component "hhblits" appears to mmap a large file with genomic data and then does random reads throughout the file. GPFS appears to be doing 4K reads for each block limiting the performance. The first run takes 20+ hours to run. Subsequent identical runs complete in just 1-2 hours. After clearing the linux system cache (echo 3 > /proc/sys/vm/drop_caches) the slow performance returns for the next run. GPFS Server is 4.2.3-5 running on DDN hardware. CentOS 7.3 Default GPFS Client is 4.2.3-22. CentOS 7.9 We have tried a number of things including Spectrum Scale client version 5.0.5-9 which should have Sven's recent mmap performance improvements. Are the recent mmap performance improvements in the client code or the server code? Only now do I notice a suggestion: mmchconfig prefetchAggressivenessRead=0 -i I did not use this. Would a performance change be expected? Would the pagepool size be involved in this? Stuart Barkley -- I've never been lost; I was once bewildered for three days, but never lost! -- Daniel Boone _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss