Hi Mike, Forking the discussion to have a consistent topic that is more discoverable.
On Thu, Jul 18, 2019 at 4:21 PM Michael Williams <michael.r.c.willi...@gmail.com> wrote: > > Hi Szilárd, > > Thanks for the interesting observations on recent hardware. I was wondering > if you could comment on the use of somewhat older server cpus and > motherboards (versus more cutting edge consumer parts). I recently noticed > that Haswell era Xeon cpus (E5 v3) are quite affordable now (~$400 for 12 > core models with 40 pcie lanes) and so are the corresponding 2 cpu socket > server motherboards. Of course the RAM is slower than what can be used with > the latest Ryzen or i7/i9 cpus. ,When it comes to GPU accelerated runs, given that most of the arithmetically-intensive computation is offloaded, major features of more modern processors / CPU instruction sets don't help much (like AVX512). As most bio-MD (unless running huge systems) fits in the CPU cache, RAM performance and more memory channels also has little to no impact (with some exceptions being 1-st gen AMD Zen arch, but that's another topic). What dominates the performance CPU contribution of CPUs is cache size (and speed/efficiency) and number/speed of the CPU cores. This is somewhat of a non-trivial thing to assess as the clock speed specs don't always reflect the stable clocks these CPUs run at, but roughly you can count the (#core x frequency) as a metric to gauge the performance of a CPU *in such a scenario*. More on this you can find in our recent paper where we do in fact compare the performance of the best bang for buck modern servers (spoiler alert: AMD EPYC was already and will especially be the champion with the Rome arch) with upgraded older Xeon v2 nodes; see: https://doi.org/10.1002/jcc.26011 > > Are there any other bottlenecks with this somewhat older server hardware that > I might not be aware of? There can be: PCI topology can be an issue; you want a symmetric, e.g. two x16 buses connected directly to each socket (for dual-socket systems) rather than e.g. many lanes connected to a PCI switch all connected to the same socket. You can also have significant GPU-to-GPU communication issues on older-gen hardware (like v2/v3 Xeon), but GROMACS does not make use of that yet (partly due to that very reason), but with the near future releases that may also be a slight concern if you want to scale across many GPUs. I hope that helps, let me know if you have any other questions! Cheers, -- Szilárd > Thanks again for the interesting information and practical advice on this > topic. > > Mike > > > > On Jul 18, 2019, at 2:21 AM, Szilárd Páll <pall.szil...@gmail.com> wrote: > > > > PS: You will get more PCIe lanes without motherboard trickery -- and note > > that consumer motherboards with PCIe switches can sometimes cause > > instabilities when under heavy compute load -- if you buy the aging and > > quite overpriced i9 X-series like the i9-7920 with 12 cores or the > > Threadripper 2950x 16 cores and 60 PCIe lanes. > > > > Also note that, but more cores always win when the CPU performance matters > > and while 8 cores are generally sufficient, in some use-cases it may not be > > (like runs with free energy). > > > > -- > > Szilárd > > > > > > On Thu, Jul 18, 2019 at 10:08 AM Szilárd Páll <pall.szil...@gmail.com> > > wrote: > > > >> On Wed, Jul 17, 2019 at 7:00 PM Moir, Michael (MMoir) <mm...@chevron.com> > >> wrote: > >> > >>> This is not quite true. I certainly observed this degradation in > >>> performance using the 9900K with two GPUs as Szilárd states using a > >>> motherboard with one PCIe controller, but the limitation is from the > >>> motherboard not from the CPU. > >> > >> > >> Sorry, but that's not the case. PCIe controllers have been integrated into > >> CPUs for many years; see > >> > >> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-introduction-basics-paper.pdf > >> > >> https://www.microway.com/hpc-tech-tips/common-pci-express-myths-gpu-computing/ > >> > >> So no, the limitation is the CPU itself. Consumer CPUs these days have 24 > >> lanes total, some of which are used to connect the CPU to the chipset, and > >> effectively you get 16-20 lanes (BTW here too the new AMD CPUs win as they > >> provide 16 lanes for GPUs and similar devices and 4 lanes for NVMe, all on > >> PCIe 4.0). > >> > >> > >>> It is possible to obtain a motherboard that contains two PCIe > >>> controllers which overcomes this obstacle for not a whole lot more money. > >>> > >> > >> It is possibly to buy motherboards with PCIe switches. These don't > >> increase the number of lanes just do what a swtich does: as long as not all > >> connected devices try to use the full capacity of the CPU (!) at the same > >> time, you can get full speed on all connected devices. > >> e.g.: > >> https://techreport.com/r.x/2015_11_19_Gigabytes_Z170XGaming_G1_motherboard_reviewed/05-diagram_pcie_routing.gif > >> > >> Cheers, > >> -- > >> Szilárd > >> > >> Mike > >>> > >>> -----Original Message----- > >>> From: gromacs.org_gmx-users-boun...@maillist.sys.kth.se < > >>> gromacs.org_gmx-users-boun...@maillist.sys.kth.se> On Behalf Of Szilárd > >>> Páll > >>> Sent: Wednesday, July 17, 2019 8:14 AM > >>> To: Discussion list for GROMACS users <gmx-us...@gromacs.org> > >>> Subject: [**EXTERNAL**] Re: [gmx-users] Xeon Gold + RTX 5000 > >>> > >>> Hi Alex, > >>> > >>> I've not had a chance to test the new 3rd gen Ryzen CPUs, but all > >>> public benchmarks out there point to the fact that they are a major > >>> improvement over the previous generation Ryzen -- which were already > >>> quite competitive for GPU-accelerated GROMACS runs compared to Intel, > >>> especially in perf/price. > >>> > >>> One caveat for dual-GPU setups on the i9 9900 or the Ryzen 3900X is > >>> that they don't have enough PCI lanes for peak CPU-GPU transfer (x8 > >>> for both of the GPUs) which will lead to a slightly less performance > >>> (I'd estimate <5-10%) in particular compared to i) having a single GPU > >>> plugged in into the machine ii) compare to CPUs like Threadripper or > >>> the i9 79xx series processors which have more PCIe lanes. > >>> > >>> However, if throughput is the goal, the ideal use-case especially for > >>> small simulation systems like <=50k atoms is to run e.g. 2 runs / GPU, > >>> hence 4 runs on a 2-GPU system case in which the impact of the > >>> aforementioned limitation will be further decreased. > >>> > >>> Cheers, > >>> -- > >>> Szilárd > >>> > >>> > >>>> On Tue, Jul 16, 2019 at 7:18 PM Alex <nedoma...@gmail.com> wrote: > >>>> > >>>> That is excellent information, thank you. None of us have dealt with AMD > >>>> CPUs in a while, so would the combination of a Ryzen 3900X and two > >>>> Quadro 2080 Ti be a good choice? > >>>> > >>>> Again, thanks! > >>>> > >>>> Alex > >>>> > >>>> > >>>>> On 7/16/2019 8:41 AM, Szilárd Páll wrote: > >>>>> Hi Alex, > >>>>> > >>>>>> On Mon, Jul 15, 2019 at 8:53 PM Alex <nedoma...@gmail.com> wrote: > >>>>>> Hi all and especially Szilard! > >>>>>> > >>>>>> My glorious management asked me to post this here. One of our group > >>>>>> members, an ex-NAMD guy, wants to use Gromacs for biophysics and the > >>>>>> following basics have been spec'ed for him: > >>>>>> > >>>>>> CPU: Xeon Gold 6244 > >>>>>> GPU: RTX 5000 or 6000 > >>>>>> > >>>>>> I'll be surprised if he runs systems with more than 50K particles. > >>> Could > >>>>>> you please comment on whether this is a cost-efficient and reasonably > >>>>>> powerful setup? Your past suggestions have been invaluable for us. > >>>>> That will be reasonably fast, but cost efficiency will be awful, to > >>> be honest: > >>>>> - that CPU is a ~$3000 part and won't perform much better than a > >>>>> $4-500 desktop CPU like an i9 9900, let alone a Ryzen 3900X which > >>>>> would be significantly faster. > >>>>> - Quadro cards also pretty low in bang for buck: a 2080 Ti will be > >>>>> close to the RTX 6000 for ~5x less and the 2080 or 2070 Super a bit > >>>>> slower for at least another 1.5x less. > >>>>> > >>>>> Single run at a time or possibly multiple? The proposed (or any 8+ > >>>>> core) workstation CPU is fast enough in the majority of the > >>>>> simulations to pair well with two of those GPUs if used for two > >>>>> concurrent simulations. If that's a relevant use-case, I'd recommend > >>>>> two 2070 Super or 2080 cards. > >>>>> > >>>>> Cheers, > >>>>> -- > >>>>> Szilárd > >>>>> > >>>>> > >>>>>> Thank you, > >>>>>> > >>>>>> Alex > >>>>>> -- > >>>>>> Gromacs Users mailing list > >>>>>> > >>>>>> * Please search the archive at > >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > >>> posting! > >>>>>> > >>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >>>>>> > >>>>>> * For (un)subscribe requests visit > >>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users > >>> or send a mail to gmx-users-requ...@gromacs.org. > >>>> -- > >>>> Gromacs Users mailing list > >>>> > >>>> * Please search the archive at > >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > >>> posting! > >>>> > >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >>>> > >>>> * For (un)subscribe requests visit > >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > >>> send a mail to gmx-users-requ...@gromacs.org. > >>> -- > >>> Gromacs Users mailing list > >>> > >>> * Please search the archive at > >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > >>> posting! > >>> > >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >>> > >>> * For (un)subscribe requests visit > >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > >>> send a mail to gmx-users-requ...@gromacs.org. > >>> -- > >>> Gromacs Users mailing list > >>> > >>> * Please search the archive at > >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before > >>> posting! > >>> > >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >>> > >>> * For (un)subscribe requests visit > >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > >>> send a mail to gmx-users-requ...@gromacs.org. > >> > >> > > -- > > Gromacs Users mailing list > > > > * Please search the archive at > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > > > * For (un)subscribe requests visit > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send > > a mail to gmx-users-requ...@gromacs.org. > -- > Gromacs Users mailing list > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a > mail to gmx-users-requ...@gromacs.org. -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.