"Smith, Barry F." <bsm...@mcs.anl.gov> writes: >> On Aug 14, 2019, at 2:37 PM, Jed Brown <j...@jedbrown.org> wrote: >> >> Mark Adams via petsc-dev <petsc-dev@mcs.anl.gov> writes: >> >>> On Wed, Aug 14, 2019 at 2:35 PM Smith, Barry F. <bsm...@mcs.anl.gov> wrote: >>> >>>> >>>> Mark, >>>> >>>> Would you be able to make one run using single precision? Just single >>>> everywhere since that is all we support currently? >>>> >>>> >>> Experience in engineering at least is single does not work for FE >>> elasticity. I have tried it many years ago and have heard this from others. >>> This problem is pretty simple other than using Q2. I suppose I could try >>> it, but just be aware the FE people might say that single sucks. >> >> When they say that single sucks, is it for the definition of the >> operator or the preconditioner? >> >> As point of reference, we can apply Q2 elasticity operators in double >> precision at nearly a billion dofs/second per GPU. > > And in single you get what?
I don't have exact numbers, but <2x faster on V100, and it sort of doesn't matter because preconditioning cost will dominate. The big win of single is on consumer-grade GPUs, which DOE doesn't install and NVIDIA forbids to be used in data centers (because they're so cost-effective ;-)). >> I'm skeptical of big wins in preconditioning (especially setup) due to >> the cost and irregularity of indexing being large compared to the >> bandwidth cost of the floating point values.