But that wouls bring the theoretical performance to 160 TFLOPS per box, which also doesn't match!
On Thu, Jun 3, 2021, 5:50 PM Carlos Bederián <[email protected]> wrote: > A100 does 19.5 FP64 TFLOPS using tensor cores. > > On Thu, Jun 3, 2021 at 9:08 AM harsh_google lastname < > [email protected]> wrote: > >> I am calculating the theoretical peak (FP64) performance of the Nvidia >> DGX A100 system. >> >> Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS. >> Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops >> / cycle = 4.6 TFLOPS. >> This gives a total of 82.2 TFLOPS per DGX-A100. >> >> Here is my problem. For any system with DGX A100 on top500.org, numbers >> just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical >> peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie >> 82.2 TFLOPS x560). The same is true for any other DGX based system listed >> on top500. What am I missing here? >> >> Thanks! >> >> Harsh Hemani >> _______________________________________________ >> Beowulf mailing list, [email protected] sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf >> >
_______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
