But that wouls bring the theoretical performance to 160 TFLOPS per box,
which also doesn't match!

On Thu, Jun 3, 2021, 5:50 PM Carlos Bederián <[email protected]>
wrote:

> A100 does 19.5 FP64 TFLOPS using tensor cores.
>
> On Thu, Jun 3, 2021 at 9:08 AM harsh_google lastname <
> [email protected]> wrote:
>
>> I am calculating the theoretical peak (FP64) performance of the Nvidia
>> DGX A100 system.
>>
>> Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS.
>> Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops
>> / cycle = 4.6 TFLOPS.
>> This gives a total of 82.2 TFLOPS per DGX-A100.
>>
>> Here is my problem. For any system with DGX A100 on top500.org, numbers
>> just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical
>> peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie
>> 82.2 TFLOPS x560). The same is true for any other DGX based system listed
>> on top500. What am I missing here?
>>
>> Thanks!
>>
>> Harsh Hemani
>> _______________________________________________
>> Beowulf mailing list, [email protected] sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
>
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to