Yes, these soft-float math (in libm.so) makes Arm binary extremely slow. -----Original Message----- From: Antoine Pitrou <anto...@python.org> Sent: Thursday, April 22, 2021 17:20 To: dev@arrow.apache.org Subject: Re: [C++] Indeterminate poor performance of random number generator
Le 22/04/2021 à 03:38, Yibo Cai a écrit : > > Both using same libstdc++. > But std::bernoulli_distribution is inlined, so they are indeed different for > clang and gcc. > https://godbolt.org/z/aT84x5Yec > Looks a pure compiler thing. It looks like clang generates calls to logl() and __divtf3() (soft-float long double division) inside the loop. Perhaps that can be avoided by reimplementing the Bernoulli distribution. If we don't care too much about accuracy and extreme probability values (very close to 0 or 1), that should be relatively easy. Regards Antoine. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.