I'd in general expect good C code to optimize a little better; and in particular, decomposing an IEEE float is almost certainly more expensive in Haskell than in C, because unions let you cheat. (And I recall looking at the implementation of decodeFloat once; it's significantly longer than that C.) But I have to wonder if that code would be better done with something more native; the implementation may be a portable default, and you might be able to find something x86-specific that is faster.
On Tue, Mar 6, 2018 at 5:35 AM, Mateusz Kowalczyk <fuuze...@fuuzetsu.co.uk> wrote: > On 03/05/2018 10:23 PM, Brandon Allbery wrote: > > If the FFI version is done with "safe", consider using "unsafe" instead. > > There are technical reasons why this is slightly incorrect, but unless > > you're fiddling with the CPU's FP control flags they're mostly irrelevant > > and you can treat isNaN as pure and non-side-effectful, significantly > > reducing the overhead. You may also be able to use "ccall" to take > > advantage of C compiler level optimizations, or simply to directly > invoke a > > CPU-based test with asm(); but you'll need to hide that in a C > preprocessor > > #define, so that it looks syntactically like a function call to the FFI. > > > > (One of the technical reasons is that various OSes have been known to > > introduce bugs in their FP register and state handling across system > calls, > > in which case the "safe" version may turn "complete FP chaos" into merely > > "wrong answer". It's your call whether, or which side, of this bothers > you.) > > Perhaps I was a little unclear. The FFI-using isDoubleNaN is something > GHC does! > > ``` > libraries/base/GHC/Float.hs:foreign import ccall unsafe "isDoubleNaN" > isDoubleNaN :: Double -> Int > ``` > ``` > HsInt > isDoubleNaN(HsDouble d) > { > union stg_ieee754_dbl u; > > u.d = d; > > return ( > u.ieee.exponent == 2047 /* 2^11 - 1 */ && /* Is the exponent all > ones? */ > (u.ieee.mantissa0 != 0 || u.ieee.mantissa1 != 0) > /* and the mantissa non-zero? */ > ); > } > ``` > > My question is whether it could do better by not doing FFI and instead > computing natively and if not, why not? > > > On Mon, Mar 5, 2018 at 4:53 PM, Mateusz Kowalczyk < > fuuze...@fuuzetsu.co.uk> > > wrote: > > > >> Hi, > >> > >> Recently at a client I was profiling some code and isDoubleNaN lit up. > >> We were checking a lot of doubles for NaN as that's what customer would > >> send in. > >> > >> I went to investigate and I found that FFI is used to achieve this. I > >> was always under the impression that FFI costs a little. I had at the > >> time replaced the code with a hack with great results: > >> > >> ``` > >> isNaN' :: Double -> Bool > >> isNaN' d = d /= d > >> ``` > >> > >> While this worked and provided good speedup in my case, this fails > >> catastrophically if the program is compiled with -ffast-math. This is > >> expected. I have since reverted it. Seeking an alternative solution I > >> have thought about re-implementing the C code with a native Haskell > >> version: after all it just checks a few bits. Apparently unsafeCoerce# > >> and friends were a big no-no but I found > >> https://phabricator.haskell.org/D3358 . I have implemented the code at > >> the bottom of this post. Obviously it's missing endianness (compile-time > >> switch). > >> > >> This seems to be faster for smaller `mkInput` list than Prelude.isNaN > >> but slower slightly on the one below. The `/=` version is the fastest > >> but very fragile. > >> > >> My question to you all is whether implementing a version of this > >> function in Haskell makes sense and if not, why not? The > >> stgDoubleToWord64 is implemented in CMM and I don't know anything about > >> the costs of that. > >> > >> * Is there a cheaper alternative to FFI way? > >> * If yes, does anyone know how to write it such that it compiles to same > >> code but without the call overhead? I must have failed below as it's > >> slower on some inputs. > >> > >> Basically if a faster way exists for isNaN, something I have to do a > >> lot, I'd love to hear about it. > >> > >> I leave you with basic code I managed to come up with. 8.4.x only. > >> > >> > >> ``` > >> {-# LANGUAGE MagicHash #-} > >> {-# OPTIONS_GHC -O2 -ddump-simpl -ddump-stg -ddump-to-file -ddump-asm > #-} > >> module Main (main) where > >> > >> import GHC.Float > >> import GHC.Prim > >> > >> isNaN' :: Double -> Bool > >> isNaN' d = d /= d > >> > >> isNaNBits :: Double -> Bool > >> isNaNBits (D# d) = case (bits `and#` expMask) `eqWord#` expMask of > >> 1# -> case bits `and#` mantissaMask of > >> 0## -> False > >> _ -> True > >> _ -> False > >> where > >> bits :: Word# > >> bits = stgDoubleToWord64 d > >> > >> expMask, mantissaMask :: Word# > >> expMask = 0x7FF0000000000000## > >> mantissaMask = 0x000FFFFFFFFFFFFF## > >> > >> main :: IO () > >> main = sumFilter isNaN {-isNaN'-} {-isNaNBits-} (mkInput 100000000) > >> `seq` pure () > >> where > >> nan :: Double > >> nan = log (-1) > >> > >> mkInput :: Int -> [Double] > >> mkInput n = take n $ cycle [1, nan] > >> > >> sumFilter :: (Double -> Bool) -> [Double] -> Double > >> sumFilter p = Prelude.sum . Prelude.filter (not . p) > >> ``` > >> > >> -- > >> Mateusz K. > >> _______________________________________________ > >> ghc-devs mailing list > >> ghc-devs@haskell.org > >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > >> > > > > > > > > > -- > Mateusz K. > -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs