In an effort to get this thread back on track, I tried implementing cos(_:) in pure generic Swift code, with the BinaryFloatingPoint protocol. It deviates from the _cos(_:) intrinsic by no more than 5.26362703423544e-11. Adding more terms to the approximation only has a small penalty to the performance for some reason.
To make the benchmarks fair, and explore the idea of distributing a Math module without killing people on the cross-module optimization boundary, I enabled some of the unsafe compiler attributes. All of these benchmarks are cross-module calls, as if the math module were downloaded as a dependency in the SPM. == Relative execution time (lower is better) == llvm intrinsic : 3.133 glibc cos() : 3.124 no attributes : 43.675 with specialization : 4.162 with inlining : 3.108 with inlining and specialization : 3.264 As you can see, the pure Swift generic implementation actually beats the compiler intrinsic (and the glibc cos() but I guess they’re the same thing) when inlining is used, but for some reason generic specialization and inlining don’t get along very well. Here’s the source implementation. It uses a taylor series (!) which probably isn’t optimal but it does prove that cos() and sin() can be implemented as generics in pure Swift, be distributed as a module outside the stdlib, and still achieve competitive performance with the llvm intrinsics. @_inlineable //@_specialize(where F == Float) //@_specialize(where F == Double) public func cos<F>(_ x:F) -> F where F:BinaryFloatingPoint { let x:F = abs(x.remainder(dividingBy: 2 * F.pi)), quadrant:Int = Int(x * (2 / F.pi)) switch quadrant { case 0: return cos(on_first_quadrant: x) case 1: return -cos(on_first_quadrant: F.pi - x) case 2: return -cos(on_first_quadrant: x - F.pi) case 3: return -cos(on_first_quadrant: 2 * F.pi - x) default: fatalError("unreachable") } } @_versioned @_inlineable //@_specialize(where F == Float) //@_specialize(where F == Double) func cos<F>(on_first_quadrant x:F) -> F where F:BinaryFloatingPoint { let x2:F = x * x var y:F = -0.0000000000114707451267755432394 for c:F in [0.000000002087675698165412591559, -0.000000275573192239332256421489, 0.00002480158730158702330045157, -0.00138888888888888880310186415, 0.04166666666666666665319411988, -0.4999999999999999999991637437, 0.9999999999999999999999914771 ] { y = x2 * y + c } return y } On Thu, Aug 3, 2017 at 7:04 AM, Stephen Canon via swift-evolution < swift-evolution@swift.org> wrote: > On Aug 2, 2017, at 7:03 PM, Karl Wagner via swift-evolution < > swift-evolution@swift.org> wrote: > > > It’s important to remember that computers are mathematical machines, and > some functions which are implemented in hardware on essentially every > platform (like sin/cos/etc) are definitely best implemented as compiler > intrinsics. > > > sin/cos/etc are implemented in software, not hardware. x86 does have the > FSIN/FCOS instructions, but (almost) no one actually uses them to implement > the sin( ) and cos( ) functions; they are a legacy curiosity, both too slow > and too inaccurate for serious use today. There are no analogous > instructions on ARM or PPC. > > – Steve > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution > >
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution