While working on the kernels API, one thought that occurred to me is that it might be simpler (for kernel implementers) to use "private" exceptions (caught internally, never exposed to public API users) to interrupt kernel execution on errors. Currently we have kernels that can fail, but we only check for failures after the kernel has executed on the entire array rather than interrupting on error. I was curious what are the performance implications, but had a secondary concern about what would be our policy regarding using exceptions for error control flow.
To motivate my interest, there are a couple of paths forward if we wanted to enable kernels to abort on their first error 1. Every kernel implementation can return a true/false value (e.g. 0/-1 or true/false) indicating success/failure 2. Exceptions can be thrown only in kernels that can fail (these exceptions must be caught at the call site and never allowed to be thrown publicly) 3. A different API calling convention can be developed for kernels that can fail / can't fail (worst option) I wrote an illustrative benchmark to help understand what I'm saying https://github.com/wesm/arrow/blob/internal-exc-control-flow/cpp/src/arrow/exc_benchmark.cc So the innermost loop of a kernel using exceptions would look something like struct AddNonZeroExc { static double Call(double x, double y) { if (ARROW_PREDICT_FALSE(y == 0)) { throw std::runtime_error("addend was zero"); } return x + y; } }; and then this would be called like try { for (int64_t i = 0; i < kLength; ++i) { *out++ = Operator::Call(*x++, *y++); } } catch (const std::exception&) { return false; } return true; and with return values for errors struct AddNonZeroRetval { static int Call(double x, double y, double* out) { if (ARROW_PREDICT_FALSE(y == 0)) { return -1; } *out = x + y; return 0; } }; and then this is called like template <typename Operator> static bool LoopRetval(const double* x, const double* y, double* out) { for (int64_t i = 0; i < kLength; ++i) { if (ARROW_PREDICT_FALSE(Operator::Call(*x++, *y++, out++) != 0)) { return false; } } return true; } If there were a scenario where it made sense to use exceptions for internal control flow (but never being exposed in any public API), what would others think about that? For what it's worth, the benchmarks show little performance difference, when there are no errors. For short-running kernels on small arrays, the cost of catching the exception can exceed the cost of running the kernel, as demonstrated below. So using an exception to interrupt execution in the unlikely error case does not seem to improve performance. 1024 elements (24KB processed memory) ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- BenchAddNoFailRetval 708 ns 708 ns 1990102 1.34701G items/s BenchAddNoFailExc 709 ns 709 ns 1987780 1.34462G items/s BenchAddFailRetval 716 ns 716 ns 1961577 1.33218G items/s BenchAddFailExc 1931 ns 1931 ns 728017 505.713M items/s BenchAddCantFailRetval 246 ns 246 ns 5702803 3.87169G items/s BenchAddCantFailNoRetval 249 ns 249 ns 5577482 3.83466G items/s 1M elements (24MB processed memory) ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- BenchAddNoFailRetval 1062696 ns 1062694 ns 1304 941.004M items/s BenchAddNoFailExc 1036151 ns 1036122 ns 1346 965.138M items/s BenchAddFailRetval 1049943 ns 1049922 ns 1310 952.451M items/s BenchAddFailExc 1049455 ns 1049439 ns 1331 952.89M items/s BenchAddCantFailRetval 813654 ns 813606 ns 1715 1.20029G items/s BenchAddCantFailNoRetval 807262 ns 807261 ns 1676 1.20972G items/s (Hopefully I didn't make mistakes in the benchmark) The other path forward is to not bother with trying to abort kernel execution and instead set an error Status on a FunctionContext object like we're doing currently for the Cast kernels (FTR this is also what Gandiva does). This is probably what I'll stick with for now. Thanks, Wes