>
> If there were a scenario where it made sense to use exceptions for
> internal control flow (but never being exposed in any public API),
> what would others think about that?

I would lean against this.  This seems like a slippery slop, but wouldn't
veto it if others felt strongly.

The other path forward is to not bother with trying to abort kernel
> execution and instead set an error Status on a FunctionContext object
> like we're doing currently for the Cast kernels (FTR this is also what
> Gandiva does). This is probably what I'll stick with for now.


This would be my preferred approach, but again I don't have a strong
technical argument, it is more the style that I'm accustomed to.

On Mon, May 4, 2020 at 3:09 PM Wes McKinney <wesmck...@gmail.com> wrote:

> While working on the kernels API, one thought that occurred to me is
> that it might be simpler (for kernel implementers) to use "private"
> exceptions (caught internally, never exposed to public API users) to
> interrupt kernel execution on errors. Currently we have kernels that
> can fail, but we only check for failures after the kernel has executed
> on the entire array rather than interrupting on error. I was curious
> what are the performance implications, but had a secondary concern
> about what would be our policy regarding using exceptions for error
> control flow.
>
> To motivate my interest, there are a couple of paths forward if we
> wanted to enable kernels to abort on their first error
>
> 1. Every kernel implementation can return a true/false value (e.g.
> 0/-1 or true/false) indicating success/failure
> 2. Exceptions can be thrown only in kernels that can fail (these
> exceptions must be caught at the call site and never allowed to be
> thrown publicly)
> 3. A different API calling convention can be developed for kernels
> that can fail / can't fail (worst option)
>
> I wrote an illustrative benchmark to help understand what I'm saying
>
>
> https://github.com/wesm/arrow/blob/internal-exc-control-flow/cpp/src/arrow/exc_benchmark.cc
>
> So the innermost loop of a kernel using exceptions would look something
> like
>
> struct AddNonZeroExc {
>   static double Call(double x, double y) {
>     if (ARROW_PREDICT_FALSE(y == 0)) {
>       throw std::runtime_error("addend was zero");
>     }
>     return x + y;
>   }
> };
>
> and then this would be called like
>
>   try {
>     for (int64_t i = 0; i < kLength; ++i) {
>       *out++ = Operator::Call(*x++, *y++);
>     }
>   } catch (const std::exception&) {
>     return false;
>   }
>   return true;
>
> and with return values for errors
>
> struct AddNonZeroRetval {
>   static int Call(double x, double y, double* out) {
>     if (ARROW_PREDICT_FALSE(y == 0)) {
>       return -1;
>     }
>     *out = x + y;
>     return 0;
>   }
> };
>
> and then this is called like
>
> template <typename Operator>
> static bool LoopRetval(const double* x, const double* y,
>                        double* out) {
>   for (int64_t i = 0; i < kLength; ++i) {
>     if (ARROW_PREDICT_FALSE(Operator::Call(*x++, *y++, out++) != 0)) {
>       return false;
>     }
>   }
>   return true;
> }
>
> If there were a scenario where it made sense to use exceptions for
> internal control flow (but never being exposed in any public API),
> what would others think about that?
>
> For what it's worth, the benchmarks show little performance
> difference, when there are no errors. For short-running kernels on
> small arrays, the cost of catching the exception can exceed the cost
> of running the kernel, as demonstrated below. So using an exception to
> interrupt execution in the unlikely error case does not seem to
> improve performance.
>
> 1024 elements (24KB processed memory)
>
> ----------------------------------------------------------------
> Benchmark                         Time           CPU Iterations
> ----------------------------------------------------------------
> BenchAddNoFailRetval            708 ns        708 ns    1990102
> 1.34701G items/s
> BenchAddNoFailExc               709 ns        709 ns    1987780
> 1.34462G items/s
> BenchAddFailRetval              716 ns        716 ns    1961577
> 1.33218G items/s
> BenchAddFailExc                1931 ns       1931 ns     728017
> 505.713M items/s
> BenchAddCantFailRetval          246 ns        246 ns    5702803
> 3.87169G items/s
> BenchAddCantFailNoRetval        249 ns        249 ns    5577482
> 3.83466G items/s
>
> 1M elements (24MB processed memory)
>
> ----------------------------------------------------------------
> Benchmark                         Time           CPU Iterations
> ----------------------------------------------------------------
> BenchAddNoFailRetval        1062696 ns    1062694 ns       1304
> 941.004M items/s
> BenchAddNoFailExc           1036151 ns    1036122 ns       1346
> 965.138M items/s
> BenchAddFailRetval          1049943 ns    1049922 ns       1310
> 952.451M items/s
> BenchAddFailExc             1049455 ns    1049439 ns       1331
> 952.89M items/s
> BenchAddCantFailRetval       813654 ns     813606 ns       1715
> 1.20029G items/s
> BenchAddCantFailNoRetval     807262 ns     807261 ns       1676
> 1.20972G items/s
>
> (Hopefully I didn't make mistakes in the benchmark)
>
> The other path forward is to not bother with trying to abort kernel
> execution and instead set an error Status on a FunctionContext object
> like we're doing currently for the Cast kernels (FTR this is also what
> Gandiva does). This is probably what I'll stick with for now.
>
> Thanks,
> Wes
>

Reply via email to