On Tue, May 5, 2020 at 11:08 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > If there were a scenario where it made sense to use exceptions for > > internal control flow (but never being exposed in any public API), > > what would others think about that? > > I would lean against this. This seems like a slippery slop, but wouldn't > veto it if others felt strongly.
I think this is my position in the end also (if this comes up again). > The other path forward is to not bother with trying to abort kernel > > execution and instead set an error Status on a FunctionContext object > > like we're doing currently for the Cast kernels (FTR this is also what > > Gandiva does). This is probably what I'll stick with for now. > > > This would be my preferred approach, but again I don't have a strong > technical argument, it is more the style that I'm accustomed to. > > On Mon, May 4, 2020 at 3:09 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > While working on the kernels API, one thought that occurred to me is > > that it might be simpler (for kernel implementers) to use "private" > > exceptions (caught internally, never exposed to public API users) to > > interrupt kernel execution on errors. Currently we have kernels that > > can fail, but we only check for failures after the kernel has executed > > on the entire array rather than interrupting on error. I was curious > > what are the performance implications, but had a secondary concern > > about what would be our policy regarding using exceptions for error > > control flow. > > > > To motivate my interest, there are a couple of paths forward if we > > wanted to enable kernels to abort on their first error > > > > 1. Every kernel implementation can return a true/false value (e.g. > > 0/-1 or true/false) indicating success/failure > > 2. Exceptions can be thrown only in kernels that can fail (these > > exceptions must be caught at the call site and never allowed to be > > thrown publicly) > > 3. A different API calling convention can be developed for kernels > > that can fail / can't fail (worst option) > > > > I wrote an illustrative benchmark to help understand what I'm saying > > > > > > https://github.com/wesm/arrow/blob/internal-exc-control-flow/cpp/src/arrow/exc_benchmark.cc > > > > So the innermost loop of a kernel using exceptions would look something > > like > > > > struct AddNonZeroExc { > > static double Call(double x, double y) { > > if (ARROW_PREDICT_FALSE(y == 0)) { > > throw std::runtime_error("addend was zero"); > > } > > return x + y; > > } > > }; > > > > and then this would be called like > > > > try { > > for (int64_t i = 0; i < kLength; ++i) { > > *out++ = Operator::Call(*x++, *y++); > > } > > } catch (const std::exception&) { > > return false; > > } > > return true; > > > > and with return values for errors > > > > struct AddNonZeroRetval { > > static int Call(double x, double y, double* out) { > > if (ARROW_PREDICT_FALSE(y == 0)) { > > return -1; > > } > > *out = x + y; > > return 0; > > } > > }; > > > > and then this is called like > > > > template <typename Operator> > > static bool LoopRetval(const double* x, const double* y, > > double* out) { > > for (int64_t i = 0; i < kLength; ++i) { > > if (ARROW_PREDICT_FALSE(Operator::Call(*x++, *y++, out++) != 0)) { > > return false; > > } > > } > > return true; > > } > > > > If there were a scenario where it made sense to use exceptions for > > internal control flow (but never being exposed in any public API), > > what would others think about that? > > > > For what it's worth, the benchmarks show little performance > > difference, when there are no errors. For short-running kernels on > > small arrays, the cost of catching the exception can exceed the cost > > of running the kernel, as demonstrated below. So using an exception to > > interrupt execution in the unlikely error case does not seem to > > improve performance. > > > > 1024 elements (24KB processed memory) > > > > ---------------------------------------------------------------- > > Benchmark Time CPU Iterations > > ---------------------------------------------------------------- > > BenchAddNoFailRetval 708 ns 708 ns 1990102 > > 1.34701G items/s > > BenchAddNoFailExc 709 ns 709 ns 1987780 > > 1.34462G items/s > > BenchAddFailRetval 716 ns 716 ns 1961577 > > 1.33218G items/s > > BenchAddFailExc 1931 ns 1931 ns 728017 > > 505.713M items/s > > BenchAddCantFailRetval 246 ns 246 ns 5702803 > > 3.87169G items/s > > BenchAddCantFailNoRetval 249 ns 249 ns 5577482 > > 3.83466G items/s > > > > 1M elements (24MB processed memory) > > > > ---------------------------------------------------------------- > > Benchmark Time CPU Iterations > > ---------------------------------------------------------------- > > BenchAddNoFailRetval 1062696 ns 1062694 ns 1304 > > 941.004M items/s > > BenchAddNoFailExc 1036151 ns 1036122 ns 1346 > > 965.138M items/s > > BenchAddFailRetval 1049943 ns 1049922 ns 1310 > > 952.451M items/s > > BenchAddFailExc 1049455 ns 1049439 ns 1331 > > 952.89M items/s > > BenchAddCantFailRetval 813654 ns 813606 ns 1715 > > 1.20029G items/s > > BenchAddCantFailNoRetval 807262 ns 807261 ns 1676 > > 1.20972G items/s > > > > (Hopefully I didn't make mistakes in the benchmark) > > > > The other path forward is to not bother with trying to abort kernel > > execution and instead set an error Status on a FunctionContext object > > like we're doing currently for the Cast kernels (FTR this is also what > > Gandiva does). This is probably what I'll stick with for now. > > > > Thanks, > > Wes > >