Re: [C++][DISCUSS] Implementing interpreted (non-compiled) tests for compute functions

Wes McKinney Sun, 16 May 2021 12:35:56 -0700

I agree there are pros and cons here (up front investment hopefully
yielding future productivity gains). If a test harness and format
meeting the requirements could be created without too much pain (I'm
thinking less than a week of full time effort, including refactoring
some existing tests), it would save developers a lot of time going
forward. I look for example at the large number of hand-coded
functional tests found in this file as one example to use to guide the
effort:


https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc

On Sat, May 15, 2021 at 5:02 AM Antoine Pitrou <[email protected]> wrote:
>
>
> I think people who think this would be beneficial should try to devise a
> text format to represent compute test data.  As Eduardo pointed out,
> there are various complications that need to be catered for.
>
> To me, it's not obvious that building the necessary infrastructure in
> C++ to ingest that text format will be more pleasant than our current
> way of writing tests.  As a data point, the JSON integration code in C++
> is really annoying to maintain.
>
> Regards
>
> Antoine.
>
>
> Le 15/05/2021 à 00:03, Wes McKinney a écrit :
> > In C++, we have the "ArrayFromJSON" function which is an even simpler
> > way of specifying input data compared with the integration tests.
> > That's one possible starting point.
> >
> > The "interpreted tests" could be all specified and driven by minimal
> > dependency Python code, as one possible way to approach things.
> >
> > On Fri, May 14, 2021 at 1:57 PM Jorge Cardoso Leitão
> > <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> (this problem also exists in Rust, btw)
> >>
> >> Couldn't we use something like we do for our integration tests? Create a
> >> separate binary that would allow us to call e.g.
> >>
> >> test-compute --method equal --json-file <path> --arg "column1" --arg
> >> "column2" --expected "column3"
> >>
> >> (or simply pass the input via stdin)
> >>
> >> and then use Python to call the binary?
> >>
> >> The advantage I see here is that we would compile the binary with flags to
> >> disable unnecessary code, use debug, etc, thereby reducing compile time if
> >> the kernel needs
> >> to be changed.
> >>
> >> IMO our "json integration" format is a reliable way of passing data across,
> >> it is very easy to read and write, and all our implementations can already
> >> read it for integration tests.
> >>
> >> wrt to the "cross implementation", the equality operation seems a natural
> >> candidate for across implementations checks, as that one has important
> >> implications in all our integration tests. filter, take, slice, boolean ops
> >> may also be easy to agree upon. "add" and the like are a bit more difficult
> >> due to how overflow should be handled (abort vs saturate vs None), but
> >> nothing that we can't take. ^_^
> >>
> >> Best,
> >> Jorge
> >>
> >> On Fri, May 14, 2021 at 8:25 PM David Li <[email protected]> wrote:
> >>
> >>> I think even if it's not (easily) generalizable across languages, it'd
> >>> still be a win for C++ (and hopefully languages that bind to
> >>> C++). Also, I don't think they're meant to completely replace
> >>> language-specific tests, but rather complement them, and make it
> >>> easier to add and maintain tests in the overwhelmingly common case.
> >>>
> >>> I do feel it's somewhat painful to write these kinds of tests in C++,
> >>> largely because of the iteration time and the difficulty of repeating
> >>> tests across various configurations. I also think this could be an
> >>> opportunity to leverage things like Hypothesis/property-based testing
> >>> or perhaps fuzzing to make the kernels even more robust.
> >>>
> >>> -David
> >>>
> >>> On 2021/05/14 18:09:45, Eduardo Ponce <[email protected]> wrote:
> >>>> Another aspect to keep in mind is that some tests require internal
> >>> options
> >>>> to be changed before executing the compute functions (e.g., check
> >>> overflow,
> >>>> allow NaN comparisons, change validity bits, etc.). Also, there are tests
> >>>> that take randomized inputs and others make use of the min/max values for
> >>>> each specific data type. Certainly, these details can be generalized
> >>> across
> >>>> languages/testing frameworks but not without careful treatment.
> >>>>
> >>>> Moreover, each language implementation still needs to test
> >>>> language-specific or internal functions, so having a meta test framework
> >>>> will not necessarily get rid of language-specific tests.
> >>>>
> >>>> ~Eduardo
> >>>>
> >>>> On Fri, May 14, 2021 at 1:56 PM Weston Pace <[email protected]>
> >>> wrote:
> >>>>
> >>>>> With that in mind it seems the somewhat recurring discussion on coming
> >>>>> up with a language independent standard for logical query plans
> >>>>> (
> >>>>>
> >>> https://lists.apache.org/thread.html/rfab15e09c97a8fb961d6c5db8b2093824c58d11a51981a40f40cc2c0%40%3Cdev.arrow.apache.org%3E
> >>>>> )
> >>>>> would be relevant.  Each test case would then be a triplet of (Input
> >>>>> Dataframe, Logical Plan, Output Dataframe).  So perhaps tackling this
> >>>>> effort would be to make progress on both fronts.
> >>>>>
> >>>>> On Fri, May 14, 2021 at 7:39 AM Julian Hyde <[email protected]>
> >>>>> wrote:
> >>>>>>
> >>>>>> Do these any of these compute functions have analogs in other
> >>>>> implementations of Arrow (e.g. Rust)?
> >>>>>>
> >>>>>> I believe that as much as possible of Arrow’s compute functionality
> >>>>> should be cross-language. Perhaps there are language-specific
> >>> differences
> >>>>> in how functions are invoked, but the basic functionality is the same.
> >>>>>>
> >>>>>> If people buy into that premise, then a single suite of tests is a
> >>>>> powerful way to make that happen. The tests can be written in a
> >>> high-level
> >>>>> language and can generate tests in each implementation language. (For
> >>> these
> >>>>> purposes, the “high-level language” could be a special text format,
> >>> could
> >>>>> be a data language such as JSON, or could be a programming language
> >>> such as
> >>>>> Python; it doesn’t matter much.)
> >>>>>>
> >>>>>> For example,
> >>>>>>
> >>>>>>    assertThatCall(“foo(1, 2)”, returns(“3”))
> >>>>>>
> >>>>>> might actually call foo with arguments 1 and 2, or it might generate
> >>> a
> >>>>> C++ or Rust test that does the same.
> >>>>>>
> >>>>>>
> >>>>>> Julian
> >>>>>>
> >>>>>>
> >>>>>>> On May 14, 2021, at 8:45 AM, Antoine Pitrou <[email protected]>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Le 14/05/2021 à 15:30, Wes McKinney a écrit :
> >>>>>>>> hi folks,
> >>>>>>>> As we build more functions (kernels) in the project, I note that
> >>> the
> >>>>>>>> amount of hand-coded C++ code relating to testing function
> >>> correctness
> >>>>>>>> is growing significantly. Many of these tests are quite simple and
> >>>>>>>> could be expressed in a text format that can be parsed and
> >>> evaluated.
> >>>>>>>> Thoughts about building something like that to make it easier to
> >>> write
> >>>>>>>> functional correctness tests?
> >>>>>>>
> >>>>>>> Or perhaps build-up higher level C++ functions if desired?
> >>>>>>>
> >>>>>>> Or even write some of those tests as part of the PyArrow test
> >>> suite.
> >>>>>>>
> >>>>>>> I'm not sure adding a custom (and probably inflexible) text format
> >>> is
> >>>>> really a good use of our time.
> >>>>>>>
> >>>>>>> Regards
> >>>>>>>
> >>>>>>> Antoine.
> >>>>>>
> >>>>>
> >>>>
> >>>

Re: [C++][DISCUSS] Implementing interpreted (non-compiled) tests for compute functions

Reply via email to