I think even if it's not (easily) generalizable across languages, it'd
still be a win for C++ (and hopefully languages that bind to
C++). Also, I don't think they're meant to completely replace
language-specific tests, but rather complement them, and make it
easier to add and maintain tests in the overwhelmingly common case.

I do feel it's somewhat painful to write these kinds of tests in C++,
largely because of the iteration time and the difficulty of repeating
tests across various configurations. I also think this could be an
opportunity to leverage things like Hypothesis/property-based testing
or perhaps fuzzing to make the kernels even more robust.

-David

On 2021/05/14 18:09:45, Eduardo Ponce <edponc...@gmail.com> wrote: 
> Another aspect to keep in mind is that some tests require internal options
> to be changed before executing the compute functions (e.g., check overflow,
> allow NaN comparisons, change validity bits, etc.). Also, there are tests
> that take randomized inputs and others make use of the min/max values for
> each specific data type. Certainly, these details can be generalized across
> languages/testing frameworks but not without careful treatment.
> 
> Moreover, each language implementation still needs to test
> language-specific or internal functions, so having a meta test framework
> will not necessarily get rid of language-specific tests.
> 
> ~Eduardo
> 
> On Fri, May 14, 2021 at 1:56 PM Weston Pace <weston.p...@gmail.com> wrote:
> 
> > With that in mind it seems the somewhat recurring discussion on coming
> > up with a language independent standard for logical query plans
> > (
> > https://lists.apache.org/thread.html/rfab15e09c97a8fb961d6c5db8b2093824c58d11a51981a40f40cc2c0%40%3Cdev.arrow.apache.org%3E
> > )
> > would be relevant.  Each test case would then be a triplet of (Input
> > Dataframe, Logical Plan, Output Dataframe).  So perhaps tackling this
> > effort would be to make progress on both fronts.
> >
> > On Fri, May 14, 2021 at 7:39 AM Julian Hyde <jhyde.apa...@gmail.com>
> > wrote:
> > >
> > > Do these any of these compute functions have analogs in other
> > implementations of Arrow (e.g. Rust)?
> > >
> > > I believe that as much as possible of Arrow’s compute functionality
> > should be cross-language. Perhaps there are language-specific differences
> > in how functions are invoked, but the basic functionality is the same.
> > >
> > > If people buy into that premise, then a single suite of tests is a
> > powerful way to make that happen. The tests can be written in a high-level
> > language and can generate tests in each implementation language. (For these
> > purposes, the “high-level language” could be a special text format, could
> > be a data language such as JSON, or could be a programming language such as
> > Python; it doesn’t matter much.)
> > >
> > > For example,
> > >
> > >   assertThatCall(“foo(1, 2)”, returns(“3”))
> > >
> > > might actually call foo with arguments 1 and 2, or it might generate a
> > C++ or Rust test that does the same.
> > >
> > >
> > > Julian
> > >
> > >
> > > > On May 14, 2021, at 8:45 AM, Antoine Pitrou <anto...@python.org>
> > wrote:
> > > >
> > > >
> > > > Le 14/05/2021 à 15:30, Wes McKinney a écrit :
> > > >> hi folks,
> > > >> As we build more functions (kernels) in the project, I note that the
> > > >> amount of hand-coded C++ code relating to testing function correctness
> > > >> is growing significantly. Many of these tests are quite simple and
> > > >> could be expressed in a text format that can be parsed and evaluated.
> > > >> Thoughts about building something like that to make it easier to write
> > > >> functional correctness tests?
> > > >
> > > > Or perhaps build-up higher level C++ functions if desired?
> > > >
> > > > Or even write some of those tests as part of the PyArrow test suite.
> > > >
> > > > I'm not sure adding a custom (and probably inflexible) text format is
> > really a good use of our time.
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > >
> >
> 

Reply via email to