I agree there are pros and cons here (up front investment hopefully yielding future productivity gains). If a test harness and format meeting the requirements could be created without too much pain (I'm thinking less than a week of full time effort, including refactoring some existing tests), it would save developers a lot of time going forward. I look for example at the large number of hand-coded functional tests found in this file as one example to use to guide the effort:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc On Sat, May 15, 2021 at 5:02 AM Antoine Pitrou <anto...@python.org> wrote: > > > I think people who think this would be beneficial should try to devise a > text format to represent compute test data. As Eduardo pointed out, > there are various complications that need to be catered for. > > To me, it's not obvious that building the necessary infrastructure in > C++ to ingest that text format will be more pleasant than our current > way of writing tests. As a data point, the JSON integration code in C++ > is really annoying to maintain. > > Regards > > Antoine. > > > Le 15/05/2021 à 00:03, Wes McKinney a écrit : > > In C++, we have the "ArrayFromJSON" function which is an even simpler > > way of specifying input data compared with the integration tests. > > That's one possible starting point. > > > > The "interpreted tests" could be all specified and driven by minimal > > dependency Python code, as one possible way to approach things. > > > > On Fri, May 14, 2021 at 1:57 PM Jorge Cardoso Leitão > > <jorgecarlei...@gmail.com> wrote: > >> > >> Hi, > >> > >> (this problem also exists in Rust, btw) > >> > >> Couldn't we use something like we do for our integration tests? Create a > >> separate binary that would allow us to call e.g. > >> > >> test-compute --method equal --json-file <path> --arg "column1" --arg > >> "column2" --expected "column3" > >> > >> (or simply pass the input via stdin) > >> > >> and then use Python to call the binary? > >> > >> The advantage I see here is that we would compile the binary with flags to > >> disable unnecessary code, use debug, etc, thereby reducing compile time if > >> the kernel needs > >> to be changed. > >> > >> IMO our "json integration" format is a reliable way of passing data across, > >> it is very easy to read and write, and all our implementations can already > >> read it for integration tests. > >> > >> wrt to the "cross implementation", the equality operation seems a natural > >> candidate for across implementations checks, as that one has important > >> implications in all our integration tests. filter, take, slice, boolean ops > >> may also be easy to agree upon. "add" and the like are a bit more difficult > >> due to how overflow should be handled (abort vs saturate vs None), but > >> nothing that we can't take. ^_^ > >> > >> Best, > >> Jorge > >> > >> On Fri, May 14, 2021 at 8:25 PM David Li <lidav...@apache.org> wrote: > >> > >>> I think even if it's not (easily) generalizable across languages, it'd > >>> still be a win for C++ (and hopefully languages that bind to > >>> C++). Also, I don't think they're meant to completely replace > >>> language-specific tests, but rather complement them, and make it > >>> easier to add and maintain tests in the overwhelmingly common case. > >>> > >>> I do feel it's somewhat painful to write these kinds of tests in C++, > >>> largely because of the iteration time and the difficulty of repeating > >>> tests across various configurations. I also think this could be an > >>> opportunity to leverage things like Hypothesis/property-based testing > >>> or perhaps fuzzing to make the kernels even more robust. > >>> > >>> -David > >>> > >>> On 2021/05/14 18:09:45, Eduardo Ponce <edponc...@gmail.com> wrote: > >>>> Another aspect to keep in mind is that some tests require internal > >>> options > >>>> to be changed before executing the compute functions (e.g., check > >>> overflow, > >>>> allow NaN comparisons, change validity bits, etc.). Also, there are tests > >>>> that take randomized inputs and others make use of the min/max values for > >>>> each specific data type. Certainly, these details can be generalized > >>> across > >>>> languages/testing frameworks but not without careful treatment. > >>>> > >>>> Moreover, each language implementation still needs to test > >>>> language-specific or internal functions, so having a meta test framework > >>>> will not necessarily get rid of language-specific tests. > >>>> > >>>> ~Eduardo > >>>> > >>>> On Fri, May 14, 2021 at 1:56 PM Weston Pace <weston.p...@gmail.com> > >>> wrote: > >>>> > >>>>> With that in mind it seems the somewhat recurring discussion on coming > >>>>> up with a language independent standard for logical query plans > >>>>> ( > >>>>> > >>> https://lists.apache.org/thread.html/rfab15e09c97a8fb961d6c5db8b2093824c58d11a51981a40f40cc2c0%40%3Cdev.arrow.apache.org%3E > >>>>> ) > >>>>> would be relevant. Each test case would then be a triplet of (Input > >>>>> Dataframe, Logical Plan, Output Dataframe). So perhaps tackling this > >>>>> effort would be to make progress on both fronts. > >>>>> > >>>>> On Fri, May 14, 2021 at 7:39 AM Julian Hyde <jhyde.apa...@gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> Do these any of these compute functions have analogs in other > >>>>> implementations of Arrow (e.g. Rust)? > >>>>>> > >>>>>> I believe that as much as possible of Arrow’s compute functionality > >>>>> should be cross-language. Perhaps there are language-specific > >>> differences > >>>>> in how functions are invoked, but the basic functionality is the same. > >>>>>> > >>>>>> If people buy into that premise, then a single suite of tests is a > >>>>> powerful way to make that happen. The tests can be written in a > >>> high-level > >>>>> language and can generate tests in each implementation language. (For > >>> these > >>>>> purposes, the “high-level language” could be a special text format, > >>> could > >>>>> be a data language such as JSON, or could be a programming language > >>> such as > >>>>> Python; it doesn’t matter much.) > >>>>>> > >>>>>> For example, > >>>>>> > >>>>>> assertThatCall(“foo(1, 2)”, returns(“3”)) > >>>>>> > >>>>>> might actually call foo with arguments 1 and 2, or it might generate > >>> a > >>>>> C++ or Rust test that does the same. > >>>>>> > >>>>>> > >>>>>> Julian > >>>>>> > >>>>>> > >>>>>>> On May 14, 2021, at 8:45 AM, Antoine Pitrou <anto...@python.org> > >>>>> wrote: > >>>>>>> > >>>>>>> > >>>>>>> Le 14/05/2021 à 15:30, Wes McKinney a écrit : > >>>>>>>> hi folks, > >>>>>>>> As we build more functions (kernels) in the project, I note that > >>> the > >>>>>>>> amount of hand-coded C++ code relating to testing function > >>> correctness > >>>>>>>> is growing significantly. Many of these tests are quite simple and > >>>>>>>> could be expressed in a text format that can be parsed and > >>> evaluated. > >>>>>>>> Thoughts about building something like that to make it easier to > >>> write > >>>>>>>> functional correctness tests? > >>>>>>> > >>>>>>> Or perhaps build-up higher level C++ functions if desired? > >>>>>>> > >>>>>>> Or even write some of those tests as part of the PyArrow test > >>> suite. > >>>>>>> > >>>>>>> I'm not sure adding a custom (and probably inflexible) text format > >>> is > >>>>> really a good use of our time. > >>>>>>> > >>>>>>> Regards > >>>>>>> > >>>>>>> Antoine. > >>>>>> > >>>>> > >>>> > >>>