roee88 commented on pull request #11067: URL: https://github.com/apache/arrow/pull/11067#issuecomment-911998111
> I skimmed through it and it looks good, thanks a lot for this! > > What we did in Rust was to use the c data interface that C++ exposes in Python to make calls from within the process, see [here](https://github.com/apache/arrow-rs/tree/master/arrow-pyarrow-integration-testing). > > I think it would be beneficial to add integration tests against pyarrow. In Rust we found a couple of memory leaks and double frees during development by testing against pyarrow / c++. Thanks Jorge. We have @tomersolomon1 working on integration tests against pyarrow as a follow-up to this PR. The [approach](https://github.com/roee88/arrow/issues/6) that we are trying with integration tests is to use jpype (as suggested by @pitrou) and run the tests against the same datasets used in the IPC integration tests. It's too early to say if that makes sense. I think that eventually the same set of tests should be used for all languages. > I would also setup an environment to run those tests against e.g. valgrind, since in FFI is very easy to trigger UB. I'm not familiar with using valgrind in Java but I will definitely check. FWIW the tests do fail in case of a memory leak for memory allocated with a buffer allocator (`allocator#close()` raises an exception if there are allocated bytes left). Somewhat unrelated but I know that you ran tests with valgrind for arrow2, did you also ran it for arrow-rs? I thought that there is a memory leak in arrow-rs because no one ported your export API fixes yet. Just out of curiosity about my memory leak assumption. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org