paleolimbot commented on issue #33094: URL: https://github.com/apache/arrow/issues/33094#issuecomment-1387452294
I've done some bisecting of the tests in the pursuit of a minimal reproducer here. Since it appears that the docker image used in the nightly test is the only way to reproduce this, I looked up some image details and what gets run. - The image is defined in arrow/docker-compose.yml ("ubuntu-r-valgrind") - It's based on Winston Chang's r-debug image - The image might have some ubuntu version mismatch thing going on...some of the options in seems to suggest it's an 18.04 image but I'm pretty sure it's 22.04 that's running. - The script that runs is in ci/scripts/r_valgrind.sh. It basically runs r/tests/testthat.R with R -d valgrind. ``` ubuntu-r-valgrind: # Only 18.04 and amd64 supported # Usage: # docker-compose build ubuntu-r-valgrind # docker-compose run ubuntu-r-valgrind image: ${REPO}:amd64-ubuntu-18.04-r-valgrind build: context: . dockerfile: ci/docker/linux-r.dockerfile cache_from: - ${REPO}:amd64-ubuntu-18.04-r-valgrind args: base: wch1/r-debug:latest r_bin: RDvalgrind tz: ${TZ} environment: <<: [*ccache, *sccache] ARROW_R_DEV: ${ARROW_R_DEV} # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not # so some build might pass without this setting, but we want to ensure that we stay to AVX2 regardless of runner. EXTRA_CMAKE_FLAGS: "-DARROW_RUNTIME_SIMD_LEVEL=AVX2" ARROW_SOURCE_HOME: "/arrow" volumes: *ubuntu-volumes command: > /bin/bash -c " /arrow/ci/scripts/r_valgrind.sh /arrow" ``` To find a test file with a a leak, I modified `r/test/testthat.R` with a filter to use specific tests: ```r # Tried: # filter = "^Array" (no leaks) # filter = "^dataset" (no leaks) # filter = "^dplyr" (leaks!) # filter = "^dplyr-[g-u]" (leaks!) # filter = "^dplyr-[s-u]" (leaks!) # filter = "^dplyr-summarize" (leaks!) test_check("arrow", reporter = arrow_reporter, filter = "^dplyr-summarize") ``` Next, I'll see if I can isolate one test in the summarize tests that leaks consistently. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org