Hi Experts, Pyarrow *Table.from_pylist* does not release memory until the program terminates. I created a sample script to highlight the issue. I have also tried setting up `pa.jemalloc_set_decay_ms(0)` but it didn't help much. Could you please check this and let me know if there are potential issues / any workaround to resolve this?
>>> pyarrow.__version__ '12.0.0' OS Details: OS: macOS 13.4 (22F66) Kernel Version: Darwin 22.5.0 Sample code to reproduce. (it needs memory_profiler) #file_name: test_exec.py import pyarrow as pa import time import random import string from memory_profiler import profile def get_sample_data(): record1 = {} for col_id in range(15): record1[f"column_{col_id}"] = string.ascii_letters[10 : random.randint(17, 49)] return [record1] def construct_data(data): count = 1 while count < 10: pa.Table.from_pylist(data * 100000) count += 1 return True @profile def main(): data = get_sample_data() construct_data(data) print("construct data completed!") if __name__ == "__main__": main() time.sleep(600) memory_profiler output: Filename: test_exec.py Line # Mem usage Increment Occurrences Line Contents ============================================================= 41 65.6 MiB 65.6 MiB 1 @profile 42 def main(): 43 65.6 MiB 0.0 MiB 1 data = get_sample_data() 44 203.8 MiB 138.2 MiB 1 construct_data(data) 45 203.8 MiB 0.0 MiB 1 print("construct data completed!") Regards, Alex