Hi, I'm trying to read 4096 parquet files with a total size of 6GB using this cookbook: https://arrow.apache.org/cookbook/java/dataset.html#query-parquet-file
I'm using 100 threads, each thread processing one file at a time on a 72 core machine with 32GB heap. The files are pre-loaded in memory. However it's taking about 10 minutes to process these 4096 files with a total size of only 6GB and the process seems to be cpu-bound. Is this expected read performance for parquet files or am I doing something wrong? Any help or tips would be appreciated. Thanks, Paulo
