[ https://issues.apache.org/jira/browse/PARQUET-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Blue updated PARQUET-1055: ------------------------------- Fix Version/s: (was: 1.9.1) > Improve the creation of ExecutorService when reading footers > ------------------------------------------------------------ > > Key: PARQUET-1055 > URL: https://issues.apache.org/jira/browse/PARQUET-1055 > Project: Parquet > Issue Type: Improvement > Components: parquet-mr > Affects Versions: 1.9.0 > Reporter: Benoit Lacelle > Priority: Minor > > Doing some benchmarks loading a large set of parquet files (3000+) from the > local FS, we observed some inefficiencies in the number of created threads > when reading footers. > By reading, the read the configuration parallelism in Hadoop configuration > (defaulted to 5) and allocate 2 ExecuteService with each 5 threads to read > footers. This is especially inefficient if there is less Callable to handle > than the configured parallelism. -- This message was sent by Atlassian JIRA (v7.6.3#76005)