Xinyu Zeng created ORC-1232: ------------------------------- Summary: Disable metrics collector by default Key: ORC-1232 URL: https://issues.apache.org/jira/browse/ORC-1232 Project: ORC Issue Type: Improvement Reporter: Xinyu Zeng
ORC-961 introduced a metrics collector for the reader. However, it may affect the performance of reading ORC files. It may be helpful to disable it as default. Reproducable experiment result: Alibaba Cloud [ecs.s6-c1m4.xlarge|https://help.aliyun.com/document_detail/25378.html#s6], running Ubuntu 20.04, ESSD PL1 40GB The original file is 4.1GB csv file with generated string with some degree of repetiveness (the value of one column follows a zipfian distribution). The ORC file with dictionary encoding and no block compression is 319MB. Time of running orc-scan with metrics enabled: 7.5s Time of running orc-scan with metrics disabled: 1.5s The action of disable is implemented by adding readerOpts.setReaderMetrics(nullptr); after https://github.com/apache/orc/blob/02e48107b36b8ed868797dadcd7355a632519d48/tools/src/FileScan.cc#L26 -- This message was sent by Atlassian Jira (v8.20.10#820010)