Arina Ielchiieva created DRILL-7419: ---------------------------------------
Summary: Enhance Drill splitting logic for compressed files Key: DRILL-7419 URL: https://issues.apache.org/jira/browse/DRILL-7419 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.16.0 Reporter: Arina Ielchiieva By default Drill treats all compressed files are non splittable. Drill uses BlockMapBuilder to split file into blocks if possible. According to its code, it tries to split the file if blockSplittable is set to true and file IS NOT compressed. So even if format is block splittable but came as compressed file, it won't be split. https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java#L115 But some compression codecs can be splittable, for example; bzip2 (https://i.stack.imgur.com/jpprr.jpg). Codec type should be taken into account when considering if file can be split. -- This message was sent by Atlassian Jira (v8.3.4#803005)