[ 
https://issues.apache.org/jira/browse/DRILL-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu closed DRILL-4380.
-----------------------------

Verified. LGTM.

> Fix performance regression: in creation of FileSelection in 
> ParquetFormatPlugin to not set files if metadata cache is available.
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4380
>                 URL: https://issues.apache.org/jira/browse/DRILL-4380
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Parth Chandra
>             Fix For: 1.5.0
>
>
> The regression has been caused by the changes in 
> 367d74a65ce2871a1452361cbd13bbd5f4a6cc95 (DRILL-2618: handle queries over 
> empty folders consistently so that they report table not found rather than 
> failing.)
> In ParquetFormatPlugin, the original code created a FileSelection object in 
> the following code:
> {code}
> return new FileSelection(fileNames, metaRootPath.toString(), metadata, 
> selection.getFileStatusList(fs));
> {code}
> The selection.getFileStatusList call made an inexpensive call to 
> FileSelection.init(). The call was inexpensive because the 
> FileSelection.files member was not set and the code does not need to make an 
> expensive call to get the file statuses corresponding to the files in the 
> FileSelection.files member.
> In the new code, this is replaced by 
> {code}
>   final FileSelection newSelection = FileSelection.create(null, fileNames, 
> metaRootPath.toString());
>         return ParquetFileSelection.create(newSelection, metadata);
> {code}
> This sets the FileSelection.files member but not the FileSelection.statuses 
> member. A subsequent call to FileSelection.getStatuses ( in 
> ParquetGroupScan() ) now makes an expensive call to get all the statuses.
> It appears that there was an implicit assumption that the 
> FileSelection.statuses member should be set before the FileSelection.files 
> member is set. This assumption is no longer true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to