[
https://issues.apache.org/jira/browse/DRILL-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161031#comment-17161031
]
ASF GitHub Bot commented on DRILL-7763:
---------------------------------------
vvysotskyi commented on a change in pull request #2092:
URL: https://github.com/apache/drill/pull/2092#discussion_r457178768
##########
File path:
exec/java-exec/src/test/java/org/apache/drill/exec/metastore/TestMetastoreWithEasyFormatPlugin.java
##########
@@ -1079,14 +1079,14 @@ public void testFilesPruningWithLimit() throws
Exception {
queryBuilder()
.sql("select * from dfs.tmp.`%s` limit 1", tableName)
.planMatcher()
- .include("Limit", "numFiles=1,")
+ .include("Limit", "numFiles=12,")
Review comment:
This is the regression I mentioned in the previous comment. It is
non-optimal to read all 12 files instead of the single one.
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyGroupScan.java
##########
@@ -172,6 +175,26 @@ private EasyGroupScan(final EasyGroupScan that) {
mappings = that.mappings;
partitionDepth = that.partitionDepth;
metadataProvider = that.metadataProvider;
+ maxRecords = that.maxRecords;
+ supportsLimitPushdown =
that.formatPlugin.easyConfig().supportsLimitPushdown;
+ }
+
+ // Constructor to get the limit pushed down
+ private EasyGroupScan(final EasyGroupScan that, int maxRecords) {
Review comment:
Please do not introduce the new constructor. `maxRecords` is not final,
so it is possible to assign its value after using the existing copy constructor.
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/easy/EasyGroupScan.java
##########
@@ -17,31 +17,30 @@
*/
package org.apache.drill.exec.store.dfs.easy;
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.List;
-
+import com.fasterxml.jackson.annotation.JacksonInject;
Review comment:
Please keep the original imports order.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Add Limit Pushdown to File Based Storage Plugins
> ------------------------------------------------
>
> Key: DRILL-7763
> URL: https://issues.apache.org/jira/browse/DRILL-7763
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.17.0
> Reporter: Charles Givre
> Assignee: Charles Givre
> Priority: Major
> Fix For: 1.18.0
>
>
> As currently implemented, when querying a file, Drill will read the entire
> file even if a limit is specified in the query. This PR does a few things:
> # Refactors the EasyGroupScan, EasySubScan, and EasyFormatConfig to allow
> the option of pushing down limits.
> # Applies this to all the EVF based format plugins which are: LogRegex,
> PCAP, SPSS, Esri, Excel and Text (CSV).
> Due to JSON's fluid schema, it would be unwise to adopt the limit pushdown as
> it could result in very inconsistent schemata.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)