Aman Sinha created DRILL-3856: --------------------------------- Summary: Enhance Scan costing to include factors other than row count Key: DRILL-3856 URL: https://issues.apache.org/jira/browse/DRILL-3856 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.1.0 Reporter: Aman Sinha
The costing of Scans in DrillScanRel and ScanPrel's computeSelfCost() method currently computes the cpu cost as a function of row count and column count only. This works fine as long as there is a single type of Scan plan. With the new addition of the native reader for Hive parquet tables, there are 2 ways to do the same scan: a HiveScan and a Drill native scan. Both scans produce the same row count, so there should be a way to differentiate between the two. The CPU and memory cost of the Drill native scan is expected to be lower than HiveScan, hence these factors need to be included in the costing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)