Andrew Sherman created IMPALA-10588: ---------------------------------------
Summary: PlannerTest/resource-requirements.test fails with bad mem estimates (from number of files?) Key: IMPALA-10588 URL: https://issues.apache.org/jira/browse/IMPALA-10588 Project: IMPALA Issue Type: Bug Affects Versions: Impala 4.0 Reporter: Andrew Sherman We see an unexpected plan in the plan for "select * from tpch_orc_def.lineitem" with Hive v3. The first line to diff is {code} Per-Host Resource Estimates: Memory=188MB ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ {code} but in the scan we see {code} HDFS partitions=1/1 files=1 size=142.84MB {code} instead of the expected {code} HDFS partitions=1/1 files=5 size=142.72MB {code} Could this be a regression from the recent change IMPALA-10503 which changed data loading? {code} Section PLAN of query: select * from tpch_orc_def.lineitem Actual does not match expected result: Max Per-Host Resource Reservation: Memory=12.00MB Threads=2 Per-Host Resource Estimates: Memory=188MB ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Analyzed query: SELECT * FROM tpch_orc_def.lineitem F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=188.00MB mem-reservation=12.00MB thread-reservation=2 PLAN-ROOT SINK | output exprs: tpch_orc_def.lineitem.l_orderkey, tpch_orc_def.lineitem.l_partkey, tpch_orc_def.lineitem.l_suppkey, tpch_orc_def.lineitem.l_linenumber, tpch_orc_def.lineitem.l_quantity, tpch_orc_def.lineitem.l_extendedprice, tpch_orc_def.lineitem.l_discount, tpch_orc_def.lineitem.l_tax, tpch_orc_def.lineitem.l_returnflag, tpch_orc_def.lineitem.l_linestatus, tpch_orc_def.lineitem.l_shipdate, tpch_orc_def.lineitem.l_commitdate, tpch_orc_def.lineitem.l_receiptdate, tpch_orc_def.lineitem.l_shipinstruct, tpch_orc_def.lineitem.l_shipmode, tpch_orc_def.lineitem.l_comment | mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 00:SCAN HDFS [tpch_orc_def.lineitem] HDFS partitions=1/1 files=1 size=142.84MB stored statistics: table: rows=6.00M size=142.84MB columns: all extrapolated-rows=disabled max-scan-range-rows=6.00M mem-estimate=88.00MB mem-reservation=8.00MB thread-reservation=1 tuple-ids=0 row-size=231B cardinality=6.00M in pipelines: 00(GETNEXT) Expected: Max Per-Host Resource Reservation: Memory=12.00MB Threads=2 Per-Host Resource Estimates: Memory=140MB Analyzed query: SELECT * FROM tpch_orc_def.lineitem F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=140.00MB mem-reservation=12.00MB thread-reservation=2 PLAN-ROOT SINK | output exprs: tpch_orc_def.lineitem.l_orderkey, tpch_orc_def.lineitem.l_partkey, tpch_orc_def.lineitem.l_suppkey, tpch_orc_def.lineitem.l_linenumber, tpch_orc_def.lineitem.l_quantity, tpch_orc_def.lineitem.l_extendedprice, tpch_orc_def.lineitem.l_discount, tpch_orc_def.lineitem.l_tax, tpch_orc_def.lineitem.l_returnflag, tpch_orc_def.lineitem.l_linestatus, tpch_orc_def.lineitem.l_shipdate, tpch_orc_def.lineitem.l_commitdate, tpch_orc_def.lineitem.l_receiptdate, tpch_orc_def.lineitem.l_shipinstruct, tpch_orc_def.lineitem.l_shipmode, tpch_orc_def.lineitem.l_comment | mem-estimate=100.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 | 00:SCAN HDFS [tpch_orc_def.lineitem] HDFS partitions=1/1 files=5 size=142.72MB stored statistics: table: rows=6.00M size=142.72MB columns: all extrapolated-rows=disabled max-scan-range-rows=1.73M mem-estimate=40.00MB mem-reservation=8.00MB thread-reservation=1 tuple-ids=0 row-size=231B cardinality=6.00M in pipelines: 00(GETNEXT) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org